2024-03-24 13:55:52

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism

This patch set acts as the second part of the new version of [1] (The first
part can be referred from [2]), the updated things of this version are listed
at the end.

- Background

SMC-D is now used in IBM z with ISM function to optimize network interconnect
for intra-CPC communications. Inspired by this, we try to make SMC-D available
on the non-s390 architecture through a software-implemented Emulated-ISM device,
that is the loopback-ism device here, to accelerate inter-process or
inter-containers communication within the same OS instance.

- Design

This patch set includes 3 parts:

- Patch #1: some prepare work for loopback-ism.
- Patch #2-#7: implement loopback-ism device. Noted that loopback-ism now
serves only SMC and no userspace interface exposed.
- Patch #8-#11: memory copy optimization for intra-OS scenario.

The loopback-ism device is designed as an ISMv2 device and not be limited to
a specific net namespace, ends of both inter-process connection (1/1' in diagram
below) or inter-container connection (2/2' in diagram below) can find the same
available loopback-ism and choose it during the CLC handshake.

Container 1 (ns1) Container 2 (ns2)
+-----------------------------------------+ +-------------------------+
| +-------+ +-------+ +-------+ | | +-------+ |
| | App A | | App B | | App C | | | | App D |<-+ |
| +-------+ +---^---+ +-------+ | | +-------+ |(2') |
| |127.0.0.1 (1')| |192.168.0.11 192.168.0.12| |
| (1)| +--------+ | +--------+ |(2) | | +--------+ +--------+ |
| `-->| lo |-` | eth0 |<-` | | | lo | | eth0 | |
+---------+--|---^-+---+-----|--+---------+ +-+--------+---+-^------+-+
| | | |
Kernel | | | |
+----+-------v---+-----------v----------------------------------+---+----+
| | TCP | |
| | | |
| +--------------------------------------------------------------+ |
| |
| +--------------+ |
| | smc loopback | |
+---------------------------+--------------+-----------------------------+

loopback-ism device creates DMBs (shared memory) for each connection peer.
Since data transfer occurs within the same kernel, the sndbuf of each peer
is only a descriptor and point to the same memory region as peer DMB, so that
the data copy from sndbuf to peer DMB can be avoided in loopback-ism case.

Container 1 (ns1) Container 2 (ns2)
+-----------------------------------------+ +-------------------------+
| +-------+ | | +-------+ |
| | App C |-----+ | | | App D | |
| +-------+ | | | +-^-----+ |
| | | | | |
| (2) | | | (2') | |
| | | | | |
+---------------|-------------------------+ +----------|--------------+
| |
Kernel | |
+---------------|-----------------------------------------|--------------+
| +--------+ +--v-----+ +--------+ +--------+ |
| |dmb_desc| |snd_desc| |dmb_desc| |snd_desc| |
| +-----|--+ +--|-----+ +-----|--+ +--------+ |
| +-----|--+ | +-----|--+ |
| | DMB C | +---------------------------------| DMB D | |
| +--------+ +--------+ |
| |
| +--------------+ |
| | smc loopback | |
+---------------------------+--------------+-----------------------------+

- Benchmark Test

* Test environments:
- VM with Intel Xeon Platinum 8 core 2.50GHz, 16 GiB mem.
- SMC sndbuf/DMB size 1MB.

* Test object:
- TCP: run on TCP loopback.
- SMC lo: run on SMC loopback-ism.

1. ipc-benchmark (see [3])

- ./<foo> -c 1000000 -s 100

TCP SMC-lo
Message
rate (msg/s) 81908 143128(+74.74%)

2. sockperf

- serv: <smc_run> taskset -c <cpu> sockperf sr --tcp
- clnt: <smc_run> taskset -c <cpu> sockperf { tp | pp } --tcp --msg-size={ 64000 for tp | 14 for pp } -i 127.0.0.1 -t 30

TCP SMC-lo
Bandwidth(MBps) 5082.40 8134.22(+60.05%)
Latency(us) 5.956 3.308(-44.46%)

3. nginx/wrk

- serv: <smc_run> nginx
- clnt: <smc_run> wrk -t 8 -c 1000 -d 30 http://127.0.0.1:80

TCP SMC-lo
Requests/s 190113.20 248735.41(+30.83%)

4. redis-benchmark

- serv: <smc_run> redis-server
- clnt: <smc_run> redis-benchmark -h 127.0.0.1 -q -t set,get -n 400000 -c 200 -d 1024

TCP SMC-lo
GET(Requests/s) 89505.48 117577.90(+31.36%)
SET(Requests/s) 89847.26 120336.95(+33.94%)


Change log:

RFC v5->RFC v4:
- Patch #2: minor changes in description of config SMC_LO and comments.
- Patch #10: minor changes in comments and if(smc_ism_support_dmb_nocopy())
check in smcd_cdc_msg_send().
- Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids() and SMC_LO_CHID
to SMC_LO_RESERVED_CHID.
- Patch #5: memcpy while holding the ldev->dmb_ht_lock.
- Some expression changes in commit logs.

RFC v4->v3:
Link: https://lore.kernel.org/netdev/[email protected]/
- The merge window of v6.9 is open, so post this series as an RFC.
- Patch #6: since some information fed back by smc_nl_handle_smcd_dev() dose
not apply to Emulated-ISM (including loopback-ism here), loopback-ism is
not exposed through smc netlink for the time being. we may refactor this
part when smc netlink interface is updated.

v3->v2:
Link: https://lore.kernel.org/netdev/[email protected]/
- Patch #11: use tasklet_schedule(&conn->rx_tsklet) instead of smcd_cdc_rx_handler()
to avoid possible recursive locking of conn->send_lock and use {read|write}_lock_bh()
to acquire dmb_ht_lock.

v2->v1:
Link: https://lore.kernel.org/netdev/[email protected]/
- All the patches: changed the term virtual-ISM to Emulated-ISM as defined by SMCv2.1.
- Patch #3: optimized the description of SMC_LO config. Avoid exposing loopback-ism
to sysfs and remove all the knobs until future definition clear.
- Patch #3: try to make lockdep happy by using read_lock_bh() in smc_lo_move_data().
- Patch #6: defaultly use physical contiguous DMB buffers.
- Patch #11: defaultly enable DMB no-copy for loopback-ism and free the DMB in
unregister_dmb or detach_dmb when dmb_node->refcnt reaches 0, instead of using
wait_event to keep waiting in unregister_dmb.

v1->RFC:
Link: https://lore.kernel.org/netdev/[email protected]/
- Patch #9: merge rx_bytes and tx_bytes as xfer_bytes statistics:
/sys/devices/virtual/smc/loopback-ism/xfer_bytes
- Patch #10: add support_dmb_nocopy operation to check if SMC-D device supports
merging sndbuf with peer DMB.
- Patch #13 & #14: introduce loopback-ism device control of DMB memory type and
control of whether to merge sndbuf and DMB. They can be respectively set by:
/sys/devices/virtual/smc/loopback-ism/dmb_type
/sys/devices/virtual/smc/loopback-ism/dmb_copy
The motivation for these two control is that a performance bottleneck was
found when using vzalloced DMB and sndbuf is merged with DMB, and there are
many CPUs and CONFIG_HARDENED_USERCOPY is set [4]. The bottleneck is caused
by the lock contention in vmap_area_lock [5] which is involved in memcpy_from_msg()
or memcpy_to_msg(). Currently, Uladzislau Rezki is working on mitigating the
vmap lock contention [6]. It has significant effects, but using virtual memory
still has additional overhead compared to using physical memory.
So this new version provides controls of dmb_type and dmb_copy to suit
different scenarios.
- Some minor changes and comments improvements.

RFC->old version([1]):
Link: https://lore.kernel.org/netdev/[email protected]/
- Patch #1: improve the loopback-ism dump, it shows as follows now:
# smcd d
FID Type PCI-ID PCHID InUse #LGs PNET-ID
0000 0 loopback-ism ffff No 0
- Patch #3: introduce the smc_ism_set_v2_capable() helper and set
smc_ism_v2_capable when ISMv2 or virtual ISM is registered,
regardless of whether there is already a device in smcd device list.
- Patch #3: loopback-ism will be added into /sys/devices/virtual/smc/loopback-ism/.
- Patch #8: introduce the runtime switch /sys/devices/virtual/smc/loopback-ism/active
to activate or deactivate the loopback-ism.
- Patch #9: introduce the statistics of loopback-ism by
/sys/devices/virtual/smc/loopback-ism/{{tx|rx}_tytes|dmbs_cnt}.
- Some minor changes and comments improvements.

[1] https://lore.kernel.org/netdev/[email protected]/
[2] https://lore.kernel.org/netdev/[email protected]/
[3] https://github.com/goldsborough/ipc-bench
[4] https://lore.kernel.org/all/[email protected]/
[5] https://lore.kernel.org/all/[email protected]/
[6] https://lore.kernel.org/all/[email protected]/

Wen Gu (11):
net/smc: decouple ism_client from SMC-D DMB registration
net/smc: introduce loopback-ism for SMC intra-OS shortcut
net/smc: implement ID-related operations of loopback-ism
net/smc: implement some unsupported operations of loopback-ism
net/smc: implement DMB-related operations of loopback-ism
net/smc: ignore loopback-ism when dumping SMC-D devices
net/smc: register loopback-ism into SMC-D device list
net/smc: add operations to merge sndbuf with peer DMB
net/smc: {at|de}tach sndbuf to peer DMB if supported
net/smc: adapt cursor update when sndbuf and peer DMB are merged
net/smc: implement DMB-merged operations of loopback-ism

drivers/s390/net/ism_drv.c | 2 +-
include/net/smc.h | 7 +-
net/smc/Kconfig | 13 ++
net/smc/Makefile | 2 +-
net/smc/af_smc.c | 28 ++-
net/smc/smc_cdc.c | 34 ++-
net/smc/smc_core.c | 61 ++++-
net/smc/smc_core.h | 1 +
net/smc/smc_ism.c | 60 ++++-
net/smc/smc_ism.h | 10 +
net/smc/smc_loopback.c | 461 +++++++++++++++++++++++++++++++++++++
net/smc/smc_loopback.h | 52 +++++
12 files changed, 715 insertions(+), 16 deletions(-)
create mode 100644 net/smc/smc_loopback.c
create mode 100644 net/smc/smc_loopback.h

--
2.32.0.3.g01195cf9f



2024-03-24 13:56:07

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 01/11] net/smc: decouple ism_client from SMC-D DMB registration

The struct 'ism_client' is specialized for s390 platform firmware ISM.
So replace it with 'void' to make SMCD DMB registration helper generic
for both Emulated-ISM and existing ISM.

Signed-off-by: Wen Gu <[email protected]>
---
drivers/s390/net/ism_drv.c | 2 +-
include/net/smc.h | 4 ++--
net/smc/smc_ism.c | 7 ++-----
3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c
index 2c8e964425dc..9b2a52913e76 100644
--- a/drivers/s390/net/ism_drv.c
+++ b/drivers/s390/net/ism_drv.c
@@ -726,7 +726,7 @@ static int smcd_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid,
}

static int smcd_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb,
- struct ism_client *client)
+ void *client)
{
return ism_register_dmb(smcd->priv, (struct ism_dmb *)dmb, client);
}
diff --git a/include/net/smc.h b/include/net/smc.h
index c9dcb30e3fd9..6273c3a8b24a 100644
--- a/include/net/smc.h
+++ b/include/net/smc.h
@@ -50,7 +50,6 @@ struct smcd_dmb {
#define ISM_ERROR 0xFFFF

struct smcd_dev;
-struct ism_client;

struct smcd_gid {
u64 gid;
@@ -61,7 +60,7 @@ struct smcd_ops {
int (*query_remote_gid)(struct smcd_dev *dev, struct smcd_gid *rgid,
u32 vid_valid, u32 vid);
int (*register_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb,
- struct ism_client *client);
+ void *client);
int (*unregister_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb);
int (*add_vlan_id)(struct smcd_dev *dev, u64 vlan_id);
int (*del_vlan_id)(struct smcd_dev *dev, u64 vlan_id);
@@ -81,6 +80,7 @@ struct smcd_ops {
struct smcd_dev {
const struct smcd_ops *ops;
void *priv;
+ void *client;
struct list_head list;
spinlock_t lock;
struct smc_connection **conn;
diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c
index ac88de2a06a0..051726586730 100644
--- a/net/smc/smc_ism.c
+++ b/net/smc/smc_ism.c
@@ -222,7 +222,6 @@ int smc_ism_unregister_dmb(struct smcd_dev *smcd, struct smc_buf_desc *dmb_desc)
int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len,
struct smc_buf_desc *dmb_desc)
{
-#if IS_ENABLED(CONFIG_ISM)
struct smcd_dmb dmb;
int rc;

@@ -231,7 +230,7 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len,
dmb.sba_idx = dmb_desc->sba_idx;
dmb.vlan_id = lgr->vlan_id;
dmb.rgid = lgr->peer_gid.gid;
- rc = lgr->smcd->ops->register_dmb(lgr->smcd, &dmb, &smc_ism_client);
+ rc = lgr->smcd->ops->register_dmb(lgr->smcd, &dmb, lgr->smcd->client);
if (!rc) {
dmb_desc->sba_idx = dmb.sba_idx;
dmb_desc->token = dmb.dmb_tok;
@@ -240,9 +239,6 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len,
dmb_desc->len = dmb.dmb_len;
}
return rc;
-#else
- return 0;
-#endif
}

static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd,
@@ -446,6 +442,7 @@ static void smcd_register_dev(struct ism_dev *ism)
if (!smcd)
return;
smcd->priv = ism;
+ smcd->client = &smc_ism_client;
ism_set_priv(ism, &smc_ism_client, smcd);
if (smc_pnetid_by_dev_port(&ism->pdev->dev, 0, smcd->pnetid))
smc_pnetid_by_table_smcd(smcd);
--
2.32.0.3.g01195cf9f


2024-03-24 13:56:36

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 02/11] net/smc: introduce loopback-ism for SMC intra-OS shortcut

This introduces a kind of Emulated-ISM device named loopback-ism for
SMCv2.1. The loopback-ism device is currently exclusive for SMC usage,
and aims to provide an SMC shortcut for sockets within the same kernel,
leading to improved intra-OS traffic performance. Configuration of this
feature is managed through the config SMC_LO.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/Kconfig | 13 ++++
net/smc/Makefile | 2 +-
net/smc/af_smc.c | 12 ++-
net/smc/smc_loopback.c | 164 +++++++++++++++++++++++++++++++++++++++++
net/smc/smc_loopback.h | 33 +++++++++
5 files changed, 222 insertions(+), 2 deletions(-)
create mode 100644 net/smc/smc_loopback.c
create mode 100644 net/smc/smc_loopback.h

diff --git a/net/smc/Kconfig b/net/smc/Kconfig
index 746be3996768..ba5e6a2dd2fd 100644
--- a/net/smc/Kconfig
+++ b/net/smc/Kconfig
@@ -20,3 +20,16 @@ config SMC_DIAG
smcss.

if unsure, say Y.
+
+config SMC_LO
+ bool "SMC intra-OS shortcut with loopback-ism"
+ depends on SMC
+ default n
+ help
+ SMC_LO enables the creation of an Emulated-ISM device named
+ loopback-ism in SMC and makes use of it for transferring data
+ when communication occurs within the same OS. This helps in
+ convenient testing of SMC-D since loopback-ism is independent
+ of architecture or hardware.
+
+ if unsure, say N.
diff --git a/net/smc/Makefile b/net/smc/Makefile
index 875efcd126a2..a8c37111abe1 100644
--- a/net/smc/Makefile
+++ b/net/smc/Makefile
@@ -4,5 +4,5 @@ obj-$(CONFIG_SMC) += smc.o
obj-$(CONFIG_SMC_DIAG) += smc_diag.o
smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o
smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o
-smc-y += smc_tracepoint.o
+smc-y += smc_tracepoint.o smc_loopback.o
smc-$(CONFIG_SYSCTL) += smc_sysctl.o
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 4b52b3b159c0..751a8fdd6eb0 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -53,6 +53,7 @@
#include "smc_stats.h"
#include "smc_tracepoint.h"
#include "smc_sysctl.h"
+#include "smc_loopback.h"

static DEFINE_MUTEX(smc_server_lgr_pending); /* serialize link group
* creation on server
@@ -3557,15 +3558,23 @@ static int __init smc_init(void)
goto out_sock;
}

+ rc = smc_loopback_init();
+ if (rc) {
+ pr_err("%s: smc_loopback_init fails with %d\n", __func__, rc);
+ goto out_ib;
+ }
+
rc = tcp_register_ulp(&smc_ulp_ops);
if (rc) {
pr_err("%s: tcp_ulp_register fails with %d\n", __func__, rc);
- goto out_ib;
+ goto out_lo;
}

static_branch_enable(&tcp_have_smc);
return 0;

+out_lo:
+ smc_loopback_exit();
out_ib:
smc_ib_unregister_client();
out_sock:
@@ -3603,6 +3612,7 @@ static void __exit smc_exit(void)
tcp_unregister_ulp(&smc_ulp_ops);
sock_unregister(PF_SMC);
smc_core_exit();
+ smc_loopback_exit();
smc_ib_unregister_client();
smc_ism_exit();
destroy_workqueue(smc_close_wq);
diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
new file mode 100644
index 000000000000..3990f689ae79
--- /dev/null
+++ b/net/smc/smc_loopback.c
@@ -0,0 +1,164 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Shared Memory Communications Direct over loopback-ism device.
+ *
+ * Functions for loopback-ism device.
+ *
+ * Copyright (c) 2024, Alibaba Inc.
+ *
+ * Author: Wen Gu <[email protected]>
+ * Tony Lu <[email protected]>
+ *
+ */
+
+#include <linux/device.h>
+#include <linux/types.h>
+#include <net/smc.h>
+
+#include "smc_ism.h"
+#include "smc_loopback.h"
+
+#if IS_ENABLED(CONFIG_SMC_LO)
+static const char smc_lo_dev_name[] = "loopback-ism";
+static struct smc_lo_dev *lo_dev;
+
+static const struct smcd_ops lo_ops = {
+ .query_remote_gid = NULL,
+ .register_dmb = NULL,
+ .unregister_dmb = NULL,
+ .add_vlan_id = NULL,
+ .del_vlan_id = NULL,
+ .set_vlan_required = NULL,
+ .reset_vlan_required = NULL,
+ .signal_event = NULL,
+ .move_data = NULL,
+ .supports_v2 = NULL,
+ .get_local_gid = NULL,
+ .get_chid = NULL,
+ .get_dev = NULL,
+};
+
+static struct smcd_dev *smcd_lo_alloc_dev(const struct smcd_ops *ops,
+ int max_dmbs)
+{
+ struct smcd_dev *smcd;
+
+ smcd = kzalloc(sizeof(*smcd), GFP_KERNEL);
+ if (!smcd)
+ return NULL;
+
+ smcd->conn = kcalloc(max_dmbs, sizeof(struct smc_connection *),
+ GFP_KERNEL);
+ if (!smcd->conn)
+ goto out_smcd;
+
+ smcd->ops = ops;
+
+ spin_lock_init(&smcd->lock);
+ spin_lock_init(&smcd->lgr_lock);
+ INIT_LIST_HEAD(&smcd->vlan);
+ INIT_LIST_HEAD(&smcd->lgr_list);
+ init_waitqueue_head(&smcd->lgrs_deleted);
+ return smcd;
+
+out_smcd:
+ kfree(smcd);
+ return NULL;
+}
+
+static int smcd_lo_register_dev(struct smc_lo_dev *ldev)
+{
+ struct smcd_dev *smcd;
+
+ smcd = smcd_lo_alloc_dev(&lo_ops, SMC_LO_MAX_DMBS);
+ if (!smcd)
+ return -ENOMEM;
+ ldev->smcd = smcd;
+ smcd->priv = ldev;
+
+ /* TODO:
+ * register loopback-ism to smcd_dev list.
+ */
+ return 0;
+}
+
+static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev)
+{
+ struct smcd_dev *smcd = ldev->smcd;
+
+ /* TODO:
+ * unregister loopback-ism from smcd_dev list.
+ */
+ kfree(smcd->conn);
+ kfree(smcd);
+}
+
+static int smc_lo_dev_init(struct smc_lo_dev *ldev)
+{
+ return smcd_lo_register_dev(ldev);
+}
+
+static void smc_lo_dev_exit(struct smc_lo_dev *ldev)
+{
+ smcd_lo_unregister_dev(ldev);
+}
+
+static void smc_lo_dev_release(struct device *dev)
+{
+ struct smc_lo_dev *ldev =
+ container_of(dev, struct smc_lo_dev, dev);
+
+ kfree(ldev);
+}
+
+static int smc_lo_dev_probe(void)
+{
+ struct smc_lo_dev *ldev;
+ int ret;
+
+ ldev = kzalloc(sizeof(*ldev), GFP_KERNEL);
+ if (!ldev)
+ return -ENOMEM;
+
+ ldev->dev.parent = NULL;
+ ldev->dev.release = smc_lo_dev_release;
+ device_initialize(&ldev->dev);
+ dev_set_name(&ldev->dev, smc_lo_dev_name);
+
+ ret = smc_lo_dev_init(ldev);
+ if (ret)
+ goto free_dev;
+
+ lo_dev = ldev; /* global loopback device */
+ return 0;
+
+free_dev:
+ put_device(&ldev->dev);
+ return ret;
+}
+
+static void smc_lo_dev_remove(void)
+{
+ if (!lo_dev)
+ return;
+
+ smc_lo_dev_exit(lo_dev);
+ put_device(&lo_dev->dev); /* device_initialize in smc_lo_dev_probe */
+}
+#endif
+
+int smc_loopback_init(void)
+{
+#if IS_ENABLED(CONFIG_SMC_LO)
+ return smc_lo_dev_probe();
+#else
+ return 0;
+#endif
+}
+
+void smc_loopback_exit(void)
+{
+#if IS_ENABLED(CONFIG_SMC_LO)
+ smc_lo_dev_remove();
+#endif
+}
diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
new file mode 100644
index 000000000000..77980650b2bd
--- /dev/null
+++ b/net/smc/smc_loopback.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Shared Memory Communications Direct over loopback-ism device.
+ *
+ * SMC-D loopback-ism device structure definitions.
+ *
+ * Copyright (c) 2024, Alibaba Inc.
+ *
+ * Author: Wen Gu <[email protected]>
+ * Tony Lu <[email protected]>
+ *
+ */
+
+#ifndef _SMC_LOOPBACK_H
+#define _SMC_LOOPBACK_H
+
+#include <linux/device.h>
+#include <linux/err.h>
+#include <net/smc.h>
+
+#if IS_ENABLED(CONFIG_SMC_LO)
+#define SMC_LO_MAX_DMBS 5000
+
+struct smc_lo_dev {
+ struct smcd_dev *smcd;
+ struct device dev;
+};
+#endif
+
+int smc_loopback_init(void);
+void smc_loopback_exit(void);
+
+#endif /* _SMC_LOOPBACK_H */
--
2.32.0.3.g01195cf9f


2024-03-24 13:57:10

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism

This implements some operations that loopback-ism does not support
currently:

- vlan operations, since there is no strong use-case for it.
- signal_event operations, since there is no event to be processed by
the loopback-ism device.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_loopback.c | 36 +++++++++++++++++++++++++++++++-----
1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 3eb623e030eb..4b5e864ebca3 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -49,6 +49,32 @@ static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid,
return 0;
}

+static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
+{
+ return -EOPNOTSUPP;
+}
+
+static int smc_lo_del_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
+{
+ return -EOPNOTSUPP;
+}
+
+static int smc_lo_set_vlan_required(struct smcd_dev *smcd)
+{
+ return -EOPNOTSUPP;
+}
+
+static int smc_lo_reset_vlan_required(struct smcd_dev *smcd)
+{
+ return -EOPNOTSUPP;
+}
+
+static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid *rgid,
+ u32 trigger_irq, u32 event_code, u64 info)
+{
+ return 0;
+}
+
static int smc_lo_supports_v2(void)
{
return SMC_LO_V2_CAPABLE;
@@ -77,11 +103,11 @@ static const struct smcd_ops lo_ops = {
.query_remote_gid = smc_lo_query_rgid,
.register_dmb = NULL,
.unregister_dmb = NULL,
- .add_vlan_id = NULL,
- .del_vlan_id = NULL,
- .set_vlan_required = NULL,
- .reset_vlan_required = NULL,
- .signal_event = NULL,
+ .add_vlan_id = smc_lo_add_vlan_id,
+ .del_vlan_id = smc_lo_del_vlan_id,
+ .set_vlan_required = smc_lo_set_vlan_required,
+ .reset_vlan_required = smc_lo_reset_vlan_required,
+ .signal_event = smc_lo_signal_event,
.move_data = NULL,
.supports_v2 = smc_lo_supports_v2,
.get_local_gid = smc_lo_get_local_gid,
--
2.32.0.3.g01195cf9f


2024-03-24 13:57:19

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 03/11] net/smc: implement ID-related operations of loopback-ism

This implements operations related to IDs for the loopback-ism device.
loopback-ism uses an Extended GID that is a 128-bit GID instead of the
existing ISM 64-bit GID, and uses the CHID defined with the reserved
value 0xFFFF.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_loopback.c | 62 ++++++++++++++++++++++++++++++++++++++----
net/smc/smc_loopback.h | 3 ++
2 files changed, 60 insertions(+), 5 deletions(-)

diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 3990f689ae79..3eb623e030eb 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -19,11 +19,62 @@
#include "smc_loopback.h"

#if IS_ENABLED(CONFIG_SMC_LO)
+#define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */
+
static const char smc_lo_dev_name[] = "loopback-ism";
static struct smc_lo_dev *lo_dev;

+static void smc_lo_generate_ids(struct smc_lo_dev *ldev)
+{
+ struct smcd_gid *lgid = &ldev->local_gid;
+ uuid_t uuid;
+
+ uuid_gen(&uuid);
+ memcpy(&lgid->gid, &uuid, sizeof(lgid->gid));
+ memcpy(&lgid->gid_ext, (u8 *)&uuid + sizeof(lgid->gid),
+ sizeof(lgid->gid_ext));
+
+ ldev->chid = SMC_LO_RESERVED_CHID;
+}
+
+static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid,
+ u32 vid_valid, u32 vid)
+{
+ struct smc_lo_dev *ldev = smcd->priv;
+
+ /* rgid should be the same as lgid */
+ if (!ldev || rgid->gid != ldev->local_gid.gid ||
+ rgid->gid_ext != ldev->local_gid.gid_ext)
+ return -ENETUNREACH;
+ return 0;
+}
+
+static int smc_lo_supports_v2(void)
+{
+ return SMC_LO_V2_CAPABLE;
+}
+
+static void smc_lo_get_local_gid(struct smcd_dev *smcd,
+ struct smcd_gid *smcd_gid)
+{
+ struct smc_lo_dev *ldev = smcd->priv;
+
+ smcd_gid->gid = ldev->local_gid.gid;
+ smcd_gid->gid_ext = ldev->local_gid.gid_ext;
+}
+
+static u16 smc_lo_get_chid(struct smcd_dev *smcd)
+{
+ return ((struct smc_lo_dev *)smcd->priv)->chid;
+}
+
+static struct device *smc_lo_get_dev(struct smcd_dev *smcd)
+{
+ return &((struct smc_lo_dev *)smcd->priv)->dev;
+}
+
static const struct smcd_ops lo_ops = {
- .query_remote_gid = NULL,
+ .query_remote_gid = smc_lo_query_rgid,
.register_dmb = NULL,
.unregister_dmb = NULL,
.add_vlan_id = NULL,
@@ -32,10 +83,10 @@ static const struct smcd_ops lo_ops = {
.reset_vlan_required = NULL,
.signal_event = NULL,
.move_data = NULL,
- .supports_v2 = NULL,
- .get_local_gid = NULL,
- .get_chid = NULL,
- .get_dev = NULL,
+ .supports_v2 = smc_lo_supports_v2,
+ .get_local_gid = smc_lo_get_local_gid,
+ .get_chid = smc_lo_get_chid,
+ .get_dev = smc_lo_get_dev,
};

static struct smcd_dev *smcd_lo_alloc_dev(const struct smcd_ops *ops,
@@ -95,6 +146,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev)

static int smc_lo_dev_init(struct smc_lo_dev *ldev)
{
+ smc_lo_generate_ids(ldev);
return smcd_lo_register_dev(ldev);
}

diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
index 77980650b2bd..11868e5ac732 100644
--- a/net/smc/smc_loopback.h
+++ b/net/smc/smc_loopback.h
@@ -20,10 +20,13 @@

#if IS_ENABLED(CONFIG_SMC_LO)
#define SMC_LO_MAX_DMBS 5000
+#define SMC_LO_RESERVED_CHID 0xFFFF

struct smc_lo_dev {
struct smcd_dev *smcd;
struct device dev;
+ u16 chid;
+ struct smcd_gid local_gid;
};
#endif

--
2.32.0.3.g01195cf9f


2024-03-24 13:57:57

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism

This implements DMB (un)registration and data move operations of
loopback-ism device.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_loopback.c | 129 ++++++++++++++++++++++++++++++++++++++++-
net/smc/smc_loopback.h | 13 +++++
2 files changed, 139 insertions(+), 3 deletions(-)

diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 4b5e864ebca3..8ee02cfd7fb4 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -15,11 +15,13 @@
#include <linux/types.h>
#include <net/smc.h>

+#include "smc_cdc.h"
#include "smc_ism.h"
#include "smc_loopback.h"

#if IS_ENABLED(CONFIG_SMC_LO)
#define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */
+#define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0)

static const char smc_lo_dev_name[] = "loopback-ism";
static struct smc_lo_dev *lo_dev;
@@ -49,6 +51,93 @@ static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid,
return 0;
}

+static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb,
+ void *client_priv)
+{
+ struct smc_lo_dmb_node *dmb_node, *tmp_node;
+ struct smc_lo_dev *ldev = smcd->priv;
+ int sba_idx, rc;
+
+ /* check space for new dmb */
+ for_each_clear_bit(sba_idx, ldev->sba_idx_mask, SMC_LO_MAX_DMBS) {
+ if (!test_and_set_bit(sba_idx, ldev->sba_idx_mask))
+ break;
+ }
+ if (sba_idx == SMC_LO_MAX_DMBS)
+ return -ENOSPC;
+
+ dmb_node = kzalloc(sizeof(*dmb_node), GFP_KERNEL);
+ if (!dmb_node) {
+ rc = -ENOMEM;
+ goto err_bit;
+ }
+
+ dmb_node->sba_idx = sba_idx;
+ dmb_node->len = dmb->dmb_len;
+ dmb_node->cpu_addr = kzalloc(dmb_node->len, GFP_KERNEL |
+ __GFP_NOWARN | __GFP_NORETRY |
+ __GFP_NOMEMALLOC);
+ if (!dmb_node->cpu_addr) {
+ rc = -ENOMEM;
+ goto err_node;
+ }
+ dmb_node->dma_addr = SMC_DMA_ADDR_INVALID;
+
+again:
+ /* add new dmb into hash table */
+ get_random_bytes(&dmb_node->token, sizeof(dmb_node->token));
+ write_lock_bh(&ldev->dmb_ht_lock);
+ hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_node->token) {
+ if (tmp_node->token == dmb_node->token) {
+ write_unlock_bh(&ldev->dmb_ht_lock);
+ goto again;
+ }
+ }
+ hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token);
+ write_unlock_bh(&ldev->dmb_ht_lock);
+
+ dmb->sba_idx = dmb_node->sba_idx;
+ dmb->dmb_tok = dmb_node->token;
+ dmb->cpu_addr = dmb_node->cpu_addr;
+ dmb->dma_addr = dmb_node->dma_addr;
+ dmb->dmb_len = dmb_node->len;
+
+ return 0;
+
+err_node:
+ kfree(dmb_node);
+err_bit:
+ clear_bit(sba_idx, ldev->sba_idx_mask);
+ return rc;
+}
+
+static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb)
+{
+ struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
+ struct smc_lo_dev *ldev = smcd->priv;
+
+ /* remove dmb from hash table */
+ write_lock_bh(&ldev->dmb_ht_lock);
+ hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) {
+ if (tmp_node->token == dmb->dmb_tok) {
+ dmb_node = tmp_node;
+ break;
+ }
+ }
+ if (!dmb_node) {
+ write_unlock_bh(&ldev->dmb_ht_lock);
+ return -EINVAL;
+ }
+ hash_del(&dmb_node->list);
+ write_unlock_bh(&ldev->dmb_ht_lock);
+
+ clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask);
+ kfree(dmb_node->cpu_addr);
+ kfree(dmb_node);
+
+ return 0;
+}
+
static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
{
return -EOPNOTSUPP;
@@ -75,6 +164,38 @@ static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid *rgid,
return 0;
}

+static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok,
+ unsigned int idx, bool sf, unsigned int offset,
+ void *data, unsigned int size)
+{
+ struct smc_lo_dmb_node *rmb_node = NULL, *tmp_node;
+ struct smc_lo_dev *ldev = smcd->priv;
+ struct smc_connection *conn;
+
+ read_lock_bh(&ldev->dmb_ht_lock);
+ hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_tok) {
+ if (tmp_node->token == dmb_tok) {
+ rmb_node = tmp_node;
+ break;
+ }
+ }
+ if (!rmb_node) {
+ read_unlock_bh(&ldev->dmb_ht_lock);
+ return -EINVAL;
+ }
+ memcpy((char *)rmb_node->cpu_addr + offset, data, size);
+ read_unlock_bh(&ldev->dmb_ht_lock);
+
+ if (sf) {
+ conn = smcd->conn[rmb_node->sba_idx];
+ if (conn && !conn->killed)
+ tasklet_schedule(&conn->rx_tsklet);
+ else
+ return -EPIPE;
+ }
+ return 0;
+}
+
static int smc_lo_supports_v2(void)
{
return SMC_LO_V2_CAPABLE;
@@ -101,14 +222,14 @@ static struct device *smc_lo_get_dev(struct smcd_dev *smcd)

static const struct smcd_ops lo_ops = {
.query_remote_gid = smc_lo_query_rgid,
- .register_dmb = NULL,
- .unregister_dmb = NULL,
+ .register_dmb = smc_lo_register_dmb,
+ .unregister_dmb = smc_lo_unregister_dmb,
.add_vlan_id = smc_lo_add_vlan_id,
.del_vlan_id = smc_lo_del_vlan_id,
.set_vlan_required = smc_lo_set_vlan_required,
.reset_vlan_required = smc_lo_reset_vlan_required,
.signal_event = smc_lo_signal_event,
- .move_data = NULL,
+ .move_data = smc_lo_move_data,
.supports_v2 = smc_lo_supports_v2,
.get_local_gid = smc_lo_get_local_gid,
.get_chid = smc_lo_get_chid,
@@ -173,6 +294,8 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev)
static int smc_lo_dev_init(struct smc_lo_dev *ldev)
{
smc_lo_generate_ids(ldev);
+ rwlock_init(&ldev->dmb_ht_lock);
+ hash_init(ldev->dmb_ht);
return smcd_lo_register_dev(ldev);
}

diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
index 11868e5ac732..6c4a390430f3 100644
--- a/net/smc/smc_loopback.h
+++ b/net/smc/smc_loopback.h
@@ -20,13 +20,26 @@

#if IS_ENABLED(CONFIG_SMC_LO)
#define SMC_LO_MAX_DMBS 5000
+#define SMC_LO_DMBS_HASH_BITS 12
#define SMC_LO_RESERVED_CHID 0xFFFF

+struct smc_lo_dmb_node {
+ struct hlist_node list;
+ u64 token;
+ u32 len;
+ u32 sba_idx;
+ void *cpu_addr;
+ dma_addr_t dma_addr;
+};
+
struct smc_lo_dev {
struct smcd_dev *smcd;
struct device dev;
u16 chid;
struct smcd_gid local_gid;
+ rwlock_t dmb_ht_lock;
+ DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS);
+ DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS);
};
#endif

--
2.32.0.3.g01195cf9f


2024-03-24 13:58:07

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 06/11] net/smc: ignore loopback-ism when dumping SMC-D devices

Since loopback-ism is not a PCI device, the PCI information fed back by
smc_nl_handle_smcd_dev() does not apply to loopback-ism. So currently
ignore loopback-ism when dumping SMC-D devices. The netlink function of
loopback-ism will be refactored when SMC netlink interface is updated.

Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_ism.c | 2 ++
net/smc/smc_ism.h | 5 +++++
2 files changed, 7 insertions(+)

diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c
index 051726586730..97f6ae4f9b23 100644
--- a/net/smc/smc_ism.c
+++ b/net/smc/smc_ism.c
@@ -318,6 +318,8 @@ static void smc_nl_prep_smcd_dev(struct smcd_dev_list *dev_list,
list_for_each_entry(smcd, &dev_list->list, list) {
if (num < snum)
goto next;
+ if (smc_ism_is_loopback(smcd))
+ goto next;
if (smc_nl_handle_smcd_dev(smcd, skb, cb))
goto errout;
next:
diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h
index 165cd013404b..322973527c61 100644
--- a/net/smc/smc_ism.h
+++ b/net/smc/smc_ism.h
@@ -84,4 +84,9 @@ static inline bool smc_ism_is_emulated(struct smcd_dev *smcd)
return __smc_ism_is_emulated(chid);
}

+static inline bool smc_ism_is_loopback(struct smcd_dev *smcd)
+{
+ return (smcd->ops->get_chid(smcd) == 0xFFFF);
+}
+
#endif
--
2.32.0.3.g01195cf9f


2024-03-24 13:58:20

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 09/11] net/smc: {at|de}tach sndbuf to peer DMB if supported

If the device used by SMC-D supports merging local sndbuf to peer DMB,
then create sndbuf descriptor and attach it to peer DMB once peer
token is obtained, and detach and free the sndbuf descriptor when the
connection is freed.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/af_smc.c | 16 ++++++++++++
net/smc/smc_core.c | 61 +++++++++++++++++++++++++++++++++++++++++++++-
net/smc/smc_core.h | 1 +
3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 751a8fdd6eb0..fce7a5b2ce5c 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -1438,6 +1438,14 @@ static int smc_connect_ism(struct smc_sock *smc,
}

smc_conn_save_peer_info(smc, aclc);
+
+ if (smc_ism_support_dmb_nocopy(smc->conn.lgr->smcd)) {
+ rc = smcd_buf_attach(smc);
+ if (rc) {
+ rc = SMC_CLC_DECL_MEM; /* try to fallback */
+ goto connect_abort;
+ }
+ }
smc_close_init(smc);
smc_rx_init(smc);
smc_tx_init(smc);
@@ -2542,6 +2550,14 @@ static void smc_listen_work(struct work_struct *work)
mutex_unlock(&smc_server_lgr_pending);
}
smc_conn_save_peer_info(new_smc, cclc);
+
+ if (ini->is_smcd &&
+ smc_ism_support_dmb_nocopy(new_smc->conn.lgr->smcd)) {
+ rc = smcd_buf_attach(new_smc);
+ if (rc)
+ goto out_decl;
+ }
+
smc_listen_out_connected(new_smc);
SMC_STAT_SERV_SUCC_INC(sock_net(newclcsock->sk), ini);
goto out_free;
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 9b84d5897aa5..fafdb97adfad 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -1149,6 +1149,20 @@ static void smcr_buf_unuse(struct smc_buf_desc *buf_desc, bool is_rmb,
}
}

+static void smcd_buf_detach(struct smc_connection *conn)
+{
+ struct smcd_dev *smcd = conn->lgr->smcd;
+ u64 peer_token = conn->peer_token;
+
+ if (!conn->sndbuf_desc)
+ return;
+
+ smc_ism_detach_dmb(smcd, peer_token);
+
+ kfree(conn->sndbuf_desc);
+ conn->sndbuf_desc = NULL;
+}
+
static void smc_buf_unuse(struct smc_connection *conn,
struct smc_link_group *lgr)
{
@@ -1192,6 +1206,8 @@ void smc_conn_free(struct smc_connection *conn)
if (lgr->is_smcd) {
if (!list_empty(&lgr->list))
smc_ism_unset_conn(conn);
+ if (smc_ism_support_dmb_nocopy(lgr->smcd))
+ smcd_buf_detach(conn);
tasklet_kill(&conn->rx_tsklet);
} else {
smc_cdc_wait_pend_tx_wr(conn);
@@ -1445,6 +1461,8 @@ static void smc_conn_kill(struct smc_connection *conn, bool soft)
smc_sk_wake_ups(smc);
if (conn->lgr->is_smcd) {
smc_ism_unset_conn(conn);
+ if (smc_ism_support_dmb_nocopy(conn->lgr->smcd))
+ smcd_buf_detach(conn);
if (soft)
tasklet_kill(&conn->rx_tsklet);
else
@@ -2464,12 +2482,18 @@ int smc_buf_create(struct smc_sock *smc, bool is_smcd)
int rc;

/* create send buffer */
+ if (is_smcd &&
+ smc_ism_support_dmb_nocopy(smc->conn.lgr->smcd))
+ goto create_rmb;
+
rc = __smc_buf_create(smc, is_smcd, false);
if (rc)
return rc;
+
+create_rmb:
/* create rmb */
rc = __smc_buf_create(smc, is_smcd, true);
- if (rc) {
+ if (rc && smc->conn.sndbuf_desc) {
down_write(&smc->conn.lgr->sndbufs_lock);
list_del(&smc->conn.sndbuf_desc->list);
up_write(&smc->conn.lgr->sndbufs_lock);
@@ -2479,6 +2503,41 @@ int smc_buf_create(struct smc_sock *smc, bool is_smcd)
return rc;
}

+int smcd_buf_attach(struct smc_sock *smc)
+{
+ struct smc_connection *conn = &smc->conn;
+ struct smcd_dev *smcd = conn->lgr->smcd;
+ u64 peer_token = conn->peer_token;
+ struct smc_buf_desc *buf_desc;
+ int rc;
+
+ buf_desc = kzalloc(sizeof(*buf_desc), GFP_KERNEL);
+ if (!buf_desc)
+ return -ENOMEM;
+
+ /* The ghost sndbuf_desc describes the same memory region as
+ * peer RMB. Its lifecycle is consistent with the connection's
+ * and it will be freed with the connections instead of the
+ * link group.
+ */
+ rc = smc_ism_attach_dmb(smcd, peer_token, buf_desc);
+ if (rc)
+ goto free;
+
+ smc->sk.sk_sndbuf = buf_desc->len;
+ buf_desc->cpu_addr =
+ (u8 *)buf_desc->cpu_addr + sizeof(struct smcd_cdc_msg);
+ buf_desc->len -= sizeof(struct smcd_cdc_msg);
+ conn->sndbuf_desc = buf_desc;
+ conn->sndbuf_desc->used = 1;
+ atomic_set(&conn->sndbuf_space, conn->sndbuf_desc->len);
+ return 0;
+
+free:
+ kfree(buf_desc);
+ return rc;
+}
+
static inline int smc_rmb_reserve_rtoken_idx(struct smc_link_group *lgr)
{
int i;
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 1f175376037b..d93cf51dbd7c 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -557,6 +557,7 @@ void smc_smcd_terminate(struct smcd_dev *dev, struct smcd_gid *peer_gid,
void smc_smcd_terminate_all(struct smcd_dev *dev);
void smc_smcr_terminate_all(struct smc_ib_device *smcibdev);
int smc_buf_create(struct smc_sock *smc, bool is_smcd);
+int smcd_buf_attach(struct smc_sock *smc);
int smc_uncompress_bufsize(u8 compressed);
int smc_rmb_rtoken_handling(struct smc_connection *conn, struct smc_link *link,
struct smc_clc_msg_accept_confirm *clc);
--
2.32.0.3.g01195cf9f


2024-03-24 13:59:00

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 08/11] net/smc: add operations to merge sndbuf with peer DMB

In some scenarios using Emulated-ISM device, sndbuf can share the same
physical memory region with peer DMB to avoid data copy from one side
to the other. In such case the sndbuf is only a descriptor that
describes the shared memory and does not actually occupy memory, it's
more like a ghost buffer.

+----------+ +----------+
| socket A | | socket B |
+----------+ +----------+
| |
+--------+ +--------+
| sndbuf | | DMB |
| desc | | desc |
+--------+ +--------+
| |
| +----v-----+
+--------------------------> memory |
+----------+

So here introduces three new SMC-D device operations to check if this
feature is supported by device, and to {attach|detach} ghost sndbuf to
peer DMB. For now only loopback-ism supports this.

Signed-off-by: Wen Gu <[email protected]>
---
include/net/smc.h | 3 +++
net/smc/smc_ism.c | 40 ++++++++++++++++++++++++++++++++++++++++
net/smc/smc_ism.h | 4 ++++
3 files changed, 47 insertions(+)

diff --git a/include/net/smc.h b/include/net/smc.h
index 6273c3a8b24a..01387631d8a6 100644
--- a/include/net/smc.h
+++ b/include/net/smc.h
@@ -62,6 +62,9 @@ struct smcd_ops {
int (*register_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb,
void *client);
int (*unregister_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb);
+ int (*support_dmb_nocopy)(struct smcd_dev *dev);
+ int (*attach_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb);
+ int (*detach_dmb)(struct smcd_dev *dev, u64 token);
int (*add_vlan_id)(struct smcd_dev *dev, u64 vlan_id);
int (*del_vlan_id)(struct smcd_dev *dev, u64 vlan_id);
int (*set_vlan_required)(struct smcd_dev *dev);
diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c
index 38469ff53346..182748927681 100644
--- a/net/smc/smc_ism.c
+++ b/net/smc/smc_ism.c
@@ -246,6 +246,46 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len,
return rc;
}

+bool smc_ism_support_dmb_nocopy(struct smcd_dev *smcd)
+{
+ /* for now only loopback-ism supports
+ * merging sndbuf with peer DMB to avoid
+ * data copies between them.
+ */
+ return (smcd->ops->support_dmb_nocopy &&
+ smcd->ops->support_dmb_nocopy(smcd));
+}
+
+int smc_ism_attach_dmb(struct smcd_dev *dev, u64 token,
+ struct smc_buf_desc *dmb_desc)
+{
+ struct smcd_dmb dmb;
+ int rc = 0;
+
+ if (!dev->ops->attach_dmb)
+ return -EINVAL;
+
+ memset(&dmb, 0, sizeof(dmb));
+ dmb.dmb_tok = token;
+ rc = dev->ops->attach_dmb(dev, &dmb);
+ if (!rc) {
+ dmb_desc->sba_idx = dmb.sba_idx;
+ dmb_desc->token = dmb.dmb_tok;
+ dmb_desc->cpu_addr = dmb.cpu_addr;
+ dmb_desc->dma_addr = dmb.dma_addr;
+ dmb_desc->len = dmb.dmb_len;
+ }
+ return rc;
+}
+
+int smc_ism_detach_dmb(struct smcd_dev *dev, u64 token)
+{
+ if (!dev->ops->detach_dmb)
+ return -EINVAL;
+
+ return dev->ops->detach_dmb(dev, token);
+}
+
static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd,
struct sk_buff *skb,
struct netlink_callback *cb)
diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h
index e6f57e5e1ef9..6763133dd8d0 100644
--- a/net/smc/smc_ism.h
+++ b/net/smc/smc_ism.h
@@ -48,6 +48,10 @@ int smc_ism_put_vlan(struct smcd_dev *dev, unsigned short vlan_id);
int smc_ism_register_dmb(struct smc_link_group *lgr, int buf_size,
struct smc_buf_desc *dmb_desc);
int smc_ism_unregister_dmb(struct smcd_dev *dev, struct smc_buf_desc *dmb_desc);
+bool smc_ism_support_dmb_nocopy(struct smcd_dev *smcd);
+int smc_ism_attach_dmb(struct smcd_dev *dev, u64 token,
+ struct smc_buf_desc *dmb_desc);
+int smc_ism_detach_dmb(struct smcd_dev *dev, u64 token);
int smc_ism_signal_shutdown(struct smc_link_group *lgr);
void smc_ism_get_system_eid(u8 **eid);
u16 smc_ism_get_chid(struct smcd_dev *dev);
--
2.32.0.3.g01195cf9f


2024-03-24 13:59:23

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 10/11] net/smc: adapt cursor update when sndbuf and peer DMB are merged

If the local sndbuf shares the same physical memory with peer DMB,
the cursor update processing needs to be adapted to ensure that the
data to be consumed won't be overwritten.

So in this case, the fin_curs and sndbuf_space that were originally
updated after sending the CDC message should be modified to not be
update until the peer updates cons_curs.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_cdc.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index 3c06625ceb20..5e95347ae497 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -18,6 +18,7 @@
#include "smc_tx.h"
#include "smc_rx.h"
#include "smc_close.h"
+#include "smc_ism.h"

/********************************** send *************************************/

@@ -255,6 +256,14 @@ int smcd_cdc_msg_send(struct smc_connection *conn)
return rc;
smc_curs_copy(&conn->rx_curs_confirmed, &curs, conn);
conn->local_rx_ctrl.prod_flags.cons_curs_upd_req = 0;
+
+ if (smc_ism_support_dmb_nocopy(conn->lgr->smcd))
+ /* if local sndbuf shares the same memory region with
+ * peer DMB, then don't update the tx_curs_fin
+ * and sndbuf_space until peer has consumed the data.
+ */
+ return rc;
+
/* Calculate transmitted data and increment free send buffer space */
diff = smc_curs_diff(conn->sndbuf_desc->len, &conn->tx_curs_fin,
&conn->tx_curs_sent);
@@ -323,7 +332,7 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
{
union smc_host_cursor cons_old, prod_old;
struct smc_connection *conn = &smc->conn;
- int diff_cons, diff_prod;
+ int diff_cons, diff_prod, diff_tx;

smc_curs_copy(&prod_old, &conn->local_rx_ctrl.prod, conn);
smc_curs_copy(&cons_old, &conn->local_rx_ctrl.cons, conn);
@@ -339,6 +348,29 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc,
atomic_add(diff_cons, &conn->peer_rmbe_space);
/* guarantee 0 <= peer_rmbe_space <= peer_rmbe_size */
smp_mb__after_atomic();
+
+ /* if local sndbuf shares the same memory region with
+ * peer RMB, then update tx_curs_fin and sndbuf_space
+ * here since peer has already consumed the data.
+ */
+ if (conn->lgr->is_smcd &&
+ smc_ism_support_dmb_nocopy(conn->lgr->smcd)) {
+ /* Calculate consumed data and
+ * increment free send buffer space.
+ */
+ diff_tx = smc_curs_diff(conn->sndbuf_desc->len,
+ &conn->tx_curs_fin,
+ &conn->local_rx_ctrl.cons);
+ /* increase local sndbuf space and fin_curs */
+ smp_mb__before_atomic();
+ atomic_add(diff_tx, &conn->sndbuf_space);
+ /* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */
+ smp_mb__after_atomic();
+ smc_curs_copy(&conn->tx_curs_fin,
+ &conn->local_rx_ctrl.cons, conn);
+
+ smc_tx_sndbuf_nonfull(smc);
+ }
}

diff_prod = smc_curs_diff(conn->rmb_desc->len, &prod_old,
--
2.32.0.3.g01195cf9f


2024-03-24 13:59:33

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 07/11] net/smc: register loopback-ism into SMC-D device list

After loopback-ism device gets ready, add it to the SMC-D device list as
an ISMv2 device.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_ism.c | 11 +++++++----
net/smc/smc_ism.h | 1 +
net/smc/smc_loopback.c | 20 +++++++++++++-------
3 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c
index 97f6ae4f9b23..38469ff53346 100644
--- a/net/smc/smc_ism.c
+++ b/net/smc/smc_ism.c
@@ -91,6 +91,11 @@ bool smc_ism_is_v2_capable(void)
return smc_ism_v2_capable;
}

+void smc_ism_set_v2_capable(void)
+{
+ smc_ism_v2_capable = true;
+}
+
/* Set a connection using this DMBE. */
void smc_ism_set_conn(struct smc_connection *conn)
{
@@ -449,11 +454,9 @@ static void smcd_register_dev(struct ism_dev *ism)
if (smc_pnetid_by_dev_port(&ism->pdev->dev, 0, smcd->pnetid))
smc_pnetid_by_table_smcd(smcd);

+ if (smcd->ops->supports_v2())
+ smc_ism_set_v2_capable();
mutex_lock(&smcd_dev_list.mutex);
- if (list_empty(&smcd_dev_list.list)) {
- if (smcd->ops->supports_v2())
- smc_ism_v2_capable = true;
- }
/* sort list: devices without pnetid before devices with pnetid */
if (smcd->pnetid[0])
list_add_tail(&smcd->list, &smcd_dev_list.list);
diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h
index 322973527c61..e6f57e5e1ef9 100644
--- a/net/smc/smc_ism.h
+++ b/net/smc/smc_ism.h
@@ -52,6 +52,7 @@ int smc_ism_signal_shutdown(struct smc_link_group *lgr);
void smc_ism_get_system_eid(u8 **eid);
u16 smc_ism_get_chid(struct smcd_dev *dev);
bool smc_ism_is_v2_capable(void);
+void smc_ism_set_v2_capable(void);
int smc_ism_init(void);
void smc_ism_exit(void);
int smcd_nl_get_device(struct sk_buff *skb, struct netlink_callback *cb);
diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 8ee02cfd7fb4..5b35e68d9cdf 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -273,10 +273,12 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev)
return -ENOMEM;
ldev->smcd = smcd;
smcd->priv = ldev;
-
- /* TODO:
- * register loopback-ism to smcd_dev list.
- */
+ smc_ism_set_v2_capable();
+ mutex_lock(&smcd_dev_list.mutex);
+ list_add(&smcd->list, &smcd_dev_list.list);
+ mutex_unlock(&smcd_dev_list.mutex);
+ pr_warn_ratelimited("smc: adding smcd device %s\n",
+ smc_lo_dev_name);
return 0;
}

@@ -284,9 +286,13 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev)
{
struct smcd_dev *smcd = ldev->smcd;

- /* TODO:
- * unregister loopback-ism from smcd_dev list.
- */
+ pr_warn_ratelimited("smc: removing smcd device %s\n",
+ smc_lo_dev_name);
+ smcd->going_away = 1;
+ smc_smcd_terminate_all(smcd);
+ mutex_lock(&smcd_dev_list.mutex);
+ list_del_init(&smcd->list);
+ mutex_unlock(&smcd_dev_list.mutex);
kfree(smcd->conn);
kfree(smcd);
}
--
2.32.0.3.g01195cf9f


2024-03-24 13:59:46

by Wen Gu

[permalink] [raw]
Subject: [RFC PATCH net-next v5 11/11] net/smc: implement DMB-merged operations of loopback-ism

This implements operations related to merging sndbuf with peer DMB in
loopback-ism. The DMB won't be freed until no sndbuf is attached to it.

Signed-off-by: Wen Gu <[email protected]>
---
net/smc/smc_loopback.c | 120 +++++++++++++++++++++++++++++++++++------
net/smc/smc_loopback.h | 3 ++
2 files changed, 108 insertions(+), 15 deletions(-)

diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 5b35e68d9cdf..994fe39930ad 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -21,6 +21,7 @@

#if IS_ENABLED(CONFIG_SMC_LO)
#define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */
+#define SMC_LO_SUPPORT_NOCOPY 0x1
#define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0)

static const char smc_lo_dev_name[] = "loopback-ism";
@@ -82,6 +83,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb,
goto err_node;
}
dmb_node->dma_addr = SMC_DMA_ADDR_INVALID;
+ refcount_set(&dmb_node->refcnt, 1);

again:
/* add new dmb into hash table */
@@ -95,6 +97,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb,
}
hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token);
write_unlock_bh(&ldev->dmb_ht_lock);
+ atomic_inc(&ldev->dmb_cnt);

dmb->sba_idx = dmb_node->sba_idx;
dmb->dmb_tok = dmb_node->token;
@@ -111,13 +114,29 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb,
return rc;
}

+static void __smc_lo_unregister_dmb(struct smc_lo_dev *ldev,
+ struct smc_lo_dmb_node *dmb_node)
+{
+ /* remove dmb from hash table */
+ write_lock_bh(&ldev->dmb_ht_lock);
+ hash_del(&dmb_node->list);
+ write_unlock_bh(&ldev->dmb_ht_lock);
+
+ clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask);
+ kvfree(dmb_node->cpu_addr);
+ kfree(dmb_node);
+
+ if (atomic_dec_and_test(&ldev->dmb_cnt))
+ wake_up(&ldev->ldev_release);
+}
+
static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb)
{
struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
struct smc_lo_dev *ldev = smcd->priv;

- /* remove dmb from hash table */
- write_lock_bh(&ldev->dmb_ht_lock);
+ /* find dmb from hash table */
+ read_lock_bh(&ldev->dmb_ht_lock);
hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) {
if (tmp_node->token == dmb->dmb_tok) {
dmb_node = tmp_node;
@@ -125,16 +144,76 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb)
}
}
if (!dmb_node) {
- write_unlock_bh(&ldev->dmb_ht_lock);
+ read_unlock_bh(&ldev->dmb_ht_lock);
return -EINVAL;
}
- hash_del(&dmb_node->list);
- write_unlock_bh(&ldev->dmb_ht_lock);
+ read_unlock_bh(&ldev->dmb_ht_lock);

- clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask);
- kfree(dmb_node->cpu_addr);
- kfree(dmb_node);
+ if (refcount_dec_and_test(&dmb_node->refcnt))
+ __smc_lo_unregister_dmb(ldev, dmb_node);
+ return 0;
+}
+
+static int smc_lo_support_dmb_nocopy(struct smcd_dev *smcd)
+{
+ return SMC_LO_SUPPORT_NOCOPY;
+}
+
+static int smc_lo_attach_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb)
+{
+ struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
+ struct smc_lo_dev *ldev = smcd->priv;
+
+ /* find dmb_node according to dmb->dmb_tok */
+ read_lock_bh(&ldev->dmb_ht_lock);
+ hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) {
+ if (tmp_node->token == dmb->dmb_tok) {
+ dmb_node = tmp_node;
+ break;
+ }
+ }
+ if (!dmb_node) {
+ read_unlock_bh(&ldev->dmb_ht_lock);
+ return -EINVAL;
+ }
+ read_unlock_bh(&ldev->dmb_ht_lock);
+
+ if (!refcount_inc_not_zero(&dmb_node->refcnt))
+ /* the dmb is being unregistered, but has
+ * not been removed from the hash table.
+ */
+ return -EINVAL;

+ /* provide dmb information */
+ dmb->sba_idx = dmb_node->sba_idx;
+ dmb->dmb_tok = dmb_node->token;
+ dmb->cpu_addr = dmb_node->cpu_addr;
+ dmb->dma_addr = dmb_node->dma_addr;
+ dmb->dmb_len = dmb_node->len;
+ return 0;
+}
+
+static int smc_lo_detach_dmb(struct smcd_dev *smcd, u64 token)
+{
+ struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
+ struct smc_lo_dev *ldev = smcd->priv;
+
+ /* find dmb_node according to dmb->dmb_tok */
+ read_lock_bh(&ldev->dmb_ht_lock);
+ hash_for_each_possible(ldev->dmb_ht, tmp_node, list, token) {
+ if (tmp_node->token == token) {
+ dmb_node = tmp_node;
+ break;
+ }
+ }
+ if (!dmb_node) {
+ read_unlock_bh(&ldev->dmb_ht_lock);
+ return -EINVAL;
+ }
+ read_unlock_bh(&ldev->dmb_ht_lock);
+
+ if (refcount_dec_and_test(&dmb_node->refcnt))
+ __smc_lo_unregister_dmb(ldev, dmb_node);
return 0;
}

@@ -172,6 +251,12 @@ static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok,
struct smc_lo_dev *ldev = smcd->priv;
struct smc_connection *conn;

+ if (!sf)
+ /* since sndbuf is merged with peer DMB, there is
+ * no need to copy data from sndbuf to peer DMB.
+ */
+ return 0;
+
read_lock_bh(&ldev->dmb_ht_lock);
hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_tok) {
if (tmp_node->token == dmb_tok) {
@@ -186,13 +271,10 @@ static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok,
memcpy((char *)rmb_node->cpu_addr + offset, data, size);
read_unlock_bh(&ldev->dmb_ht_lock);

- if (sf) {
- conn = smcd->conn[rmb_node->sba_idx];
- if (conn && !conn->killed)
- tasklet_schedule(&conn->rx_tsklet);
- else
- return -EPIPE;
- }
+ conn = smcd->conn[rmb_node->sba_idx];
+ if (!conn || conn->killed)
+ return -EPIPE;
+ tasklet_schedule(&conn->rx_tsklet);
return 0;
}

@@ -224,6 +306,9 @@ static const struct smcd_ops lo_ops = {
.query_remote_gid = smc_lo_query_rgid,
.register_dmb = smc_lo_register_dmb,
.unregister_dmb = smc_lo_unregister_dmb,
+ .support_dmb_nocopy = smc_lo_support_dmb_nocopy,
+ .attach_dmb = smc_lo_attach_dmb,
+ .detach_dmb = smc_lo_detach_dmb,
.add_vlan_id = smc_lo_add_vlan_id,
.del_vlan_id = smc_lo_del_vlan_id,
.set_vlan_required = smc_lo_set_vlan_required,
@@ -302,12 +387,17 @@ static int smc_lo_dev_init(struct smc_lo_dev *ldev)
smc_lo_generate_ids(ldev);
rwlock_init(&ldev->dmb_ht_lock);
hash_init(ldev->dmb_ht);
+ atomic_set(&ldev->dmb_cnt, 0);
+ init_waitqueue_head(&ldev->ldev_release);
+
return smcd_lo_register_dev(ldev);
}

static void smc_lo_dev_exit(struct smc_lo_dev *ldev)
{
smcd_lo_unregister_dev(ldev);
+ if (atomic_read(&ldev->dmb_cnt))
+ wait_event(ldev->ldev_release, !atomic_read(&ldev->dmb_cnt));
}

static void smc_lo_dev_release(struct device *dev)
diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
index 6c4a390430f3..9a1c5eee5bbc 100644
--- a/net/smc/smc_loopback.h
+++ b/net/smc/smc_loopback.h
@@ -30,6 +30,7 @@ struct smc_lo_dmb_node {
u32 sba_idx;
void *cpu_addr;
dma_addr_t dma_addr;
+ refcount_t refcnt;
};

struct smc_lo_dev {
@@ -37,9 +38,11 @@ struct smc_lo_dev {
struct device dev;
u16 chid;
struct smcd_gid local_gid;
+ atomic_t dmb_cnt;
rwlock_t dmb_ht_lock;
DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS);
DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS);
+ wait_queue_head_t ldev_release;
};
#endif

--
2.32.0.3.g01195cf9f


2024-04-03 06:36:01

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism



On 2024/3/24 21:55, Wen Gu wrote:
> This patch set acts as the second part of the new version of [1] (The first
> part can be referred from [2]), the updated things of this version are listed
> at the end.

> Change log:
>
> RFC v5->RFC v4:
> - Patch #2: minor changes in description of config SMC_LO and comments.
> - Patch #10: minor changes in comments and if(smc_ism_support_dmb_nocopy())
> check in smcd_cdc_msg_send().
> - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids() and SMC_LO_CHID
> to SMC_LO_RESERVED_CHID.
> - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
> - Some expression changes in commit logs.
>

Hi, Jan. Do you have any comments on this version and should I post a new
patch series without 'RFC'? Thank you.

2024-04-03 11:10:43

by Gerd Bayer

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism

On Wed, 2024-04-03 at 14:35 +0800, Wen Gu wrote:
>
>
> On 2024/3/24 21:55, Wen Gu wrote:
> > This patch set acts as the second part of the new version of [1]
> > (The first
> > part can be referred from [2]), the updated things of this version
> > are listed
> > at the end.
>
> > Change log:
> >
> > RFC v5->RFC v4:
> > - Patch #2: minor changes in description of config SMC_LO and
> > comments.
> > - Patch #10: minor changes in comments and
> > if(smc_ism_support_dmb_nocopy())
> >    check in smcd_cdc_msg_send().
> > - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids()
> > and SMC_LO_CHID
> >    to SMC_LO_RESERVED_CHID.
> > - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
> > - Some expression changes in commit logs.
> >
>
> Hi, Jan. Do you have any comments on this version and should I post a
> new patch series without 'RFC'? Thank you.

Hi Wen,

Jan has been out sick for a little while now, and Wenjia is expected
back from a longer vacation tomorrow. So if you could hold off until
begin of next week, Wenjia might have some more feedback.

In the meantime, I'm looking at your patchset...

Thank you, Gerd



2024-04-03 11:29:45

by Gerd Bayer

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 02/11] net/smc: introduce loopback-ism for SMC intra-OS shortcut

On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> Configuration of this feature is managed through the config SMC_LO.

Hi Wen,

you could omit building smc_loopback.o if CONFIG_SMC_LO was not set if
you included the following fixup-patch. I think it's cleaner to put a
few parts of net/smc/af_smc.c under conditional compile rather than
have most of the contents of net/smc/smc_loopback.c under #ifdef.

From 11a9cfce550f0c4df10eafdd30aa4226d4d522a8 Mon Sep 17 00:00:00 2001
From: Gerd Bayer <[email protected]>
Date: Wed, 3 Apr 2024 11:43:36 +0200
Subject: [PATCH] fixup! net/smc: introduce loopback-ism for SMC intra-
OS
shortcut

---
net/smc/Makefile | 5 +++--
net/smc/af_smc.c | 6 ++++++
net/smc/smc_loopback.c | 8 --------
3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/smc/Makefile b/net/smc/Makefile
index a8c37111abe1..3b73c9d561bb 100644
--- a/net/smc/Makefile
+++ b/net/smc/Makefile
@@ -4,5 +4,6 @@ obj-$(CONFIG_SMC) += smc.o
obj-$(CONFIG_SMC_DIAG) += smc_diag.o
smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o
smc_llc.o
smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o
smc_netlink.o smc_stats.o
-smc-y += smc_tracepoint.o smc_loopback.o
-smc-$(CONFIG_SYSCTL) += smc_sysctl.o
+smc-y += smc_tracepoint.o
+smc-$(CONFIG_SMC_LO) += smc_loopback.o
+smc-$(CONFIG_SYSCTL) += smc_sysctl.o
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index fce7a5b2ce5c..bcbf600cd271 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -3574,11 +3574,13 @@ static int __init smc_init(void)
goto out_sock;
}

+#if IS_ENABLED(CONFIG_SMC_LO)
rc = smc_loopback_init();
if (rc) {
pr_err("%s: smc_loopback_init fails with %d\n",
__func__, rc);
goto out_ib;
}
+#endif

rc = tcp_register_ulp(&smc_ulp_ops);
if (rc) {
@@ -3590,8 +3592,10 @@ static int __init smc_init(void)
return 0;

out_lo:
+#if IS_ENABLED(CONFIG_SMC_LO)
smc_loopback_exit();
out_ib:
+#endif
smc_ib_unregister_client();
out_sock:
sock_unregister(PF_SMC);
@@ -3628,7 +3632,9 @@ static void __exit smc_exit(void)
tcp_unregister_ulp(&smc_ulp_ops);
sock_unregister(PF_SMC);
smc_core_exit();
+#if IS_ENABLED(CONFIG_SMC_LO)
smc_loopback_exit();
+#endif
smc_ib_unregister_client();
smc_ism_exit();
destroy_workqueue(smc_close_wq);
diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
index 994fe39930ad..8d0181635ded 100644
--- a/net/smc/smc_loopback.c
+++ b/net/smc/smc_loopback.c
@@ -19,7 +19,6 @@
#include "smc_ism.h"
#include "smc_loopback.h"

-#if IS_ENABLED(CONFIG_SMC_LO)
#define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */
#define SMC_LO_SUPPORT_NOCOPY 0x1
#define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0)
@@ -442,20 +441,13 @@ static void smc_lo_dev_remove(void)
smc_lo_dev_exit(lo_dev);
put_device(&lo_dev->dev); /* device_initialize in
smc_lo_dev_probe */
}
-#endif

int smc_loopback_init(void)
{
-#if IS_ENABLED(CONFIG_SMC_LO)
return smc_lo_dev_probe();
-#else
- return 0;
-#endif
}

void smc_loopback_exit(void)
{
-#if IS_ENABLED(CONFIG_SMC_LO)
smc_lo_dev_remove();
-#endif
}
--
2.40.1


Even nicer would be, if smc_loopback_init() and smc_loopback_exit()
would have an NO-OP implementation { } straight in
net/smc/smc_loopback.h if CONFIG_SMC_LO was not enabled. That way, one
could leave out the #if ENABLED's in net/smc/af_smc.c entirely.

I think that's more kernel-style like...
Gerd

2024-04-03 16:32:59

by Gerd Bayer

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism

On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> This implements some operations that loopback-ism does not support
> currently:
> - vlan operations, since there is no strong use-case for it.
> - signal_event operations, since there is no event to be processed
> by the loopback-ism device.

Hi Wen,

I wonder if the these operations that are not supported by loopback-ism
should rather be marked "optional" in the struct smcd_ops, and the
calling code should call these only when they are implemented.

Of course this would mean more changes to net/smc/smc_core.c - but
loopback-ism could omit these "boiler-plate" functions.

>  
> +static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int smc_lo_del_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int smc_lo_set_vlan_required(struct smcd_dev *smcd)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int smc_lo_reset_vlan_required(struct smcd_dev *smcd)
> +{
> + return -EOPNOTSUPP;
> +}
> +
> +static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid
> *rgid,
> +        u32 trigger_irq, u32 event_code, u64
> info)
> +{
> + return 0;
> +}
> +

Just a pattern that I saw elsewhere in the kernel...

Thanks,
Gerd

2024-04-03 17:54:25

by Gerd Bayer

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism

On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:

When I instrumented this to see, why I still see tons of my other
temporary instrumentation messages from the "ism" driver, I found that
in my setup loopback-ism is used rather infrequently.

I suspect this is due to how the SMC proposals are constructed in
net/smc/af_smc.c and net/smc/smc_pnet.c - and later evaluated in
smc_check_ism_v2_match() - where there is a first-come-first-serve
selection.

I wonder if one should change that to favour loopback-ism over "real"
ISM devices - and how this could be achieved elegantly.

Just some food for thought... Probably little you can do on x86.

Thanks,
Gerd

> +static int smc_lo_register_dmb(struct smcd_dev *smcd, struct
> smcd_dmb *dmb,
> +        void *client_priv)
> +{
> + struct smc_lo_dmb_node *dmb_node, *tmp_node;
> + struct smc_lo_dev *ldev = smcd->priv;
> + int sba_idx, rc;
> +
> + /* check space for new dmb */
> + for_each_clear_bit(sba_idx, ldev->sba_idx_mask,
> SMC_LO_MAX_DMBS) {
> + if (!test_and_set_bit(sba_idx, ldev->sba_idx_mask))
> + break;
> + }
> + if (sba_idx == SMC_LO_MAX_DMBS)
> + return -ENOSPC;
> +
> + dmb_node = kzalloc(sizeof(*dmb_node), GFP_KERNEL);
> + if (!dmb_node) {
> + rc = -ENOMEM;
> + goto err_bit;
> + }
> +
> + dmb_node->sba_idx = sba_idx;
> + dmb_node->len = dmb->dmb_len;
> + dmb_node->cpu_addr = kzalloc(dmb_node->len, GFP_KERNEL |
> +      __GFP_NOWARN | __GFP_NORETRY |
> +      __GFP_NOMEMALLOC);
> + if (!dmb_node->cpu_addr) {
> + rc = -ENOMEM;
> + goto err_node;
> + }
> + dmb_node->dma_addr = SMC_DMA_ADDR_INVALID;
> +
> +again:
> + /* add new dmb into hash table */
> + get_random_bytes(&dmb_node->token, sizeof(dmb_node->token));
> + write_lock_bh(&ldev->dmb_ht_lock);
> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list,
> dmb_node->token) {
> + if (tmp_node->token == dmb_node->token) {
> + write_unlock_bh(&ldev->dmb_ht_lock);
> + goto again;
> + }
> + }
> + hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token);
> + write_unlock_bh(&ldev->dmb_ht_lock);
> +
> + dmb->sba_idx = dmb_node->sba_idx;
> + dmb->dmb_tok = dmb_node->token;
> + dmb->cpu_addr = dmb_node->cpu_addr;
> + dmb->dma_addr = dmb_node->dma_addr;
> + dmb->dmb_len = dmb_node->len;
> +
> + return 0;
> +
> +err_node:
> + kfree(dmb_node);
> +err_bit:
> + clear_bit(sba_idx, ldev->sba_idx_mask);
> + return rc;
> +}
> +
> +static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct
> smcd_dmb *dmb)
> +{
> + struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
> + struct smc_lo_dev *ldev = smcd->priv;
> +
> + /* remove dmb from hash table */
> + write_lock_bh(&ldev->dmb_ht_lock);
> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb-
> >dmb_tok) {
> + if (tmp_node->token == dmb->dmb_tok) {
> + dmb_node = tmp_node;
> + break;
> + }
> + }
> + if (!dmb_node) {
> + write_unlock_bh(&ldev->dmb_ht_lock);
> + return -EINVAL;
> + }
> + hash_del(&dmb_node->list);
> + write_unlock_bh(&ldev->dmb_ht_lock);
> +
> + clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask);
> + kfree(dmb_node->cpu_addr);
> + kfree(dmb_node);
> +
> + return 0;
> +}
> +
>  static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
>  {
>   return -EOPNOTSUPP;
> @@ -75,6 +164,38 @@ static int smc_lo_signal_event(struct smcd_dev
> *dev, struct smcd_gid *rgid,
>   return 0;
>  }
>  
> +static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok,
> +     unsigned int idx, bool sf, unsigned int
> offset,
> +     void *data, unsigned int size)
> +{
> + struct smc_lo_dmb_node *rmb_node = NULL, *tmp_node;
> + struct smc_lo_dev *ldev = smcd->priv;
> + struct smc_connection *conn;
> +
> + read_lock_bh(&ldev->dmb_ht_lock);
> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list,
> dmb_tok) {
> + if (tmp_node->token == dmb_tok) {
> + rmb_node = tmp_node;
> + break;
> + }
> + }
> + if (!rmb_node) {
> + read_unlock_bh(&ldev->dmb_ht_lock);
> + return -EINVAL;
> + }
> + memcpy((char *)rmb_node->cpu_addr + offset, data, size);
> + read_unlock_bh(&ldev->dmb_ht_lock);
> +
> + if (sf) {
> + conn = smcd->conn[rmb_node->sba_idx];
> + if (conn && !conn->killed)
> + tasklet_schedule(&conn->rx_tsklet);
> + else
> + return -EPIPE;
> + }
> + return 0;
> +}
> +
>  static int smc_lo_supports_v2(void)
>  {
>   return SMC_LO_V2_CAPABLE;
> @@ -101,14 +222,14 @@ static struct device *smc_lo_get_dev(struct
> smcd_dev *smcd)
>  
>  static const struct smcd_ops lo_ops = {
>   .query_remote_gid = smc_lo_query_rgid,
> - .register_dmb = NULL,
> - .unregister_dmb = NULL,
> + .register_dmb = smc_lo_register_dmb,
> + .unregister_dmb = smc_lo_unregister_dmb,
>   .add_vlan_id = smc_lo_add_vlan_id,
>   .del_vlan_id = smc_lo_del_vlan_id,
>   .set_vlan_required = smc_lo_set_vlan_required,
>   .reset_vlan_required = smc_lo_reset_vlan_required,
>   .signal_event = smc_lo_signal_event,
> - .move_data = NULL,
> + .move_data = smc_lo_move_data,
>   .supports_v2 = smc_lo_supports_v2,
>   .get_local_gid = smc_lo_get_local_gid,
>   .get_chid = smc_lo_get_chid,
> @@ -173,6 +294,8 @@ static void smcd_lo_unregister_dev(struct
> smc_lo_dev *ldev)
>  static int smc_lo_dev_init(struct smc_lo_dev *ldev)
>  {
>   smc_lo_generate_ids(ldev);
> + rwlock_init(&ldev->dmb_ht_lock);
> + hash_init(ldev->dmb_ht);
>   return smcd_lo_register_dev(ldev);
>  }
>  
> diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
> index 11868e5ac732..6c4a390430f3 100644
> --- a/net/smc/smc_loopback.h
> +++ b/net/smc/smc_loopback.h
> @@ -20,13 +20,26 @@
>  
>  #if IS_ENABLED(CONFIG_SMC_LO)
>  #define SMC_LO_MAX_DMBS 5000
> +#define SMC_LO_DMBS_HASH_BITS 12
>  #define SMC_LO_RESERVED_CHID 0xFFFF
>  
> +struct smc_lo_dmb_node {
> + struct hlist_node list;
> + u64 token;
> + u32 len;
> + u32 sba_idx;
> + void *cpu_addr;
> + dma_addr_t dma_addr;
> +};
> +
>  struct smc_lo_dev {
>   struct smcd_dev *smcd;
>   struct device dev;
>   u16 chid;
>   struct smcd_gid local_gid;
> + rwlock_t dmb_ht_lock;
> + DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS);
> + DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS);
>  };
>  #endif
>  


2024-04-04 09:04:12

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 02/11] net/smc: introduce loopback-ism for SMC intra-OS shortcut



On 2024/4/3 19:27, Gerd Bayer wrote:
> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>> Configuration of this feature is managed through the config SMC_LO.
>
> Hi Wen,
>
> you could omit building smc_loopback.o if CONFIG_SMC_LO was not set if
> you included the following fixup-patch. I think it's cleaner to put a
> few parts of net/smc/af_smc.c under conditional compile rather than
> have most of the contents of net/smc/smc_loopback.c under #ifdef.
>

Hi Gerd. That is a really good suggestion. It is much cleaner in this way.
I will improve it in next version. Thank you very much!

> From 11a9cfce550f0c4df10eafdd30aa4226d4d522a8 Mon Sep 17 00:00:00 2001
> From: Gerd Bayer <[email protected]>
> Date: Wed, 3 Apr 2024 11:43:36 +0200
> Subject: [PATCH] fixup! net/smc: introduce loopback-ism for SMC intra-
> OS
> shortcut
>
> ---
> net/smc/Makefile | 5 +++--
> net/smc/af_smc.c | 6 ++++++
> net/smc/smc_loopback.c | 8 --------
> 3 files changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/net/smc/Makefile b/net/smc/Makefile
> index a8c37111abe1..3b73c9d561bb 100644
> --- a/net/smc/Makefile
> +++ b/net/smc/Makefile
> @@ -4,5 +4,6 @@ obj-$(CONFIG_SMC) += smc.o
> obj-$(CONFIG_SMC_DIAG) += smc_diag.o
> smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o
> smc_llc.o
> smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o
> smc_netlink.o smc_stats.o
> -smc-y += smc_tracepoint.o smc_loopback.o
> -smc-$(CONFIG_SYSCTL) += smc_sysctl.o
> +smc-y += smc_tracepoint.o
> +smc-$(CONFIG_SMC_LO) += smc_loopback.o
> +smc-$(CONFIG_SYSCTL) += smc_sysctl.o
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index fce7a5b2ce5c..bcbf600cd271 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -3574,11 +3574,13 @@ static int __init smc_init(void)
> goto out_sock;
> }
>
> +#if IS_ENABLED(CONFIG_SMC_LO)
> rc = smc_loopback_init();
> if (rc) {
> pr_err("%s: smc_loopback_init fails with %d\n",
> __func__, rc);
> goto out_ib;
> }
> +#endif
>
> rc = tcp_register_ulp(&smc_ulp_ops);
> if (rc) {
> @@ -3590,8 +3592,10 @@ static int __init smc_init(void)
> return 0;
>
> out_lo:
> +#if IS_ENABLED(CONFIG_SMC_LO)
> smc_loopback_exit();
> out_ib:
> +#endif
> smc_ib_unregister_client();
> out_sock:
> sock_unregister(PF_SMC);
> @@ -3628,7 +3632,9 @@ static void __exit smc_exit(void)
> tcp_unregister_ulp(&smc_ulp_ops);
> sock_unregister(PF_SMC);
> smc_core_exit();
> +#if IS_ENABLED(CONFIG_SMC_LO)
> smc_loopback_exit();
> +#endif
> smc_ib_unregister_client();
> smc_ism_exit();
> destroy_workqueue(smc_close_wq);
> diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c
> index 994fe39930ad..8d0181635ded 100644
> --- a/net/smc/smc_loopback.c
> +++ b/net/smc/smc_loopback.c
> @@ -19,7 +19,6 @@
> #include "smc_ism.h"
> #include "smc_loopback.h"
>
> -#if IS_ENABLED(CONFIG_SMC_LO)
> #define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */
> #define SMC_LO_SUPPORT_NOCOPY 0x1
> #define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0)
> @@ -442,20 +441,13 @@ static void smc_lo_dev_remove(void)
> smc_lo_dev_exit(lo_dev);
> put_device(&lo_dev->dev); /* device_initialize in
> smc_lo_dev_probe */
> }
> -#endif
>
> int smc_loopback_init(void)
> {
> -#if IS_ENABLED(CONFIG_SMC_LO)
> return smc_lo_dev_probe();
> -#else
> - return 0;
> -#endif
> }
>
> void smc_loopback_exit(void)
> {
> -#if IS_ENABLED(CONFIG_SMC_LO)
> smc_lo_dev_remove();
> -#endif
> }

2024-04-04 09:36:32

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 2024/4/4 00:25, Gerd Bayer wrote:
> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>> This implements some operations that loopback-ism does not support
>> currently:
>> - vlan operations, since there is no strong use-case for it.
>> - signal_event operations, since there is no event to be processed
>> by the loopback-ism device.
>
> Hi Wen,
>
> I wonder if the these operations that are not supported by loopback-ism
> should rather be marked "optional" in the struct smcd_ops, and the
> calling code should call these only when they are implemented.
>
> Of course this would mean more changes to net/smc/smc_core.c - but
> loopback-ism could omit these "boiler-plate" functions.
>

Hi Gerd.

Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
can avoid the device driver from implementing unsupported operations. But I
am afraid that which operations need to be defined as 'optional' may differ
from different device perspectives (e.g. for loopback-ism they are vlan-related
opts and signal_event). So I perfer to simply let the smc protocol assume
that all operations have been implemented, and let drivers to decide which
ones are unsupported in implementation. What do you think?

Thanks!

>>
>> +static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
>> +{
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static int smc_lo_del_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
>> +{
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static int smc_lo_set_vlan_required(struct smcd_dev *smcd)
>> +{
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static int smc_lo_reset_vlan_required(struct smcd_dev *smcd)
>> +{
>> + return -EOPNOTSUPP;
>> +}
>> +
>> +static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid
>> *rgid,
>> +        u32 trigger_irq, u32 event_code, u64
>> info)
>> +{
>> + return 0;
>> +}
>> +
>
> Just a pattern that I saw elsewhere in the kernel...
>
> Thanks,
> Gerd

2024-04-04 10:20:32

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism



On 2024/4/4 01:20, Gerd Bayer wrote:
> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>
> When I instrumented this to see, why I still see tons of my other
> temporary instrumentation messages from the "ism" driver, I found that
> in my setup loopback-ism is used rather infrequently.
>
> I suspect this is due to how the SMC proposals are constructed in
> net/smc/af_smc.c and net/smc/smc_pnet.c - and later evaluated in
> smc_check_ism_v2_match() - where there is a first-come-first-serve
> selection.
>
> I wonder if one should change that to favour loopback-ism over "real"
> ISM devices - and how this could be achieved elegantly.
>
> Just some food for thought... Probably little you can do on x86.
>

Yes, it is about the priority of available ISM devices, and now it
is decided by their order in the smcd_dev_list. The later registered
ISMv2 devices(without pnetid) will be added to the beginning of the
list (see smcd_register_dev()). So there is a probability that
loopback-ism will not be ranked first, since it is added into list
earlier during smc_init().

If we have the runtime switch of loopback-ism, we can re-active the
loopback-ism, that make it be re-added into the beginning of the dev
list and be chosen first. Or a new netlink command to adjust the slot
order of available ISM devices in the list. As we discussed before,
that could be tasks in stage 1 or stage 2.

Thanks!

> Thanks,
> Gerd
>
>> +static int smc_lo_register_dmb(struct smcd_dev *smcd, struct
>> smcd_dmb *dmb,
>> +        void *client_priv)
>> +{
>> + struct smc_lo_dmb_node *dmb_node, *tmp_node;
>> + struct smc_lo_dev *ldev = smcd->priv;
>> + int sba_idx, rc;
>> +
>> + /* check space for new dmb */
>> + for_each_clear_bit(sba_idx, ldev->sba_idx_mask,
>> SMC_LO_MAX_DMBS) {
>> + if (!test_and_set_bit(sba_idx, ldev->sba_idx_mask))
>> + break;
>> + }
>> + if (sba_idx == SMC_LO_MAX_DMBS)
>> + return -ENOSPC;
>> +
>> + dmb_node = kzalloc(sizeof(*dmb_node), GFP_KERNEL);
>> + if (!dmb_node) {
>> + rc = -ENOMEM;
>> + goto err_bit;
>> + }
>> +
>> + dmb_node->sba_idx = sba_idx;
>> + dmb_node->len = dmb->dmb_len;
>> + dmb_node->cpu_addr = kzalloc(dmb_node->len, GFP_KERNEL |
>> +      __GFP_NOWARN | __GFP_NORETRY |
>> +      __GFP_NOMEMALLOC);
>> + if (!dmb_node->cpu_addr) {
>> + rc = -ENOMEM;
>> + goto err_node;
>> + }
>> + dmb_node->dma_addr = SMC_DMA_ADDR_INVALID;
>> +
>> +again:
>> + /* add new dmb into hash table */
>> + get_random_bytes(&dmb_node->token, sizeof(dmb_node->token));
>> + write_lock_bh(&ldev->dmb_ht_lock);
>> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list,
>> dmb_node->token) {
>> + if (tmp_node->token == dmb_node->token) {
>> + write_unlock_bh(&ldev->dmb_ht_lock);
>> + goto again;
>> + }
>> + }
>> + hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token);
>> + write_unlock_bh(&ldev->dmb_ht_lock);
>> +
>> + dmb->sba_idx = dmb_node->sba_idx;
>> + dmb->dmb_tok = dmb_node->token;
>> + dmb->cpu_addr = dmb_node->cpu_addr;
>> + dmb->dma_addr = dmb_node->dma_addr;
>> + dmb->dmb_len = dmb_node->len;
>> +
>> + return 0;
>> +
>> +err_node:
>> + kfree(dmb_node);
>> +err_bit:
>> + clear_bit(sba_idx, ldev->sba_idx_mask);
>> + return rc;
>> +}
>> +
>> +static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct
>> smcd_dmb *dmb)
>> +{
>> + struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node;
>> + struct smc_lo_dev *ldev = smcd->priv;
>> +
>> + /* remove dmb from hash table */
>> + write_lock_bh(&ldev->dmb_ht_lock);
>> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb-
>>> dmb_tok) {
>> + if (tmp_node->token == dmb->dmb_tok) {
>> + dmb_node = tmp_node;
>> + break;
>> + }
>> + }
>> + if (!dmb_node) {
>> + write_unlock_bh(&ldev->dmb_ht_lock);
>> + return -EINVAL;
>> + }
>> + hash_del(&dmb_node->list);
>> + write_unlock_bh(&ldev->dmb_ht_lock);
>> +
>> + clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask);
>> + kfree(dmb_node->cpu_addr);
>> + kfree(dmb_node);
>> +
>> + return 0;
>> +}
>> +
>>  static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id)
>>  {
>>   return -EOPNOTSUPP;
>> @@ -75,6 +164,38 @@ static int smc_lo_signal_event(struct smcd_dev
>> *dev, struct smcd_gid *rgid,
>>   return 0;
>>  }
>>
>> +static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok,
>> +     unsigned int idx, bool sf, unsigned int
>> offset,
>> +     void *data, unsigned int size)
>> +{
>> + struct smc_lo_dmb_node *rmb_node = NULL, *tmp_node;
>> + struct smc_lo_dev *ldev = smcd->priv;
>> + struct smc_connection *conn;
>> +
>> + read_lock_bh(&ldev->dmb_ht_lock);
>> + hash_for_each_possible(ldev->dmb_ht, tmp_node, list,
>> dmb_tok) {
>> + if (tmp_node->token == dmb_tok) {
>> + rmb_node = tmp_node;
>> + break;
>> + }
>> + }
>> + if (!rmb_node) {
>> + read_unlock_bh(&ldev->dmb_ht_lock);
>> + return -EINVAL;
>> + }
>> + memcpy((char *)rmb_node->cpu_addr + offset, data, size);
>> + read_unlock_bh(&ldev->dmb_ht_lock);
>> +
>> + if (sf) {
>> + conn = smcd->conn[rmb_node->sba_idx];
>> + if (conn && !conn->killed)
>> + tasklet_schedule(&conn->rx_tsklet);
>> + else
>> + return -EPIPE;
>> + }
>> + return 0;
>> +}
>> +
>>  static int smc_lo_supports_v2(void)
>>  {
>>   return SMC_LO_V2_CAPABLE;
>> @@ -101,14 +222,14 @@ static struct device *smc_lo_get_dev(struct
>> smcd_dev *smcd)
>>
>>  static const struct smcd_ops lo_ops = {
>>   .query_remote_gid = smc_lo_query_rgid,
>> - .register_dmb = NULL,
>> - .unregister_dmb = NULL,
>> + .register_dmb = smc_lo_register_dmb,
>> + .unregister_dmb = smc_lo_unregister_dmb,
>>   .add_vlan_id = smc_lo_add_vlan_id,
>>   .del_vlan_id = smc_lo_del_vlan_id,
>>   .set_vlan_required = smc_lo_set_vlan_required,
>>   .reset_vlan_required = smc_lo_reset_vlan_required,
>>   .signal_event = smc_lo_signal_event,
>> - .move_data = NULL,
>> + .move_data = smc_lo_move_data,
>>   .supports_v2 = smc_lo_supports_v2,
>>   .get_local_gid = smc_lo_get_local_gid,
>>   .get_chid = smc_lo_get_chid,
>> @@ -173,6 +294,8 @@ static void smcd_lo_unregister_dev(struct
>> smc_lo_dev *ldev)
>>  static int smc_lo_dev_init(struct smc_lo_dev *ldev)
>>  {
>>   smc_lo_generate_ids(ldev);
>> + rwlock_init(&ldev->dmb_ht_lock);
>> + hash_init(ldev->dmb_ht);
>>   return smcd_lo_register_dev(ldev);
>>  }
>>
>> diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h
>> index 11868e5ac732..6c4a390430f3 100644
>> --- a/net/smc/smc_loopback.h
>> +++ b/net/smc/smc_loopback.h
>> @@ -20,13 +20,26 @@
>>
>>  #if IS_ENABLED(CONFIG_SMC_LO)
>>  #define SMC_LO_MAX_DMBS 5000
>> +#define SMC_LO_DMBS_HASH_BITS 12
>>  #define SMC_LO_RESERVED_CHID 0xFFFF
>>
>> +struct smc_lo_dmb_node {
>> + struct hlist_node list;
>> + u64 token;
>> + u32 len;
>> + u32 sba_idx;
>> + void *cpu_addr;
>> + dma_addr_t dma_addr;
>> +};
>> +
>>  struct smc_lo_dev {
>>   struct smcd_dev *smcd;
>>   struct device dev;
>>   u16 chid;
>>   struct smcd_gid local_gid;
>> + rwlock_t dmb_ht_lock;
>> + DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS);
>> + DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS);
>>  };
>>  #endif
>>

2024-04-04 10:40:19

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism



On 2024/4/3 19:10, Gerd Bayer wrote:
> On Wed, 2024-04-03 at 14:35 +0800, Wen Gu wrote:
>>
>>
>> On 2024/3/24 21:55, Wen Gu wrote:
>>> This patch set acts as the second part of the new version of [1]
>>> (The first
>>> part can be referred from [2]), the updated things of this version
>>> are listed
>>> at the end.
>>
>>> Change log:
>>>
>>> RFC v5->RFC v4:
>>> - Patch #2: minor changes in description of config SMC_LO and
>>> comments.
>>> - Patch #10: minor changes in comments and
>>> if(smc_ism_support_dmb_nocopy())
>>>    check in smcd_cdc_msg_send().
>>> - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids()
>>> and SMC_LO_CHID
>>>    to SMC_LO_RESERVED_CHID.
>>> - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
>>> - Some expression changes in commit logs.
>>>
>>
>> Hi, Jan. Do you have any comments on this version and should I post a
>> new patch series without 'RFC'? Thank you.
>
> Hi Wen,
>
> Jan has been out sick for a little while now, and Wenjia is expected
> back from a longer vacation tomorrow. So if you could hold off until
> begin of next week, Wenjia might have some more feedback.
>
> In the meantime, I'm looking at your patchset...
>
> Thank you, Gerd
>

Hi Gerd,

Thank you for the information and comments! I guess I will post a new
version at the beginning of next week.

Thanks!

2024-04-04 11:32:50

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism

On Thu, 2024-04-04 at 18:20 +0800, Wen Gu wrote:
>
> On 2024/4/4 01:20, Gerd Bayer wrote:
> > On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> >
> > When I instrumented this to see, why I still see tons of my other
> > temporary instrumentation messages from the "ism" driver, I found that
> > in my setup loopback-ism is used rather infrequently.
> >
> > I suspect this is due to how the SMC proposals are constructed in
> > net/smc/af_smc.c and net/smc/smc_pnet.c - and later evaluated in
> > smc_check_ism_v2_match() - where there is a first-come-first-serve
> > selection.
> >
> > I wonder if one should change that to favour loopback-ism over "real"
> > ISM devices - and how this could be achieved elegantly.
> >
> > Just some food for thought... Probably little you can do on x86.
> >
>
> Yes, it is about the priority of available ISM devices, and now it
> is decided by their order in the smcd_dev_list. The later registered
> ISMv2 devices(without pnetid) will be added to the beginning of the
> list (see smcd_register_dev()). So there is a probability that
> loopback-ism will not be ranked first, since it is added into list
> earlier during smc_init().
>
> If we have the runtime switch of loopback-ism, we can re-active the
> loopback-ism, that make it be re-added into the beginning of the dev
> list and be chosen first. Or a new netlink command to adjust the slot
> order of available ISM devices in the list. As we discussed before,
> that could be tasks in stage 1 or stage 2.
>
> Thanks!

Maybe when adding the ISM devices we could instead make sure that all
ISM devices are added after loopback and loopback is added in the
beginning. I think loopback should always be preferred and would
consider it a bug if it isn't faster too. Between virtio-ism and ISM it
may be less clear so maybe for stage 2 we would want a priority setting
and then insert ordered by priority. Thoughts?

2024-04-04 11:52:48

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism

On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>
> On 2024/4/4 00:25, Gerd Bayer wrote:
> > On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> > > This implements some operations that loopback-ism does not support
> > > currently:
> > > - vlan operations, since there is no strong use-case for it.
> > > - signal_event operations, since there is no event to be processed
> > > by the loopback-ism device.
> >
> > Hi Wen,
> >
> > I wonder if the these operations that are not supported by loopback-ism
> > should rather be marked "optional" in the struct smcd_ops, and the
> > calling code should call these only when they are implemented.
> >
> > Of course this would mean more changes to net/smc/smc_core.c - but
> > loopback-ism could omit these "boiler-plate" functions.
> >
>
> Hi Gerd.
>
> Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
> can avoid the device driver from implementing unsupported operations. But I
> am afraid that which operations need to be defined as 'optional' may differ
> from different device perspectives (e.g. for loopback-ism they are vlan-related
> opts and signal_event). So I perfer to simply let the smc protocol assume
> that all operations have been implemented, and let drivers to decide which
> ones are unsupported in implementation. What do you think?
>
> Thanks!
>

I agree with Gerd, in my opinion it is better to document ops as
optional and then allow their function pointers to be NULL and check
for that. Acting like they are supported and then they turn out to be
nops to me seems to contradict the principle of least surprises. I also
think we can find a subset of mandatory ops without which SMC-D is
impossible and then everything else should be optional.

As a first guess I think the following options may be mandatory:

* query_remote_gid()
* register_dmb()/unregister_dmb()
* move_data()
For this one could argue that either move_data() or
attach_dmb()/detach_dmb() is required though personally I would
prefer to always have move_data() as a fallback and simple API
* supports_v2()
* get_local_gid()
* get_chid()
* get_dev()
> >

2024-04-04 13:13:29

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 2024/4/4 19:42, Niklas Schnelle wrote:
> On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>>
>> On 2024/4/4 00:25, Gerd Bayer wrote:
>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>> This implements some operations that loopback-ism does not support
>>>> currently:
>>>> - vlan operations, since there is no strong use-case for it.
>>>> - signal_event operations, since there is no event to be processed
>>>> by the loopback-ism device.
>>>
>>> Hi Wen,
>>>
>>> I wonder if the these operations that are not supported by loopback-ism
>>> should rather be marked "optional" in the struct smcd_ops, and the
>>> calling code should call these only when they are implemented.
>>>
>>> Of course this would mean more changes to net/smc/smc_core.c - but
>>> loopback-ism could omit these "boiler-plate" functions.
>>>
>>
>> Hi Gerd.
>>
>> Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
>> can avoid the device driver from implementing unsupported operations. But I
>> am afraid that which operations need to be defined as 'optional' may differ
>> from different device perspectives (e.g. for loopback-ism they are vlan-related
>> opts and signal_event). So I perfer to simply let the smc protocol assume
>> that all operations have been implemented, and let drivers to decide which
>> ones are unsupported in implementation. What do you think?
>>
>> Thanks!
>>
>
> I agree with Gerd, in my opinion it is better to document ops as
> optional and then allow their function pointers to be NULL and check
> for that. Acting like they are supported and then they turn out to be
> nops to me seems to contradict the principle of least surprises. I also
> think we can find a subset of mandatory ops without which SMC-D is
> impossible and then everything else should be optional.

I see. If we all agree to classify smcd_ops into mandatory and optional ones,
I'll add a patch to mark the optional ops and check if they are implemented.

>
> As a first guess I think the following options may be mandatory:
>
> * query_remote_gid()
> * register_dmb()/unregister_dmb()
> * move_data()
> For this one could argue that either move_data() or
> attach_dmb()/detach_dmb() is required though personally I would
> prefer to always have move_data() as a fallback and simple API
> * supports_v2()
> * get_local_gid()
> * get_chid()
> * get_dev()
I agree with this classification. Just one point, maybe we can take
supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
it is not implemented, we treat it as an ISMv1.

Thanks!

>>>

2024-04-04 13:45:04

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism



On 2024/4/4 19:27, Niklas Schnelle wrote:
> On Thu, 2024-04-04 at 18:20 +0800, Wen Gu wrote:
>>
>> On 2024/4/4 01:20, Gerd Bayer wrote:
>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>
>>> When I instrumented this to see, why I still see tons of my other
>>> temporary instrumentation messages from the "ism" driver, I found that
>>> in my setup loopback-ism is used rather infrequently.
>>>
>>> I suspect this is due to how the SMC proposals are constructed in
>>> net/smc/af_smc.c and net/smc/smc_pnet.c - and later evaluated in
>>> smc_check_ism_v2_match() - where there is a first-come-first-serve
>>> selection.
>>>
>>> I wonder if one should change that to favour loopback-ism over "real"
>>> ISM devices - and how this could be achieved elegantly.
>>>
>>> Just some food for thought... Probably little you can do on x86.
>>>
>>
>> Yes, it is about the priority of available ISM devices, and now it
>> is decided by their order in the smcd_dev_list. The later registered
>> ISMv2 devices(without pnetid) will be added to the beginning of the
>> list (see smcd_register_dev()). So there is a probability that
>> loopback-ism will not be ranked first, since it is added into list
>> earlier during smc_init().
>>
>> If we have the runtime switch of loopback-ism, we can re-active the
>> loopback-ism, that make it be re-added into the beginning of the dev
>> list and be chosen first. Or a new netlink command to adjust the slot
>> order of available ISM devices in the list. As we discussed before,
>> that could be tasks in stage 1 or stage 2.
>>
>> Thanks!
>
> Maybe when adding the ISM devices we could instead make sure that all
> ISM devices are added after loopback and loopback is added in the
> beginning. I think loopback should always be preferred and would
> consider it a bug if it isn't faster too. Between virtio-ism and ISM it
> may be less clear so maybe for stage 2 we would want a priority setting
> and then insert ordered by priority. Thoughts?
I have no objection. If we all agree, I will keep it at the beginning of the list.

Thanks!


2024-04-04 15:38:53

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism

On Thu, 2024-04-04 at 21:12 +0800, Wen Gu wrote:
>
> On 2024/4/4 19:42, Niklas Schnelle wrote:
> > On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
> > >
> > > On 2024/4/4 00:25, Gerd Bayer wrote:
> > > > On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> > > > > This implements some operations that loopback-ism does not support
> > > > > currently:
> > > > > - vlan operations, since there is no strong use-case for it.
> > > > > - signal_event operations, since there is no event to be processed
> > > > > by the loopback-ism device.
> > > >
> > > > Hi Wen,
> > > >
> > > > I wonder if the these operations that are not supported by loopback-ism
> > > > should rather be marked "optional" in the struct smcd_ops, and the
> > > > calling code should call these only when they are implemented.
> > > >
> > > > Of course this would mean more changes to net/smc/smc_core.c - but
> > > > loopback-ism could omit these "boiler-plate" functions.
> > > >
> > >
> > > Hi Gerd.
> > >
> > > Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
> > > can avoid the device driver from implementing unsupported operations. But I
> > > am afraid that which operations need to be defined as 'optional' may differ
> > > from different device perspectives (e.g. for loopback-ism they are vlan-related
> > > opts and signal_event). So I perfer to simply let the smc protocol assume
> > > that all operations have been implemented, and let drivers to decide which
> > > ones are unsupported in implementation. What do you think?
> > >
> > > Thanks!
> > >
> >
> > I agree with Gerd, in my opinion it is better to document ops as
> > optional and then allow their function pointers to be NULL and check
> > for that. Acting like they are supported and then they turn out to be
> > nops to me seems to contradict the principle of least surprises. I also
> > think we can find a subset of mandatory ops without which SMC-D is
> > impossible and then everything else should be optional.
>
> I see. If we all agree to classify smcd_ops into mandatory and optional ones,
> I'll add a patch to mark the optional ops and check if they are implemented.

Keep in mind I don't speak for the SMC maintainers but that does sound
reasonable to me.

>
> >
> > As a first guess I think the following options may be mandatory:
> >
> > * query_remote_gid()
> > * register_dmb()/unregister_dmb()
> > * move_data()
> > For this one could argue that either move_data() or
> > attach_dmb()/detach_dmb() is required though personally I would
> > prefer to always have move_data() as a fallback and simple API
> > * supports_v2()
> > * get_local_gid()
> > * get_chid()
> > * get_dev()
> I agree with this classification. Just one point, maybe we can take
> supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
> it is not implemented, we treat it as an ISMv1.
>
> Thanks!

Interpreting a NULL supports_v2() as not supporting v2 sounds
reasonable to me.

2024-04-04 16:05:50

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 05/11] net/smc: implement DMB-related operations of loopback-ism

On Thu, 2024-04-04 at 21:44 +0800, Wen Gu wrote:
>
> On 2024/4/4 19:27, Niklas Schnelle wrote:
> > On Thu, 2024-04-04 at 18:20 +0800, Wen Gu wrote:
> > >
> > > On 2024/4/4 01:20, Gerd Bayer wrote:
> > > > On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
> > > >
> > > > When I instrumented this to see, why I still see tons of my other
> > > > temporary instrumentation messages from the "ism" driver, I found that
> > > > in my setup loopback-ism is used rather infrequently.
> > > >
> > > > I suspect this is due to how the SMC proposals are constructed in
> > > > net/smc/af_smc.c and net/smc/smc_pnet.c - and later evaluated in
> > > > smc_check_ism_v2_match() - where there is a first-come-first-serve
> > > > selection.
> > > >
> > > > I wonder if one should change that to favour loopback-ism over "real"
> > > > ISM devices - and how this could be achieved elegantly.
> > > >
> > > > Just some food for thought... Probably little you can do on x86.
> > > >
> > >
> > > Yes, it is about the priority of available ISM devices, and now it
> > > is decided by their order in the smcd_dev_list. The later registered
> > > ISMv2 devices(without pnetid) will be added to the beginning of the
> > > list (see smcd_register_dev()). So there is a probability that
> > > loopback-ism will not be ranked first, since it is added into list
> > > earlier during smc_init().
> > >
> > > If we have the runtime switch of loopback-ism, we can re-active the
> > > loopback-ism, that make it be re-added into the beginning of the dev
> > > list and be chosen first. Or a new netlink command to adjust the slot
> > > order of available ISM devices in the list. As we discussed before,
> > > that could be tasks in stage 1 or stage 2.
> > >
> > > Thanks!
> >
> > Maybe when adding the ISM devices we could instead make sure that all
> > ISM devices are added after loopback and loopback is added in the
> > beginning. I think loopback should always be preferred and would
> > consider it a bug if it isn't faster too. Between virtio-ism and ISM it
> > may be less clear so maybe for stage 2 we would want a priority setting
> > and then insert ordered by priority. Thoughts?
> I have no objection. If we all agree, I will keep it at the beginning of the list.
>
> Thanks!
>
>

I think this is a decision that needs to be made by the SMC maintainers
and Wenjia will be back next week. That said the current code basically
prioritizes ISM devices without that having been a conscious and
documented decision. Also note that due to hotplug an LPAR could be
using loopback-ism happily until suddenly an ISM device is hot plugged
and new connections suddenly switch to ISM.

2024-04-09 01:45:54

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 2024/4/4 23:15, Niklas Schnelle wrote:
> On Thu, 2024-04-04 at 21:12 +0800, Wen Gu wrote:
>>
>> On 2024/4/4 19:42, Niklas Schnelle wrote:
>>> On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>>>>
>>>> On 2024/4/4 00:25, Gerd Bayer wrote:
>>>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>>>> This implements some operations that loopback-ism does not support
>>>>>> currently:
>>>>>> - vlan operations, since there is no strong use-case for it.
>>>>>> - signal_event operations, since there is no event to be processed
>>>>>> by the loopback-ism device.
>>>>>
>>>>> Hi Wen,
>>>>>
>>>>> I wonder if the these operations that are not supported by loopback-ism
>>>>> should rather be marked "optional" in the struct smcd_ops, and the
>>>>> calling code should call these only when they are implemented.
>>>>>
>>>>> Of course this would mean more changes to net/smc/smc_core.c - but
>>>>> loopback-ism could omit these "boiler-plate" functions.
>>>>>
>>>>
>>>> Hi Gerd.
>>>>
>>>> Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
>>>> can avoid the device driver from implementing unsupported operations. But I
>>>> am afraid that which operations need to be defined as 'optional' may differ
>>>> from different device perspectives (e.g. for loopback-ism they are vlan-related
>>>> opts and signal_event). So I perfer to simply let the smc protocol assume
>>>> that all operations have been implemented, and let drivers to decide which
>>>> ones are unsupported in implementation. What do you think?
>>>>
>>>> Thanks!
>>>>
>>>
>>> I agree with Gerd, in my opinion it is better to document ops as
>>> optional and then allow their function pointers to be NULL and check
>>> for that. Acting like they are supported and then they turn out to be
>>> nops to me seems to contradict the principle of least surprises. I also
>>> think we can find a subset of mandatory ops without which SMC-D is
>>> impossible and then everything else should be optional.
>>
>> I see. If we all agree to classify smcd_ops into mandatory and optional ones,
>> I'll add a patch to mark the optional ops and check if they are implemented.
>
> Keep in mind I don't speak for the SMC maintainers but that does sound
> reasonable to me.
>

Hi Wenjia and Jan, do you have any comments on this and [1]? Thanks!

[1] https://lore.kernel.org/netdev/[email protected]/

>>
>>>
>>> As a first guess I think the following options may be mandatory:
>>>
>>> * query_remote_gid()
>>> * register_dmb()/unregister_dmb()
>>> * move_data()
>>> For this one could argue that either move_data() or
>>> attach_dmb()/detach_dmb() is required though personally I would
>>> prefer to always have move_data() as a fallback and simple API
>>> * supports_v2()
>>> * get_local_gid()
>>> * get_chid()
>>> * get_dev()
>> I agree with this classification. Just one point, maybe we can take
>> supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
>> it is not implemented, we treat it as an ISMv1.
>>
>> Thanks!
>
> Interpreting a NULL supports_v2() as not supporting v2 sounds
> reasonable to me.

2024-04-11 07:46:49

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism



On 2024/4/3 19:10, Gerd Bayer wrote:
> On Wed, 2024-04-03 at 14:35 +0800, Wen Gu wrote:
>>
>>
>> On 2024/3/24 21:55, Wen Gu wrote:
>>> This patch set acts as the second part of the new version of [1]
>>> (The first
>>> part can be referred from [2]), the updated things of this version
>>> are listed
>>> at the end.
>>
>>> Change log:
>>>
>>> RFC v5->RFC v4:
>>> - Patch #2: minor changes in description of config SMC_LO and
>>> comments.
>>> - Patch #10: minor changes in comments and
>>> if(smc_ism_support_dmb_nocopy())
>>>    check in smcd_cdc_msg_send().
>>> - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids()
>>> and SMC_LO_CHID
>>>    to SMC_LO_RESERVED_CHID.
>>> - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
>>> - Some expression changes in commit logs.
>>>
>>
>> Hi, Jan. Do you have any comments on this version and should I post a
>> new patch series without 'RFC'? Thank you.
>
> Hi Wen,
>
> Jan has been out sick for a little while now, and Wenjia is expected
> back from a longer vacation tomorrow. So if you could hold off until
> begin of next week, Wenjia might have some more feedback.
>
> In the meantime, I'm looking at your patchset...
>
> Thank you, Gerd
>

Hi Gerd, is there any further information? I am wondering if I
should wait for more feedback from SMC maintainers. Thanks!


Hi Wenjia, when it's convenient for you, could you please confirm
if [1] and [2] need to be included in the next version? Thanks!

[1] https://lore.kernel.org/netdev/[email protected]/
[2] https://lore.kernel.org/netdev/[email protected]/

2024-04-11 09:33:33

by Wenjia Zhang

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism



On 11.04.24 09:45, Wen Gu wrote:
>
>
> On 2024/4/3 19:10, Gerd Bayer wrote:
>> On Wed, 2024-04-03 at 14:35 +0800, Wen Gu wrote:
>>>
>>>
>>> On 2024/3/24 21:55, Wen Gu wrote:
>>>> This patch set acts as the second part of the new version of [1]
>>>> (The first
>>>> part can be referred from [2]), the updated things of this version
>>>> are listed
>>>> at the end.
>>>
>>>> Change log:
>>>>
>>>> RFC v5->RFC v4:
>>>> - Patch #2: minor changes in description of config SMC_LO and
>>>> comments.
>>>> - Patch #10: minor changes in comments and
>>>> if(smc_ism_support_dmb_nocopy())
>>>>     check in smcd_cdc_msg_send().
>>>> - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids()
>>>> and SMC_LO_CHID
>>>>     to SMC_LO_RESERVED_CHID.
>>>> - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
>>>> - Some expression changes in commit logs.
>>>>
>>>
>>> Hi, Jan. Do you have any comments on this version and should I post a
>>> new patch series without 'RFC'? Thank you.
>>
>> Hi Wen,
>>
>> Jan has been out sick for a little while now, and Wenjia is expected
>> back from a longer vacation tomorrow. So if you could hold off until
>> begin of next week, Wenjia might have some more feedback.
>>
>> In the meantime, I'm looking at your patchset...
>>
>> Thank you, Gerd
>>
>
> Hi Gerd, is there any further information? I am wondering if I
> should wait for more feedback from SMC maintainers. Thanks!
>
>
> Hi Wenjia, when it's convenient for you, could you please confirm
> if [1] and [2] need to be included in the next version? Thanks!
>
> [1]
> https://lore.kernel.org/netdev/[email protected]/
> [2]
> https://lore.kernel.org/netdev/[email protected]/
>

Hi Wen,

I'm just back, thank you for the patience!

Firstly I want to thank Gerd and Niklas for review and bringing up these
points!

Here are some of my options on that:

To [1]:
I agree to document the ops as otional if it must not be supported.
Since I don't really have any ideas, the classification souds reasonable
to me. Going to the details, what about to take following options as
mandatory:

* query_remote_gid()
* register_dmb()/unregister_dmb()
* move_data() : I do see the necessary here.
* get_local_gid()
* get_chid()
* get_dev()

To [2]:
I also agree to keep the ism-loopback at the very beginning of the List.
That acting is also what I imaged previously. Thank you, gerd, again for
testing it and find it out!

Thanks,
Wenjia

2024-04-11 09:57:06

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 00/11] net/smc: SMC intra-OS shortcut with loopback-ism



On 2024/4/11 17:32, Wenjia Zhang wrote:
>
>
> On 11.04.24 09:45, Wen Gu wrote:
>>
>>
>> On 2024/4/3 19:10, Gerd Bayer wrote:
>>> On Wed, 2024-04-03 at 14:35 +0800, Wen Gu wrote:
>>>>
>>>>
>>>> On 2024/3/24 21:55, Wen Gu wrote:
>>>>> This patch set acts as the second part of the new version of [1]
>>>>> (The first
>>>>> part can be referred from [2]), the updated things of this version
>>>>> are listed
>>>>> at the end.
>>>>
>>>>> Change log:
>>>>>
>>>>> RFC v5->RFC v4:
>>>>> - Patch #2: minor changes in description of config SMC_LO and
>>>>> comments.
>>>>> - Patch #10: minor changes in comments and
>>>>> if(smc_ism_support_dmb_nocopy())
>>>>>     check in smcd_cdc_msg_send().
>>>>> - Patch #3: change smc_lo_generate_id() to smc_lo_generate_ids()
>>>>> and SMC_LO_CHID
>>>>>     to SMC_LO_RESERVED_CHID.
>>>>> - Patch #5: memcpy while holding the ldev->dmb_ht_lock.
>>>>> - Some expression changes in commit logs.
>>>>>
>>>>
>>>> Hi, Jan. Do you have any comments on this version and should I post a
>>>> new patch series without 'RFC'? Thank you.
>>>
>>> Hi Wen,
>>>
>>> Jan has been out sick for a little while now, and Wenjia is expected
>>> back from a longer vacation tomorrow. So if you could hold off until
>>> begin of next week, Wenjia might have some more feedback.
>>>
>>> In the meantime, I'm looking at your patchset...
>>>
>>> Thank you, Gerd
>>>
>>
>> Hi Gerd, is there any further information? I am wondering if I
>> should wait for more feedback from SMC maintainers. Thanks!
>>
>>
>> Hi Wenjia, when it's convenient for you, could you please confirm
>> if [1] and [2] need to be included in the next version? Thanks!
>>
>> [1] https://lore.kernel.org/netdev/[email protected]/
>> [2] https://lore.kernel.org/netdev/[email protected]/
>>
>
> Hi Wen,
>
> I'm just back, thank you for the patience!
>
> Firstly I want to thank Gerd and Niklas for review and bringing up these points!
>
> Here are some of my options on that:
>
> To [1]:
> I agree to document the ops as otional if it must not be supported. Since I don't really have any ideas, the
> classification souds reasonable to me. Going to the details, what about to take following options as mandatory:
>
> * query_remote_gid()
> * register_dmb()/unregister_dmb()
> * move_data() : I do see the necessary here.
> * get_local_gid()
> * get_chid()
> * get_dev()
>
> To [2]:
> I also agree to keep the ism-loopback at the very beginning of the List. That acting is also what I imaged previously.
> Thank you, gerd, again for testing it and find it out!
>
> Thanks,
> Wenjia

Hi Wenjia, welcome back! :)

OK, then I will take these in my next version. Thank you all!

2024-04-11 11:13:00

by Alexandra Winter

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 09.04.24 03:44, Wen Gu wrote:
>
>
> On 2024/4/4 23:15, Niklas Schnelle wrote:
>> On Thu, 2024-04-04 at 21:12 +0800, Wen Gu wrote:
>>>
>>> On 2024/4/4 19:42, Niklas Schnelle wrote:
>>>> On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>>>>>
>>>>> On 2024/4/4 00:25, Gerd Bayer wrote:
>>>>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>>>>> This implements some operations that loopback-ism does not support
>>>>>>> currently:
>>>>>>>     - vlan operations, since there is no strong use-case for it.
>>>>>>>     - signal_event operations, since there is no event to be processed
>>>>>>> by the loopback-ism device.
>>>>>>
>>>>>> Hi Wen,
>>>>>>
>>>>>> I wonder if the these operations that are not supported by loopback-ism
>>>>>> should rather be marked "optional" in the struct smcd_ops, and the
>>>>>> calling code should call these only when they are implemented.
>>>>>>
>>>>>> Of course this would mean more changes to net/smc/smc_core.c - but
>>>>>> loopback-ism could omit these "boiler-plate" functions.
>>>>>>
>>>>>
>>>>> Hi Gerd.
>>>>>
>>>>> Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
>>>>> can avoid the device driver from implementing unsupported operations. But I
>>>>> am afraid that which operations need to be defined as 'optional' may differ
>>>>> from different device perspectives (e.g. for loopback-ism they are vlan-related
>>>>> opts and signal_event). So I perfer to simply let the smc protocol assume
>>>>> that all operations have been implemented, and let drivers to decide which
>>>>> ones are unsupported in implementation. What do you think?
>>>>>
>>>>> Thanks!
>>>>>
>>>>
>>>> I agree with Gerd, in my opinion it is better to document ops as
>>>> optional and then allow their function pointers to be NULL and check
>>>> for that. Acting like they are supported and then they turn out to be
>>>> nops to me seems to contradict the principle of least surprises. I also
>>>> think we can find a subset of mandatory ops without which SMC-D is
>>>> impossible and then everything else should be optional.
>>>
>>> I see. If we all agree to classify smcd_ops into mandatory and optional ones,
>>> I'll add a patch to mark the optional ops and check if they are implemented.
>>
>> Keep in mind I don't speak for the SMC maintainers but that does sound
>> reasonable to me.
>>
>
> Hi Wenjia and Jan, do you have any comments on this and [1]? Thanks!
>
> [1] https://lore.kernel.org/netdev/[email protected]/
>
>>>
>>>>
>>>> As a first guess I think the following options may be mandatory:
>>>>
>>>> * query_remote_gid()
>>>> * register_dmb()/unregister_dmb()
>>>> * move_data()
>>>>     For this one could argue that either move_data() or
>>>>     attach_dmb()/detach_dmb() is required though personally I would
>>>>     prefer to always have move_data() as a fallback and simple API
>>>> * supports_v2()
>>>> * get_local_gid()
>>>> * get_chid()
>>>> * get_dev()
>>> I agree with this classification. Just one point, maybe we can take
>>> supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
>>> it is not implemented, we treat it as an ISMv1.
>>>
>>> Thanks!
>>
>> Interpreting a NULL supports_v2() as not supporting v2 sounds
>> reasonable to me.
>

Let me add my thoughts to the discussion:
For the vlan operations and signal_event operations that loopback-ism does
not support:
I like the idea to set the ops to NULL and make sure the caller checks that
and can live with it. That is readable and efficient.

I don't think there is a need to discuss a strategy now, which ops could be
optional in the future. This is all inside the kernel. loopback-ism is even
inside the smc module. Such comments in the code get outdated very easily.

I would propose to mark those as optional struct smcd_ops, where all callers can
handle a NULL pointer and still be productive.
Future support of other devices for SMC-D can update that.




2024-04-12 02:05:46

by Wen Gu

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 2024/4/11 19:12, Alexandra Winter wrote:
>
>
> On 09.04.24 03:44, Wen Gu wrote:
>>
>>
>> On 2024/4/4 23:15, Niklas Schnelle wrote:
>>> On Thu, 2024-04-04 at 21:12 +0800, Wen Gu wrote:
>>>>
>>>> On 2024/4/4 19:42, Niklas Schnelle wrote:
>>>>> On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>>>>>>
>>>>>> On 2024/4/4 00:25, Gerd Bayer wrote:
>>>>>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>>>>>> This implements some operations that loopback-ism does not support
>>>>>>>> currently:
>>>>>>>>     - vlan operations, since there is no strong use-case for it.
>>>>>>>>     - signal_event operations, since there is no event to be processed
>>>>>>>> by the loopback-ism device.
>>>>>>>
>>>>>>> Hi Wen,
>>>>>>>
>>>>>>> I wonder if the these operations that are not supported by loopback-ism
>>>>>>> should rather be marked "optional" in the struct smcd_ops, and the
>>>>>>> calling code should call these only when they are implemented.
>>>>>>>
>>>>>>> Of course this would mean more changes to net/smc/smc_core.c - but
>>>>>>> loopback-ism could omit these "boiler-plate" functions.
>>>>>>>
>>>>>>
>>>>>> Hi Gerd.
>>>>>>
>>>>>> Thank you for the thoughts! I agree that checks like 'if(smcd->ops->xxx)'
>>>>>> can avoid the device driver from implementing unsupported operations. But I
>>>>>> am afraid that which operations need to be defined as 'optional' may differ
>>>>>> from different device perspectives (e.g. for loopback-ism they are vlan-related
>>>>>> opts and signal_event). So I perfer to simply let the smc protocol assume
>>>>>> that all operations have been implemented, and let drivers to decide which
>>>>>> ones are unsupported in implementation. What do you think?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>
>>>>> I agree with Gerd, in my opinion it is better to document ops as
>>>>> optional and then allow their function pointers to be NULL and check
>>>>> for that. Acting like they are supported and then they turn out to be
>>>>> nops to me seems to contradict the principle of least surprises. I also
>>>>> think we can find a subset of mandatory ops without which SMC-D is
>>>>> impossible and then everything else should be optional.
>>>>
>>>> I see. If we all agree to classify smcd_ops into mandatory and optional ones,
>>>> I'll add a patch to mark the optional ops and check if they are implemented.
>>>
>>> Keep in mind I don't speak for the SMC maintainers but that does sound
>>> reasonable to me.
>>>
>>
>> Hi Wenjia and Jan, do you have any comments on this and [1]? Thanks!
>>
>> [1] https://lore.kernel.org/netdev/[email protected]/
>>
>>>>
>>>>>
>>>>> As a first guess I think the following options may be mandatory:
>>>>>
>>>>> * query_remote_gid()
>>>>> * register_dmb()/unregister_dmb()
>>>>> * move_data()
>>>>>     For this one could argue that either move_data() or
>>>>>     attach_dmb()/detach_dmb() is required though personally I would
>>>>>     prefer to always have move_data() as a fallback and simple API
>>>>> * supports_v2()
>>>>> * get_local_gid()
>>>>> * get_chid()
>>>>> * get_dev()
>>>> I agree with this classification. Just one point, maybe we can take
>>>> supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
>>>> it is not implemented, we treat it as an ISMv1.
>>>>
>>>> Thanks!
>>>
>>> Interpreting a NULL supports_v2() as not supporting v2 sounds
>>> reasonable to me.
>>
>
> Let me add my thoughts to the discussion:
> For the vlan operations and signal_event operations that loopback-ism does
> not support:
> I like the idea to set the ops to NULL and make sure the caller checks that
> and can live with it. That is readable and efficient.
>
> I don't think there is a need to discuss a strategy now, which ops could be
> optional in the future. This is all inside the kernel. loopback-ism is even
> inside the smc module. Such comments in the code get outdated very easily.
>
> I would propose to mark those as optional struct smcd_ops, where all callers can
> handle a NULL pointer and still be productive.
> Future support of other devices for SMC-D can update that.
>
>

Hi Sandy, just to confirm if I understand you correctly.

You are proposing that don't draw a conclusion about the classification now,
but supplementally mark which one become a optional operation in struct smcd_ops
during the introduction of new devices for SMC-D.

2024-04-12 12:21:11

by Wenjia Zhang

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 12.04.24 04:02, Wen Gu wrote:
>
>
> On 2024/4/11 19:12, Alexandra Winter wrote:
>>
>>
>> On 09.04.24 03:44, Wen Gu wrote:
>>>
>>>
>>> On 2024/4/4 23:15, Niklas Schnelle wrote:
>>>> On Thu, 2024-04-04 at 21:12 +0800, Wen Gu wrote:
>>>>>
>>>>> On 2024/4/4 19:42, Niklas Schnelle wrote:
>>>>>> On Thu, 2024-04-04 at 17:32 +0800, Wen Gu wrote:
>>>>>>>
>>>>>>> On 2024/4/4 00:25, Gerd Bayer wrote:
>>>>>>>> On Sun, 2024-03-24 at 21:55 +0800, Wen Gu wrote:
>>>>>>>>> This implements some operations that loopback-ism does not support
>>>>>>>>> currently:
>>>>>>>>>      - vlan operations, since there is no strong use-case for it.
>>>>>>>>>      - signal_event operations, since there is no event to be
>>>>>>>>> processed
>>>>>>>>> by the loopback-ism device.
>>>>>>>>
>>>>>>>> Hi Wen,
>>>>>>>>
>>>>>>>> I wonder if the these operations that are not supported by
>>>>>>>> loopback-ism
>>>>>>>> should rather be marked "optional" in the struct smcd_ops, and the
>>>>>>>> calling code should call these only when they are implemented.
>>>>>>>>
>>>>>>>> Of course this would mean more changes to net/smc/smc_core.c - but
>>>>>>>> loopback-ism could omit these "boiler-plate" functions.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Gerd.
>>>>>>>
>>>>>>> Thank you for the thoughts! I agree that checks like
>>>>>>> 'if(smcd->ops->xxx)'
>>>>>>> can avoid the device driver from implementing unsupported
>>>>>>> operations. But I
>>>>>>> am afraid that which operations need to be defined as 'optional'
>>>>>>> may differ
>>>>>>> from different device perspectives (e.g. for loopback-ism they
>>>>>>> are vlan-related
>>>>>>> opts and signal_event). So I perfer to simply let the smc
>>>>>>> protocol assume
>>>>>>> that all operations have been implemented, and let drivers to
>>>>>>> decide which
>>>>>>> ones are unsupported in implementation. What do you think?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>
>>>>>> I agree with Gerd, in my opinion it is better to document ops as
>>>>>> optional and then allow their function pointers to be NULL and check
>>>>>> for that. Acting like they are supported and then they turn out to be
>>>>>> nops to me seems to contradict the principle of least surprises. I
>>>>>> also
>>>>>> think we can find a subset of mandatory ops without which SMC-D is
>>>>>> impossible and then everything else should be optional.
>>>>>
>>>>> I see. If we all agree to classify smcd_ops into mandatory and
>>>>> optional ones,
>>>>> I'll add a patch to mark the optional ops and check if they are
>>>>> implemented.
>>>>
>>>> Keep in mind I don't speak for the SMC maintainers but that does sound
>>>> reasonable to me.
>>>>
>>>
>>> Hi Wenjia and Jan, do you have any comments on this and [1]? Thanks!
>>>
>>> [1]
>>> https://lore.kernel.org/netdev/[email protected]/
>>>
>>>>>
>>>>>>
>>>>>> As a first guess I think the following options may be mandatory:
>>>>>>
>>>>>> * query_remote_gid()
>>>>>> * register_dmb()/unregister_dmb()
>>>>>> * move_data()
>>>>>>      For this one could argue that either move_data() or
>>>>>>      attach_dmb()/detach_dmb() is required though personally I would
>>>>>>      prefer to always have move_data() as a fallback and simple API
>>>>>> * supports_v2()
>>>>>> * get_local_gid()
>>>>>> * get_chid()
>>>>>> * get_dev()
>>>>> I agree with this classification. Just one point, maybe we can take
>>>>> supports_v2() as an optional ops, like support_dmb_nocopy()? e.g. if
>>>>> it is not implemented, we treat it as an ISMv1.
>>>>>
>>>>> Thanks!
>>>>
>>>> Interpreting a NULL supports_v2() as not supporting v2 sounds
>>>> reasonable to me.
>>>
>>
>> Let me add my thoughts to the discussion:
>> For the vlan operations and signal_event operations that loopback-ism
>> does
>> not support:
>> I like the idea to set the ops to NULL and make sure the caller checks
>> that
>> and can live with it. That is readable and efficient.
>>
>> I don't think there is a need to discuss a strategy now, which ops
>> could be
>> optional in the future. This is all inside the kernel. loopback-ism is
>> even
>> inside the smc module. Such comments in the code get outdated very
>> easily.
>>
>> I would propose to mark those as optional struct smcd_ops, where all
>> callers can
>> handle a NULL pointer and still be productive.
>> Future support of other devices for SMC-D can update that.
>>
>>
>
> Hi Sandy, just to confirm if I understand you correctly.
>
> You are proposing that don't draw a conclusion about the classification
> now,
> but supplementally mark which one become a optional operation in struct
> smcd_ops
> during the introduction of new devices for SMC-D.

@Sandy, could you please elaborate your proposal, or comfirm what Wen
interpreted is what you mean?
If it is like what he said. IMO, I don't think it's necessary to dicuss
further on which ops could be mandatory or optional. It's actually clear
to me which are mandatory. And the classification should be much cleaner
for our code. However, I agree that the classification is not really in
the scope of this patches series. Especially if it is too expensive to
rebuild it, we do need a seperate set of cleanup patches to do it. Thus,
I'd like to let Wen take the decisions by ihmself. Any objections?

@All, if anyone has any strong opinion, I appreciate it if you could
bring up your options as soon as possible. That would help us to
accelerate the whole process.

Thanks,
Wenjia


2024-04-12 14:58:43

by Alexandra Winter

[permalink] [raw]
Subject: Re: [RFC PATCH net-next v5 04/11] net/smc: implement some unsupported operations of loopback-ism



On 12.04.24 04:02, Wen Gu wrote:
> Hi Sandy, just to confirm if I understand you correctly.
>
> You are proposing that don't draw a conclusion about the classification now,
> but supplementally mark which one become a optional operation in struct smcd_ops
> during the introduction of new devices for SMC-D.

Yes.