2015-04-20 08:29:10

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers


Since v4:
* Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
Roland, Ira and Steve :-) Please remind me if anything missed :-P
* Fix logical issue inside 3#, 14#
* Refine 3#, 4#, 5# with label 'free'
* Rework 10# to stop using port 1 when port already assigned

There are plenty of lengthy code to check the transport type of IB device,
or the link layer type of it's port, but actually we are just speculating
whether a particular management/feature is supported by the device/port.

Thus instead of inferring, we should have our own mechanism for IB management
capability/protocol/feature checking, several proposals below.

This patch set will reform the method of getting transport type, we will
now using query_transport() instead of inferring from transport and link
layer respectively, also we defined the new transport type to make the
concept more reasonable.

Mapping List:
node-type link-layer old-transport new-transport
nes RNIC ETH IWARP IWARP
amso1100 RNIC ETH IWARP IWARP
cxgb3 RNIC ETH IWARP IWARP
cxgb4 RNIC ETH IWARP IWARP
usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
ocrdma IB_CA ETH IB IBOE
mlx4 IB_CA IB/ETH IB IB/IBOE
mlx5 IB_CA IB IB IB
ehca IB_CA IB IB IB
ipath IB_CA IB IB IB
mthca IB_CA IB IB IB
qib IB_CA IB IB IB

For example:
if (transport == IB) && (link-layer == ETH)
will now become:
if (query_transport() == IBOE)

Thus we will be able to get rid of the respective transport and link-layer
checking, and it will help us to add new protocol/Technology (like OPA) more
easier, also with the introduced management helpers, IB management logical
will be more clear and easier for extending.

Highlights:
The patch set covered a wide range of IB stuff, thus for those who are
familiar with the particular part, your suggestion would be invaluable ;-)

Patch 1#~15# included all the logical reform, 16#~25# introduced the
management helpers, 26#~27# do clean up.

Patches haven't been tested yet, we appreciate if any one who have these
HW willing to provide his Tested-by :-)

Doug suggested the bitmask mechanism:
https://www.mail-archive.com/[email protected]/msg23765.html
which could be the plan for future reforming, we prefer that to be another
series which focus on semantic and performance.

This patch-set is somewhat 'bloated' now and it may be a good timing for
staging, I'd like to suggest we focus on improving existed helpers and push
all the further reforms into next series ;-)

Proposals:
Sean:
https://www.mail-archive.com/[email protected]/msg23339.html
Doug:
https://www.mail-archive.com/[email protected]/msg23418.html
https://www.mail-archive.com/[email protected]/msg23765.html
Jason:
https://www.mail-archive.com/[email protected]/msg23425.html

Michael Wang (27):
IB/Verbs: Implement new callback query_transport()
IB/Verbs: Implement raw management helpers
IB/Verbs: Reform IB-core mad/agent/user_mad
IB/Verbs: Reform IB-core cm
IB/Verbs: Reform IB-core sa_query
IB/Verbs: Reform IB-core multicast
IB/Verbs: Reform IB-ulp ipoib
IB/Verbs: Reform IB-ulp xprtrdma
IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
IB/Verbs: Reform cm related part in IB-core cma/ucm
IB/Verbs: Reform route related part in IB-core cma
IB/Verbs: Reform mcast related part in IB-core cma
IB/Verbs: Reserve legacy transport type in 'dev_addr'
IB/Verbs: Reform cma_acquire_dev()
IB/Verbs: Reform rest part in IB-core cma
IB/Verbs: Use management helper cap_ib_mad()
IB/Verbs: Use management helper cap_ib_smi()
IB/Verbs: Use management helper cap_ib_cm()
IB/Verbs: Use management helper cap_iw_cm()
IB/Verbs: Use management helper cap_ib_sa()
IB/Verbs: Use management helper cap_ib_mcast()
IB/Verbs: Use management helper cap_ipoib()
IB/Verbs: Use management helper cap_read_multi_sge()
IB/Verbs: Use management helper cap_af_ib()
IB/Verbs: Use management helper cap_eth_ah()
IB/Verbs: Clean up rdma_ib_or_iboe()
IB/Verbs: Cleanup rdma_node_get_transport()

---
drivers/infiniband/core/agent.c | 4
drivers/infiniband/core/cm.c | 26 +-
drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
drivers/infiniband/core/device.c | 1
drivers/infiniband/core/mad.c | 51 ++--
drivers/infiniband/core/multicast.c | 18 -
drivers/infiniband/core/sa_query.c | 41 +--
drivers/infiniband/core/sysfs.c | 8
drivers/infiniband/core/ucm.c | 5
drivers/infiniband/core/ucma.c | 27 --
drivers/infiniband/core/user_mad.c | 32 +-
drivers/infiniband/core/uverbs_cmd.c | 6
drivers/infiniband/core/verbs.c | 33 --
drivers/infiniband/hw/amso1100/c2_provider.c | 7
drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
drivers/infiniband/hw/cxgb4/provider.c | 7
drivers/infiniband/hw/ehca/ehca_hca.c | 6
drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
drivers/infiniband/hw/ehca/ehca_main.c | 1
drivers/infiniband/hw/ipath/ipath_verbs.c | 7
drivers/infiniband/hw/mlx4/main.c | 10
drivers/infiniband/hw/mlx5/main.c | 7
drivers/infiniband/hw/mthca/mthca_provider.c | 7
drivers/infiniband/hw/nes/nes_verbs.c | 6
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
drivers/infiniband/hw/qib/qib_verbs.c | 7
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
include/rdma/ib_verbs.h | 204 +++++++++++++++-
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
35 files changed, 584 insertions(+), 368 deletions(-)


2015-04-20 08:33:18

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()


Add new callback query_transport() and implement for each HW.

Mapping List:
node-type link-layer old-transport new-transport
nes RNIC ETH IWARP IWARP
amso1100 RNIC ETH IWARP IWARP
cxgb3 RNIC ETH IWARP IWARP
cxgb4 RNIC ETH IWARP IWARP
usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
ocrdma IB_CA ETH IB IBOE
mlx4 IB_CA IB/ETH IB IB/IBOE
mlx5 IB_CA IB IB IB
ehca IB_CA IB IB IB
ipath IB_CA IB IB IB
mthca IB_CA IB IB IB
qib IB_CA IB IB IB

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/core/verbs.c | 4 +++-
drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
drivers/infiniband/hw/ehca/ehca_main.c | 1 +
drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
drivers/infiniband/hw/mlx5/main.c | 7 +++++++
drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
include/rdma/ib_verbs.h | 7 ++++++-
21 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..a9587c4 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
} mandatory_table[] = {
IB_MANDATORY_FUNC(query_device),
IB_MANDATORY_FUNC(query_port),
+ IB_MANDATORY_FUNC(query_transport),
IB_MANDATORY_FUNC(query_pkey),
IB_MANDATORY_FUNC(query_gid),
IB_MANDATORY_FUNC(alloc_pd),
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f93eb8d..626c9cf 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -133,14 +133,16 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
if (device->get_link_layer)
return device->get_link_layer(device, port_num);

- switch (rdma_node_get_transport(device->node_type)) {
+ switch (device->query_transport(device, port_num)) {
case RDMA_TRANSPORT_IB:
return IB_LINK_LAYER_INFINIBAND;
+ case RDMA_TRANSPORT_IBOE:
case RDMA_TRANSPORT_IWARP:
case RDMA_TRANSPORT_USNIC:
case RDMA_TRANSPORT_USNIC_UDP:
return IB_LINK_LAYER_ETHERNET;
default:
+ BUG();
return IB_LINK_LAYER_UNSPECIFIED;
}
}
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
index bdf3507..d46bbb0 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+c2_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static int c2_query_pkey(struct ib_device *ibdev,
u8 port, u16 index, u16 * pkey)
{
@@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
dev->ibdev.dma_device = &dev->pcidev->dev;
dev->ibdev.query_device = c2_query_device;
dev->ibdev.query_port = c2_query_port;
+ dev->ibdev.query_transport = c2_query_transport;
dev->ibdev.query_pkey = c2_query_pkey;
dev->ibdev.query_gid = c2_query_gid;
dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 811b24a..09682e9e 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+iwch_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
dev->ibdev.query_device = iwch_query_device;
dev->ibdev.query_port = iwch_query_port;
+ dev->ibdev.query_transport = iwch_query_transport;
dev->ibdev.query_pkey = iwch_query_pkey;
dev->ibdev.query_gid = iwch_query_gid;
dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 66bd6a2..a445e0d 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+static enum rdma_transport_type
+c4iw_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
dev->ibdev.query_device = c4iw_query_device;
dev->ibdev.query_port = c4iw_query_port;
+ dev->ibdev.query_transport = c4iw_query_transport;
dev->ibdev.query_pkey = c4iw_query_pkey;
dev->ibdev.query_gid = c4iw_query_gid;
dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
index 9ed4d25..d5a34a6 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -242,6 +242,12 @@ query_port1:
return ret;
}

+enum rdma_transport_type
+ehca_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
int ehca_query_sma_attr(struct ehca_shca *shca,
u8 port, struct ehca_sma_attr *attr)
{
diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
index 22f79af..cec945f 100644
--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
+++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
@@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
int ehca_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);

+enum rdma_transport_type
+ehca_query_transport(struct ib_device *device, u8 port_num);
+
int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
struct ehca_sma_attr *attr);

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index cd8d290..60e0a09 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
shca->ib_device.dma_device = &shca->ofdev->dev;
shca->ib_device.query_device = ehca_query_device;
shca->ib_device.query_port = ehca_query_port;
+ shca->ib_device.query_transport = ehca_query_transport;
shca->ib_device.query_gid = ehca_query_gid;
shca->ib_device.query_pkey = ehca_query_pkey;
/* shca->in_device.modify_device = ehca_modify_device */
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 44ea939..58d36e3 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+ipath_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int ipath_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2140,6 +2146,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
dev->query_device = ipath_query_device;
dev->modify_device = ipath_modify_device;
dev->query_port = ipath_query_port;
+ dev->query_transport = ipath_query_transport;
dev->modify_port = ipath_modify_port;
dev->query_pkey = ipath_query_pkey;
dev->query_gid = ipath_query_gid;
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index b972c0b..e1424ad 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
return __mlx4_ib_query_port(ibdev, port, props, 0);
}

+static enum rdma_transport_type
+mlx4_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ struct mlx4_dev *dev = to_mdev(device)->dev;
+
+ return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
+ RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE;
+}
+
int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid, int netw_view)
{
@@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)

ibdev->ib_dev.query_device = mlx4_ib_query_device;
ibdev->ib_dev.query_port = mlx4_ib_query_port;
+ ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index cc4ac1e..209c796 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -351,6 +351,12 @@ out:
return err;
}

+static enum rdma_transport_type
+mlx5_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid)
{
@@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)

dev->ib_dev.query_device = mlx5_ib_query_device;
dev->ib_dev.query_port = mlx5_ib_query_port;
+ dev->ib_dev.query_transport = mlx5_ib_query_transport;
dev->ib_dev.query_gid = mlx5_ib_query_gid;
dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
dev->ib_dev.modify_device = mlx5_ib_modify_device;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 415f8e1..67ac6a4 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
return err;
}

+static enum rdma_transport_type
+mthca_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int mthca_modify_device(struct ib_device *ibdev,
int mask,
struct ib_device_modify *props)
@@ -1281,6 +1287,7 @@ int mthca_register_device(struct mthca_dev *dev)
dev->ib_dev.dma_device = &dev->pdev->dev;
dev->ib_dev.query_device = mthca_query_device;
dev->ib_dev.query_port = mthca_query_port;
+ dev->ib_dev.query_transport = mthca_query_transport;
dev->ib_dev.modify_device = mthca_modify_device;
dev->ib_dev.modify_port = mthca_modify_port;
dev->ib_dev.query_pkey = mthca_query_pkey;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index c0d0296..8df5b61 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
return 0;
}

+static enum rdma_transport_type
+nes_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}

/**
* nes_query_pkey
@@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
nesibdev->ibdev.query_device = nes_query_device;
nesibdev->ibdev.query_port = nes_query_port;
+ nesibdev->ibdev.query_transport = nes_query_transport;
nesibdev->ibdev.query_pkey = nes_query_pkey;
nesibdev->ibdev.query_gid = nes_query_gid;
nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 7a2b59a..9f4d182 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
/* mandatory verbs. */
dev->ibdev.query_device = ocrdma_query_device;
dev->ibdev.query_port = ocrdma_query_port;
+ dev->ibdev.query_transport = ocrdma_query_transport;
dev->ibdev.modify_port = ocrdma_modify_port;
dev->ibdev.query_gid = ocrdma_query_gid;
dev->ibdev.get_link_layer = ocrdma_link_layer;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 8771755..73bace4 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
return 0;
}

+enum rdma_transport_type
+ocrdma_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IBOE;
+}
+
int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
struct ib_port_modify *props)
{
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
index b8f7853..4a81b63 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
@@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port, struct ib_port_attr *props);
int ocrdma_modify_port(struct ib_device *, u8 port, int mask,
struct ib_port_modify *props);

+enum rdma_transport_type
+ocrdma_query_transport(struct ib_device *device, u8 port_num);
+
void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid);
int ocrdma_query_gid(struct ib_device *, u8 port,
int index, union ib_gid *gid);
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 4a35998..caad665 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+static enum rdma_transport_type
+qib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int qib_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2184,6 +2190,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
ibdev->query_device = qib_query_device;
ibdev->modify_device = qib_modify_device;
ibdev->query_port = qib_query_port;
+ ibdev->query_transport = qib_query_transport;
ibdev->modify_port = qib_modify_port;
ibdev->query_pkey = qib_query_pkey;
ibdev->query_gid = qib_query_gid;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index 0d0f986..03ea9f3 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)

us_ibdev->ib_dev.query_device = usnic_ib_query_device;
us_ibdev->ib_dev.query_port = usnic_ib_query_port;
+ us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 53bd6a2..ff9a5f7 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+enum rdma_transport_type
+usnic_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_USNIC_UDP;
+}
+
int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
index bb864f5..0b1633b 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
@@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
struct ib_device_attr *props);
int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);
+enum rdma_transport_type
+usnic_ib_query_transport(struct ib_device *device, u8 port_num);
int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 65994a1..d54f91e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -75,10 +75,13 @@ enum rdma_node_type {
};

enum rdma_transport_type {
+ /* legacy for users */
RDMA_TRANSPORT_IB,
RDMA_TRANSPORT_IWARP,
RDMA_TRANSPORT_USNIC,
- RDMA_TRANSPORT_USNIC_UDP
+ RDMA_TRANSPORT_USNIC_UDP,
+ /* new transport */
+ RDMA_TRANSPORT_IBOE,
};

__attribute_const__ enum rdma_transport_type
@@ -1501,6 +1504,8 @@ struct ib_device {
int (*query_port)(struct ib_device *device,
u8 port_num,
struct ib_port_attr *port_attr);
+ enum rdma_transport_type (*query_transport)(struct ib_device *device,
+ u8 port_num);
enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
u8 port_num);
int (*query_gid)(struct ib_device *device,
--
2.1.0

2015-04-20 08:33:28

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 02/27] IB/Verbs: Implement raw management helpers


Add raw helpers:
rdma_tech_ib
rdma_tech_iboe
rdma_tech_iwarp
rdma_ib_or_iboe (transition, clean up later)
To help us detect which technology the port supported.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d54f91e..a12e876 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1748,6 +1748,31 @@ int ib_query_port(struct ib_device *device,
enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
u8 port_num);

+static inline int rdma_tech_ib(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IB;
+}
+
+static inline int rdma_tech_iboe(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IBOE;
+}
+
+static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IWARP;
+}
+
+static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
+{
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:36:19

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 03/27] IB/Verbs: Reform IB-core mad/agent/user_mad


Use raw management helpers to reform IB-core mad/agent/user_mad.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/agent.c | 2 +-
drivers/infiniband/core/mad.c | 43 +++++++++++++++++++-------------------
drivers/infiniband/core/user_mad.c | 26 ++++++++++++++++-------
3 files changed, 41 insertions(+), 30 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..ffdef4d 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
goto error1;
}

- if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..1822932 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);

cq_size = mad_sendq_size + mad_recvq_size;
- has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND;
+ has_smi = rdma_tech_ib(device, port_num);
if (has_smi)
cq_size *= 2;

@@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
{
int start, end, i;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
if (device->node_type == RDMA_NODE_IB_SWITCH) {
start = 0;
end = 0;
@@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
}

for (i = start; i <= end; i++) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
if (ib_mad_port_open(device, i)) {
dev_err(&device->dev, "Couldn't open port %d\n", i);
goto error;
@@ -3086,40 +3086,39 @@ error_agent:
dev_err(&device->dev, "Couldn't close port %d\n", i);

error:
- i--;
+ while (--i >= start) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;

- while (i >= start) {
if (ib_agent_port_close(device, i))
dev_err(&device->dev,
"Couldn't close port %d for agents\n", i);
if (ib_mad_port_close(device, i))
dev_err(&device->dev, "Couldn't close port %d\n", i);
- i--;
}
}

static void ib_mad_remove_device(struct ib_device *device)
{
- int i, num_ports, cur_port;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int start, end, i;

if (device->node_type == RDMA_NODE_IB_SWITCH) {
- num_ports = 1;
- cur_port = 0;
+ start = 0;
+ end = 0;
} else {
- num_ports = device->phys_port_cnt;
- cur_port = 1;
+ start = 1;
+ end = device->phys_port_cnt;
}
- for (i = 0; i < num_ports; i++, cur_port++) {
- if (ib_agent_port_close(device, cur_port))
+
+ for (i = start; i <= end; i++) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
+ if (ib_agent_port_close(device, i))
dev_err(&device->dev,
- "Couldn't close port %d for agents\n",
- cur_port);
- if (ib_mad_port_close(device, cur_port))
- dev_err(&device->dev, "Couldn't close port %d\n",
- cur_port);
+ "Couldn't close port %d for agents\n", i);
+ if (ib_mad_port_close(device, i))
+ dev_err(&device->dev, "Couldn't close port %d\n", i);
}
}

diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 928cdd2..aa8b334 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
{
struct ib_umad_device *umad_dev;
int s, e, i;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1296,21 +1294,33 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port = e;

for (i = s; i <= e; ++i) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
umad_dev->port[i - s].umad_dev = umad_dev;

if (ib_umad_init_port(device, i, umad_dev,
&umad_dev->port[i - s]))
goto err;
+
+ count++;
}

+ if (!count)
+ goto free;
+
ib_set_client_data(device, &umad_client, umad_dev);

return;

err:
- while (--i >= s)
- ib_umad_kill_port(&umad_dev->port[i - s]);
+ while (--i >= s) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;

+ ib_umad_kill_port(&umad_dev->port[i - s]);
+ }
+free:
kobject_put(&umad_dev->kobj);
}

@@ -1322,8 +1332,10 @@ static void ib_umad_remove_one(struct ib_device *device)
if (!umad_dev)
return;

- for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i)
- ib_umad_kill_port(&umad_dev->port[i]);
+ for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
+ if (rdma_ib_or_iboe(device, i))
+ ib_umad_kill_port(&umad_dev->port[i]);
+ }

kobject_put(&umad_dev->kobj);
}
--
2.1.0

2015-04-20 08:33:53

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 04/27] IB/Verbs: Reform IB-core cm


Use raw management helpers to reform IB-core cm.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cm.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..3c10b75 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
unsigned long flags;
int ret;
u8 i;
-
- if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
ib_device->phys_port_cnt, GFP_KERNEL);
@@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)

set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = kzalloc(sizeof *port, GFP_KERNEL);
if (!port)
goto error1;
@@ -3809,7 +3810,13 @@ static void cm_add_one(struct ib_device *ib_device)
ret = ib_modify_port(ib_device, i, 0, &port_modify);
if (ret)
goto error3;
+
+ count++;
}
+
+ if (!count)
+ goto free;
+
ib_set_client_data(ib_device, &cm_client, cm_dev);

write_lock_irqsave(&cm.device_lock, flags);
@@ -3825,11 +3832,15 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
cm_remove_port_fs(port);
}
+free:
device_unregister(cm_dev->device);
kfree(cm_dev);
}
@@ -3853,6 +3864,9 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);

for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
--
2.1.0

2015-04-20 08:34:31

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 05/27] IB/Verbs: Reform IB-core sa_query


Use raw management helpers to reform IB-core sa_query.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/sa_query.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index c38f030..60dc7aa 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - sa_dev->start_port];

- if (rdma_port_get_link_layer(handler->device, port->port_num) != IB_LINK_LAYER_INFINIBAND)
+ if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
return;

spin_lock_irqsave(&port->ah_lock, flags);
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;

- force_grh = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_ETHERNET;
+ force_grh = rdma_tech_iboe(device, port_num);

if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
@@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
{
struct ib_sa_device *sa_dev;
int s, e, i;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)

for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
- if (rdma_port_get_link_layer(device, i + 1) != IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, i + 1))
continue;

sa_dev->port[i].sm_ah = NULL;
@@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;

INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
+
+ count++;
}

+ if (!count)
+ goto free;
+
ib_set_client_data(device, &sa_client, sa_dev);

/*
@@ -1204,19 +1207,21 @@ static void ib_sa_add_one(struct ib_device *device)
if (ib_register_event_handler(&sa_dev->event_handler))
goto err;

- for (i = 0; i <= e - s; ++i)
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
+ for (i = 0; i <= e - s; ++i) {
+ if (rdma_tech_ib(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
+ }

return;

err:
- while (--i >= 0)
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
+ while (--i >= 0) {
+ if (rdma_tech_ib(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
+ }

+free:
kfree(sa_dev);
-
return;
}

@@ -1233,7 +1238,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);

for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
--
2.1.0

2015-04-20 08:34:54

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 06/27] IB/Verbs: Reform IB-core multicast


Use raw management helpers to reform IB-core multicast.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/multicast.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index fa17b55..24d93f5 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
int index;

dev = container_of(handler, struct mcast_device, event_handler);
- if (rdma_port_get_link_layer(dev->device, event->element.port_num) !=
- IB_LINK_LAYER_INFINIBAND)
+ if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
return;

index = event->element.port_num - dev->start_port;
@@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
int i;
int count = 0;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
GFP_KERNEL);
if (!dev)
@@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_port_get_link_layer(device, dev->start_port + i) !=
- IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_port_get_link_layer(device, dev->start_port + i) ==
- IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
--
2.1.0

2015-04-20 08:35:23

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 07/27] IB/Verbs: Reform IB-ulp ipoib


Use raw management helpers to reform IB-ulp ipoib.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 58b5aa3..60b379d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1654,9 +1654,7 @@ static void ipoib_add_one(struct ib_device *device)
struct net_device *dev;
struct ipoib_dev_priv *priv;
int s, e, p;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
if (!dev_list)
@@ -1673,15 +1671,21 @@ static void ipoib_add_one(struct ib_device *device)
}

for (p = s; p <= e; ++p) {
- if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
list_add_tail(&priv->list, dev_list);
+ count++;
}
}

+ if (!count) {
+ kfree(dev_list);
+ return;
+ }
+
ib_set_client_data(device, &ipoib_client, dev_list);
}

@@ -1690,9 +1694,6 @@ static void ipoib_remove_one(struct ib_device *device)
struct ipoib_dev_priv *priv, *tmp;
struct list_head *dev_list;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
dev_list = ib_get_client_data(device, &ipoib_client);
if (!dev_list)
return;
--
2.1.0

2015-04-20 08:35:56

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 08/27] IB/Verbs: Reform IB-ulp xprtrdma


Use raw management helpers to reform IB-ulp xprtrdma.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 +--
net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +++++++++++++-------------------
2 files changed, 19 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f9f13a3..a5bed5b 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,

static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
{
- if (rdma_node_get_transport(xprt->sc_cm_id->device->node_type) ==
- RDMA_TRANSPORT_IWARP)
+ if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index f609c1c..a09b7a1 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
struct ib_qp_init_attr qp_attr;
struct ib_device_attr devattr;
int uninitialized_var(dma_mr_acc);
- int need_dma_mr;
+ int need_dma_mr = 0;
int ret;
int i;

@@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
/*
* Determine if a DMA MR is required and if so, what privs are required
*/
- switch (rdma_node_get_transport(newxprt->sc_cm_id->device->node_type)) {
- case RDMA_TRANSPORT_IWARP:
- newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
- if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
- need_dma_mr = 1;
- dma_mr_acc =
- (IB_ACCESS_LOCAL_WRITE |
- IB_ACCESS_REMOTE_WRITE);
- } else if (!(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else
- need_dma_mr = 0;
- break;
- case RDMA_TRANSPORT_IB:
- if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else if (!(devattr.device_cap_flags &
- IB_DEVICE_LOCAL_DMA_LKEY)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else
- need_dma_mr = 0;
- break;
- default:
+ if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
goto errout;
+
+ if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
+ !(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
+ need_dma_mr = 1;
+ dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
+ if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG))
+ dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
}

+ if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
+ newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
+
/* Create the DMA MR if needed, otherwise, use the DMA LKEY */
if (need_dma_mr) {
/* Register all of physical memory */
--
2.1.0

2015-04-20 08:36:25

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs


Use raw management helpers to reform IB-core verbs/uverbs_cmd/sysfs.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/sysfs.c | 8 ++------
drivers/infiniband/core/uverbs_cmd.c | 6 ++++--
drivers/infiniband/core/verbs.c | 6 ++----
3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index cbd0383..8570180 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
char *buf)
{
- switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(p->ibdev, p->port_num))
return sprintf(buf, "%s\n", "InfiniBand");
- case IB_LINK_LAYER_ETHERNET:
+ else
return sprintf(buf, "%s\n", "Ethernet");
- default:
- return sprintf(buf, "%s\n", "Unknown");
- }
}

static PORT_ATTR_RO(state);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a9f0489..5dc90aa 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.active_width = attr.active_width;
resp.active_speed = attr.active_speed;
resp.phys_state = attr.phys_state;
- resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
- cmd.port_num);
+ resp.link_layer = rdma_tech_ib(file->device->ib_dev,
+ cmd.port_num) ?
+ IB_LINK_LAYER_INFINIBAND :
+ IB_LINK_LAYER_ETHERNET;

if (copy_to_user((void __user *) (unsigned long) cmd.response,
&resp, sizeof resp))
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 626c9cf..7264860 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
u32 flow_class;
u16 gid_index;
int ret;
- int is_eth = (rdma_port_get_link_layer(device, port_num) ==
- IB_LINK_LAYER_ETHERNET);

memset(ah_attr, 0, sizeof *ah_attr);
- if (is_eth) {
+ if (rdma_tech_iboe(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;

@@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid sgid;

if ((*qp_attr_mask & IB_QP_AV) &&
- (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
+ (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
--
2.1.0

2015-04-20 08:36:46

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 10/27] IB/Verbs: Reform cm related part in IB-core cma/ucm


Use raw management helpers to reform cm related part in IB-core cma/ucm.

Few checks focus on the device cm type rather than the port capability,
directly pass port 1 works currently, but can't support mixing cm type
device in future.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 81 +++++++++++++------------------------------
drivers/infiniband/core/ucm.c | 3 +-
2 files changed, 26 insertions(+), 58 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..815e41b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -735,8 +735,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
int ret = 0;

id_priv = container_of(id, struct rdma_id_private, id);
- switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
else
@@ -745,19 +744,15 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,

if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
} else
ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
qp_attr_mask);
- break;
- default:
+ } else
ret = -ENOSYS;
- break;
- }

return ret;
}
@@ -1037,17 +1032,12 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);

if (id_priv->cma_dev) {
- switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
- break;
- default:
- break;
}
cma_leave_mc_groups(id_priv);
cma_release_dev(id_priv);
@@ -1626,7 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv,
int ret;

if (cma_family(id_priv) == AF_IB &&
- rdma_node_get_transport(cma_dev->device->node_type) != RDMA_TRANSPORT_IB)
+ !rdma_ib_or_iboe(cma_dev->device, 1))
return;

id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2028,7 +2018,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
- rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
+ !rdma_ib_or_iboe(cur_dev->device, 1))
continue;

if (!cma_dev)
@@ -2060,7 +2050,7 @@ port_found:
goto out;

id_priv->id.route.addr.dev_addr.dev_type =
- (rdma_port_get_link_layer(cma_dev->device, p) == IB_LINK_LAYER_INFINIBAND) ?
+ (rdma_tech_ib(cma_dev->device, p)) ?
ARPHRD_INFINIBAND : ARPHRD_ETHER;

rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
@@ -2537,18 +2527,15 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)

id_priv->backlog = backlog;
if (id->device) {
- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
- break;
- default:
+ } else {
ret = -ENOSYS;
goto err;
}
@@ -2884,20 +2871,15 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }
if (ret)
goto err;

@@ -3000,8 +2982,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
@@ -3017,14 +2998,10 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }

if (ret)
goto reject;
@@ -3068,8 +3045,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
if (!id_priv->cm_id.ib)
return -EINVAL;

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
private_data, private_data_len);
@@ -3077,15 +3053,12 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
IB_CM_REJ_CONSUMER_DEFINED, NULL,
0, private_data, private_data_len);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
private_data, private_data_len);
- break;
- default:
+ } else
ret = -ENOSYS;
- break;
- }
+
return ret;
}
EXPORT_SYMBOL(rdma_reject);
@@ -3099,22 +3072,18 @@ int rdma_disconnect(struct rdma_cm_id *id)
if (!id_priv->cm_id.ib)
return -EINVAL;

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
ret = cma_modify_qp_err(id_priv);
if (ret)
goto out;
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
- break;
- default:
+ } else
ret = -EINVAL;
- break;
- }
+
out:
return ret;
}
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index f2f6393..70e0ccb 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -1253,8 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
dev_t base;
struct ib_ucm_device *ucm_dev;

- if (!device->alloc_ucontext ||
- rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
+ if (!device->alloc_ucontext || !rdma_ib_or_iboe(device, 1))
return;

ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
--
2.1.0

2015-04-20 08:37:20

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 11/27] IB/Verbs: Reform route related part in IB-core cma


Use raw management helpers to reform route related part in IB-core cma.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 31 ++++++++-----------------------
drivers/infiniband/core/ucma.c | 25 ++++++-------------------
2 files changed, 14 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 815e41b..fa69f34 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)

static void cma_cancel_route(struct rdma_id_private *id_priv)
{
- switch (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
- break;
- default:
- break;
}
}

@@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
return -EINVAL;

atomic_inc(&id_priv->refcount);
- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ret = cma_resolve_ib_route(id_priv, timeout_ms);
- break;
- case IB_LINK_LAYER_ETHERNET:
- ret = cma_resolve_iboe_route(id_priv);
- break;
- default:
- ret = -ENOSYS;
- }
- break;
- case RDMA_TRANSPORT_IWARP:
+ if (rdma_tech_ib(id->device, id->port_num))
+ ret = cma_resolve_ib_route(id_priv, timeout_ms);
+ else if (rdma_tech_iboe(id->device, id->port_num))
+ ret = cma_resolve_iboe_route(id_priv);
+ else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_resolve_iw_route(id_priv, timeout_ms);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }
+
if (ret)
goto err;

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 45d67e9..7331c6c 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,

resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
- switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(ctx->cm_id->device,
- ctx->cm_id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ucma_copy_ib_route(&resp, &ctx->cm_id->route);
- break;
- case IB_LINK_LAYER_ETHERNET:
- ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
- break;
- default:
- break;
- }
- break;
- case RDMA_TRANSPORT_IWARP:
+
+ if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+ ucma_copy_ib_route(&resp, &ctx->cm_id->route);
+ else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
+ ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
+ else if (rdma_tech_iwarp(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iw_route(&resp, &ctx->cm_id->route);
- break;
- default:
- break;
- }

out:
if (copy_to_user((void __user *)(unsigned long)cmd.response,
--
2.1.0

2015-04-20 08:37:41

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 12/27] IB/Verbs: Reform mcast related part in IB-core cma


Use raw management helpers to reform mcast related part in IB-core cma.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 56 ++++++++++++++-----------------------------
1 file changed, 18 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index fa69f34..a89c246 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
mc = container_of(id_priv->mc_list.next,
struct cma_multicast, list);
list_del(&mc->list);
- switch (rdma_port_get_link_layer(id_priv->cma_dev->device, id_priv->id.port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(id_priv->cma_dev->device,
+ id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
+ } else
kref_put(&mc->mcref, release_mc);
- break;
- default:
- break;
- }
}
}

@@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ret = cma_join_ib_multicast(id_priv, mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
- kref_init(&mc->mcref);
- ret = cma_iboe_join_multicast(id_priv, mc);
- break;
- default:
- ret = -EINVAL;
- }
- break;
- default:
+ if (rdma_tech_iboe(id->device, id->port_num)) {
+ kref_init(&mc->mcref);
+ ret = cma_iboe_join_multicast(id_priv, mc);
+ } else if (rdma_tech_ib(id->device, id->port_num))
+ ret = cma_join_ib_multicast(id_priv, mc);
+ else
ret = -ENOSYS;
- break;
- }

if (ret) {
spin_lock_irq(&id_priv->lock);
@@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
ib_detach_mcast(id->qp,
&mc->multicast.ib->rec.mgid,
be16_to_cpu(mc->multicast.ib->rec.mlid));
- if (rdma_node_get_transport(id_priv->cma_dev->device->node_type) == RDMA_TRANSPORT_IB) {
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ib_sa_free_multicast(mc->multicast.ib);
- kfree(mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
- kref_put(&mc->mcref, release_mc);
- break;
- default:
- break;
- }
- }
+
+ BUG_ON(id_priv->cma_dev->device != id->device);
+
+ if (rdma_tech_ib(id->device, id->port_num)) {
+ ib_sa_free_multicast(mc->multicast.ib);
+ kfree(mc);
+ } else if (rdma_tech_iboe(id->device, id->port_num))
+ kref_put(&mc->mcref, release_mc);
+
return;
}
}
--
2.1.0

2015-04-20 08:38:07

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in 'dev_addr'


Reserve the legacy transport type for the 'transport' member
of 'struct rdma_dev_addr' until we make sure this is no
longer needed.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ebac646..6195bf6 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 ip_ver)
hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF);
}

+static inline void cma_set_legacy_transport(struct rdma_cm_id *id)
+{
+ switch (id->device->node_type) {
+ case RDMA_NODE_IB_CA:
+ case RDMA_NODE_IB_SWITCH:
+ case RDMA_NODE_IB_ROUTER:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;
+ break;
+ case RDMA_NODE_RNIC:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IWARP;
+ break;
+ case RDMA_NODE_USNIC:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC;
+ break;
+ case RDMA_NODE_USNIC_UDP:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC_UDP;
+ break;
+ default:
+ BUG();
+ }
+}
+
static void cma_attach_to_dev(struct rdma_id_private *id_priv,
struct cma_device *cma_dev)
{
atomic_inc(&cma_dev->refcount);
id_priv->cma_dev = cma_dev;
id_priv->id.device = cma_dev->device;
- id_priv->id.route.addr.dev_addr.transport =
- rdma_node_get_transport(cma_dev->device->node_type);
+ cma_set_legacy_transport(&id_priv->id);
list_add_tail(&id_priv->list, &cma_dev->id_list);
}

--
2.1.0

2015-04-20 08:38:30

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()


Reform cma_acquire_dev() with management helpers, introduce
cma_validate_port() to make the code more clean.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 68 +++++++++++++++++++++++++------------------
1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 6195bf6..44e7bb9 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -370,18 +370,35 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
return ret;
}

+static inline int cma_validate_port(struct ib_device *device, u8 port,
+ union ib_gid *gid, int dev_type)
+{
+ u8 found_port;
+ int ret = -ENODEV;
+
+ if ((dev_type == ARPHRD_INFINIBAND) && !rdma_tech_ib(device, port))
+ return ret;
+
+ if ((dev_type != ARPHRD_INFINIBAND) && rdma_tech_ib(device, port))
+ return ret;
+
+ ret = ib_find_cached_gid(device, gid, &found_port, NULL);
+ if (port != found_port)
+ return -ENODEV;
+
+ return ret;
+}
+
static int cma_acquire_dev(struct rdma_id_private *id_priv,
struct rdma_id_private *listen_id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct cma_device *cma_dev;
- union ib_gid gid, iboe_gid;
+ union ib_gid gid, iboe_gid, *gidp;
int ret = -ENODEV;
- u8 port, found_port;
- enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
- IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
+ u8 port;

- if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
+ if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
id_priv->id.ps == RDMA_PS_IPOIB)
return -EINVAL;

@@ -391,41 +408,36 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,

memcpy(&gid, dev_addr->src_dev_addr +
rdma_addr_gid_offset(dev_addr), sizeof gid);
- if (listen_id_priv &&
- rdma_port_get_link_layer(listen_id_priv->id.device,
- listen_id_priv->id.port_num) == dev_ll) {
+
+ if (listen_id_priv) {
cma_dev = listen_id_priv->cma_dev;
port = listen_id_priv->id.port_num;
- if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
- ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
- &found_port, NULL);
- else
- ret = ib_find_cached_gid(cma_dev->device, &gid,
- &found_port, NULL);
+ gidp = rdma_tech_iboe(cma_dev->device, port) ?
+ &iboe_gid : &gid;

- if (!ret && (port == found_port)) {
- id_priv->id.port_num = found_port;
+ ret = cma_validate_port(cma_dev->device, port, gidp,
+ dev_addr->dev_type);
+ if (!ret) {
+ id_priv->id.port_num = port;
goto out;
}
}
+
list_for_each_entry(cma_dev, &dev_list, list) {
for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
if (listen_id_priv &&
listen_id_priv->cma_dev == cma_dev &&
listen_id_priv->id.port_num == port)
continue;
- if (rdma_port_get_link_layer(cma_dev->device, port) == dev_ll) {
- if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
- ret = ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
- else
- ret = ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
-
- if (!ret && (port == found_port)) {
- id_priv->id.port_num = found_port;
- goto out;
- }
+
+ gidp = rdma_tech_iboe(cma_dev->device, port) ?
+ &iboe_gid : &gid;
+
+ ret = cma_validate_port(cma_dev->device, port, gidp,
+ dev_addr->dev_type);
+ if (!ret) {
+ id_priv->id.port_num = port;
+ goto out;
}
}
}
--
2.1.0

2015-04-20 08:38:57

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 15/27] IB/Verbs: Reform rest part in IB-core cma


Use raw management helpers to reform rest part in IB-core cma.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 44e7bb9..ec64b97 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -468,10 +468,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
pkey = ntohs(addr->sib_pkey);

list_for_each_entry(cur_dev, &dev_list, list) {
- if (rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
- continue;
-
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
+ if (!rdma_ib_or_iboe(cur_dev->device, p))
+ continue;
+
if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
continue;

@@ -666,10 +666,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private *id_priv,
if (ret)
goto out;

- if (rdma_node_get_transport(id_priv->cma_dev->device->node_type)
- == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)
- == IB_LINK_LAYER_ETHERNET) {
+ BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
+
+ if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num)) {
ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);

if (ret)
@@ -733,11 +732,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
int ret;
u16 pkey;

- if (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num) ==
- IB_LINK_LAYER_INFINIBAND)
- pkey = ib_addr_get_pkey(dev_addr);
- else
+ if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num))
pkey = 0xffff;
+ else
+ pkey = ib_addr_get_pkey(dev_addr);

ret = ib_find_cached_pkey(id_priv->id.device, id_priv->id.port_num,
pkey, &qp_attr->pkey_index);
--
2.1.0

2015-04-20 08:39:23

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 16/27] IB/Verbs: Use management helper cap_ib_mad()


Introduce helper cap_ib_mad() to help us check if the port of an
IB device support Infiniband Management Datagrams.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/mad.c | 6 +++---
drivers/infiniband/core/user_mad.c | 6 +++---
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 1822932..4315aeb 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
}

for (i = start; i <= end; i++) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_mad_port_open(device, i)) {
@@ -3087,7 +3087,7 @@ error_agent:

error:
while (--i >= start) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_agent_port_close(device, i))
@@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device)
}

for (i = start; i <= end; i++) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_agent_port_close(device, i))
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 71fc8ba..b52884b 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port = e;

for (i = s; i <= e; ++i) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

umad_dev->port[i - s].umad_dev = umad_dev;
@@ -1317,7 +1317,7 @@ static void ib_umad_add_one(struct ib_device *device)

err:
while (--i >= s) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

ib_umad_kill_port(&umad_dev->port[i - s]);
@@ -1335,7 +1335,7 @@ static void ib_umad_remove_one(struct ib_device *device)
return;

for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
- if (rdma_ib_or_iboe(device, i))
+ if (cap_ib_mad(device, i))
ib_umad_kill_port(&umad_dev->port[i]);
}

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a12e876..624e963 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1773,6 +1773,21 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

+/**
+ * cap_ib_mad - Check if the port of device has the capability Infiniband
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Management Datagrams.
+ */
+static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:39:44

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 17/27] IB/Verbs: Use management helper cap_ib_smi()


Introduce helper cap_ib_smi() to help us check if the port of an
IB device support Infiniband Subnet Management Interface.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/agent.c | 2 +-
drivers/infiniband/core/mad.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index ffdef4d..61471ee 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
goto error1;
}

- if (rdma_tech_ib(device, port_num)) {
+ if (cap_ib_smi(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 4315aeb..ee3a05e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);

cq_size = mad_sendq_size + mad_recvq_size;
- has_smi = rdma_tech_ib(device, port_num);
+ has_smi = cap_ib_smi(device, port_num);
if (has_smi)
cq_size *= 2;

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 624e963..873b9a6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1788,6 +1788,21 @@ static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
}

+/**
+ * cap_ib_smi - Check if the port of device has the capability Infiniband
+ * Subnet Management Interface.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Management Interface.
+ */
+static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:40:12

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 18/27] IB/Verbs: Use management helper cap_ib_cm()


Introduce helper cap_ib_cm() to help us check if the port of an
IB device support Infiniband Communication Manager.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cm.c | 6 +++---
drivers/infiniband/core/cma.c | 19 +++++++++----------
drivers/infiniband/core/ucm.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 3c10b75..eae4c9f 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3781,7 +3781,7 @@ static void cm_add_one(struct ib_device *ib_device)

set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = kzalloc(sizeof *port, GFP_KERNEL);
@@ -3832,7 +3832,7 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = cm_dev->port[i-1];
@@ -3864,7 +3864,7 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);

for (i = 1; i <= ib_device->phys_port_cnt; i++) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = cm_dev->port[i-1];
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ec64b97..ff59dbc 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -766,7 +766,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
int ret = 0;

id_priv = container_of(id, struct rdma_id_private, id);
- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
else
@@ -1054,7 +1054,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);

if (id_priv->cma_dev) {
- if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
+ if (cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
} else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
@@ -1637,8 +1637,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv,
struct rdma_cm_id *id;
int ret;

- if (cma_family(id_priv) == AF_IB &&
- !rdma_ib_or_iboe(cma_dev->device, 1))
+ if (cma_family(id_priv) == AF_IB && !cap_ib_cm(cma_dev->device, 1))
return;

id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2029,7 +2028,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
- !rdma_ib_or_iboe(cur_dev->device, 1))
+ !cap_ib_cm(cur_dev->device, 1))
continue;

if (!cma_dev)
@@ -2538,7 +2537,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)

id_priv->backlog = backlog;
if (id->device) {
- if (rdma_ib_or_iboe(id->device, 1)) {
+ if (cap_ib_cm(id->device, 1)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
@@ -2882,7 +2881,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
@@ -2993,7 +2992,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
@@ -3056,7 +3055,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
if (!id_priv->cm_id.ib)
return -EINVAL;

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
private_data, private_data_len);
@@ -3083,7 +3082,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
if (!id_priv->cm_id.ib)
return -EINVAL;

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
ret = cma_modify_qp_err(id_priv);
if (ret)
goto out;
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index 70e0ccb..f7290c8 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -1253,7 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
dev_t base;
struct ib_ucm_device *ucm_dev;

- if (!device->alloc_ucontext || !rdma_ib_or_iboe(device, 1))
+ if (!device->alloc_ucontext || !cap_ib_cm(device, 1))
return;

ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 873b9a6..6805e3e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1803,6 +1803,21 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_ib_cm - Check if the port of device has the capability Infiniband
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Communication Manager.
+ */
+static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:40:34

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()


Introduce helper cap_iw_cm() to help us check if the port of an
IB device support IWARP Communication Manager.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 14 +++++++-------
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ff59dbc..dd37b4a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -775,7 +775,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,

if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
@@ -1057,7 +1057,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
if (cap_ib_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
- } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
+ } else if (cap_iw_cm(id_priv->id.device, 1)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
}
@@ -2541,7 +2541,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
- } else if (rdma_tech_iwarp(id->device, 1)) {
+ } else if (cap_iw_cm(id->device, 1)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
@@ -2886,7 +2886,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
- } else if (rdma_tech_iwarp(id->device, id->port_num))
+ } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3008,7 +3008,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
- } else if (rdma_tech_iwarp(id->device, id->port_num))
+ } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3063,7 +3063,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
IB_CM_REJ_CONSUMER_DEFINED, NULL,
0, private_data, private_data_len);
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
private_data, private_data_len);
} else
@@ -3089,7 +3089,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
} else
ret = -EINVAL;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6805e3e..e4999f6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
}

+/**
+ * cap_iw_cm - Check if the port of device has the capability IWARP
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support IWARP
+ * Communication Manager.
+ */
+static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_iwarp(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:40:58

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 20/27] IB/Verbs: Use management helper cap_ib_sa()


Introduce helper cap_ib_sa() to help us check if the port of an
IB device support Infiniband Subnet Administration.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 4 ++--
drivers/infiniband/core/sa_query.c | 10 +++++-----
drivers/infiniband/core/ucma.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index dd37b4a..b92f81b 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -954,7 +954,7 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)

static void cma_cancel_route(struct rdma_id_private *id_priv)
{
- if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
+ if (cap_ib_sa(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
}
@@ -1978,7 +1978,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
return -EINVAL;

atomic_inc(&id_priv->refcount);
- if (rdma_tech_ib(id->device, id->port_num))
+ if (cap_ib_sa(id->device, id->port_num))
ret = cma_resolve_ib_route(id_priv, timeout_ms);
else if (rdma_tech_iboe(id->device, id->port_num))
ret = cma_resolve_iboe_route(id_priv);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 60dc7aa..f14a66f 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - sa_dev->start_port];

- if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
+ if (WARN_ON(!cap_ib_sa(handler->device, port->port_num)))
return;

spin_lock_irqsave(&port->ah_lock, flags);
@@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)

for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
- if (!rdma_tech_ib(device, i + 1))
+ if (!cap_ib_sa(device, i + 1))
continue;

sa_dev->port[i].sm_ah = NULL;
@@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;

for (i = 0; i <= e - s; ++i) {
- if (rdma_tech_ib(device, i + 1))
+ if (cap_ib_sa(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
}

@@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device)

err:
while (--i >= 0) {
- if (rdma_tech_ib(device, i + 1))
+ if (cap_ib_sa(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
}

@@ -1238,7 +1238,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);

for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
- if (rdma_tech_ib(device, i + 1)) {
+ if (cap_ib_sa(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 7331c6c..bed7957 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;

- if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+ if (cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_ib_route(&resp, &ctx->cm_id->route);
else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e4999f6..de3a168 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1833,6 +1833,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
return rdma_tech_iwarp(device, port_num);
}

+/**
+ * cap_ib_sa - Check if the port of device has the capability Infiniband
+ * Subnet Administration.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Administration.
+ */
+static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:41:21

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 21/27] IB/Verbs: Use management helper cap_ib_mcast()


Introduce helper cap_ib_mcast() to help us check if the port of an
IB device support Infiniband Multicast.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 6 +++---
drivers/infiniband/core/multicast.c | 6 +++---
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8484ae3..58ec946 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1028,7 +1028,7 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
mc = container_of(id_priv->mc_list.next,
struct cma_multicast, list);
list_del(&mc->list);
- if (rdma_tech_ib(id_priv->cma_dev->device,
+ if (cap_ib_mcast(id_priv->cma_dev->device,
id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
@@ -3342,7 +3342,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
if (rdma_tech_iboe(id->device, id->port_num)) {
kref_init(&mc->mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
- } else if (rdma_tech_ib(id->device, id->port_num))
+ } else if (cap_ib_mcast(id->device, id->port_num))
ret = cma_join_ib_multicast(id_priv, mc);
else
ret = -ENOSYS;
@@ -3376,7 +3376,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)

BUG_ON(id_priv->cma_dev->device != id->device);

- if (rdma_tech_ib(id->device, id->port_num)) {
+ if (cap_ib_mcast(id->device, id->port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else if (rdma_tech_iboe(id->device, id->port_num))
diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index 24d93f5..bdc1880 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
int index;

dev = container_of(handler, struct mcast_device, event_handler);
- if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
+ if (WARN_ON(!cap_ib_mcast(dev->device, event->element.port_num)))
return;

index = event->element.port_num - dev->start_port;
@@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (!rdma_tech_ib(device, dev->start_port + i))
+ if (!cap_ib_mcast(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_tech_ib(device, dev->start_port + i)) {
+ if (cap_ib_mcast(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index de3a168..6e354df 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1848,6 +1848,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_ib_mcast - Check if the port of device has the capability Infiniband
+ * Multicast.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Multicast.
+ */
+static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
+{
+ return cap_ib_sa(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:41:43

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()


Introduce helper cap_ipoib() to help us check if the port of an
IB device support IP over Infiniband.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 60b379d..a9812df 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1671,7 +1671,7 @@ static void ipoib_add_one(struct ib_device *device)
}

for (p = s; p <= e; ++p) {
- if (!rdma_tech_ib(device, p))
+ if (!cap_ipoib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6e354df..d0ae08e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1863,6 +1863,21 @@ static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
return cap_ib_sa(device, port_num);
}

+/**
+ * cap_ipoib - Check if the port of device has the capability
+ * IP over Infiniband.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * IP over Infiniband.
+ */
+static inline int cap_ipoib(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-20 08:42:13

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 23/27] IB/Verbs: Use management helper cap_read_multi_sge()


Introduce helper cap_read_multi_sge() to help us check if the port of an
IB device support RDMA Read Multiple Scatter-Gather Entries.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 15 +++++++++++++++
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 ++-
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d0ae08e..074f66d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1878,6 +1878,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_read_multi_sge - Check if the port of device has the capability
+ * RDMA Read Multiple Scatter-Gather Entries.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * RDMA Read Multiple Scatter-Gather Entries.
+ */
+static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
+{
+ return !rdma_tech_iwarp(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index a5bed5b..7711b7a 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,7 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,

static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
{
- if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
+ if (!cap_read_multi_sge(xprt->sc_cm_id->device,
+ xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
--
2.1.0

2015-04-20 08:42:40

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib()


Introduce helper cap_af_ib() to help us check if the port of an
IB device support Native Infiniband Address.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 05d148e..9c1f5b72 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -469,7 +469,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)

list_for_each_entry(cur_dev, &dev_list, list) {
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
- if (!rdma_ib_or_iboe(cur_dev->device, p))
+ if (!cap_af_ib(cur_dev->device, p))
continue;

if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 074f66d..9cfab09 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
}

/**
+ * cap_af_ib - Check if the port of device has the capability
+ * Native Infiniband Address.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Native Infiniband Address.
+ */
+static inline int cap_af_ib(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
+/**
* cap_read_multi_sge - Check if the port of device has the capability
* RDMA Read Multiple Scatter-Gather Entries.
*
--
2.1.0

2015-04-20 08:43:09

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah()


Introduce helper cap_eth_ah() to help us check if the port of an
IB device support Ethernet Address Handler.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 2 +-
drivers/infiniband/core/sa_query.c | 2 +-
drivers/infiniband/core/verbs.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 9c1f5b72..b9f7ccc 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -732,7 +732,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
int ret;
u16 pkey;

- if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num))
+ if (cap_eth_ah(id_priv->id.device, id_priv->id.port_num))
pkey = 0xffff;
else
pkey = ib_addr_get_pkey(dev_addr);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index f14a66f..063c17c 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;

- force_grh = rdma_tech_iboe(device, port_num);
+ force_grh = cap_eth_ah(device, port_num);

if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 7264860..ee4b5cb 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -202,7 +202,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
int ret;

memset(ah_attr, 0, sizeof *ah_attr);
- if (rdma_tech_iboe(device, port_num)) {
+ if (cap_eth_ah(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9cfab09..45050cb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1894,6 +1894,21 @@ static inline int cap_af_ib(struct ib_device *device, u8 port_num)
}

/**
+ * cap_eth_ah - Check if the port of device has the capability
+ * Ethernet Address Handler.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Ethernet Address Handler.
+ */
+static inline int cap_eth_ah(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_iboe(device, port_num);
+}
+
+/**
* cap_read_multi_sge - Check if the port of device has the capability
* RDMA Read Multiple Scatter-Gather Entries.
*
--
2.1.0

2015-04-20 08:43:32

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe()


We have finished introducing the cap_XX(), and raw helper rdma_ib_or_iboe()
is no longer necessary, thus clean it up.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 19 +++++++++----------
net/sunrpc/xprtrdma/svc_rdma_transport.c | 6 ++++--
2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 45050cb..0c0a4f0 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1766,13 +1766,6 @@ static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
== RDMA_TRANSPORT_IWARP;
}

-static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
-{
- enum rdma_transport_type tp = device->query_transport(device, port_num);
-
- return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
-}
-
/**
* cap_ib_mad - Check if the port of device has the capability Infiniband
* Management Datagrams.
@@ -1785,7 +1778,9 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
*/
static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
@@ -1815,7 +1810,9 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
*/
static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
@@ -1890,7 +1887,9 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
*/
static inline int cap_af_ib(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index a09b7a1..8af6f92 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -987,8 +987,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
*/
if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
newxprt->sc_cm_id->port_num) &&
- !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
- newxprt->sc_cm_id->port_num))
+ !rdma_tech_ib(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !rdma_tech_iboe(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
goto errout;

if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
--
2.1.0

2015-04-20 08:43:57

by Michael Wang

[permalink] [raw]
Subject: [PATCH v5 27/27] IB/Verbs: Cleanup rdma_node_get_transport()


We have get rid of all the scene using legacy rdma_node_get_transport(),
now clean it up.

Cc: Hal Rosenstock <[email protected]>
Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/verbs.c | 21 ---------------------
include/rdma/ib_verbs.h | 3 ---
2 files changed, 24 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index ee4b5cb..bbea0c0 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -107,27 +107,6 @@ __attribute_const__ int ib_rate_to_mbps(enum ib_rate rate)
}
EXPORT_SYMBOL(ib_rate_to_mbps);

-__attribute_const__ enum rdma_transport_type
-rdma_node_get_transport(enum rdma_node_type node_type)
-{
- switch (node_type) {
- case RDMA_NODE_IB_CA:
- case RDMA_NODE_IB_SWITCH:
- case RDMA_NODE_IB_ROUTER:
- return RDMA_TRANSPORT_IB;
- case RDMA_NODE_RNIC:
- return RDMA_TRANSPORT_IWARP;
- case RDMA_NODE_USNIC:
- return RDMA_TRANSPORT_USNIC;
- case RDMA_NODE_USNIC_UDP:
- return RDMA_TRANSPORT_USNIC_UDP;
- default:
- BUG();
- return 0;
- }
-}
-EXPORT_SYMBOL(rdma_node_get_transport);
-
enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num)
{
if (device->get_link_layer)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 0c0a4f0..f2ea6e7 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -84,9 +84,6 @@ enum rdma_transport_type {
RDMA_TRANSPORT_IBOE,
};

-__attribute_const__ enum rdma_transport_type
-rdma_node_get_transport(enum rdma_node_type node_type);
-
enum rdma_link_layer {
IB_LINK_LAYER_UNSPECIFIED,
IB_LINK_LAYER_INFINIBAND,
--
2.1.0

2015-04-20 14:00:58

by Steve Wise

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On 4/20/2015 3:40 AM, Michael Wang wrote:
> Introduce helper cap_iw_cm() to help us check if the port of an
> IB device support IWARP Communication Manager.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cma.c | 14 +++++++-------
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 2 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ff59dbc..dd37b4a 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -775,7 +775,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
>
> if (qp_attr->qp_state == IB_QPS_RTR)
> qp_attr->rq_psn = id_priv->seq_num;
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> if (!id_priv->cm_id.iw) {
> qp_attr->qp_access_flags = 0;
> *qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
> @@ -1057,7 +1057,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> if (cap_ib_cm(id_priv->id.device, 1)) {
> if (id_priv->cm_id.ib)
> ib_destroy_cm_id(id_priv->cm_id.ib);
> - } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
> + } else if (cap_iw_cm(id_priv->id.device, 1)) {
> if (id_priv->cm_id.iw)
> iw_destroy_cm_id(id_priv->cm_id.iw);
> }
> @@ -2541,7 +2541,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
> ret = cma_ib_listen(id_priv);
> if (ret)
> goto err;
> - } else if (rdma_tech_iwarp(id->device, 1)) {
> + } else if (cap_iw_cm(id->device, 1)) {
> ret = cma_iw_listen(id_priv, backlog);
> if (ret)
> goto err;
> @@ -2886,7 +2886,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> ret = cma_resolve_ib_udp(id_priv, conn_param);
> else
> ret = cma_connect_ib(id_priv, conn_param);
> - } else if (rdma_tech_iwarp(id->device, id->port_num))
> + } else if (cap_iw_cm(id->device, id->port_num))
> ret = cma_connect_iw(id_priv, conn_param);
> else
> ret = -ENOSYS;
> @@ -3008,7 +3008,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> else
> ret = cma_rep_recv(id_priv);
> }
> - } else if (rdma_tech_iwarp(id->device, id->port_num))
> + } else if (cap_iw_cm(id->device, id->port_num))
> ret = cma_accept_iw(id_priv, conn_param);
> else
> ret = -ENOSYS;
> @@ -3063,7 +3063,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
> ret = ib_send_cm_rej(id_priv->cm_id.ib,
> IB_CM_REJ_CONSUMER_DEFINED, NULL,
> 0, private_data, private_data_len);
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> ret = iw_cm_reject(id_priv->cm_id.iw,
> private_data, private_data_len);
> } else
> @@ -3089,7 +3089,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
> /* Initiate or respond to a disconnect. */
> if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
> ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
> } else
> ret = -EINVAL;
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 6805e3e..e4999f6 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
> return rdma_ib_or_iboe(device, port_num);
> }
>
> +/**
> + * cap_iw_cm - Check if the port of device has the capability IWARP
> + * Communication Manager.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support IWARP
> + * Communication Manager.
> + */
> +static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_iwarp(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>

iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.

2015-04-20 15:16:18

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On 04/20/2015 04:00 PM, Steve Wise wrote:
> On 4/20/2015 3:40 AM, Michael Wang wrote:
[snip]
>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>> index 6805e3e..e4999f6 100644
>> --- a/include/rdma/ib_verbs.h
>> +++ b/include/rdma/ib_verbs.h
>> @@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
>> return rdma_ib_or_iboe(device, port_num);
>> }
>> +/**
>> + * cap_iw_cm - Check if the port of device has the capability IWARP
>> + * Communication Manager.
>> + *
>> + * @device: Device to be checked
>> + * @port_num: Port number of the device
>> + *
>> + * Return 0 when port of the device don't support IWARP
>> + * Communication Manager.
>> + */
>> +static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
>> +{
>> + return rdma_tech_iwarp(device, port_num);
>> +}
>> +
>> int ib_query_gid(struct ib_device *device,
>> u8 port_num, int index, union ib_gid *gid);
>>
>
> iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.

Sean suggested to add this helper paired with cap_ib_cm(), may be there are
some consideration on maintainability?

Me too also prefer this way to make the code more readable ;-)

Regards,
Michael Wang

>
>

2015-04-20 15:52:00

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On 4/20/15 10:16 AM, Michael Wang wrote:
> On 04/20/2015 04:00 PM, Steve Wise wrote:
>> On 4/20/2015 3:40 AM, Michael Wang wrote:
> [snip]
>>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>>> index 6805e3e..e4999f6 100644
>>> --- a/include/rdma/ib_verbs.h
>>> +++ b/include/rdma/ib_verbs.h
>>> @@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
>>> return rdma_ib_or_iboe(device, port_num);
>>> }
>>> +/**
>>> + * cap_iw_cm - Check if the port of device has the capability IWARP
>>> + * Communication Manager.
>>> + *
>>> + * @device: Device to be checked
>>> + * @port_num: Port number of the device
>>> + *
>>> + * Return 0 when port of the device don't support IWARP
>>> + * Communication Manager.
>>> + */
>>> +static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
>>> +{
>>> + return rdma_tech_iwarp(device, port_num);
>>> +}
>>> +
>>> int ib_query_gid(struct ib_device *device,
>>> u8 port_num, int index, union ib_gid *gid);
>>>
>> iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.
> Sean suggested to add this helper paired with cap_ib_cm(), may be there are
> some consideration on maintainability?
>
> Me too also prefer this way to make the code more readable ;-)

It's more consistent, but not necessarily more readable -- if by
readability we mean understanding.

If the reader knows how the transports work, then the reader would be
confused by the addition of a check that is always true. For the reader
that doesn't know, the addition of the check implies that the support is
optional, which it is not.

Tom

> Regards,
> Michael Wang
>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-04-20 16:19:44

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On Mon, Apr 20, 2015 at 10:51:58AM -0500, Tom Tucker wrote:
> On 4/20/15 10:16 AM, Michael Wang wrote:
> >On 04/20/2015 04:00 PM, Steve Wise wrote:
> >>On 4/20/2015 3:40 AM, Michael Wang wrote:
> >[snip]
> >>>diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> >>>index 6805e3e..e4999f6 100644
> >>>+++ b/include/rdma/ib_verbs.h
> >>>@@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
> >>> return rdma_ib_or_iboe(device, port_num);
> >>> }
> >>> +/**
> >>>+ * cap_iw_cm - Check if the port of device has the capability IWARP
> >>>+ * Communication Manager.
> >>>+ *
> >>>+ * @device: Device to be checked
> >>>+ * @port_num: Port number of the device
> >>>+ *
> >>>+ * Return 0 when port of the device don't support IWARP
> >>>+ * Communication Manager.
> >>>+ */
> >>>+static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
> >>>+{
> >>>+ return rdma_tech_iwarp(device, port_num);
> >>>+}
> >>>+
> >>> int ib_query_gid(struct ib_device *device,
> >>> u8 port_num, int index, union ib_gid *gid);
> >>iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.
> >Sean suggested to add this helper paired with cap_ib_cm(), may be there are
> >some consideration on maintainability?
> >
> >Me too also prefer this way to make the code more readable ;-)
>
> It's more consistent, but not necessarily more readable -- if by
> readability we mean understanding.
>
> If the reader knows how the transports work, then the reader would
> be confused by the addition of a check that is always true. For the
> reader that doesn't know, the addition of the check implies that the
> support is optional, which it is not.

No, it says this code is concerned with the unique parts of iWarp
related to CM, not the other unique parts of iWarp. The check isn't
aways true, it is just always true on iWarp devices.

That became the problem with the old way of just saying 'is iWarp'
(and others). There are too many differences, the why became lost in
many places.

There are now too many standards, and several do not have public docs,
to keep relying on a mess of 'is standard' tests.

Jason

2015-04-20 16:39:07

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On 4/20/15 11:19 AM, Jason Gunthorpe wrote:
> On Mon, Apr 20, 2015 at 10:51:58AM -0500, Tom Tucker wrote:
>> On 4/20/15 10:16 AM, Michael Wang wrote:
>>> On 04/20/2015 04:00 PM, Steve Wise wrote:
>>>> On 4/20/2015 3:40 AM, Michael Wang wrote:
>>> [snip]
>>>>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>>>>> index 6805e3e..e4999f6 100644
>>>>> +++ b/include/rdma/ib_verbs.h
>>>>> @@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
>>>>> return rdma_ib_or_iboe(device, port_num);
>>>>> }
>>>>> +/**
>>>>> + * cap_iw_cm - Check if the port of device has the capability IWARP
>>>>> + * Communication Manager.
>>>>> + *
>>>>> + * @device: Device to be checked
>>>>> + * @port_num: Port number of the device
>>>>> + *
>>>>> + * Return 0 when port of the device don't support IWARP
>>>>> + * Communication Manager.
>>>>> + */
>>>>> +static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
>>>>> +{
>>>>> + return rdma_tech_iwarp(device, port_num);
>>>>> +}
>>>>> +
>>>>> int ib_query_gid(struct ib_device *device,
>>>>> u8 port_num, int index, union ib_gid *gid);
>>>> iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.
>>> Sean suggested to add this helper paired with cap_ib_cm(), may be there are
>>> some consideration on maintainability?
>>>
>>> Me too also prefer this way to make the code more readable ;-)
>> It's more consistent, but not necessarily more readable -- if by
>> readability we mean understanding.
>>
>> If the reader knows how the transports work, then the reader would
>> be confused by the addition of a check that is always true. For the
>> reader that doesn't know, the addition of the check implies that the
>> support is optional, which it is not.
> No, it says this code is concerned with the unique parts of iWarp
> related to CM, not the other unique parts of iWarp. The check isn't
> aways true, it is just always true on iWarp devices.
>
> That became the problem with the old way of just saying 'is iWarp'
> (and others). There are too many differences, the why became lost in
> many places.
>
> There are now too many standards, and several do not have public docs,
> to keep relying on a mess of 'is standard' tests.

You're right Jason, this gets called with the device handle so it's only
true for iwarp.

> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-04-20 17:04:38

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH v5 04/27] IB/Verbs: Reform IB-core cm

On 4/20/2015 4:33 AM, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core cm.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cm.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e28a494..3c10b75 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
> unsigned long flags;
> int ret;
> u8 i;
> -
> - if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;

Nit: Should the int count line be moved above u8 i declaration so
declarations are naturally aligned ?

-- Hal

<snip...>

2015-04-21 05:42:01

by Devesh Sharma

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi Michael,

is there a specific git branch available to pull out all the patches?

-Regards
Devesh

> -----Original Message-----
> From: [email protected] [mailto:linux-rdma-
> [email protected]] On Behalf Of Michael Wang
> Sent: Monday, April 20, 2015 1:59 PM
> To: Roland Dreier; Sean Hefty; Hal Rosenstock; [email protected];
> [email protected]; [email protected]
> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran;
> Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford; Michael Wang
> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>
>
> Since v4:
> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> * Fix logical issue inside 3#, 14#
> * Refine 3#, 4#, 5# with label 'free'
> * Rework 10# to stop using port 1 when port already assigned
>
> There are plenty of lengthy code to check the transport type of IB device, or the
> link layer type of it's port, but actually we are just speculating whether a
> particular management/feature is supported by the device/port.
>
> Thus instead of inferring, we should have our own mechanism for IB
> management capability/protocol/feature checking, several proposals below.
>
> This patch set will reform the method of getting transport type, we will now
> using query_transport() instead of inferring from transport and link layer
> respectively, also we defined the new transport type to make the concept more
> reasonable.
>
> Mapping List:
> node-type link-layer old-transport new-transport
> nes RNIC ETH IWARP IWARP
> amso1100 RNIC ETH IWARP IWARP
> cxgb3 RNIC ETH IWARP IWARP
> cxgb4 RNIC ETH IWARP IWARP
> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> ocrdma IB_CA ETH IB IBOE
> mlx4 IB_CA IB/ETH IB IB/IBOE
> mlx5 IB_CA IB IB IB
> ehca IB_CA IB IB IB
> ipath IB_CA IB IB IB
> mthca IB_CA IB IB IB
> qib IB_CA IB IB IB
>
> For example:
> if (transport == IB) && (link-layer == ETH) will now become:
> if (query_transport() == IBOE)
>
> Thus we will be able to get rid of the respective transport and link-layer
> checking, and it will help us to add new protocol/Technology (like OPA) more
> easier, also with the introduced management helpers, IB management logical
> will be more clear and easier for extending.
>
> Highlights:
> The patch set covered a wide range of IB stuff, thus for those who are
> familiar with the particular part, your suggestion would be invaluable ;-)
>
> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> management helpers, 26#~27# do clean up.
>
> Patches haven't been tested yet, we appreciate if any one who have these
> HW willing to provide his Tested-by :-)
>
> Doug suggested the bitmask mechanism:
> https://www.mail-archive.com/linux-
> [email protected]/msg23765.html
> which could be the plan for future reforming, we prefer that to be another
> series which focus on semantic and performance.
>
> This patch-set is somewhat 'bloated' now and it may be a good timing for
> staging, I'd like to suggest we focus on improving existed helpers and push
> all the further reforms into next series ;-)
>
> Proposals:
> Sean:
> https://www.mail-archive.com/linux-
> [email protected]/msg23339.html
> Doug:
> https://www.mail-archive.com/linux-
> [email protected]/msg23418.html
> https://www.mail-archive.com/linux-
> [email protected]/msg23765.html
> Jason:
> https://www.mail-archive.com/linux-
> [email protected]/msg23425.html
>
> Michael Wang (27):
> IB/Verbs: Implement new callback query_transport()
> IB/Verbs: Implement raw management helpers
> IB/Verbs: Reform IB-core mad/agent/user_mad
> IB/Verbs: Reform IB-core cm
> IB/Verbs: Reform IB-core sa_query
> IB/Verbs: Reform IB-core multicast
> IB/Verbs: Reform IB-ulp ipoib
> IB/Verbs: Reform IB-ulp xprtrdma
> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> IB/Verbs: Reform cm related part in IB-core cma/ucm
> IB/Verbs: Reform route related part in IB-core cma
> IB/Verbs: Reform mcast related part in IB-core cma
> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> IB/Verbs: Reform cma_acquire_dev()
> IB/Verbs: Reform rest part in IB-core cma
> IB/Verbs: Use management helper cap_ib_mad()
> IB/Verbs: Use management helper cap_ib_smi()
> IB/Verbs: Use management helper cap_ib_cm()
> IB/Verbs: Use management helper cap_iw_cm()
> IB/Verbs: Use management helper cap_ib_sa()
> IB/Verbs: Use management helper cap_ib_mcast()
> IB/Verbs: Use management helper cap_ipoib()
> IB/Verbs: Use management helper cap_read_multi_sge()
> IB/Verbs: Use management helper cap_af_ib()
> IB/Verbs: Use management helper cap_eth_ah()
> IB/Verbs: Clean up rdma_ib_or_iboe()
> IB/Verbs: Cleanup rdma_node_get_transport()
>
> ---
> drivers/infiniband/core/agent.c | 4
> drivers/infiniband/core/cm.c | 26 +-
> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> drivers/infiniband/core/device.c | 1
> drivers/infiniband/core/mad.c | 51 ++--
> drivers/infiniband/core/multicast.c | 18 -
> drivers/infiniband/core/sa_query.c | 41 +--
> drivers/infiniband/core/sysfs.c | 8
> drivers/infiniband/core/ucm.c | 5
> drivers/infiniband/core/ucma.c | 27 --
> drivers/infiniband/core/user_mad.c | 32 +-
> drivers/infiniband/core/uverbs_cmd.c | 6
> drivers/infiniband/core/verbs.c | 33 --
> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> drivers/infiniband/hw/cxgb4/provider.c | 7
> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> drivers/infiniband/hw/ehca/ehca_main.c | 1
> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> drivers/infiniband/hw/mlx4/main.c | 10
> drivers/infiniband/hw/mlx5/main.c | 7
> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> drivers/infiniband/hw/nes/nes_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> drivers/infiniband/hw/qib/qib_verbs.c | 7
> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> 35 files changed, 584 insertions(+), 368 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
> of a message to [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-21 05:58:56

by Devesh Sharma

[permalink] [raw]
Subject: RE: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in 'dev_addr'

> -----Original Message-----
> From: [email protected] [mailto:linux-rdma-
> [email protected]] On Behalf Of Michael Wang
> Sent: Monday, April 20, 2015 2:08 PM
> To: Roland Dreier; Sean Hefty; [email protected]; linux-
> [email protected]; [email protected]
> Cc: Michael Wang; Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph
> Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz;
> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
> Subject: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in
> 'dev_addr'
>
>
> Reserve the legacy transport type for the 'transport' member of 'struct
> rdma_dev_addr' until we make sure this is no longer needed.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cma.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ebac646..6195bf6 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr
> *hdr, u8 ip_ver)
> hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF); }
>
> +static inline void cma_set_legacy_transport(struct rdma_cm_id *id) {
> + switch (id->device->node_type) {
> + case RDMA_NODE_IB_CA:
> + case RDMA_NODE_IB_SWITCH:
> + case RDMA_NODE_IB_ROUTER:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;

What about IBOE transport, am I missing something here? As of today ocrdma exports node_type as RDMA_NODE_IB_CA, here transport will be set to RDMA_TRANSPORT_IB,
Should it be RDMA_TRANPORT_IBOE?

> + break;
> + case RDMA_NODE_RNIC:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_IWARP;
> + break;
> + case RDMA_NODE_USNIC:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_USNIC;
> + break;
> + case RDMA_NODE_USNIC_UDP:
> + id->route.addr.dev_addr.transport =
> RDMA_TRANSPORT_USNIC_UDP;
> + break;
> + default:
> + BUG();
> + }
> +}
> +
> static void cma_attach_to_dev(struct rdma_id_private *id_priv,
> struct cma_device *cma_dev)
> {
> atomic_inc(&cma_dev->refcount);
> id_priv->cma_dev = cma_dev;
> id_priv->id.device = cma_dev->device;
> - id_priv->id.route.addr.dev_addr.transport =
> - rdma_node_get_transport(cma_dev->device->node_type);
> + cma_set_legacy_transport(&id_priv->id);
> list_add_tail(&id_priv->list, &cma_dev->id_list); }
>
> --
> 2.1.0
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
> of a message to [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-21 06:16:21

by Devesh Sharma

[permalink] [raw]
Subject: RE: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()

Looks good, I would like to test with ocrdma before confirming.

> -----Original Message-----
> From: [email protected] [mailto:linux-rdma-
> [email protected]] On Behalf Of Michael Wang
> Sent: Monday, April 20, 2015 2:08 PM
> To: Roland Dreier; Sean Hefty; [email protected]; linux-
> [email protected]; [email protected]
> Cc: Michael Wang; Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph
> Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz;
> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
> Subject: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()
>
>
> Reform cma_acquire_dev() with management helpers, introduce
> cma_validate_port() to make the code more clean.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cma.c | 68 +++++++++++++++++++++++++----------------
> --
> 1 file changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 6195bf6..44e7bb9 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -370,18 +370,35 @@ static int cma_translate_addr(struct sockaddr *addr,
> struct rdma_dev_addr *dev_a
> return ret;
> }
>
> +static inline int cma_validate_port(struct ib_device *device, u8 port,
> + union ib_gid *gid, int dev_type) {
> + u8 found_port;
> + int ret = -ENODEV;
> +
> + if ((dev_type == ARPHRD_INFINIBAND) && !rdma_tech_ib(device,
> port))
> + return ret;
> +
> + if ((dev_type != ARPHRD_INFINIBAND) && rdma_tech_ib(device, port))
> + return ret;
> +
> + ret = ib_find_cached_gid(device, gid, &found_port, NULL);
> + if (port != found_port)
> + return -ENODEV;
> +
> + return ret;
> +}
> +
> static int cma_acquire_dev(struct rdma_id_private *id_priv,
> struct rdma_id_private *listen_id_priv) {
> struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
> struct cma_device *cma_dev;
> - union ib_gid gid, iboe_gid;
> + union ib_gid gid, iboe_gid, *gidp;
> int ret = -ENODEV;
> - u8 port, found_port;
> - enum rdma_link_layer dev_ll = dev_addr->dev_type ==
> ARPHRD_INFINIBAND ?
> - IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
> + u8 port;
>
> - if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
> + if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
> id_priv->id.ps == RDMA_PS_IPOIB)
> return -EINVAL;
>
> @@ -391,41 +408,36 @@ static int cma_acquire_dev(struct rdma_id_private
> *id_priv,
>
> memcpy(&gid, dev_addr->src_dev_addr +
> rdma_addr_gid_offset(dev_addr), sizeof gid);
> - if (listen_id_priv &&
> - rdma_port_get_link_layer(listen_id_priv->id.device,
> - listen_id_priv->id.port_num) == dev_ll) {
> +
> + if (listen_id_priv) {
> cma_dev = listen_id_priv->cma_dev;
> port = listen_id_priv->id.port_num;
> - if (rdma_node_get_transport(cma_dev->device->node_type)
> == RDMA_TRANSPORT_IB &&
> - rdma_port_get_link_layer(cma_dev->device, port) ==
> IB_LINK_LAYER_ETHERNET)
> - ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
> - &found_port, NULL);
> - else
> - ret = ib_find_cached_gid(cma_dev->device, &gid,
> - &found_port, NULL);
> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
> + &iboe_gid : &gid;
>
> - if (!ret && (port == found_port)) {
> - id_priv->id.port_num = found_port;
> + ret = cma_validate_port(cma_dev->device, port, gidp,
> + dev_addr->dev_type);
> + if (!ret) {
> + id_priv->id.port_num = port;
> goto out;
> }
> }
> +
> list_for_each_entry(cma_dev, &dev_list, list) {
> for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port)
> {
> if (listen_id_priv &&
> listen_id_priv->cma_dev == cma_dev &&
> listen_id_priv->id.port_num == port)
> continue;
> - if (rdma_port_get_link_layer(cma_dev->device, port)
> == dev_ll) {
> - if (rdma_node_get_transport(cma_dev-
> >device->node_type) == RDMA_TRANSPORT_IB &&
> - rdma_port_get_link_layer(cma_dev-
> >device, port) == IB_LINK_LAYER_ETHERNET)
> - ret = ib_find_cached_gid(cma_dev-
> >device, &iboe_gid, &found_port, NULL);
> - else
> - ret = ib_find_cached_gid(cma_dev-
> >device, &gid, &found_port, NULL);
> -
> - if (!ret && (port == found_port)) {
> - id_priv->id.port_num = found_port;
> - goto out;
> - }
> +
> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
> + &iboe_gid : &gid;
> +
> + ret = cma_validate_port(cma_dev->device, port, gidp,
> + dev_addr->dev_type);
> + if (!ret) {
> + id_priv->id.port_num = port;
> + goto out;
> }
> }
> }
> --
> 2.1.0
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
> of a message to [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-21 07:39:22

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()



On 04/20/2015 05:51 PM, Tom Tucker wrote:
[snip]
>>>> int ib_query_gid(struct ib_device *device,
>>>> u8 port_num, int index, union ib_gid *gid);
>>>>
>>> iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.
>> Sean suggested to add this helper paired with cap_ib_cm(), may be there are
>> some consideration on maintainability?
>>
>> Me too also prefer this way to make the code more readable ;-)
>
> It's more consistent, but not necessarily more readable -- if by readability we mean understanding.
>
> If the reader knows how the transports work, then the reader would be confused by the addition of a check that is always true. For the reader that doesn't know, the addition of the check implies that the support is optional, which it is not.

The purpose is to make sure folks understand what we really want to check
when they reviewing the code :-) and prepared for the further reform which may
not rely on technology type any more, for example the device could tell core
layer directly what management it required with a bitmask :-)

Regards,
Michael Wang

>
> Tom
>
>> Regards,
>> Michael Wang
>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2015-04-21 07:42:31

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 04/27] IB/Verbs: Reform IB-core cm



On 04/20/2015 07:04 PM, Hal Rosenstock wrote:
> On 4/20/2015 4:33 AM, Michael Wang wrote:
>>
>> Use raw management helpers to reform IB-core cm.
>>
>> Cc: Hal Rosenstock <[email protected]>
>> Cc: Steve Wise <[email protected]>
>> Cc: Tom Talpey <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Doug Ledford <[email protected]>
>> Cc: Ira Weiny <[email protected]>
>> Cc: Sean Hefty <[email protected]>
>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> drivers/infiniband/core/cm.c | 20 +++++++++++++++++---
>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
>> index e28a494..3c10b75 100644
>> --- a/drivers/infiniband/core/cm.c
>> +++ b/drivers/infiniband/core/cm.c
>> @@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
>> unsigned long flags;
>> int ret;
>> u8 i;
>> -
>> - if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
>> - return;
>> + int count = 0;
>
> Nit: Should the int count line be moved above u8 i declaration so
> declarations are naturally aligned ?

Make sense, will be in next version :-)

Regards,
Michael Wang

>
> -- Hal
>
> <snip...>
>

2015-04-21 07:47:01

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi, Devesh

On 04/21/2015 07:41 AM, Devesh Sharma wrote:
> Hi Michael,
>
> is there a specific git branch available to pull out all the patches?

Not yet, we may need the maintainer to tell us which branch could
the series been applied for testing purpose, after we all satisfied :-)

For now we could 'git am' these patches to 'infiniband.git/for-next'
in order to do testing.

Regards,
Michael Wang

>
> -Regards
> Devesh
>
>> -----Original Message-----
>> From: [email protected] [mailto:linux-rdma-
>> [email protected]] On Behalf Of Michael Wang
>> Sent: Monday, April 20, 2015 1:59 PM
>> To: Roland Dreier; Sean Hefty; Hal Rosenstock; [email protected];
>> [email protected]; [email protected]
>> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
>> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran;
>> Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford; Michael Wang
>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>
>>
>> Since v4:
>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>> * Fix logical issue inside 3#, 14#
>> * Refine 3#, 4#, 5# with label 'free'
>> * Rework 10# to stop using port 1 when port already assigned
>>
>> There are plenty of lengthy code to check the transport type of IB device, or the
>> link layer type of it's port, but actually we are just speculating whether a
>> particular management/feature is supported by the device/port.
>>
>> Thus instead of inferring, we should have our own mechanism for IB
>> management capability/protocol/feature checking, several proposals below.
>>
>> This patch set will reform the method of getting transport type, we will now
>> using query_transport() instead of inferring from transport and link layer
>> respectively, also we defined the new transport type to make the concept more
>> reasonable.
>>
>> Mapping List:
>> node-type link-layer old-transport new-transport
>> nes RNIC ETH IWARP IWARP
>> amso1100 RNIC ETH IWARP IWARP
>> cxgb3 RNIC ETH IWARP IWARP
>> cxgb4 RNIC ETH IWARP IWARP
>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>> ocrdma IB_CA ETH IB IBOE
>> mlx4 IB_CA IB/ETH IB IB/IBOE
>> mlx5 IB_CA IB IB IB
>> ehca IB_CA IB IB IB
>> ipath IB_CA IB IB IB
>> mthca IB_CA IB IB IB
>> qib IB_CA IB IB IB
>>
>> For example:
>> if (transport == IB) && (link-layer == ETH) will now become:
>> if (query_transport() == IBOE)
>>
>> Thus we will be able to get rid of the respective transport and link-layer
>> checking, and it will help us to add new protocol/Technology (like OPA) more
>> easier, also with the introduced management helpers, IB management logical
>> will be more clear and easier for extending.
>>
>> Highlights:
>> The patch set covered a wide range of IB stuff, thus for those who are
>> familiar with the particular part, your suggestion would be invaluable ;-)
>>
>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>> management helpers, 26#~27# do clean up.
>>
>> Patches haven't been tested yet, we appreciate if any one who have these
>> HW willing to provide his Tested-by :-)
>>
>> Doug suggested the bitmask mechanism:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23765.html
>> which could be the plan for future reforming, we prefer that to be another
>> series which focus on semantic and performance.
>>
>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>> staging, I'd like to suggest we focus on improving existed helpers and push
>> all the further reforms into next series ;-)
>>
>> Proposals:
>> Sean:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23339.html
>> Doug:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23418.html
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23765.html
>> Jason:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23425.html
>>
>> Michael Wang (27):
>> IB/Verbs: Implement new callback query_transport()
>> IB/Verbs: Implement raw management helpers
>> IB/Verbs: Reform IB-core mad/agent/user_mad
>> IB/Verbs: Reform IB-core cm
>> IB/Verbs: Reform IB-core sa_query
>> IB/Verbs: Reform IB-core multicast
>> IB/Verbs: Reform IB-ulp ipoib
>> IB/Verbs: Reform IB-ulp xprtrdma
>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>> IB/Verbs: Reform route related part in IB-core cma
>> IB/Verbs: Reform mcast related part in IB-core cma
>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>> IB/Verbs: Reform cma_acquire_dev()
>> IB/Verbs: Reform rest part in IB-core cma
>> IB/Verbs: Use management helper cap_ib_mad()
>> IB/Verbs: Use management helper cap_ib_smi()
>> IB/Verbs: Use management helper cap_ib_cm()
>> IB/Verbs: Use management helper cap_iw_cm()
>> IB/Verbs: Use management helper cap_ib_sa()
>> IB/Verbs: Use management helper cap_ib_mcast()
>> IB/Verbs: Use management helper cap_ipoib()
>> IB/Verbs: Use management helper cap_read_multi_sge()
>> IB/Verbs: Use management helper cap_af_ib()
>> IB/Verbs: Use management helper cap_eth_ah()
>> IB/Verbs: Clean up rdma_ib_or_iboe()
>> IB/Verbs: Cleanup rdma_node_get_transport()
>>
>> ---
>> drivers/infiniband/core/agent.c | 4
>> drivers/infiniband/core/cm.c | 26 +-
>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>> drivers/infiniband/core/device.c | 1
>> drivers/infiniband/core/mad.c | 51 ++--
>> drivers/infiniband/core/multicast.c | 18 -
>> drivers/infiniband/core/sa_query.c | 41 +--
>> drivers/infiniband/core/sysfs.c | 8
>> drivers/infiniband/core/ucm.c | 5
>> drivers/infiniband/core/ucma.c | 27 --
>> drivers/infiniband/core/user_mad.c | 32 +-
>> drivers/infiniband/core/uverbs_cmd.c | 6
>> drivers/infiniband/core/verbs.c | 33 --
>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>> drivers/infiniband/hw/cxgb4/provider.c | 7
>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>> drivers/infiniband/hw/mlx4/main.c | 10
>> drivers/infiniband/hw/mlx5/main.c | 7
>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>> 35 files changed, 584 insertions(+), 368 deletions(-)
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
>> of a message to [email protected] More majordomo info at
>> http://vger.kernel.org/majordomo-info.html

2015-04-21 08:05:51

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in 'dev_addr'

On 04/21/2015 07:58 AM, Devesh Sharma wrote:
[snip]
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index ebac646..6195bf6 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr
>> *hdr, u8 ip_ver)
>> hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF); }
>>
>> +static inline void cma_set_legacy_transport(struct rdma_cm_id *id) {
>> + switch (id->device->node_type) {
>> + case RDMA_NODE_IB_CA:
>> + case RDMA_NODE_IB_SWITCH:
>> + case RDMA_NODE_IB_ROUTER:
>> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;
>
> What about IBOE transport, am I missing something here? As of today ocrdma exports node_type as RDMA_NODE_IB_CA, here transport will be set to RDMA_TRANSPORT_IB,
> Should it be RDMA_TRANPORT_IBOE?

This part is actually just the old method we used to get transport type, I'm
not sure about the usage of this 'transport' so reserve the old way for it :-P

Actually I can't locate the place using this stuff in core layer, thus I guess it
may be used by user layer or the protocol, as long as these layer using the old
logical, we better don't touch anything ;-)

Regards,
Michael Wang

>
>> + break;
>> + case RDMA_NODE_RNIC:
>> + id->route.addr.dev_addr.transport =
>> RDMA_TRANSPORT_IWARP;
>> + break;
>> + case RDMA_NODE_USNIC:
>> + id->route.addr.dev_addr.transport =
>> RDMA_TRANSPORT_USNIC;
>> + break;
>> + case RDMA_NODE_USNIC_UDP:
>> + id->route.addr.dev_addr.transport =
>> RDMA_TRANSPORT_USNIC_UDP;
>> + break;
>> + default:
>> + BUG();
>> + }
>> +}
>> +
>> static void cma_attach_to_dev(struct rdma_id_private *id_priv,
>> struct cma_device *cma_dev)
>> {
>> atomic_inc(&cma_dev->refcount);
>> id_priv->cma_dev = cma_dev;
>> id_priv->id.device = cma_dev->device;
>> - id_priv->id.route.addr.dev_addr.transport =
>> - rdma_node_get_transport(cma_dev->device->node_type);
>> + cma_set_legacy_transport(&id_priv->id);
>> list_add_tail(&id_priv->list, &cma_dev->id_list); }
>>
>> --
>> 2.1.0
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
>> of a message to [email protected] More majordomo info at
>> http://vger.kernel.org/majordomo-info.html

2015-04-21 08:09:04

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()



On 04/21/2015 08:15 AM, Devesh Sharma wrote:
> Looks good, I would like to test with ocrdma before confirming.

That's great :-) Any testing would be really helpful, please let
us know if there is some thing broken.

Regards,
Michael Wang

>
>> -----Original Message-----
>> From: [email protected] [mailto:linux-rdma-
>> [email protected]] On Behalf Of Michael Wang
>> Sent: Monday, April 20, 2015 2:08 PM
>> To: Roland Dreier; Sean Hefty; [email protected]; linux-
>> [email protected]; [email protected]
>> Cc: Michael Wang; Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph
>> Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz;
>> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
>> Subject: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()
>>
>>
>> Reform cma_acquire_dev() with management helpers, introduce
>> cma_validate_port() to make the code more clean.
>>
>> Cc: Hal Rosenstock <[email protected]>
>> Cc: Steve Wise <[email protected]>
>> Cc: Tom Talpey <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Doug Ledford <[email protected]>
>> Cc: Ira Weiny <[email protected]>
>> Cc: Sean Hefty <[email protected]>
>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> drivers/infiniband/core/cma.c | 68 +++++++++++++++++++++++++----------------
>> --
>> 1 file changed, 40 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index 6195bf6..44e7bb9 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -370,18 +370,35 @@ static int cma_translate_addr(struct sockaddr *addr,
>> struct rdma_dev_addr *dev_a
>> return ret;
>> }
>>
>> +static inline int cma_validate_port(struct ib_device *device, u8 port,
>> + union ib_gid *gid, int dev_type) {
>> + u8 found_port;
>> + int ret = -ENODEV;
>> +
>> + if ((dev_type == ARPHRD_INFINIBAND) && !rdma_tech_ib(device,
>> port))
>> + return ret;
>> +
>> + if ((dev_type != ARPHRD_INFINIBAND) && rdma_tech_ib(device, port))
>> + return ret;
>> +
>> + ret = ib_find_cached_gid(device, gid, &found_port, NULL);
>> + if (port != found_port)
>> + return -ENODEV;
>> +
>> + return ret;
>> +}
>> +
>> static int cma_acquire_dev(struct rdma_id_private *id_priv,
>> struct rdma_id_private *listen_id_priv) {
>> struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
>> struct cma_device *cma_dev;
>> - union ib_gid gid, iboe_gid;
>> + union ib_gid gid, iboe_gid, *gidp;
>> int ret = -ENODEV;
>> - u8 port, found_port;
>> - enum rdma_link_layer dev_ll = dev_addr->dev_type ==
>> ARPHRD_INFINIBAND ?
>> - IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
>> + u8 port;
>>
>> - if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
>> + if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
>> id_priv->id.ps == RDMA_PS_IPOIB)
>> return -EINVAL;
>>
>> @@ -391,41 +408,36 @@ static int cma_acquire_dev(struct rdma_id_private
>> *id_priv,
>>
>> memcpy(&gid, dev_addr->src_dev_addr +
>> rdma_addr_gid_offset(dev_addr), sizeof gid);
>> - if (listen_id_priv &&
>> - rdma_port_get_link_layer(listen_id_priv->id.device,
>> - listen_id_priv->id.port_num) == dev_ll) {
>> +
>> + if (listen_id_priv) {
>> cma_dev = listen_id_priv->cma_dev;
>> port = listen_id_priv->id.port_num;
>> - if (rdma_node_get_transport(cma_dev->device->node_type)
>> == RDMA_TRANSPORT_IB &&
>> - rdma_port_get_link_layer(cma_dev->device, port) ==
>> IB_LINK_LAYER_ETHERNET)
>> - ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
>> - &found_port, NULL);
>> - else
>> - ret = ib_find_cached_gid(cma_dev->device, &gid,
>> - &found_port, NULL);
>> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
>> + &iboe_gid : &gid;
>>
>> - if (!ret && (port == found_port)) {
>> - id_priv->id.port_num = found_port;
>> + ret = cma_validate_port(cma_dev->device, port, gidp,
>> + dev_addr->dev_type);
>> + if (!ret) {
>> + id_priv->id.port_num = port;
>> goto out;
>> }
>> }
>> +
>> list_for_each_entry(cma_dev, &dev_list, list) {
>> for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port)
>> {
>> if (listen_id_priv &&
>> listen_id_priv->cma_dev == cma_dev &&
>> listen_id_priv->id.port_num == port)
>> continue;
>> - if (rdma_port_get_link_layer(cma_dev->device, port)
>> == dev_ll) {
>> - if (rdma_node_get_transport(cma_dev-
>>> device->node_type) == RDMA_TRANSPORT_IB &&
>> - rdma_port_get_link_layer(cma_dev-
>>> device, port) == IB_LINK_LAYER_ETHERNET)
>> - ret = ib_find_cached_gid(cma_dev-
>>> device, &iboe_gid, &found_port, NULL);
>> - else
>> - ret = ib_find_cached_gid(cma_dev-
>>> device, &gid, &found_port, NULL);
>> -
>> - if (!ret && (port == found_port)) {
>> - id_priv->id.port_num = found_port;
>> - goto out;
>> - }
>> +
>> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
>> + &iboe_gid : &gid;
>> +
>> + ret = cma_validate_port(cma_dev->device, port, gidp,
>> + dev_addr->dev_type);
>> + if (!ret) {
>> + id_priv->id.port_num = port;
>> + goto out;
>> }
>> }
>> }
>> --
>> 2.1.0
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body
>> of a message to [email protected] More majordomo info at
>> http://vger.kernel.org/majordomo-info.html

2015-04-21 11:04:20

by Devesh Sharma

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi Michael,

It will be great help if you could base you patches on existing Roland's tree and share to branch details to pull.
Just like Chuck lever does for his nfs-rdma patches?

-Regards
Devesh

> -----Original Message-----
> From: Michael Wang [mailto:[email protected]]
> Sent: Tuesday, April 21, 2015 1:17 PM
> To: Devesh Sharma; Roland Dreier; Sean Hefty; Hal Rosenstock; linux-
> [email protected]; [email protected]; [email protected]
> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran;
> Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
> Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>
> Hi, Devesh
>
> On 04/21/2015 07:41 AM, Devesh Sharma wrote:
> > Hi Michael,
> >
> > is there a specific git branch available to pull out all the patches?
>
> Not yet, we may need the maintainer to tell us which branch could the series
> been applied for testing purpose, after we all satisfied :-)
>
> For now we could 'git am' these patches to 'infiniband.git/for-next'
> in order to do testing.
>
> Regards,
> Michael Wang
>
> >
> > -Regards
> > Devesh
> >
> >> -----Original Message-----
> >> From: [email protected] [mailto:linux-rdma-
> >> [email protected]] On Behalf Of Michael Wang
> >> Sent: Monday, April 20, 2015 1:59 PM
> >> To: Roland Dreier; Sean Hefty; Hal Rosenstock;
> >> [email protected]; [email protected];
> >> [email protected]
> >> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
> >> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz;
> >> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford;
> >> Michael Wang
> >> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> >>
> >>
> >> Since v4:
> >> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> >> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> >> * Fix logical issue inside 3#, 14#
> >> * Refine 3#, 4#, 5# with label 'free'
> >> * Rework 10# to stop using port 1 when port already assigned
> >>
> >> There are plenty of lengthy code to check the transport type of IB
> >> device, or the link layer type of it's port, but actually we are just
> >> speculating whether a particular management/feature is supported by the
> device/port.
> >>
> >> Thus instead of inferring, we should have our own mechanism for IB
> >> management capability/protocol/feature checking, several proposals below.
> >>
> >> This patch set will reform the method of getting transport type, we
> >> will now using query_transport() instead of inferring from transport
> >> and link layer respectively, also we defined the new transport type
> >> to make the concept more reasonable.
> >>
> >> Mapping List:
> >> node-type link-layer old-transport new-transport
> >> nes RNIC ETH IWARP IWARP
> >> amso1100 RNIC ETH IWARP IWARP
> >> cxgb3 RNIC ETH IWARP IWARP
> >> cxgb4 RNIC ETH IWARP IWARP
> >> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> >> ocrdma IB_CA ETH IB IBOE
> >> mlx4 IB_CA IB/ETH IB IB/IBOE
> >> mlx5 IB_CA IB IB IB
> >> ehca IB_CA IB IB IB
> >> ipath IB_CA IB IB IB
> >> mthca IB_CA IB IB IB
> >> qib IB_CA IB IB IB
> >>
> >> For example:
> >> if (transport == IB) && (link-layer == ETH) will now become:
> >> if (query_transport() == IBOE)
> >>
> >> Thus we will be able to get rid of the respective transport and
> >> link-layer checking, and it will help us to add new
> >> protocol/Technology (like OPA) more easier, also with the introduced
> >> management helpers, IB management logical will be more clear and easier
> for extending.
> >>
> >> Highlights:
> >> The patch set covered a wide range of IB stuff, thus for those who are
> >> familiar with the particular part, your suggestion would be
> >> invaluable ;-)
> >>
> >> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> >> management helpers, 26#~27# do clean up.
> >>
> >> Patches haven't been tested yet, we appreciate if any one who have these
> >> HW willing to provide his Tested-by :-)
> >>
> >> Doug suggested the bitmask mechanism:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23765.html
> >> which could be the plan for future reforming, we prefer that to be
> another
> >> series which focus on semantic and performance.
> >>
> >> This patch-set is somewhat 'bloated' now and it may be a good timing for
> >> staging, I'd like to suggest we focus on improving existed helpers and push
> >> all the further reforms into next series ;-)
> >>
> >> Proposals:
> >> Sean:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23339.html
> >> Doug:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23418.html
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23765.html
> >> Jason:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23425.html
> >>
> >> Michael Wang (27):
> >> IB/Verbs: Implement new callback query_transport()
> >> IB/Verbs: Implement raw management helpers
> >> IB/Verbs: Reform IB-core mad/agent/user_mad
> >> IB/Verbs: Reform IB-core cm
> >> IB/Verbs: Reform IB-core sa_query
> >> IB/Verbs: Reform IB-core multicast
> >> IB/Verbs: Reform IB-ulp ipoib
> >> IB/Verbs: Reform IB-ulp xprtrdma
> >> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> >> IB/Verbs: Reform cm related part in IB-core cma/ucm
> >> IB/Verbs: Reform route related part in IB-core cma
> >> IB/Verbs: Reform mcast related part in IB-core cma
> >> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> >> IB/Verbs: Reform cma_acquire_dev()
> >> IB/Verbs: Reform rest part in IB-core cma
> >> IB/Verbs: Use management helper cap_ib_mad()
> >> IB/Verbs: Use management helper cap_ib_smi()
> >> IB/Verbs: Use management helper cap_ib_cm()
> >> IB/Verbs: Use management helper cap_iw_cm()
> >> IB/Verbs: Use management helper cap_ib_sa()
> >> IB/Verbs: Use management helper cap_ib_mcast()
> >> IB/Verbs: Use management helper cap_ipoib()
> >> IB/Verbs: Use management helper cap_read_multi_sge()
> >> IB/Verbs: Use management helper cap_af_ib()
> >> IB/Verbs: Use management helper cap_eth_ah()
> >> IB/Verbs: Clean up rdma_ib_or_iboe()
> >> IB/Verbs: Cleanup rdma_node_get_transport()
> >>
> >> ---
> >> drivers/infiniband/core/agent.c | 4
> >> drivers/infiniband/core/cm.c | 26 +-
> >> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> >> drivers/infiniband/core/device.c | 1
> >> drivers/infiniband/core/mad.c | 51 ++--
> >> drivers/infiniband/core/multicast.c | 18 -
> >> drivers/infiniband/core/sa_query.c | 41 +--
> >> drivers/infiniband/core/sysfs.c | 8
> >> drivers/infiniband/core/ucm.c | 5
> >> drivers/infiniband/core/ucma.c | 27 --
> >> drivers/infiniband/core/user_mad.c | 32 +-
> >> drivers/infiniband/core/uverbs_cmd.c | 6
> >> drivers/infiniband/core/verbs.c | 33 --
> >> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> >> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> >> drivers/infiniband/hw/cxgb4/provider.c | 7
> >> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> >> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> >> drivers/infiniband/hw/ehca/ehca_main.c | 1
> >> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> >> drivers/infiniband/hw/mlx4/main.c | 10
> >> drivers/infiniband/hw/mlx5/main.c | 7
> >> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> >> drivers/infiniband/hw/nes/nes_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> >> drivers/infiniband/hw/qib/qib_verbs.c | 7
> >> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> >> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> >> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> >> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> >> 35 files changed, 584 insertions(+), 368 deletions(-)
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> >> in the body of a message to [email protected] More majordomo
> >> info at http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-21 15:06:55

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On 4/21/15 2:39 AM, Michael Wang wrote:
>
> On 04/20/2015 05:51 PM, Tom Tucker wrote:
> [snip]
>>>>> int ib_query_gid(struct ib_device *device,
>>>>> u8 port_num, int index, union ib_gid *gid);
>>>>>
>>>> iWARP devices _must_ support the IWCM so cap_iw_cm() is not really useful.
>>> Sean suggested to add this helper paired with cap_ib_cm(), may be there are
>>> some consideration on maintainability?
>>>
>>> Me too also prefer this way to make the code more readable ;-)
>> It's more consistent, but not necessarily more readable -- if by readability we mean understanding.
>>
>> If the reader knows how the transports work, then the reader would be confused by the addition of a check that is always true. For the reader that doesn't know, the addition of the check implies that the support is optional, which it is not.
> The purpose is to make sure folks understand what we really want to check
> when they reviewing the code :-) and prepared for the further reform which may
> not rely on technology type any more, for example the device could tell core
> layer directly what management it required with a bitmask :-)
Hi Michael,

Thanks for the reply, but my premise was just wrong...I need to review the
whole patch, not just a snippet.

Thanks,
Tom
> Regards,
> Michael Wang
>
>> Tom
>>
>>> Regards,
>>> Michael Wang
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-04-21 15:56:52

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/21/2015 01:03 PM, Devesh Sharma wrote:
> Hi Michael,
>
> It will be great help if you could base you patches on existing Roland's tree and share to branch details to pull.
> Just like Chuck lever does for his nfs-rdma patches?

I've setup a repository in:
https://github.com/ywang-pb/infiniband-wy
git url is:
[email protected]:ywang-pb/infiniband-wy.git

It's based on latest infiniband/for-next, branch 'mgmt-helpers' contain
this patch set, please let me know if there are any issues :-)

Regards,
Michael Wang

>
> -Regards
> Devesh
>
>> -----Original Message-----
>> From: Michael Wang [mailto:[email protected]]
>> Sent: Tuesday, April 21, 2015 1:17 PM
>> To: Devesh Sharma; Roland Dreier; Sean Hefty; Hal Rosenstock; linux-
>> [email protected]; [email protected]; [email protected]
>> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
>> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran;
>> Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford
>> Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>
>> Hi, Devesh
>>
>> On 04/21/2015 07:41 AM, Devesh Sharma wrote:
>>> Hi Michael,
>>>
>>> is there a specific git branch available to pull out all the patches?
>>
>> Not yet, we may need the maintainer to tell us which branch could the series
>> been applied for testing purpose, after we all satisfied :-)
>>
>> For now we could 'git am' these patches to 'infiniband.git/for-next'
>> in order to do testing.
>>
>> Regards,
>> Michael Wang
>>
>>>
>>> -Regards
>>> Devesh
>>>
>>>> -----Original Message-----
>>>> From: [email protected] [mailto:linux-rdma-
>>>> [email protected]] On Behalf Of Michael Wang
>>>> Sent: Monday, April 20, 2015 1:59 PM
>>>> To: Roland Dreier; Sean Hefty; Hal Rosenstock;
>>>> [email protected]; [email protected];
>>>> [email protected]
>>>> Cc: Tom Tucker; Steve Wise; Hoang-Nam Nguyen; Christoph Raisch; Mike
>>>> Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz;
>>>> Haggai Eran; Ira Weiny; Tom Talpey; Jason Gunthorpe; Doug Ledford;
>>>> Michael Wang
>>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>>>
>>>>
>>>> Since v4:
>>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>>>> * Fix logical issue inside 3#, 14#
>>>> * Refine 3#, 4#, 5# with label 'free'
>>>> * Rework 10# to stop using port 1 when port already assigned
>>>>
>>>> There are plenty of lengthy code to check the transport type of IB
>>>> device, or the link layer type of it's port, but actually we are just
>>>> speculating whether a particular management/feature is supported by the
>> device/port.
>>>>
>>>> Thus instead of inferring, we should have our own mechanism for IB
>>>> management capability/protocol/feature checking, several proposals below.
>>>>
>>>> This patch set will reform the method of getting transport type, we
>>>> will now using query_transport() instead of inferring from transport
>>>> and link layer respectively, also we defined the new transport type
>>>> to make the concept more reasonable.
>>>>
>>>> Mapping List:
>>>> node-type link-layer old-transport new-transport
>>>> nes RNIC ETH IWARP IWARP
>>>> amso1100 RNIC ETH IWARP IWARP
>>>> cxgb3 RNIC ETH IWARP IWARP
>>>> cxgb4 RNIC ETH IWARP IWARP
>>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>>>> ocrdma IB_CA ETH IB IBOE
>>>> mlx4 IB_CA IB/ETH IB IB/IBOE
>>>> mlx5 IB_CA IB IB IB
>>>> ehca IB_CA IB IB IB
>>>> ipath IB_CA IB IB IB
>>>> mthca IB_CA IB IB IB
>>>> qib IB_CA IB IB IB
>>>>
>>>> For example:
>>>> if (transport == IB) && (link-layer == ETH) will now become:
>>>> if (query_transport() == IBOE)
>>>>
>>>> Thus we will be able to get rid of the respective transport and
>>>> link-layer checking, and it will help us to add new
>>>> protocol/Technology (like OPA) more easier, also with the introduced
>>>> management helpers, IB management logical will be more clear and easier
>> for extending.
>>>>
>>>> Highlights:
>>>> The patch set covered a wide range of IB stuff, thus for those who are
>>>> familiar with the particular part, your suggestion would be
>>>> invaluable ;-)
>>>>
>>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>>>> management helpers, 26#~27# do clean up.
>>>>
>>>> Patches haven't been tested yet, we appreciate if any one who have these
>>>> HW willing to provide his Tested-by :-)
>>>>
>>>> Doug suggested the bitmask mechanism:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23765.html
>>>> which could be the plan for future reforming, we prefer that to be
>> another
>>>> series which focus on semantic and performance.
>>>>
>>>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>>>> staging, I'd like to suggest we focus on improving existed helpers and push
>>>> all the further reforms into next series ;-)
>>>>
>>>> Proposals:
>>>> Sean:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23339.html
>>>> Doug:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23418.html
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23765.html
>>>> Jason:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23425.html
>>>>
>>>> Michael Wang (27):
>>>> IB/Verbs: Implement new callback query_transport()
>>>> IB/Verbs: Implement raw management helpers
>>>> IB/Verbs: Reform IB-core mad/agent/user_mad
>>>> IB/Verbs: Reform IB-core cm
>>>> IB/Verbs: Reform IB-core sa_query
>>>> IB/Verbs: Reform IB-core multicast
>>>> IB/Verbs: Reform IB-ulp ipoib
>>>> IB/Verbs: Reform IB-ulp xprtrdma
>>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>>>> IB/Verbs: Reform route related part in IB-core cma
>>>> IB/Verbs: Reform mcast related part in IB-core cma
>>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>>>> IB/Verbs: Reform cma_acquire_dev()
>>>> IB/Verbs: Reform rest part in IB-core cma
>>>> IB/Verbs: Use management helper cap_ib_mad()
>>>> IB/Verbs: Use management helper cap_ib_smi()
>>>> IB/Verbs: Use management helper cap_ib_cm()
>>>> IB/Verbs: Use management helper cap_iw_cm()
>>>> IB/Verbs: Use management helper cap_ib_sa()
>>>> IB/Verbs: Use management helper cap_ib_mcast()
>>>> IB/Verbs: Use management helper cap_ipoib()
>>>> IB/Verbs: Use management helper cap_read_multi_sge()
>>>> IB/Verbs: Use management helper cap_af_ib()
>>>> IB/Verbs: Use management helper cap_eth_ah()
>>>> IB/Verbs: Clean up rdma_ib_or_iboe()
>>>> IB/Verbs: Cleanup rdma_node_get_transport()
>>>>
>>>> ---
>>>> drivers/infiniband/core/agent.c | 4
>>>> drivers/infiniband/core/cm.c | 26 +-
>>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>>>> drivers/infiniband/core/device.c | 1
>>>> drivers/infiniband/core/mad.c | 51 ++--
>>>> drivers/infiniband/core/multicast.c | 18 -
>>>> drivers/infiniband/core/sa_query.c | 41 +--
>>>> drivers/infiniband/core/sysfs.c | 8
>>>> drivers/infiniband/core/ucm.c | 5
>>>> drivers/infiniband/core/ucma.c | 27 --
>>>> drivers/infiniband/core/user_mad.c | 32 +-
>>>> drivers/infiniband/core/uverbs_cmd.c | 6
>>>> drivers/infiniband/core/verbs.c | 33 --
>>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>>>> drivers/infiniband/hw/cxgb4/provider.c | 7
>>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>>>> drivers/infiniband/hw/mlx4/main.c | 10
>>>> drivers/infiniband/hw/mlx5/main.c | 7
>>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>>>> 35 files changed, 584 insertions(+), 368 deletions(-)
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>> in the body of a message to [email protected] More majordomo
>>>> info at http://vger.kernel.org/majordomo-info.html

2015-04-21 23:22:06

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs

On Mon, Apr 20, 2015 at 10:36:12AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core verbs/uverbs_cmd/sysfs.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/sysfs.c | 8 ++------
> drivers/infiniband/core/uverbs_cmd.c | 6 ++++--
> drivers/infiniband/core/verbs.c | 6 ++----
> 3 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index cbd0383..8570180 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
> static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
> char *buf)
> {
> - switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> + if (rdma_tech_ib(p->ibdev, p->port_num))

Is the final intention to remove Link Layer from the rdma stack entirely?

I know that the use of link layer in userspace is just as convoluted as what we
are trying to fix here in the kernel. So it would be nice if we can eventually
get user space cleaned up to not use link layer as it currently does.

However, standard networking tools can report the link layer. So while the
current use of "link layer" via userspace software is wrong I don't think it is
wrong to report this information _to_ userspace.

So unless we intend to completely hide the link layer from userspace I don't
think we should be removing the rdma_port_get_link_layer call. It is still
valid information even though we don't want to use it in most places.

Ira

> return sprintf(buf, "%s\n", "InfiniBand");
> - case IB_LINK_LAYER_ETHERNET:
> + else
> return sprintf(buf, "%s\n", "Ethernet");
> - default:
> - return sprintf(buf, "%s\n", "Unknown");
> - }
> }
>
> static PORT_ATTR_RO(state);
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index a9f0489..5dc90aa 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> resp.active_width = attr.active_width;
> resp.active_speed = attr.active_speed;
> resp.phys_state = attr.phys_state;
> - resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
> - cmd.port_num);
> + resp.link_layer = rdma_tech_ib(file->device->ib_dev,
> + cmd.port_num) ?
> + IB_LINK_LAYER_INFINIBAND :
> + IB_LINK_LAYER_ETHERNET;
>
> if (copy_to_user((void __user *) (unsigned long) cmd.response,
> &resp, sizeof resp))
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 626c9cf..7264860 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
> u32 flow_class;
> u16 gid_index;
> int ret;
> - int is_eth = (rdma_port_get_link_layer(device, port_num) ==
> - IB_LINK_LAYER_ETHERNET);
>
> memset(ah_attr, 0, sizeof *ah_attr);
> - if (is_eth) {
> + if (rdma_tech_iboe(device, port_num)) {
> if (!(wc->wc_flags & IB_WC_GRH))
> return -EPROTOTYPE;
>
> @@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
> union ib_gid sgid;
>
> if ((*qp_attr_mask & IB_QP_AV) &&
> - (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
> + (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
> ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
> qp_attr->ah_attr.grh.sgid_index, &sgid);
> if (ret)
> --
> 2.1.0

2015-04-22 00:10:25

by Liran Liss

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi Michael,

The spirit of this patch-set is great, but I think that we need to clarify some concepts.
Since this will affect the whole patch-set, I am laying out my concerns here instead.

A suggestion for the resulting management helpers is given below.
I believe the result would be much more coherent.
--Liran

In general
========

An ib_dev (or a port of) should be distinguished by 3 qualifiers:
- The link layer:
-- Ethernet (shared by iWARP, USNIC, and ROCE)
-- Infiniband

- The transport (*)
-- IBTA transport (shared by IB and ROCE)
-- iWARP transport
-- USNIC transport

(*) Transport means both:
- The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack)
- The transport semantics (for example, there are slight semantic differences between IBTA and iWARP)

- The node type (**)
-- CA
-- Switch
-- Router

(**) This has been extended to also encode the transport in the current code.
At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code.

So, I think that our "old-transport" below is just fine.
No need to change it (and you aren't, since it is currently implemented as a function).

The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type.
I would remove the table below and tell what we really want to achieve:
==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers.


Detailed remarks
==============

1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27.
This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set.
Do this and remove patches 16-24.

2)The name rdma_tech_* is lame.
rdma_transport_*(), adhering to the above (*) remark, is much better.
For example, both IB and ROCE *do* use the same transport.

3) The name cap_* as it is used above is not accurate.
You use it to describe technology characteristics rather than extendable capabilities.
I would suggest having a single convention for all helpers, such as rdma_has_*() and rdma_is_*().
For example: cap_ib_smi() ==> rdma_has_smi().

4) Remove all capabilities that do not introduce any distinction in the current code.
We can add them as needed later.
This means remove patches:
- [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all IB devices support ipoib
- [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB devices support AF_IB.

On the other hand:
- rdma_has_multicast() makes sense, since iWARP doesn’t support it.
- cap_ib_sa() might make sense to cut code even further in the CMA, since RoCE has a GSI but no SA.

5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
It *is* the link layer!

6) Remove cap_read_multi_sge
It is not device/port feature, but a transport capability.
Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'.

7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah().
Address handles that refer to Ethernet links always have Ethernet addressing.

In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway.
Currently, this patch just adds clutter.

8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model.
This is after renaming the function to rdma_is_ib_transport()...


Putting it all together
==================

We are left with the following helpers:
- rdma_is_ib_transport()
- rdma_is_iwarp_transport()
- rdma_is_usnic_transport()
- rdma_is_iboe()
- rdma_has_mad()
- rdma_has_smi()
- rdma_has_gsi() - complements smi; can be used by the mad code for clarity
- rdma_has_sa()
- rdma_has_cm()
- rdma_has_mcast()


> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>
>
> Since v4:
> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> * Fix logical issue inside 3#, 14#
> * Refine 3#, 4#, 5# with label 'free'
> * Rework 10# to stop using port 1 when port already assigned
>
> There are plenty of lengthy code to check the transport type of IB device, or
> the link layer type of it's port, but actually we are just speculating whether a
> particular management/feature is supported by the device/port.
>
> Thus instead of inferring, we should have our own mechanism for IB
> management capability/protocol/feature checking, several proposals below.
>
> This patch set will reform the method of getting transport type, we will now
> using query_transport() instead of inferring from transport and link layer
> respectively, also we defined the new transport type to make the concept
> more reasonable.
>
> Mapping List:
> node-type link-layer old-transport new-transport
> nes RNIC ETH IWARP IWARP
> amso1100 RNIC ETH IWARP IWARP
> cxgb3 RNIC ETH IWARP IWARP
> cxgb4 RNIC ETH IWARP IWARP
> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> ocrdma IB_CA ETH IB IBOE
> mlx4 IB_CA IB/ETH IB IB/IBOE
> mlx5 IB_CA IB IB IB
> ehca IB_CA IB IB IB
> ipath IB_CA IB IB IB
> mthca IB_CA IB IB IB
> qib IB_CA IB IB IB
>
> For example:
> if (transport == IB) && (link-layer == ETH) will now become:
> if (query_transport() == IBOE)
>
> Thus we will be able to get rid of the respective transport and link-layer
> checking, and it will help us to add new protocol/Technology (like OPA) more
> easier, also with the introduced management helpers, IB management logical
> will be more clear and easier for extending.
>
> Highlights:
> The patch set covered a wide range of IB stuff, thus for those who are
> familiar with the particular part, your suggestion would be invaluable ;-)
>
> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> management helpers, 26#~27# do clean up.
>
> Patches haven't been tested yet, we appreciate if any one who have these
> HW willing to provide his Tested-by :-)
>
> Doug suggested the bitmask mechanism:
> https://www.mail-archive.com/linux-
> [email protected]/msg23765.html
> which could be the plan for future reforming, we prefer that to be another
> series which focus on semantic and performance.
>
> This patch-set is somewhat 'bloated' now and it may be a good timing for
> staging, I'd like to suggest we focus on improving existed helpers and push
> all the further reforms into next series ;-)
>
> Proposals:
> Sean:
> https://www.mail-archive.com/linux-
> [email protected]/msg23339.html
> Doug:
> https://www.mail-archive.com/linux-
> [email protected]/msg23418.html
> https://www.mail-archive.com/linux-
> [email protected]/msg23765.html
> Jason:
> https://www.mail-archive.com/linux-
> [email protected]/msg23425.html
>
> Michael Wang (27):
> IB/Verbs: Implement new callback query_transport()
> IB/Verbs: Implement raw management helpers
> IB/Verbs: Reform IB-core mad/agent/user_mad
> IB/Verbs: Reform IB-core cm
> IB/Verbs: Reform IB-core sa_query
> IB/Verbs: Reform IB-core multicast
> IB/Verbs: Reform IB-ulp ipoib
> IB/Verbs: Reform IB-ulp xprtrdma
> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> IB/Verbs: Reform cm related part in IB-core cma/ucm
> IB/Verbs: Reform route related part in IB-core cma
> IB/Verbs: Reform mcast related part in IB-core cma
> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> IB/Verbs: Reform cma_acquire_dev()
> IB/Verbs: Reform rest part in IB-core cma
> IB/Verbs: Use management helper cap_ib_mad()
> IB/Verbs: Use management helper cap_ib_smi()
> IB/Verbs: Use management helper cap_ib_cm()
> IB/Verbs: Use management helper cap_iw_cm()
> IB/Verbs: Use management helper cap_ib_sa()
> IB/Verbs: Use management helper cap_ib_mcast()
> IB/Verbs: Use management helper cap_ipoib()
> IB/Verbs: Use management helper cap_read_multi_sge()
> IB/Verbs: Use management helper cap_af_ib()
> IB/Verbs: Use management helper cap_eth_ah()
> IB/Verbs: Clean up rdma_ib_or_iboe()
> IB/Verbs: Cleanup rdma_node_get_transport()
>
> ---
> drivers/infiniband/core/agent.c | 4
> drivers/infiniband/core/cm.c | 26 +-
> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> drivers/infiniband/core/device.c | 1
> drivers/infiniband/core/mad.c | 51 ++--
> drivers/infiniband/core/multicast.c | 18 -
> drivers/infiniband/core/sa_query.c | 41 +--
> drivers/infiniband/core/sysfs.c | 8
> drivers/infiniband/core/ucm.c | 5
> drivers/infiniband/core/ucma.c | 27 --
> drivers/infiniband/core/user_mad.c | 32 +-
> drivers/infiniband/core/uverbs_cmd.c | 6
> drivers/infiniband/core/verbs.c | 33 --
> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> drivers/infiniband/hw/cxgb4/provider.c | 7
> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> drivers/infiniband/hw/ehca/ehca_main.c | 1
> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> drivers/infiniband/hw/mlx4/main.c | 10
> drivers/infiniband/hw/mlx5/main.c | 7
> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> drivers/infiniband/hw/nes/nes_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> drivers/infiniband/hw/qib/qib_verbs.c | 7
> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> 35 files changed, 584 insertions(+), 368 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-22 00:05:04

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()

On Mon, Apr 20, 2015 at 10:32:01AM +0200, Michael Wang wrote:
>
> Add new callback query_transport() and implement for each HW.
>
> Mapping List:
> node-type link-layer old-transport new-transport
> nes RNIC ETH IWARP IWARP
> amso1100 RNIC ETH IWARP IWARP
> cxgb3 RNIC ETH IWARP IWARP
> cxgb4 RNIC ETH IWARP IWARP
> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> ocrdma IB_CA ETH IB IBOE
> mlx4 IB_CA IB/ETH IB IB/IBOE
> mlx5 IB_CA IB IB IB
> ehca IB_CA IB IB IB
> ipath IB_CA IB IB IB
> mthca IB_CA IB IB IB
> qib IB_CA IB IB IB
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/device.c | 1 +
> drivers/infiniband/core/verbs.c | 4 +++-
> drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
> drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
> drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
> drivers/infiniband/hw/ehca/ehca_main.c | 1 +
> drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
> drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
> drivers/infiniband/hw/mlx5/main.c | 7 +++++++
> drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
> drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
> drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
> include/rdma/ib_verbs.h | 7 ++++++-
> 21 files changed, 104 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 18c1ece..a9587c4 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
> } mandatory_table[] = {
> IB_MANDATORY_FUNC(query_device),
> IB_MANDATORY_FUNC(query_port),
> + IB_MANDATORY_FUNC(query_transport),
> IB_MANDATORY_FUNC(query_pkey),
> IB_MANDATORY_FUNC(query_gid),
> IB_MANDATORY_FUNC(alloc_pd),
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index f93eb8d..626c9cf 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -133,14 +133,16 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
> if (device->get_link_layer)
> return device->get_link_layer(device, port_num);
>
> - switch (rdma_node_get_transport(device->node_type)) {
> + switch (device->query_transport(device, port_num)) {
> case RDMA_TRANSPORT_IB:
> return IB_LINK_LAYER_INFINIBAND;
> + case RDMA_TRANSPORT_IBOE:
> case RDMA_TRANSPORT_IWARP:
> case RDMA_TRANSPORT_USNIC:
> case RDMA_TRANSPORT_USNIC_UDP:
> return IB_LINK_LAYER_ETHERNET;
> default:
> + BUG();
> return IB_LINK_LAYER_UNSPECIFIED;
> }
> }
> diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
> index bdf3507..d46bbb0 100644
> --- a/drivers/infiniband/hw/amso1100/c2_provider.c
> +++ b/drivers/infiniband/hw/amso1100/c2_provider.c
> @@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +c2_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static int c2_query_pkey(struct ib_device *ibdev,
> u8 port, u16 index, u16 * pkey)
> {
> @@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
> dev->ibdev.dma_device = &dev->pcidev->dev;
> dev->ibdev.query_device = c2_query_device;
> dev->ibdev.query_port = c2_query_port;
> + dev->ibdev.query_transport = c2_query_transport;
> dev->ibdev.query_pkey = c2_query_pkey;
> dev->ibdev.query_gid = c2_query_gid;
> dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> index 811b24a..09682e9e 100644
> --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> @@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +iwch_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> char *buf)
> {
> @@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
> dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
> dev->ibdev.query_device = iwch_query_device;
> dev->ibdev.query_port = iwch_query_port;
> + dev->ibdev.query_transport = iwch_query_transport;
> dev->ibdev.query_pkey = iwch_query_pkey;
> dev->ibdev.query_gid = iwch_query_gid;
> dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
> index 66bd6a2..a445e0d 100644
> --- a/drivers/infiniband/hw/cxgb4/provider.c
> +++ b/drivers/infiniband/hw/cxgb4/provider.c
> @@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +static enum rdma_transport_type
> +c4iw_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> char *buf)
> {
> @@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
> dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
> dev->ibdev.query_device = c4iw_query_device;
> dev->ibdev.query_port = c4iw_query_port;
> + dev->ibdev.query_transport = c4iw_query_transport;
> dev->ibdev.query_pkey = c4iw_query_pkey;
> dev->ibdev.query_gid = c4iw_query_gid;
> dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
> index 9ed4d25..d5a34a6 100644
> --- a/drivers/infiniband/hw/ehca/ehca_hca.c
> +++ b/drivers/infiniband/hw/ehca/ehca_hca.c
> @@ -242,6 +242,12 @@ query_port1:
> return ret;
> }
>
> +enum rdma_transport_type
> +ehca_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> int ehca_query_sma_attr(struct ehca_shca *shca,
> u8 port, struct ehca_sma_attr *attr)
> {
> diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> index 22f79af..cec945f 100644
> --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> @@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
> int ehca_query_port(struct ib_device *ibdev, u8 port,
> struct ib_port_attr *props);
>
> +enum rdma_transport_type
> +ehca_query_transport(struct ib_device *device, u8 port_num);
> +
> int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
> struct ehca_sma_attr *attr);
>
> diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
> index cd8d290..60e0a09 100644
> --- a/drivers/infiniband/hw/ehca/ehca_main.c
> +++ b/drivers/infiniband/hw/ehca/ehca_main.c
> @@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
> shca->ib_device.dma_device = &shca->ofdev->dev;
> shca->ib_device.query_device = ehca_query_device;
> shca->ib_device.query_port = ehca_query_port;
> + shca->ib_device.query_transport = ehca_query_transport;
> shca->ib_device.query_gid = ehca_query_gid;
> shca->ib_device.query_pkey = ehca_query_pkey;
> /* shca->in_device.modify_device = ehca_modify_device */
> diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
> index 44ea939..58d36e3 100644
> --- a/drivers/infiniband/hw/ipath/ipath_verbs.c
> +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
> @@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +ipath_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int ipath_modify_device(struct ib_device *device,
> int device_modify_mask,
> struct ib_device_modify *device_modify)
> @@ -2140,6 +2146,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
> dev->query_device = ipath_query_device;
> dev->modify_device = ipath_modify_device;
> dev->query_port = ipath_query_port;
> + dev->query_transport = ipath_query_transport;
> dev->modify_port = ipath_modify_port;
> dev->query_pkey = ipath_query_pkey;
> dev->query_gid = ipath_query_gid;
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index b972c0b..e1424ad 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
> return __mlx4_ib_query_port(ibdev, port, props, 0);
> }
>
> +static enum rdma_transport_type
> +mlx4_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + struct mlx4_dev *dev = to_mdev(device)->dev;
> +
> + return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
> + RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE;
> +}
> +
> int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> union ib_gid *gid, int netw_view)
> {
> @@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>
> ibdev->ib_dev.query_device = mlx4_ib_query_device;
> ibdev->ib_dev.query_port = mlx4_ib_query_port;
> + ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
> ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
> ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
> ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index cc4ac1e..209c796 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -351,6 +351,12 @@ out:
> return err;
> }
>
> +static enum rdma_transport_type
> +mlx5_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> union ib_gid *gid)
> {
> @@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
>
> dev->ib_dev.query_device = mlx5_ib_query_device;
> dev->ib_dev.query_port = mlx5_ib_query_port;
> + dev->ib_dev.query_transport = mlx5_ib_query_transport;
> dev->ib_dev.query_gid = mlx5_ib_query_gid;
> dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
> dev->ib_dev.modify_device = mlx5_ib_modify_device;
> diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
> index 415f8e1..67ac6a4 100644
> --- a/drivers/infiniband/hw/mthca/mthca_provider.c
> +++ b/drivers/infiniband/hw/mthca/mthca_provider.c
> @@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
> return err;
> }
>
> +static enum rdma_transport_type
> +mthca_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int mthca_modify_device(struct ib_device *ibdev,
> int mask,
> struct ib_device_modify *props)
> @@ -1281,6 +1287,7 @@ int mthca_register_device(struct mthca_dev *dev)
> dev->ib_dev.dma_device = &dev->pdev->dev;
> dev->ib_dev.query_device = mthca_query_device;
> dev->ib_dev.query_port = mthca_query_port;
> + dev->ib_dev.query_transport = mthca_query_transport;
> dev->ib_dev.modify_device = mthca_modify_device;
> dev->ib_dev.modify_port = mthca_modify_port;
> dev->ib_dev.query_pkey = mthca_query_pkey;
> diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
> index c0d0296..8df5b61 100644
> --- a/drivers/infiniband/hw/nes/nes_verbs.c
> +++ b/drivers/infiniband/hw/nes/nes_verbs.c
> @@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
> return 0;
> }
>
> +static enum rdma_transport_type
> +nes_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
>
> /**
> * nes_query_pkey
> @@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
> nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
> nesibdev->ibdev.query_device = nes_query_device;
> nesibdev->ibdev.query_port = nes_query_port;
> + nesibdev->ibdev.query_transport = nes_query_transport;
> nesibdev->ibdev.query_pkey = nes_query_pkey;
> nesibdev->ibdev.query_gid = nes_query_gid;
> nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> index 7a2b59a..9f4d182 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> @@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
> /* mandatory verbs. */
> dev->ibdev.query_device = ocrdma_query_device;
> dev->ibdev.query_port = ocrdma_query_port;
> + dev->ibdev.query_transport = ocrdma_query_transport;
> dev->ibdev.modify_port = ocrdma_modify_port;
> dev->ibdev.query_gid = ocrdma_query_gid;
> dev->ibdev.get_link_layer = ocrdma_link_layer;
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> index 8771755..73bace4 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> @@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +enum rdma_transport_type
> +ocrdma_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IBOE;
> +}
> +
> int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
> struct ib_port_modify *props)
> {
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> index b8f7853..4a81b63 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> @@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port, struct ib_port_attr *props);
> int ocrdma_modify_port(struct ib_device *, u8 port, int mask,
> struct ib_port_modify *props);
>
> +enum rdma_transport_type
> +ocrdma_query_transport(struct ib_device *device, u8 port_num);
> +
> void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid);
> int ocrdma_query_gid(struct ib_device *, u8 port,
> int index, union ib_gid *gid);
> diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
> index 4a35998..caad665 100644
> --- a/drivers/infiniband/hw/qib/qib_verbs.c
> +++ b/drivers/infiniband/hw/qib/qib_verbs.c
> @@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +static enum rdma_transport_type
> +qib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int qib_modify_device(struct ib_device *device,
> int device_modify_mask,
> struct ib_device_modify *device_modify)
> @@ -2184,6 +2190,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
> ibdev->query_device = qib_query_device;
> ibdev->modify_device = qib_modify_device;
> ibdev->query_port = qib_query_port;
> + ibdev->query_transport = qib_query_transport;
> ibdev->modify_port = qib_modify_port;
> ibdev->query_pkey = qib_query_pkey;
> ibdev->query_gid = qib_query_gid;
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> index 0d0f986..03ea9f3 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> @@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
>
> us_ibdev->ib_dev.query_device = usnic_ib_query_device;
> us_ibdev->ib_dev.query_port = usnic_ib_query_port;
> + us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
> us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
> us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
> us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> index 53bd6a2..ff9a5f7 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> @@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +enum rdma_transport_type
> +usnic_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_USNIC_UDP;
> +}
> +
> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> int qp_attr_mask,
> struct ib_qp_init_attr *qp_init_attr)
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> index bb864f5..0b1633b 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> @@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
> struct ib_device_attr *props);
> int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
> struct ib_port_attr *props);
> +enum rdma_transport_type
> +usnic_ib_query_transport(struct ib_device *device, u8 port_num);
> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> int qp_attr_mask,
> struct ib_qp_init_attr *qp_init_attr);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 65994a1..d54f91e 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -75,10 +75,13 @@ enum rdma_node_type {
> };
>
> enum rdma_transport_type {
> + /* legacy for users */
> RDMA_TRANSPORT_IB,
> RDMA_TRANSPORT_IWARP,
> RDMA_TRANSPORT_USNIC,
> - RDMA_TRANSPORT_USNIC_UDP
> + RDMA_TRANSPORT_USNIC_UDP,
> + /* new transport */
> + RDMA_TRANSPORT_IBOE,
> };
>
> __attribute_const__ enum rdma_transport_type
> @@ -1501,6 +1504,8 @@ struct ib_device {
> int (*query_port)(struct ib_device *device,
> u8 port_num,
> struct ib_port_attr *port_attr);
> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
> + u8 port_num);
> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
> u8 port_num);
> int (*query_gid)(struct ib_device *device,
> --
> 2.1.0

2015-04-22 00:05:20

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 02/27] IB/Verbs: Implement raw management helpers

On Mon, Apr 20, 2015 at 10:32:32AM +0200, Michael Wang wrote:
>
> Add raw helpers:
> rdma_tech_ib
> rdma_tech_iboe
> rdma_tech_iwarp
> rdma_ib_or_iboe (transition, clean up later)
> To help us detect which technology the port supported.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> include/rdma/ib_verbs.h | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index d54f91e..a12e876 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1748,6 +1748,31 @@ int ib_query_port(struct ib_device *device,
> enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
> u8 port_num);
>
> +static inline int rdma_tech_ib(struct ib_device *device, u8 port_num)
> +{
> + return device->query_transport(device, port_num)
> + == RDMA_TRANSPORT_IB;
> +}
> +
> +static inline int rdma_tech_iboe(struct ib_device *device, u8 port_num)
> +{
> + return device->query_transport(device, port_num)
> + == RDMA_TRANSPORT_IBOE;
> +}
> +
> +static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
> +{
> + return device->query_transport(device, port_num)
> + == RDMA_TRANSPORT_IWARP;
> +}
> +
> +static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
> +{
> + enum rdma_transport_type tp = device->query_transport(device, port_num);
> +
> + return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:05:31

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 03/27] IB/Verbs: Reform IB-core mad/agent/user_mad

On Mon, Apr 20, 2015 at 10:33:11AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core mad/agent/user_mad.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/agent.c | 2 +-
> drivers/infiniband/core/mad.c | 43 +++++++++++++++++++-------------------
> drivers/infiniband/core/user_mad.c | 26 ++++++++++++++++-------
> 3 files changed, 41 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
> index f6d2961..ffdef4d 100644
> --- a/drivers/infiniband/core/agent.c
> +++ b/drivers/infiniband/core/agent.c
> @@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
> goto error1;
> }
>
> - if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) {
> + if (rdma_tech_ib(device, port_num)) {
> /* Obtain send only MAD agent for SMI QP */
> port_priv->agent[0] = ib_register_mad_agent(device, port_num,
> IB_QPT_SMI, NULL, 0,
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 74c30f4..1822932 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
> init_mad_qp(port_priv, &port_priv->qp_info[1]);
>
> cq_size = mad_sendq_size + mad_recvq_size;
> - has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND;
> + has_smi = rdma_tech_ib(device, port_num);
> if (has_smi)
> cq_size *= 2;
>
> @@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
> {
> int start, end, i;
>
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> -
> if (device->node_type == RDMA_NODE_IB_SWITCH) {
> start = 0;
> end = 0;
> @@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
> }
>
> for (i = start; i <= end; i++) {
> + if (!rdma_ib_or_iboe(device, i))
> + continue;
> +
> if (ib_mad_port_open(device, i)) {
> dev_err(&device->dev, "Couldn't open port %d\n", i);
> goto error;
> @@ -3086,40 +3086,39 @@ error_agent:
> dev_err(&device->dev, "Couldn't close port %d\n", i);
>
> error:
> - i--;
> + while (--i >= start) {
> + if (!rdma_ib_or_iboe(device, i))
> + continue;
>
> - while (i >= start) {
> if (ib_agent_port_close(device, i))
> dev_err(&device->dev,
> "Couldn't close port %d for agents\n", i);
> if (ib_mad_port_close(device, i))
> dev_err(&device->dev, "Couldn't close port %d\n", i);
> - i--;
> }
> }
>
> static void ib_mad_remove_device(struct ib_device *device)
> {
> - int i, num_ports, cur_port;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int start, end, i;
>
> if (device->node_type == RDMA_NODE_IB_SWITCH) {
> - num_ports = 1;
> - cur_port = 0;
> + start = 0;
> + end = 0;
> } else {
> - num_ports = device->phys_port_cnt;
> - cur_port = 1;
> + start = 1;
> + end = device->phys_port_cnt;
> }
> - for (i = 0; i < num_ports; i++, cur_port++) {
> - if (ib_agent_port_close(device, cur_port))
> +
> + for (i = start; i <= end; i++) {
> + if (!rdma_ib_or_iboe(device, i))
> + continue;
> +
> + if (ib_agent_port_close(device, i))
> dev_err(&device->dev,
> - "Couldn't close port %d for agents\n",
> - cur_port);
> - if (ib_mad_port_close(device, cur_port))
> - dev_err(&device->dev, "Couldn't close port %d\n",
> - cur_port);
> + "Couldn't close port %d for agents\n", i);
> + if (ib_mad_port_close(device, i))
> + dev_err(&device->dev, "Couldn't close port %d\n", i);
> }
> }
>
> diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
> index 928cdd2..aa8b334 100644
> --- a/drivers/infiniband/core/user_mad.c
> +++ b/drivers/infiniband/core/user_mad.c
> @@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
> {
> struct ib_umad_device *umad_dev;
> int s, e, i;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;
>
> if (device->node_type == RDMA_NODE_IB_SWITCH)
> s = e = 0;
> @@ -1296,21 +1294,33 @@ static void ib_umad_add_one(struct ib_device *device)
> umad_dev->end_port = e;
>
> for (i = s; i <= e; ++i) {
> + if (!rdma_ib_or_iboe(device, i))
> + continue;
> +
> umad_dev->port[i - s].umad_dev = umad_dev;
>
> if (ib_umad_init_port(device, i, umad_dev,
> &umad_dev->port[i - s]))
> goto err;
> +
> + count++;
> }
>
> + if (!count)
> + goto free;
> +
> ib_set_client_data(device, &umad_client, umad_dev);
>
> return;
>
> err:
> - while (--i >= s)
> - ib_umad_kill_port(&umad_dev->port[i - s]);
> + while (--i >= s) {
> + if (!rdma_ib_or_iboe(device, i))
> + continue;
>
> + ib_umad_kill_port(&umad_dev->port[i - s]);
> + }
> +free:
> kobject_put(&umad_dev->kobj);
> }
>
> @@ -1322,8 +1332,10 @@ static void ib_umad_remove_one(struct ib_device *device)
> if (!umad_dev)
> return;
>
> - for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i)
> - ib_umad_kill_port(&umad_dev->port[i]);
> + for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
> + if (rdma_ib_or_iboe(device, i))
> + ib_umad_kill_port(&umad_dev->port[i]);
> + }
>
> kobject_put(&umad_dev->kobj);
> }
> --
> 2.1.0

2015-04-22 00:05:45

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 04/27] IB/Verbs: Reform IB-core cm

On Mon, Apr 20, 2015 at 10:33:45AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core cm.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cm.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e28a494..3c10b75 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
> unsigned long flags;
> int ret;
> u8 i;
> -
> - if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;
>
> cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
> ib_device->phys_port_cnt, GFP_KERNEL);
> @@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)
>
> set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> + if (!rdma_ib_or_iboe(ib_device, i))
> + continue;
> +
> port = kzalloc(sizeof *port, GFP_KERNEL);
> if (!port)
> goto error1;
> @@ -3809,7 +3810,13 @@ static void cm_add_one(struct ib_device *ib_device)
> ret = ib_modify_port(ib_device, i, 0, &port_modify);
> if (ret)
> goto error3;
> +
> + count++;
> }
> +
> + if (!count)
> + goto free;
> +
> ib_set_client_data(ib_device, &cm_client, cm_dev);
>
> write_lock_irqsave(&cm.device_lock, flags);
> @@ -3825,11 +3832,15 @@ error1:
> port_modify.set_port_cap_mask = 0;
> port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> while (--i) {
> + if (!rdma_ib_or_iboe(ib_device, i))
> + continue;
> +
> port = cm_dev->port[i-1];
> ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> ib_unregister_mad_agent(port->mad_agent);
> cm_remove_port_fs(port);
> }
> +free:
> device_unregister(cm_dev->device);
> kfree(cm_dev);
> }
> @@ -3853,6 +3864,9 @@ static void cm_remove_one(struct ib_device *ib_device)
> write_unlock_irqrestore(&cm.device_lock, flags);
>
> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> + if (!rdma_ib_or_iboe(ib_device, i))
> + continue;
> +
> port = cm_dev->port[i-1];
> ib_modify_port(ib_device, port->port_num, 0, &port_modify);
> ib_unregister_mad_agent(port->mad_agent);
> --
> 2.1.0

2015-04-22 00:06:14

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 05/27] IB/Verbs: Reform IB-core sa_query

On Mon, Apr 20, 2015 at 10:34:23AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core sa_query.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/sa_query.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> index c38f030..60dc7aa 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
> struct ib_sa_port *port =
> &sa_dev->port[event->element.port_num - sa_dev->start_port];
>
> - if (rdma_port_get_link_layer(handler->device, port->port_num) != IB_LINK_LAYER_INFINIBAND)
> + if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
> return;
>
> spin_lock_irqsave(&port->ah_lock, flags);
> @@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
> ah_attr->port_num = port_num;
> ah_attr->static_rate = rec->rate;
>
> - force_grh = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_ETHERNET;
> + force_grh = rdma_tech_iboe(device, port_num);
>
> if (rec->hop_limit > 1 || force_grh) {
> ah_attr->ah_flags = IB_AH_GRH;
> @@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
> {
> struct ib_sa_device *sa_dev;
> int s, e, i;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;
>
> if (device->node_type == RDMA_NODE_IB_SWITCH)
> s = e = 0;
> @@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
>
> for (i = 0; i <= e - s; ++i) {
> spin_lock_init(&sa_dev->port[i].ah_lock);
> - if (rdma_port_get_link_layer(device, i + 1) != IB_LINK_LAYER_INFINIBAND)
> + if (!rdma_tech_ib(device, i + 1))
> continue;
>
> sa_dev->port[i].sm_ah = NULL;
> @@ -1189,8 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
> goto err;
>
> INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
> +
> + count++;
> }
>
> + if (!count)
> + goto free;
> +
> ib_set_client_data(device, &sa_client, sa_dev);
>
> /*
> @@ -1204,19 +1207,21 @@ static void ib_sa_add_one(struct ib_device *device)
> if (ib_register_event_handler(&sa_dev->event_handler))
> goto err;
>
> - for (i = 0; i <= e - s; ++i)
> - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
> + for (i = 0; i <= e - s; ++i) {
> + if (rdma_tech_ib(device, i + 1))
> update_sm_ah(&sa_dev->port[i].update_task);
> + }
>
> return;
>
> err:
> - while (--i >= 0)
> - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
> + while (--i >= 0) {
> + if (rdma_tech_ib(device, i + 1))
> ib_unregister_mad_agent(sa_dev->port[i].agent);
> + }
>
> +free:
> kfree(sa_dev);
> -
> return;
> }
>
> @@ -1233,7 +1238,7 @@ static void ib_sa_remove_one(struct ib_device *device)
> flush_workqueue(ib_wq);
>
> for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
> - if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) {
> + if (rdma_tech_ib(device, i + 1)) {
> ib_unregister_mad_agent(sa_dev->port[i].agent);
> if (sa_dev->port[i].sm_ah)
> kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
> --
> 2.1.0

2015-04-22 00:06:21

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 06/27] IB/Verbs: Reform IB-core multicast

On Mon, Apr 20, 2015 at 10:34:48AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core multicast.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/multicast.c | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
> index fa17b55..24d93f5 100644
> --- a/drivers/infiniband/core/multicast.c
> +++ b/drivers/infiniband/core/multicast.c
> @@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
> int index;
>
> dev = container_of(handler, struct mcast_device, event_handler);
> - if (rdma_port_get_link_layer(dev->device, event->element.port_num) !=
> - IB_LINK_LAYER_INFINIBAND)
> + if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
> return;
>
> index = event->element.port_num - dev->start_port;
> @@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
> int i;
> int count = 0;
>
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> -
> dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
> GFP_KERNEL);
> if (!dev)
> @@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
> }
>
> for (i = 0; i <= dev->end_port - dev->start_port; i++) {
> - if (rdma_port_get_link_layer(device, dev->start_port + i) !=
> - IB_LINK_LAYER_INFINIBAND)
> + if (!rdma_tech_ib(device, dev->start_port + i))
> continue;
> port = &dev->port[i];
> port->dev = dev;
> @@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
> flush_workqueue(mcast_wq);
>
> for (i = 0; i <= dev->end_port - dev->start_port; i++) {
> - if (rdma_port_get_link_layer(device, dev->start_port + i) ==
> - IB_LINK_LAYER_INFINIBAND) {
> + if (rdma_tech_ib(device, dev->start_port + i)) {
> port = &dev->port[i];
> deref_port(port);
> wait_for_completion(&port->comp);
> --
> 2.1.0

2015-04-22 00:06:40

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 07/27] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 20, 2015 at 10:35:15AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-ulp ipoib.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index 58b5aa3..60b379d 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -1654,9 +1654,7 @@ static void ipoib_add_one(struct ib_device *device)
> struct net_device *dev;
> struct ipoib_dev_priv *priv;
> int s, e, p;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;
>
> dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
> if (!dev_list)
> @@ -1673,15 +1671,21 @@ static void ipoib_add_one(struct ib_device *device)
> }
>
> for (p = s; p <= e; ++p) {
> - if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
> + if (!rdma_tech_ib(device, p))
> continue;
> dev = ipoib_add_port("ib%d", device, p);
> if (!IS_ERR(dev)) {
> priv = netdev_priv(dev);
> list_add_tail(&priv->list, dev_list);
> + count++;
> }
> }
>
> + if (!count) {
> + kfree(dev_list);
> + return;
> + }
> +
> ib_set_client_data(device, &ipoib_client, dev_list);
> }
>
> @@ -1690,9 +1694,6 @@ static void ipoib_remove_one(struct ib_device *device)
> struct ipoib_dev_priv *priv, *tmp;
> struct list_head *dev_list;
>
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> -
> dev_list = ib_get_client_data(device, &ipoib_client);
> if (!dev_list)
> return;
> --
> 2.1.0

2015-04-22 00:06:55

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 08/27] IB/Verbs: Reform IB-ulp xprtrdma

On Mon, Apr 20, 2015 at 10:35:47AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-ulp xprtrdma.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 +--
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +++++++++++++-------------------
> 2 files changed, 19 insertions(+), 29 deletions(-)
>
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index f9f13a3..a5bed5b 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -117,8 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>
> static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
> {
> - if (rdma_node_get_transport(xprt->sc_cm_id->device->node_type) ==
> - RDMA_TRANSPORT_IWARP)
> + if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
> return 1;
> else
> return min_t(int, sge_count, xprt->sc_max_sge);
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> index f609c1c..a09b7a1 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> @@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
> struct ib_qp_init_attr qp_attr;
> struct ib_device_attr devattr;
> int uninitialized_var(dma_mr_acc);
> - int need_dma_mr;
> + int need_dma_mr = 0;
> int ret;
> int i;
>
> @@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
> /*
> * Determine if a DMA MR is required and if so, what privs are required
> */
> - switch (rdma_node_get_transport(newxprt->sc_cm_id->device->node_type)) {
> - case RDMA_TRANSPORT_IWARP:
> - newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
> - if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
> - need_dma_mr = 1;
> - dma_mr_acc =
> - (IB_ACCESS_LOCAL_WRITE |
> - IB_ACCESS_REMOTE_WRITE);
> - } else if (!(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
> - need_dma_mr = 1;
> - dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
> - } else
> - need_dma_mr = 0;
> - break;
> - case RDMA_TRANSPORT_IB:
> - if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
> - need_dma_mr = 1;
> - dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
> - } else if (!(devattr.device_cap_flags &
> - IB_DEVICE_LOCAL_DMA_LKEY)) {
> - need_dma_mr = 1;
> - dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
> - } else
> - need_dma_mr = 0;
> - break;
> - default:
> + if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num) &&
> + !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num))
> goto errout;
> +
> + if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
> + !(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
> + need_dma_mr = 1;
> + dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
> + if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num) &&
> + !(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG))
> + dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
> }
>
> + if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num))
> + newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
> +
> /* Create the DMA MR if needed, otherwise, use the DMA LKEY */
> if (need_dma_mr) {
> /* Register all of physical memory */
> --
> 2.1.0

2015-04-22 00:07:15

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 10/27] IB/Verbs: Reform cm related part in IB-core cma/ucm

On Mon, Apr 20, 2015 at 10:36:36AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform cm related part in IB-core cma/ucm.
>
> Few checks focus on the device cm type rather than the port capability,
> directly pass port 1 works currently, but can't support mixing cm type
> device in future.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 81 +++++++++++++------------------------------
> drivers/infiniband/core/ucm.c | 3 +-
> 2 files changed, 26 insertions(+), 58 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index d570030..815e41b 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -735,8 +735,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
> int ret = 0;
>
> id_priv = container_of(id, struct rdma_id_private, id);
> - switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, id->port_num)) {
> if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
> ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
> else
> @@ -745,19 +744,15 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
>
> if (qp_attr->qp_state == IB_QPS_RTR)
> qp_attr->rq_psn = id_priv->seq_num;
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> if (!id_priv->cm_id.iw) {
> qp_attr->qp_access_flags = 0;
> *qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
> } else
> ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
> qp_attr_mask);
> - break;
> - default:
> + } else
> ret = -ENOSYS;
> - break;
> - }
>
> return ret;
> }
> @@ -1037,17 +1032,12 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> mutex_unlock(&id_priv->handler_mutex);
>
> if (id_priv->cma_dev) {
> - switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
> if (id_priv->cm_id.ib)
> ib_destroy_cm_id(id_priv->cm_id.ib);
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
> if (id_priv->cm_id.iw)
> iw_destroy_cm_id(id_priv->cm_id.iw);
> - break;
> - default:
> - break;
> }
> cma_leave_mc_groups(id_priv);
> cma_release_dev(id_priv);
> @@ -1626,7 +1616,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv,
> int ret;
>
> if (cma_family(id_priv) == AF_IB &&
> - rdma_node_get_transport(cma_dev->device->node_type) != RDMA_TRANSPORT_IB)
> + !rdma_ib_or_iboe(cma_dev->device, 1))
> return;
>
> id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
> @@ -2028,7 +2018,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
> mutex_lock(&lock);
> list_for_each_entry(cur_dev, &dev_list, list) {
> if (cma_family(id_priv) == AF_IB &&
> - rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
> + !rdma_ib_or_iboe(cur_dev->device, 1))
> continue;
>
> if (!cma_dev)
> @@ -2060,7 +2050,7 @@ port_found:
> goto out;
>
> id_priv->id.route.addr.dev_addr.dev_type =
> - (rdma_port_get_link_layer(cma_dev->device, p) == IB_LINK_LAYER_INFINIBAND) ?
> + (rdma_tech_ib(cma_dev->device, p)) ?
> ARPHRD_INFINIBAND : ARPHRD_ETHER;
>
> rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
> @@ -2537,18 +2527,15 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
>
> id_priv->backlog = backlog;
> if (id->device) {
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, 1)) {
> ret = cma_ib_listen(id_priv);
> if (ret)
> goto err;
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, 1)) {
> ret = cma_iw_listen(id_priv, backlog);
> if (ret)
> goto err;
> - break;
> - default:
> + } else {
> ret = -ENOSYS;
> goto err;
> }
> @@ -2884,20 +2871,15 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> id_priv->srq = conn_param->srq;
> }
>
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD)
> ret = cma_resolve_ib_udp(id_priv, conn_param);
> else
> ret = cma_connect_ib(id_priv, conn_param);
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, id->port_num))
> ret = cma_connect_iw(id_priv, conn_param);
> - break;
> - default:
> + else
> ret = -ENOSYS;
> - break;
> - }
> if (ret)
> goto err;
>
> @@ -3000,8 +2982,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> id_priv->srq = conn_param->srq;
> }
>
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD) {
> if (conn_param)
> ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
> @@ -3017,14 +2998,10 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> else
> ret = cma_rep_recv(id_priv);
> }
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, id->port_num))
> ret = cma_accept_iw(id_priv, conn_param);
> - break;
> - default:
> + else
> ret = -ENOSYS;
> - break;
> - }
>
> if (ret)
> goto reject;
> @@ -3068,8 +3045,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
> if (!id_priv->cm_id.ib)
> return -EINVAL;
>
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD)
> ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
> private_data, private_data_len);
> @@ -3077,15 +3053,12 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
> ret = ib_send_cm_rej(id_priv->cm_id.ib,
> IB_CM_REJ_CONSUMER_DEFINED, NULL,
> 0, private_data, private_data_len);
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> ret = iw_cm_reject(id_priv->cm_id.iw,
> private_data, private_data_len);
> - break;
> - default:
> + } else
> ret = -ENOSYS;
> - break;
> - }
> +
> return ret;
> }
> EXPORT_SYMBOL(rdma_reject);
> @@ -3099,22 +3072,18 @@ int rdma_disconnect(struct rdma_cm_id *id)
> if (!id_priv->cm_id.ib)
> return -EINVAL;
>
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id->device, id->port_num)) {
> ret = cma_modify_qp_err(id_priv);
> if (ret)
> goto out;
> /* Initiate or respond to a disconnect. */
> if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
> ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
> - break;
> - default:
> + } else
> ret = -EINVAL;
> - break;
> - }
> +
> out:
> return ret;
> }
> diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
> index f2f6393..70e0ccb 100644
> --- a/drivers/infiniband/core/ucm.c
> +++ b/drivers/infiniband/core/ucm.c
> @@ -1253,8 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
> dev_t base;
> struct ib_ucm_device *ucm_dev;
>
> - if (!device->alloc_ucontext ||
> - rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> + if (!device->alloc_ucontext || !rdma_ib_or_iboe(device, 1))
> return;
>
> ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
> --
> 2.1.0

2015-04-22 00:07:30

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 11/27] IB/Verbs: Reform route related part in IB-core cma

On Mon, Apr 20, 2015 at 10:37:13AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform route related part in IB-core cma.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 31 ++++++++-----------------------
> drivers/infiniband/core/ucma.c | 25 ++++++-------------------
> 2 files changed, 14 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 815e41b..fa69f34 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -923,13 +923,9 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)
>
> static void cma_cancel_route(struct rdma_id_private *id_priv)
> {
> - switch (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> + if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
> if (id_priv->query)
> ib_sa_cancel_query(id_priv->query_id, id_priv->query);
> - break;
> - default:
> - break;
> }
> }
>
> @@ -1957,26 +1953,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
> return -EINVAL;
>
> atomic_inc(&id_priv->refcount);
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> - switch (rdma_port_get_link_layer(id->device, id->port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> - ret = cma_resolve_ib_route(id_priv, timeout_ms);
> - break;
> - case IB_LINK_LAYER_ETHERNET:
> - ret = cma_resolve_iboe_route(id_priv);
> - break;
> - default:
> - ret = -ENOSYS;
> - }
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + if (rdma_tech_ib(id->device, id->port_num))
> + ret = cma_resolve_ib_route(id_priv, timeout_ms);
> + else if (rdma_tech_iboe(id->device, id->port_num))
> + ret = cma_resolve_iboe_route(id_priv);
> + else if (rdma_tech_iwarp(id->device, id->port_num))
> ret = cma_resolve_iw_route(id_priv, timeout_ms);
> - break;
> - default:
> + else
> ret = -ENOSYS;
> - break;
> - }
> +
> if (ret)
> goto err;
>
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
> index 45d67e9..7331c6c 100644
> --- a/drivers/infiniband/core/ucma.c
> +++ b/drivers/infiniband/core/ucma.c
> @@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,
>
> resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
> resp.port_num = ctx->cm_id->port_num;
> - switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> - switch (rdma_port_get_link_layer(ctx->cm_id->device,
> - ctx->cm_id->port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> - ucma_copy_ib_route(&resp, &ctx->cm_id->route);
> - break;
> - case IB_LINK_LAYER_ETHERNET:
> - ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
> - break;
> - default:
> - break;
> - }
> - break;
> - case RDMA_TRANSPORT_IWARP:
> +
> + if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
> + ucma_copy_ib_route(&resp, &ctx->cm_id->route);
> + else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
> + ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
> + else if (rdma_tech_iwarp(ctx->cm_id->device, ctx->cm_id->port_num))
> ucma_copy_iw_route(&resp, &ctx->cm_id->route);
> - break;
> - default:
> - break;
> - }
>
> out:
> if (copy_to_user((void __user *)(unsigned long)cmd.response,
> --
> 2.1.0

2015-04-22 00:07:44

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 12/27] IB/Verbs: Reform mcast related part in IB-core cma

On Mon, Apr 20, 2015 at 10:37:35AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform mcast related part in IB-core cma.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 56 ++++++++++++++-----------------------------
> 1 file changed, 18 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index fa69f34..a89c246 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -997,17 +997,12 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
> mc = container_of(id_priv->mc_list.next,
> struct cma_multicast, list);
> list_del(&mc->list);
> - switch (rdma_port_get_link_layer(id_priv->cma_dev->device, id_priv->id.port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> + if (rdma_tech_ib(id_priv->cma_dev->device,
> + id_priv->id.port_num)) {
> ib_sa_free_multicast(mc->multicast.ib);
> kfree(mc);
> - break;
> - case IB_LINK_LAYER_ETHERNET:
> + } else
> kref_put(&mc->mcref, release_mc);
> - break;
> - default:
> - break;
> - }
> }
> }
>
> @@ -3314,24 +3309,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
> list_add(&mc->list, &id_priv->mc_list);
> spin_unlock(&id_priv->lock);
>
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> - switch (rdma_port_get_link_layer(id->device, id->port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> - ret = cma_join_ib_multicast(id_priv, mc);
> - break;
> - case IB_LINK_LAYER_ETHERNET:
> - kref_init(&mc->mcref);
> - ret = cma_iboe_join_multicast(id_priv, mc);
> - break;
> - default:
> - ret = -EINVAL;
> - }
> - break;
> - default:
> + if (rdma_tech_iboe(id->device, id->port_num)) {
> + kref_init(&mc->mcref);
> + ret = cma_iboe_join_multicast(id_priv, mc);
> + } else if (rdma_tech_ib(id->device, id->port_num))
> + ret = cma_join_ib_multicast(id_priv, mc);
> + else
> ret = -ENOSYS;
> - break;
> - }
>
> if (ret) {
> spin_lock_irq(&id_priv->lock);
> @@ -3359,19 +3343,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
> ib_detach_mcast(id->qp,
> &mc->multicast.ib->rec.mgid,
> be16_to_cpu(mc->multicast.ib->rec.mlid));
> - if (rdma_node_get_transport(id_priv->cma_dev->device->node_type) == RDMA_TRANSPORT_IB) {
> - switch (rdma_port_get_link_layer(id->device, id->port_num)) {
> - case IB_LINK_LAYER_INFINIBAND:
> - ib_sa_free_multicast(mc->multicast.ib);
> - kfree(mc);
> - break;
> - case IB_LINK_LAYER_ETHERNET:
> - kref_put(&mc->mcref, release_mc);
> - break;
> - default:
> - break;
> - }
> - }
> +
> + BUG_ON(id_priv->cma_dev->device != id->device);
> +
> + if (rdma_tech_ib(id->device, id->port_num)) {
> + ib_sa_free_multicast(mc->multicast.ib);
> + kfree(mc);
> + } else if (rdma_tech_iboe(id->device, id->port_num))
> + kref_put(&mc->mcref, release_mc);
> +
> return;
> }
> }
> --
> 2.1.0

2015-04-22 00:08:16

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 13/27] IB/Verbs: Reserve legacy transport type in 'dev_addr'

On Mon, Apr 20, 2015 at 10:38:00AM +0200, Michael Wang wrote:
>
> Reserve the legacy transport type for the 'transport' member
> of 'struct rdma_dev_addr' until we make sure this is no
> longer needed.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ebac646..6195bf6 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 ip_ver)
> hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF);
> }
>
> +static inline void cma_set_legacy_transport(struct rdma_cm_id *id)
> +{
> + switch (id->device->node_type) {
> + case RDMA_NODE_IB_CA:
> + case RDMA_NODE_IB_SWITCH:
> + case RDMA_NODE_IB_ROUTER:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;
> + break;
> + case RDMA_NODE_RNIC:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IWARP;
> + break;
> + case RDMA_NODE_USNIC:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC;
> + break;
> + case RDMA_NODE_USNIC_UDP:
> + id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC_UDP;
> + break;
> + default:
> + BUG();
> + }
> +}
> +
> static void cma_attach_to_dev(struct rdma_id_private *id_priv,
> struct cma_device *cma_dev)
> {
> atomic_inc(&cma_dev->refcount);
> id_priv->cma_dev = cma_dev;
> id_priv->id.device = cma_dev->device;
> - id_priv->id.route.addr.dev_addr.transport =
> - rdma_node_get_transport(cma_dev->device->node_type);
> + cma_set_legacy_transport(&id_priv->id);
> list_add_tail(&id_priv->list, &cma_dev->id_list);
> }
>
> --
> 2.1.0

2015-04-22 00:08:45

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 14/27] IB/Verbs: Reform cma_acquire_dev()

On Mon, Apr 20, 2015 at 10:38:23AM +0200, Michael Wang wrote:
>
> Reform cma_acquire_dev() with management helpers, introduce
> cma_validate_port() to make the code more clean.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 68 +++++++++++++++++++++++++------------------
> 1 file changed, 40 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 6195bf6..44e7bb9 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -370,18 +370,35 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
> return ret;
> }
>
> +static inline int cma_validate_port(struct ib_device *device, u8 port,
> + union ib_gid *gid, int dev_type)
> +{
> + u8 found_port;
> + int ret = -ENODEV;
> +
> + if ((dev_type == ARPHRD_INFINIBAND) && !rdma_tech_ib(device, port))
> + return ret;
> +
> + if ((dev_type != ARPHRD_INFINIBAND) && rdma_tech_ib(device, port))
> + return ret;
> +
> + ret = ib_find_cached_gid(device, gid, &found_port, NULL);
> + if (port != found_port)
> + return -ENODEV;
> +
> + return ret;
> +}
> +
> static int cma_acquire_dev(struct rdma_id_private *id_priv,
> struct rdma_id_private *listen_id_priv)
> {
> struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
> struct cma_device *cma_dev;
> - union ib_gid gid, iboe_gid;
> + union ib_gid gid, iboe_gid, *gidp;
> int ret = -ENODEV;
> - u8 port, found_port;
> - enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
> - IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
> + u8 port;
>
> - if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
> + if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
> id_priv->id.ps == RDMA_PS_IPOIB)
> return -EINVAL;
>
> @@ -391,41 +408,36 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,
>
> memcpy(&gid, dev_addr->src_dev_addr +
> rdma_addr_gid_offset(dev_addr), sizeof gid);
> - if (listen_id_priv &&
> - rdma_port_get_link_layer(listen_id_priv->id.device,
> - listen_id_priv->id.port_num) == dev_ll) {
> +
> + if (listen_id_priv) {
> cma_dev = listen_id_priv->cma_dev;
> port = listen_id_priv->id.port_num;
> - if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
> - rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
> - ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
> - &found_port, NULL);
> - else
> - ret = ib_find_cached_gid(cma_dev->device, &gid,
> - &found_port, NULL);
> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
> + &iboe_gid : &gid;
>
> - if (!ret && (port == found_port)) {
> - id_priv->id.port_num = found_port;
> + ret = cma_validate_port(cma_dev->device, port, gidp,
> + dev_addr->dev_type);
> + if (!ret) {
> + id_priv->id.port_num = port;
> goto out;
> }
> }
> +
> list_for_each_entry(cma_dev, &dev_list, list) {
> for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
> if (listen_id_priv &&
> listen_id_priv->cma_dev == cma_dev &&
> listen_id_priv->id.port_num == port)
> continue;
> - if (rdma_port_get_link_layer(cma_dev->device, port) == dev_ll) {
> - if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
> - rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
> - ret = ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
> - else
> - ret = ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
> -
> - if (!ret && (port == found_port)) {
> - id_priv->id.port_num = found_port;
> - goto out;
> - }
> +
> + gidp = rdma_tech_iboe(cma_dev->device, port) ?
> + &iboe_gid : &gid;
> +
> + ret = cma_validate_port(cma_dev->device, port, gidp,
> + dev_addr->dev_type);
> + if (!ret) {
> + id_priv->id.port_num = port;
> + goto out;
> }
> }
> }
> --
> 2.1.0

2015-04-22 00:08:56

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 15/27] IB/Verbs: Reform rest part in IB-core cma

On Mon, Apr 20, 2015 at 10:38:49AM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform rest part in IB-core cma.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 20 +++++++++-----------
> 1 file changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 44e7bb9..ec64b97 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -468,10 +468,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
> pkey = ntohs(addr->sib_pkey);
>
> list_for_each_entry(cur_dev, &dev_list, list) {
> - if (rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
> - continue;
> -
> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
> + if (!rdma_ib_or_iboe(cur_dev->device, p))
> + continue;
> +
> if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
> continue;
>
> @@ -666,10 +666,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private *id_priv,
> if (ret)
> goto out;
>
> - if (rdma_node_get_transport(id_priv->cma_dev->device->node_type)
> - == RDMA_TRANSPORT_IB &&
> - rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)
> - == IB_LINK_LAYER_ETHERNET) {
> + BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
> +
> + if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num)) {
> ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);
>
> if (ret)
> @@ -733,11 +732,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
> int ret;
> u16 pkey;
>
> - if (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num) ==
> - IB_LINK_LAYER_INFINIBAND)
> - pkey = ib_addr_get_pkey(dev_addr);
> - else
> + if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num))
> pkey = 0xffff;
> + else
> + pkey = ib_addr_get_pkey(dev_addr);
>
> ret = ib_find_cached_pkey(id_priv->id.device, id_priv->id.port_num,
> pkey, &qp_attr->pkey_index);
> --
> 2.1.0

2015-04-22 00:09:09

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 16/27] IB/Verbs: Use management helper cap_ib_mad()

On Mon, Apr 20, 2015 at 10:39:12AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ib_mad() to help us check if the port of an
> IB device support Infiniband Management Datagrams.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/mad.c | 6 +++---
> drivers/infiniband/core/user_mad.c | 6 +++---
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 3 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 1822932..4315aeb 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
> }
>
> for (i = start; i <= end; i++) {
> - if (!rdma_ib_or_iboe(device, i))
> + if (!cap_ib_mad(device, i))
> continue;
>
> if (ib_mad_port_open(device, i)) {
> @@ -3087,7 +3087,7 @@ error_agent:
>
> error:
> while (--i >= start) {
> - if (!rdma_ib_or_iboe(device, i))
> + if (!cap_ib_mad(device, i))
> continue;
>
> if (ib_agent_port_close(device, i))
> @@ -3111,7 +3111,7 @@ static void ib_mad_remove_device(struct ib_device *device)
> }
>
> for (i = start; i <= end; i++) {
> - if (!rdma_ib_or_iboe(device, i))
> + if (!cap_ib_mad(device, i))
> continue;
>
> if (ib_agent_port_close(device, i))
> diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
> index 71fc8ba..b52884b 100644
> --- a/drivers/infiniband/core/user_mad.c
> +++ b/drivers/infiniband/core/user_mad.c
> @@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
> umad_dev->end_port = e;
>
> for (i = s; i <= e; ++i) {
> - if (!rdma_ib_or_iboe(device, i))
> + if (!cap_ib_mad(device, i))
> continue;
>
> umad_dev->port[i - s].umad_dev = umad_dev;
> @@ -1317,7 +1317,7 @@ static void ib_umad_add_one(struct ib_device *device)
>
> err:
> while (--i >= s) {
> - if (!rdma_ib_or_iboe(device, i))
> + if (!cap_ib_mad(device, i))
> continue;
>
> ib_umad_kill_port(&umad_dev->port[i - s]);
> @@ -1335,7 +1335,7 @@ static void ib_umad_remove_one(struct ib_device *device)
> return;
>
> for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
> - if (rdma_ib_or_iboe(device, i))
> + if (cap_ib_mad(device, i))
> ib_umad_kill_port(&umad_dev->port[i]);
> }
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index a12e876..624e963 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1773,6 +1773,21 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
> return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> }
>
> +/**
> + * cap_ib_mad - Check if the port of device has the capability Infiniband
> + * Management Datagrams.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support Infiniband
> + * Management Datagrams.
> + */
> +static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
> +{
> + return rdma_ib_or_iboe(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:09:28

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 17/27] IB/Verbs: Use management helper cap_ib_smi()

On Mon, Apr 20, 2015 at 10:39:37AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ib_smi() to help us check if the port of an
> IB device support Infiniband Subnet Management Interface.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/agent.c | 2 +-
> drivers/infiniband/core/mad.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 3 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
> index ffdef4d..61471ee 100644
> --- a/drivers/infiniband/core/agent.c
> +++ b/drivers/infiniband/core/agent.c
> @@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
> goto error1;
> }
>
> - if (rdma_tech_ib(device, port_num)) {
> + if (cap_ib_smi(device, port_num)) {
> /* Obtain send only MAD agent for SMI QP */
> port_priv->agent[0] = ib_register_mad_agent(device, port_num,
> IB_QPT_SMI, NULL, 0,
> diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
> index 4315aeb..ee3a05e 100644
> --- a/drivers/infiniband/core/mad.c
> +++ b/drivers/infiniband/core/mad.c
> @@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
> init_mad_qp(port_priv, &port_priv->qp_info[1]);
>
> cq_size = mad_sendq_size + mad_recvq_size;
> - has_smi = rdma_tech_ib(device, port_num);
> + has_smi = cap_ib_smi(device, port_num);
> if (has_smi)
> cq_size *= 2;
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 624e963..873b9a6 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1788,6 +1788,21 @@ static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
> return rdma_ib_or_iboe(device, port_num);
> }
>
> +/**
> + * cap_ib_smi - Check if the port of device has the capability Infiniband
> + * Subnet Management Interface.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support Infiniband
> + * Subnet Management Interface.
> + */
> +static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_ib(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:09:44

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 18/27] IB/Verbs: Use management helper cap_ib_cm()

On Mon, Apr 20, 2015 at 10:40:04AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ib_cm() to help us check if the port of an
> IB device support Infiniband Communication Manager.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cm.c | 6 +++---
> drivers/infiniband/core/cma.c | 19 +++++++++----------
> drivers/infiniband/core/ucm.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 4 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index 3c10b75..eae4c9f 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3781,7 +3781,7 @@ static void cm_add_one(struct ib_device *ib_device)
>
> set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> - if (!rdma_ib_or_iboe(ib_device, i))
> + if (!cap_ib_cm(ib_device, i))
> continue;
>
> port = kzalloc(sizeof *port, GFP_KERNEL);
> @@ -3832,7 +3832,7 @@ error1:
> port_modify.set_port_cap_mask = 0;
> port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
> while (--i) {
> - if (!rdma_ib_or_iboe(ib_device, i))
> + if (!cap_ib_cm(ib_device, i))
> continue;
>
> port = cm_dev->port[i-1];
> @@ -3864,7 +3864,7 @@ static void cm_remove_one(struct ib_device *ib_device)
> write_unlock_irqrestore(&cm.device_lock, flags);
>
> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> - if (!rdma_ib_or_iboe(ib_device, i))
> + if (!cap_ib_cm(ib_device, i))
> continue;
>
> port = cm_dev->port[i-1];
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ec64b97..ff59dbc 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -766,7 +766,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
> int ret = 0;
>
> id_priv = container_of(id, struct rdma_id_private, id);
> - if (rdma_ib_or_iboe(id->device, id->port_num)) {
> + if (cap_ib_cm(id->device, id->port_num)) {
> if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
> ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
> else
> @@ -1054,7 +1054,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> mutex_unlock(&id_priv->handler_mutex);
>
> if (id_priv->cma_dev) {
> - if (rdma_ib_or_iboe(id_priv->id.device, 1)) {
> + if (cap_ib_cm(id_priv->id.device, 1)) {
> if (id_priv->cm_id.ib)
> ib_destroy_cm_id(id_priv->cm_id.ib);
> } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
> @@ -1637,8 +1637,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv,
> struct rdma_cm_id *id;
> int ret;
>
> - if (cma_family(id_priv) == AF_IB &&
> - !rdma_ib_or_iboe(cma_dev->device, 1))
> + if (cma_family(id_priv) == AF_IB && !cap_ib_cm(cma_dev->device, 1))
> return;
>
> id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
> @@ -2029,7 +2028,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
> mutex_lock(&lock);
> list_for_each_entry(cur_dev, &dev_list, list) {
> if (cma_family(id_priv) == AF_IB &&
> - !rdma_ib_or_iboe(cur_dev->device, 1))
> + !cap_ib_cm(cur_dev->device, 1))
> continue;
>
> if (!cma_dev)
> @@ -2538,7 +2537,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
>
> id_priv->backlog = backlog;
> if (id->device) {
> - if (rdma_ib_or_iboe(id->device, 1)) {
> + if (cap_ib_cm(id->device, 1)) {
> ret = cma_ib_listen(id_priv);
> if (ret)
> goto err;
> @@ -2882,7 +2881,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> id_priv->srq = conn_param->srq;
> }
>
> - if (rdma_ib_or_iboe(id->device, id->port_num)) {
> + if (cap_ib_cm(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD)
> ret = cma_resolve_ib_udp(id_priv, conn_param);
> else
> @@ -2993,7 +2992,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> id_priv->srq = conn_param->srq;
> }
>
> - if (rdma_ib_or_iboe(id->device, id->port_num)) {
> + if (cap_ib_cm(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD) {
> if (conn_param)
> ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
> @@ -3056,7 +3055,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
> if (!id_priv->cm_id.ib)
> return -EINVAL;
>
> - if (rdma_ib_or_iboe(id->device, id->port_num)) {
> + if (cap_ib_cm(id->device, id->port_num)) {
> if (id->qp_type == IB_QPT_UD)
> ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
> private_data, private_data_len);
> @@ -3083,7 +3082,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
> if (!id_priv->cm_id.ib)
> return -EINVAL;
>
> - if (rdma_ib_or_iboe(id->device, id->port_num)) {
> + if (cap_ib_cm(id->device, id->port_num)) {
> ret = cma_modify_qp_err(id_priv);
> if (ret)
> goto out;
> diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
> index 70e0ccb..f7290c8 100644
> --- a/drivers/infiniband/core/ucm.c
> +++ b/drivers/infiniband/core/ucm.c
> @@ -1253,7 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
> dev_t base;
> struct ib_ucm_device *ucm_dev;
>
> - if (!device->alloc_ucontext || !rdma_ib_or_iboe(device, 1))
> + if (!device->alloc_ucontext || !cap_ib_cm(device, 1))
> return;
>
> ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 873b9a6..6805e3e 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1803,6 +1803,21 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
> return rdma_tech_ib(device, port_num);
> }
>
> +/**
> + * cap_ib_cm - Check if the port of device has the capability Infiniband
> + * Communication Manager.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support Infiniband
> + * Communication Manager.
> + */
> +static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
> +{
> + return rdma_ib_or_iboe(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:10:31

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 19/27] IB/Verbs: Use management helper cap_iw_cm()

On Mon, Apr 20, 2015 at 10:40:27AM +0200, Michael Wang wrote:
>
> Introduce helper cap_iw_cm() to help us check if the port of an
> IB device support IWARP Communication Manager.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 14 +++++++-------
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 2 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index ff59dbc..dd37b4a 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -775,7 +775,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
>
> if (qp_attr->qp_state == IB_QPS_RTR)
> qp_attr->rq_psn = id_priv->seq_num;
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> if (!id_priv->cm_id.iw) {
> qp_attr->qp_access_flags = 0;
> *qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
> @@ -1057,7 +1057,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> if (cap_ib_cm(id_priv->id.device, 1)) {
> if (id_priv->cm_id.ib)
> ib_destroy_cm_id(id_priv->cm_id.ib);
> - } else if (rdma_tech_iwarp(id_priv->id.device, 1)) {
> + } else if (cap_iw_cm(id_priv->id.device, 1)) {
> if (id_priv->cm_id.iw)
> iw_destroy_cm_id(id_priv->cm_id.iw);
> }
> @@ -2541,7 +2541,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
> ret = cma_ib_listen(id_priv);
> if (ret)
> goto err;
> - } else if (rdma_tech_iwarp(id->device, 1)) {
> + } else if (cap_iw_cm(id->device, 1)) {
> ret = cma_iw_listen(id_priv, backlog);
> if (ret)
> goto err;
> @@ -2886,7 +2886,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> ret = cma_resolve_ib_udp(id_priv, conn_param);
> else
> ret = cma_connect_ib(id_priv, conn_param);
> - } else if (rdma_tech_iwarp(id->device, id->port_num))
> + } else if (cap_iw_cm(id->device, id->port_num))
> ret = cma_connect_iw(id_priv, conn_param);
> else
> ret = -ENOSYS;
> @@ -3008,7 +3008,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
> else
> ret = cma_rep_recv(id_priv);
> }
> - } else if (rdma_tech_iwarp(id->device, id->port_num))
> + } else if (cap_iw_cm(id->device, id->port_num))
> ret = cma_accept_iw(id_priv, conn_param);
> else
> ret = -ENOSYS;
> @@ -3063,7 +3063,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
> ret = ib_send_cm_rej(id_priv->cm_id.ib,
> IB_CM_REJ_CONSUMER_DEFINED, NULL,
> 0, private_data, private_data_len);
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> ret = iw_cm_reject(id_priv->cm_id.iw,
> private_data, private_data_len);
> } else
> @@ -3089,7 +3089,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
> /* Initiate or respond to a disconnect. */
> if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
> ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
> - } else if (rdma_tech_iwarp(id->device, id->port_num)) {
> + } else if (cap_iw_cm(id->device, id->port_num)) {
> ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
> } else
> ret = -EINVAL;
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 6805e3e..e4999f6 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
> return rdma_ib_or_iboe(device, port_num);
> }
>
> +/**
> + * cap_iw_cm - Check if the port of device has the capability IWARP
> + * Communication Manager.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support IWARP
> + * Communication Manager.
> + */
> +static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_iwarp(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:10:52

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 20/27] IB/Verbs: Use management helper cap_ib_sa()

On Mon, Apr 20, 2015 at 10:40:50AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ib_sa() to help us check if the port of an
> IB device support Infiniband Subnet Administration.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 4 ++--
> drivers/infiniband/core/sa_query.c | 10 +++++-----
> drivers/infiniband/core/ucma.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 4 files changed, 23 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index dd37b4a..b92f81b 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -954,7 +954,7 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)
>
> static void cma_cancel_route(struct rdma_id_private *id_priv)
> {
> - if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
> + if (cap_ib_sa(id_priv->id.device, id_priv->id.port_num)) {
> if (id_priv->query)
> ib_sa_cancel_query(id_priv->query_id, id_priv->query);
> }
> @@ -1978,7 +1978,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
> return -EINVAL;
>
> atomic_inc(&id_priv->refcount);
> - if (rdma_tech_ib(id->device, id->port_num))
> + if (cap_ib_sa(id->device, id->port_num))
> ret = cma_resolve_ib_route(id_priv, timeout_ms);
> else if (rdma_tech_iboe(id->device, id->port_num))
> ret = cma_resolve_iboe_route(id_priv);
> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> index 60dc7aa..f14a66f 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
> struct ib_sa_port *port =
> &sa_dev->port[event->element.port_num - sa_dev->start_port];
>
> - if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
> + if (WARN_ON(!cap_ib_sa(handler->device, port->port_num)))
> return;
>
> spin_lock_irqsave(&port->ah_lock, flags);
> @@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
>
> for (i = 0; i <= e - s; ++i) {
> spin_lock_init(&sa_dev->port[i].ah_lock);
> - if (!rdma_tech_ib(device, i + 1))
> + if (!cap_ib_sa(device, i + 1))
> continue;
>
> sa_dev->port[i].sm_ah = NULL;
> @@ -1208,7 +1208,7 @@ static void ib_sa_add_one(struct ib_device *device)
> goto err;
>
> for (i = 0; i <= e - s; ++i) {
> - if (rdma_tech_ib(device, i + 1))
> + if (cap_ib_sa(device, i + 1))
> update_sm_ah(&sa_dev->port[i].update_task);
> }
>
> @@ -1216,7 +1216,7 @@ static void ib_sa_add_one(struct ib_device *device)
>
> err:
> while (--i >= 0) {
> - if (rdma_tech_ib(device, i + 1))
> + if (cap_ib_sa(device, i + 1))
> ib_unregister_mad_agent(sa_dev->port[i].agent);
> }
>
> @@ -1238,7 +1238,7 @@ static void ib_sa_remove_one(struct ib_device *device)
> flush_workqueue(ib_wq);
>
> for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
> - if (rdma_tech_ib(device, i + 1)) {
> + if (cap_ib_sa(device, i + 1)) {
> ib_unregister_mad_agent(sa_dev->port[i].agent);
> if (sa_dev->port[i].sm_ah)
> kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
> index 7331c6c..bed7957 100644
> --- a/drivers/infiniband/core/ucma.c
> +++ b/drivers/infiniband/core/ucma.c
> @@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
> resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
> resp.port_num = ctx->cm_id->port_num;
>
> - if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
> + if (cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
> ucma_copy_ib_route(&resp, &ctx->cm_id->route);
> else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
> ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index e4999f6..de3a168 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1833,6 +1833,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
> return rdma_tech_iwarp(device, port_num);
> }
>
> +/**
> + * cap_ib_sa - Check if the port of device has the capability Infiniband
> + * Subnet Administration.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support Infiniband
> + * Subnet Administration.
> + */
> +static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_ib(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:11:05

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 21/27] IB/Verbs: Use management helper cap_ib_mcast()

On Mon, Apr 20, 2015 at 10:41:14AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ib_mcast() to help us check if the port of an
> IB device support Infiniband Multicast.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 6 +++---
> drivers/infiniband/core/multicast.c | 6 +++---
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 3 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 8484ae3..58ec946 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1028,7 +1028,7 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
> mc = container_of(id_priv->mc_list.next,
> struct cma_multicast, list);
> list_del(&mc->list);
> - if (rdma_tech_ib(id_priv->cma_dev->device,
> + if (cap_ib_mcast(id_priv->cma_dev->device,
> id_priv->id.port_num)) {
> ib_sa_free_multicast(mc->multicast.ib);
> kfree(mc);
> @@ -3342,7 +3342,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
> if (rdma_tech_iboe(id->device, id->port_num)) {
> kref_init(&mc->mcref);
> ret = cma_iboe_join_multicast(id_priv, mc);
> - } else if (rdma_tech_ib(id->device, id->port_num))
> + } else if (cap_ib_mcast(id->device, id->port_num))
> ret = cma_join_ib_multicast(id_priv, mc);
> else
> ret = -ENOSYS;
> @@ -3376,7 +3376,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
>
> BUG_ON(id_priv->cma_dev->device != id->device);
>
> - if (rdma_tech_ib(id->device, id->port_num)) {
> + if (cap_ib_mcast(id->device, id->port_num)) {
> ib_sa_free_multicast(mc->multicast.ib);
> kfree(mc);
> } else if (rdma_tech_iboe(id->device, id->port_num))
> diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
> index 24d93f5..bdc1880 100644
> --- a/drivers/infiniband/core/multicast.c
> +++ b/drivers/infiniband/core/multicast.c
> @@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
> int index;
>
> dev = container_of(handler, struct mcast_device, event_handler);
> - if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
> + if (WARN_ON(!cap_ib_mcast(dev->device, event->element.port_num)))
> return;
>
> index = event->element.port_num - dev->start_port;
> @@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
> }
>
> for (i = 0; i <= dev->end_port - dev->start_port; i++) {
> - if (!rdma_tech_ib(device, dev->start_port + i))
> + if (!cap_ib_mcast(device, dev->start_port + i))
> continue;
> port = &dev->port[i];
> port->dev = dev;
> @@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
> flush_workqueue(mcast_wq);
>
> for (i = 0; i <= dev->end_port - dev->start_port; i++) {
> - if (rdma_tech_ib(device, dev->start_port + i)) {
> + if (cap_ib_mcast(device, dev->start_port + i)) {
> port = &dev->port[i];
> deref_port(port);
> wait_for_completion(&port->comp);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index de3a168..6e354df 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1848,6 +1848,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
> return rdma_tech_ib(device, port_num);
> }
>
> +/**
> + * cap_ib_mcast - Check if the port of device has the capability Infiniband
> + * Multicast.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support Infiniband
> + * Multicast.
> + */
> +static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
> +{
> + return cap_ib_sa(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:11:18

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()

On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
>
> Introduce helper cap_ipoib() to help us check if the port of an
> IB device support IP over Infiniband.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index 60b379d..a9812df 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -1671,7 +1671,7 @@ static void ipoib_add_one(struct ib_device *device)
> }
>
> for (p = s; p <= e; ++p) {
> - if (!rdma_tech_ib(device, p))
> + if (!cap_ipoib(device, p))
> continue;
> dev = ipoib_add_port("ib%d", device, p);
> if (!IS_ERR(dev)) {
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 6e354df..d0ae08e 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1863,6 +1863,21 @@ static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
> return cap_ib_sa(device, port_num);
> }
>
> +/**
> + * cap_ipoib - Check if the port of device has the capability
> + * IP over Infiniband.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support
> + * IP over Infiniband.
> + */
> +static inline int cap_ipoib(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_ib(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> --
> 2.1.0

2015-04-22 00:11:33

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 23/27] IB/Verbs: Use management helper cap_read_multi_sge()

On Mon, Apr 20, 2015 at 10:42:07AM +0200, Michael Wang wrote:
>
> Introduce helper cap_read_multi_sge() to help us check if the port of an
> IB device support RDMA Read Multiple Scatter-Gather Entries.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 ++-
> 2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index d0ae08e..074f66d 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1878,6 +1878,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
> return rdma_tech_ib(device, port_num);
> }
>
> +/**
> + * cap_read_multi_sge - Check if the port of device has the capability
> + * RDMA Read Multiple Scatter-Gather Entries.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support
> + * RDMA Read Multiple Scatter-Gather Entries.
> + */
> +static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
> +{
> + return !rdma_tech_iwarp(device, port_num);
> +}
> +
> int ib_query_gid(struct ib_device *device,
> u8 port_num, int index, union ib_gid *gid);
>
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index a5bed5b..7711b7a 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -117,7 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>
> static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
> {
> - if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
> + if (!cap_read_multi_sge(xprt->sc_cm_id->device,
> + xprt->sc_cm_id->port_num))
> return 1;
> else
> return min_t(int, sge_count, xprt->sc_max_sge);
> --
> 2.1.0

2015-04-22 00:12:11

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib()

On Mon, Apr 20, 2015 at 10:42:33AM +0200, Michael Wang wrote:
>
> Introduce helper cap_af_ib() to help us check if the port of an
> IB device support Native Infiniband Address.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 05d148e..9c1f5b72 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -469,7 +469,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
>
> list_for_each_entry(cur_dev, &dev_list, list) {
> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
> - if (!rdma_ib_or_iboe(cur_dev->device, p))
> + if (!cap_af_ib(cur_dev->device, p))
> continue;
>
> if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 074f66d..9cfab09 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
> }
>
> /**
> + * cap_af_ib - Check if the port of device has the capability
> + * Native Infiniband Address.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support
> + * Native Infiniband Address.
> + */
> +static inline int cap_af_ib(struct ib_device *device, u8 port_num)
> +{
> + return rdma_ib_or_iboe(device, port_num);
> +}
> +
> +/**
> * cap_read_multi_sge - Check if the port of device has the capability
> * RDMA Read Multiple Scatter-Gather Entries.
> *
> --
> 2.1.0

2015-04-22 00:12:02

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah()

On Mon, Apr 20, 2015 at 10:43:03AM +0200, Michael Wang wrote:
>
> Introduce helper cap_eth_ah() to help us check if the port of an
> IB device support Ethernet Address Handler.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/cma.c | 2 +-
> drivers/infiniband/core/sa_query.c | 2 +-
> drivers/infiniband/core/verbs.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 4 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 9c1f5b72..b9f7ccc 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -732,7 +732,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
> int ret;
> u16 pkey;
>
> - if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num))
> + if (cap_eth_ah(id_priv->id.device, id_priv->id.port_num))
> pkey = 0xffff;
> else
> pkey = ib_addr_get_pkey(dev_addr);
> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> index f14a66f..063c17c 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
> ah_attr->port_num = port_num;
> ah_attr->static_rate = rec->rate;
>
> - force_grh = rdma_tech_iboe(device, port_num);
> + force_grh = cap_eth_ah(device, port_num);
>
> if (rec->hop_limit > 1 || force_grh) {
> ah_attr->ah_flags = IB_AH_GRH;
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 7264860..ee4b5cb 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -202,7 +202,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
> int ret;
>
> memset(ah_attr, 0, sizeof *ah_attr);
> - if (rdma_tech_iboe(device, port_num)) {
> + if (cap_eth_ah(device, port_num)) {
> if (!(wc->wc_flags & IB_WC_GRH))
> return -EPROTOTYPE;
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 9cfab09..45050cb 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1894,6 +1894,21 @@ static inline int cap_af_ib(struct ib_device *device, u8 port_num)
> }
>
> /**
> + * cap_eth_ah - Check if the port of device has the capability
> + * Ethernet Address Handler.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support
> + * Ethernet Address Handler.
> + */
> +static inline int cap_eth_ah(struct ib_device *device, u8 port_num)
> +{
> + return rdma_tech_iboe(device, port_num);
> +}
> +
> +/**
> * cap_read_multi_sge - Check if the port of device has the capability
> * RDMA Read Multiple Scatter-Gather Entries.
> *
> --
> 2.1.0

2015-04-22 00:12:58

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe()

On Mon, Apr 20, 2015 at 10:43:26AM +0200, Michael Wang wrote:
>
> We have finished introducing the cap_XX(), and raw helper rdma_ib_or_iboe()
> is no longer necessary, thus clean it up.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

>
> ---
> include/rdma/ib_verbs.h | 19 +++++++++----------
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 6 ++++--
> 2 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 45050cb..0c0a4f0 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1766,13 +1766,6 @@ static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
> == RDMA_TRANSPORT_IWARP;
> }
>
> -static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
> -{
> - enum rdma_transport_type tp = device->query_transport(device, port_num);
> -
> - return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> -}
> -
> /**
> * cap_ib_mad - Check if the port of device has the capability Infiniband
> * Management Datagrams.
> @@ -1785,7 +1778,9 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
> */
> static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
> {
> - return rdma_ib_or_iboe(device, port_num);
> + enum rdma_transport_type tp = device->query_transport(device, port_num);
> +
> + return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> }
>
> /**
> @@ -1815,7 +1810,9 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
> */
> static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
> {
> - return rdma_ib_or_iboe(device, port_num);
> + enum rdma_transport_type tp = device->query_transport(device, port_num);
> +
> + return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> }
>
> /**
> @@ -1890,7 +1887,9 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
> */
> static inline int cap_af_ib(struct ib_device *device, u8 port_num)
> {
> - return rdma_ib_or_iboe(device, port_num);
> + enum rdma_transport_type tp = device->query_transport(device, port_num);
> +
> + return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
> }
>
> /**
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> index a09b7a1..8af6f92 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
> @@ -987,8 +987,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
> */
> if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
> newxprt->sc_cm_id->port_num) &&
> - !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
> - newxprt->sc_cm_id->port_num))
> + !rdma_tech_ib(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num) &&
> + !rdma_tech_iboe(newxprt->sc_cm_id->device,
> + newxprt->sc_cm_id->port_num))
> goto errout;
>
> if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
> --
> 2.1.0

2015-04-22 00:13:16

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 27/27] IB/Verbs: Cleanup rdma_node_get_transport()

On Mon, Apr 20, 2015 at 10:43:51AM +0200, Michael Wang wrote:
>
> We have get rid of all the scene using legacy rdma_node_get_transport(),
> now clean it up.
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>

Reviewed-by: Ira Weiny <[email protected]>

> ---
> drivers/infiniband/core/verbs.c | 21 ---------------------
> include/rdma/ib_verbs.h | 3 ---
> 2 files changed, 24 deletions(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index ee4b5cb..bbea0c0 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -107,27 +107,6 @@ __attribute_const__ int ib_rate_to_mbps(enum ib_rate rate)
> }
> EXPORT_SYMBOL(ib_rate_to_mbps);
>
> -__attribute_const__ enum rdma_transport_type
> -rdma_node_get_transport(enum rdma_node_type node_type)
> -{
> - switch (node_type) {
> - case RDMA_NODE_IB_CA:
> - case RDMA_NODE_IB_SWITCH:
> - case RDMA_NODE_IB_ROUTER:
> - return RDMA_TRANSPORT_IB;
> - case RDMA_NODE_RNIC:
> - return RDMA_TRANSPORT_IWARP;
> - case RDMA_NODE_USNIC:
> - return RDMA_TRANSPORT_USNIC;
> - case RDMA_NODE_USNIC_UDP:
> - return RDMA_TRANSPORT_USNIC_UDP;
> - default:
> - BUG();
> - return 0;
> - }
> -}
> -EXPORT_SYMBOL(rdma_node_get_transport);
> -
> enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num)
> {
> if (device->get_link_layer)
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 0c0a4f0..f2ea6e7 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -84,9 +84,6 @@ enum rdma_transport_type {
> RDMA_TRANSPORT_IBOE,
> };
>
> -__attribute_const__ enum rdma_transport_type
> -rdma_node_get_transport(enum rdma_node_type node_type);
> -
> enum rdma_link_layer {
> IB_LINK_LAYER_UNSPECIFIED,
> IB_LINK_LAYER_INFINIBAND,
> --
> 2.1.0

2015-04-22 00:29:07

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Mon, Apr 20, 2015 at 10:28:57AM +0200, Michael Wang wrote:
>
> Since v4:
> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> * Fix logical issue inside 3#, 14#
> * Refine 3#, 4#, 5# with label 'free'
> * Rework 10# to stop using port 1 when port already assigned
>
> There are plenty of lengthy code to check the transport type of IB device,
> or the link layer type of it's port, but actually we are just speculating
> whether a particular management/feature is supported by the device/port.
>
> Thus instead of inferring, we should have our own mechanism for IB management
> capability/protocol/feature checking, several proposals below.
>
> This patch set will reform the method of getting transport type, we will
> now using query_transport() instead of inferring from transport and link
> layer respectively, also we defined the new transport type to make the
> concept more reasonable.
>
> Mapping List:
> node-type link-layer old-transport new-transport
> nes RNIC ETH IWARP IWARP
> amso1100 RNIC ETH IWARP IWARP
> cxgb3 RNIC ETH IWARP IWARP
> cxgb4 RNIC ETH IWARP IWARP
> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> ocrdma IB_CA ETH IB IBOE
> mlx4 IB_CA IB/ETH IB IB/IBOE
> mlx5 IB_CA IB IB IB
> ehca IB_CA IB IB IB
> ipath IB_CA IB IB IB
> mthca IB_CA IB IB IB
> qib IB_CA IB IB IB
>
> For example:
> if (transport == IB) && (link-layer == ETH)
> will now become:
> if (query_transport() == IBOE)
>
> Thus we will be able to get rid of the respective transport and link-layer
> checking, and it will help us to add new protocol/Technology (like OPA) more
> easier, also with the introduced management helpers, IB management logical
> will be more clear and easier for extending.
>
> Highlights:
> The patch set covered a wide range of IB stuff, thus for those who are
> familiar with the particular part, your suggestion would be invaluable ;-)
>
> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> management helpers, 26#~27# do clean up.
>
> Patches haven't been tested yet, we appreciate if any one who have these
> HW willing to provide his Tested-by :-)
>
> Doug suggested the bitmask mechanism:
> https://www.mail-archive.com/[email protected]/msg23765.html
> which could be the plan for future reforming, we prefer that to be another
> series which focus on semantic and performance.
>
> This patch-set is somewhat 'bloated' now and it may be a good timing for
> staging, I'd like to suggest we focus on improving existed helpers and push
> all the further reforms into next series ;-)
>

Series tested for IPoIB and MAD functionality on qib and mlx4 hardware.

Tested-by: Ira Weiny <[email protected]>

>
> Proposals:
> Sean:
> https://www.mail-archive.com/[email protected]/msg23339.html
> Doug:
> https://www.mail-archive.com/[email protected]/msg23418.html
> https://www.mail-archive.com/[email protected]/msg23765.html
> Jason:
> https://www.mail-archive.com/[email protected]/msg23425.html
>
> Michael Wang (27):
> IB/Verbs: Implement new callback query_transport()
> IB/Verbs: Implement raw management helpers
> IB/Verbs: Reform IB-core mad/agent/user_mad
> IB/Verbs: Reform IB-core cm
> IB/Verbs: Reform IB-core sa_query
> IB/Verbs: Reform IB-core multicast
> IB/Verbs: Reform IB-ulp ipoib
> IB/Verbs: Reform IB-ulp xprtrdma
> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> IB/Verbs: Reform cm related part in IB-core cma/ucm
> IB/Verbs: Reform route related part in IB-core cma
> IB/Verbs: Reform mcast related part in IB-core cma
> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> IB/Verbs: Reform cma_acquire_dev()
> IB/Verbs: Reform rest part in IB-core cma
> IB/Verbs: Use management helper cap_ib_mad()
> IB/Verbs: Use management helper cap_ib_smi()
> IB/Verbs: Use management helper cap_ib_cm()
> IB/Verbs: Use management helper cap_iw_cm()
> IB/Verbs: Use management helper cap_ib_sa()
> IB/Verbs: Use management helper cap_ib_mcast()
> IB/Verbs: Use management helper cap_ipoib()
> IB/Verbs: Use management helper cap_read_multi_sge()
> IB/Verbs: Use management helper cap_af_ib()
> IB/Verbs: Use management helper cap_eth_ah()
> IB/Verbs: Clean up rdma_ib_or_iboe()
> IB/Verbs: Cleanup rdma_node_get_transport()
>
> ---
> drivers/infiniband/core/agent.c | 4
> drivers/infiniband/core/cm.c | 26 +-
> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> drivers/infiniband/core/device.c | 1
> drivers/infiniband/core/mad.c | 51 ++--
> drivers/infiniband/core/multicast.c | 18 -
> drivers/infiniband/core/sa_query.c | 41 +--
> drivers/infiniband/core/sysfs.c | 8
> drivers/infiniband/core/ucm.c | 5
> drivers/infiniband/core/ucma.c | 27 --
> drivers/infiniband/core/user_mad.c | 32 +-
> drivers/infiniband/core/uverbs_cmd.c | 6
> drivers/infiniband/core/verbs.c | 33 --
> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> drivers/infiniband/hw/cxgb4/provider.c | 7
> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> drivers/infiniband/hw/ehca/ehca_main.c | 1
> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> drivers/infiniband/hw/mlx4/main.c | 10
> drivers/infiniband/hw/mlx5/main.c | 7
> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> drivers/infiniband/hw/nes/nes_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> drivers/infiniband/hw/qib/qib_verbs.c | 7
> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> 35 files changed, 584 insertions(+), 368 deletions(-)

2015-04-22 02:41:43

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Tue, Apr 21, 2015 at 11:36:40PM +0000, Liran Liss wrote:
> Hi Michael,
>
> The spirit of this patch-set is great, but I think that we need to clarify some concepts.
> Since this will affect the whole patch-set, I am laying out my concerns here instead.
>
> A suggestion for the resulting management helpers is given below.
> I believe the result would be much more coherent.
> --Liran
>
> In general
> ========
>
> An ib_dev (or a port of) should be distinguished by 3 qualifiers:
> - The link layer:
> -- Ethernet (shared by iWARP, USNIC, and ROCE)
> -- Infiniband
>
> - The transport (*)
> -- IBTA transport (shared by IB and ROCE)
> -- iWARP transport
> -- USNIC transport
>
> (*) Transport means both:
> - The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack)
> - The transport semantics (for example, there are slight semantic differences between IBTA and iWARP)
>
> - The node type (**)
> -- CA
> -- Switch
> -- Router
>
> (**) This has been extended to also encode the transport in the current code.
> At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code.
>
> So, I think that our "old-transport" below is just fine.
> No need to change it (and you aren't, since it is currently implemented as a function).

I think there is a need to change this. Encoding the transport into the node
type is not a good idea. Having different "transport semantics" while still
returning the same transport for the port is confusing.

The only thing which is clear currently is Link Layer.

But the use of "Link Layer" in the code is so convoluted that it is very
confusing.

>
> The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type.
> I would remove the table below and tell what we really want to achieve:
> ==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers.
>
>
> Detailed remarks
> ==============
>
> 1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27.
> This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set.
> Do this and remove patches 16-24.

I think this is a result of the back and forth which has gone on. Some
squashing could be done but the current series is pretty straight forward when
you look at the patches. Most are less than a page long at this point.

>
> 2)The name rdma_tech_* is lame.
> rdma_transport_*(), adhering to the above (*) remark, is much better.
> For example, both IB and ROCE *do* use the same transport.

Define Transport? There has been a lot of discussion over what a transport is
in Verbs.

>
> 3) The name cap_* as it is used above is not accurate.
> You use it to describe technology characteristics rather than extendable capabilities.
> I would suggest having a single convention for all helpers, such as rdma_has_*() and rdma_is_*().
> For example: cap_ib_smi() ==> rdma_has_smi().

rdma_has_smi is not sufficient for the new OPA technology. We discussed many
different names and cap_* was settled on. The use of cap_ib_* was for things
which are specific to IB Ports.

Frankly when RoCE was added functions like this this should have been added for
clarity into the CM and multicast code. But at the time a simple "is link layer
check" was sufficient. Now we have so many different devices and layers that
this clean up is needed to support the future.

>
> 4) Remove all capabilities that do not introduce any distinction in the current code.
> We can add them as needed later.
> This means remove patches:
> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all IB devices support ipoib

Ah? What is the point of supporting IPoIB on RoCE? What do you mean by "IB
device"?

> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB devices support AF_IB.

But not a generic RDMA device... Which is what is being queried in the call.

>
> On the other hand:
> - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
> - cap_ib_sa() might make sense to cut code even further in the CMA, since RoCE has a GSI but no SA.

As does cap_ib_mad, cap_ib_cm, etc.

>
> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> It *is* the link layer!

I agree with this. When the Link Layer is directly being requested we should
report the link layer. However, the internal uses of Link Layer should be
minimal if not 0.

>
> 6) Remove cap_read_multi_sge
> It is not device/port feature, but a transport capability.
> Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'.

This was already debated and we settled on cap_read_multi_sge. Checking the
transport does not allow for other verbs devices to support this unless they set
that same transport.

>
> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah().
> Address handles that refer to Ethernet links always have Ethernet addressing.

But how does the upper level code _know_ that? That is the point of
cap_eth_ah.

>
> In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway.
> Currently, this patch just adds clutter.
>
> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
> We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model.

I'm confused, comment 5 was talking about Link Layer???

> This is after renaming the function to rdma_is_ib_transport()...
>
>
> Putting it all together
> ==================
>
> We are left with the following helpers:
> - rdma_is_ib_transport()
> - rdma_is_iwarp_transport()
> - rdma_is_usnic_transport()
> - rdma_is_iboe()
> - rdma_has_mad()

Not sufficient to distinguish OPA MADs from IB

> - rdma_has_smi()
> - rdma_has_gsi() - complements smi; can be used by the mad code for clarity
> - rdma_has_sa()
> - rdma_has_cm()

Not sufficient to distinguish between the IB CM and iWarp "CM".

> - rdma_has_mcast()

Not sufficient to distinguish between the IB Multicast vs IBoE Multicast.


In general I'm flexible on the function names. "cap" vs "rdma" does not really
matter to me. Likewise "has" vs "requires" vs "uses" does not matter.

Regardless we still need more granularity than "Transport" and "Link Layer" for many
of the code choices.

The result of this series is pretty explicit and much cleaner as to what the
upper layers are really checking for.

Furthermore we know that the implementations are going to change going forward.
The point of this series is to decouple the MAD, CM, SA, IPoIB, etc modules
from the knowledge of transport and link layer. The new interface is simply
using the old implementation as a stepping stone.

Ira

>
>
> > Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> >
> >
> > Since v4:
> > * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> > Roland, Ira and Steve :-) Please remind me if anything missed :-P
> > * Fix logical issue inside 3#, 14#
> > * Refine 3#, 4#, 5# with label 'free'
> > * Rework 10# to stop using port 1 when port already assigned
> >
> > There are plenty of lengthy code to check the transport type of IB device, or
> > the link layer type of it's port, but actually we are just speculating whether a
> > particular management/feature is supported by the device/port.
> >
> > Thus instead of inferring, we should have our own mechanism for IB
> > management capability/protocol/feature checking, several proposals below.
> >
> > This patch set will reform the method of getting transport type, we will now
> > using query_transport() instead of inferring from transport and link layer
> > respectively, also we defined the new transport type to make the concept
> > more reasonable.
> >
> > Mapping List:
> > node-type link-layer old-transport new-transport
> > nes RNIC ETH IWARP IWARP
> > amso1100 RNIC ETH IWARP IWARP
> > cxgb3 RNIC ETH IWARP IWARP
> > cxgb4 RNIC ETH IWARP IWARP
> > usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> > ocrdma IB_CA ETH IB IBOE
> > mlx4 IB_CA IB/ETH IB IB/IBOE
> > mlx5 IB_CA IB IB IB
> > ehca IB_CA IB IB IB
> > ipath IB_CA IB IB IB
> > mthca IB_CA IB IB IB
> > qib IB_CA IB IB IB
> >
> > For example:
> > if (transport == IB) && (link-layer == ETH) will now become:
> > if (query_transport() == IBOE)
> >
> > Thus we will be able to get rid of the respective transport and link-layer
> > checking, and it will help us to add new protocol/Technology (like OPA) more
> > easier, also with the introduced management helpers, IB management logical
> > will be more clear and easier for extending.
> >
> > Highlights:
> > The patch set covered a wide range of IB stuff, thus for those who are
> > familiar with the particular part, your suggestion would be invaluable ;-)
> >
> > Patch 1#~15# included all the logical reform, 16#~25# introduced the
> > management helpers, 26#~27# do clean up.
> >
> > Patches haven't been tested yet, we appreciate if any one who have these
> > HW willing to provide his Tested-by :-)
> >
> > Doug suggested the bitmask mechanism:
> > https://www.mail-archive.com/linux-
> > [email protected]/msg23765.html
> > which could be the plan for future reforming, we prefer that to be another
> > series which focus on semantic and performance.
> >
> > This patch-set is somewhat 'bloated' now and it may be a good timing for
> > staging, I'd like to suggest we focus on improving existed helpers and push
> > all the further reforms into next series ;-)
> >
> > Proposals:
> > Sean:
> > https://www.mail-archive.com/linux-
> > [email protected]/msg23339.html
> > Doug:
> > https://www.mail-archive.com/linux-
> > [email protected]/msg23418.html
> > https://www.mail-archive.com/linux-
> > [email protected]/msg23765.html
> > Jason:
> > https://www.mail-archive.com/linux-
> > [email protected]/msg23425.html
> >
> > Michael Wang (27):
> > IB/Verbs: Implement new callback query_transport()
> > IB/Verbs: Implement raw management helpers
> > IB/Verbs: Reform IB-core mad/agent/user_mad
> > IB/Verbs: Reform IB-core cm
> > IB/Verbs: Reform IB-core sa_query
> > IB/Verbs: Reform IB-core multicast
> > IB/Verbs: Reform IB-ulp ipoib
> > IB/Verbs: Reform IB-ulp xprtrdma
> > IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> > IB/Verbs: Reform cm related part in IB-core cma/ucm
> > IB/Verbs: Reform route related part in IB-core cma
> > IB/Verbs: Reform mcast related part in IB-core cma
> > IB/Verbs: Reserve legacy transport type in 'dev_addr'
> > IB/Verbs: Reform cma_acquire_dev()
> > IB/Verbs: Reform rest part in IB-core cma
> > IB/Verbs: Use management helper cap_ib_mad()
> > IB/Verbs: Use management helper cap_ib_smi()
> > IB/Verbs: Use management helper cap_ib_cm()
> > IB/Verbs: Use management helper cap_iw_cm()
> > IB/Verbs: Use management helper cap_ib_sa()
> > IB/Verbs: Use management helper cap_ib_mcast()
> > IB/Verbs: Use management helper cap_ipoib()
> > IB/Verbs: Use management helper cap_read_multi_sge()
> > IB/Verbs: Use management helper cap_af_ib()
> > IB/Verbs: Use management helper cap_eth_ah()
> > IB/Verbs: Clean up rdma_ib_or_iboe()
> > IB/Verbs: Cleanup rdma_node_get_transport()
> >
> > ---
> > drivers/infiniband/core/agent.c | 4
> > drivers/infiniband/core/cm.c | 26 +-
> > drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> > drivers/infiniband/core/device.c | 1
> > drivers/infiniband/core/mad.c | 51 ++--
> > drivers/infiniband/core/multicast.c | 18 -
> > drivers/infiniband/core/sa_query.c | 41 +--
> > drivers/infiniband/core/sysfs.c | 8
> > drivers/infiniband/core/ucm.c | 5
> > drivers/infiniband/core/ucma.c | 27 --
> > drivers/infiniband/core/user_mad.c | 32 +-
> > drivers/infiniband/core/uverbs_cmd.c | 6
> > drivers/infiniband/core/verbs.c | 33 --
> > drivers/infiniband/hw/amso1100/c2_provider.c | 7
> > drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> > drivers/infiniband/hw/cxgb4/provider.c | 7
> > drivers/infiniband/hw/ehca/ehca_hca.c | 6
> > drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> > drivers/infiniband/hw/ehca/ehca_main.c | 1
> > drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> > drivers/infiniband/hw/mlx4/main.c | 10
> > drivers/infiniband/hw/mlx5/main.c | 7
> > drivers/infiniband/hw/mthca/mthca_provider.c | 7
> > drivers/infiniband/hw/nes/nes_verbs.c | 6
> > drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> > drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> > drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> > drivers/infiniband/hw/qib/qib_verbs.c | 7
> > drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> > drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> > drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> > drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> > include/rdma/ib_verbs.h | 204 +++++++++++++++-
> > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> > net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> > 35 files changed, 584 insertions(+), 368 deletions(-)
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> > body of a message to [email protected] More majordomo info at
> > http://vger.kernel.org/majordomo-info.html

2015-04-22 05:41:11

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()

On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:

> Introduce helper cap_ipoib() to help us check if the port of an
> IB device support IP over Infiniband.

I thought we were dropping this in favor of listing the actual
features the ULP required unconditionally? One of my messages had the
start of a list..

Jason

2015-04-22 07:39:12

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs

Hi, Ira

Thanks for the review :-)

On 04/22/2015 01:19 AM, ira.weiny wrote:
[snip]
>> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
>> index cbd0383..8570180 100644
>> --- a/drivers/infiniband/core/sysfs.c
>> +++ b/drivers/infiniband/core/sysfs.c
>> @@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
>> static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
>> char *buf)
>> {
>> - switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
>> - case IB_LINK_LAYER_INFINIBAND:
>> + if (rdma_tech_ib(p->ibdev, p->port_num))
>
> Is the final intention to remove Link Layer from the rdma stack entirely?
>
> I know that the use of link layer in userspace is just as convoluted as what we
> are trying to fix here in the kernel. So it would be nice if we can eventually
> get user space cleaned up to not use link layer as it currently does.
>
> However, standard networking tools can report the link layer. So while the
> current use of "link layer" via userspace software is wrong I don't think it is
> wrong to report this information _to_ userspace.
>
> So unless we intend to completely hide the link layer from userspace I don't
> think we should be removing the rdma_port_get_link_layer call. It is still
> valid information even though we don't want to use it in most places.

This series won't erase the rdma_port_get_link_layer(), although
currently only mlx4 still using it in kernel...

link_layer_show() was supposed to report the same info to user
space as usual, so user tool don't have to change anything :-)

Regards,
Michael Wang

>
> Ira
>
>> return sprintf(buf, "%s\n", "InfiniBand");
>> - case IB_LINK_LAYER_ETHERNET:
>> + else
>> return sprintf(buf, "%s\n", "Ethernet");
>> - default:
>> - return sprintf(buf, "%s\n", "Unknown");
>> - }
>> }
>>
>> static PORT_ATTR_RO(state);
>> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
>> index a9f0489..5dc90aa 100644
>> --- a/drivers/infiniband/core/uverbs_cmd.c
>> +++ b/drivers/infiniband/core/uverbs_cmd.c
>> @@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
>> resp.active_width = attr.active_width;
>> resp.active_speed = attr.active_speed;
>> resp.phys_state = attr.phys_state;
>> - resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
>> - cmd.port_num);
>> + resp.link_layer = rdma_tech_ib(file->device->ib_dev,
>> + cmd.port_num) ?
>> + IB_LINK_LAYER_INFINIBAND :
>> + IB_LINK_LAYER_ETHERNET;
>>
>> if (copy_to_user((void __user *) (unsigned long) cmd.response,
>> &resp, sizeof resp))
>> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
>> index 626c9cf..7264860 100644
>> --- a/drivers/infiniband/core/verbs.c
>> +++ b/drivers/infiniband/core/verbs.c
>> @@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
>> u32 flow_class;
>> u16 gid_index;
>> int ret;
>> - int is_eth = (rdma_port_get_link_layer(device, port_num) ==
>> - IB_LINK_LAYER_ETHERNET);
>>
>> memset(ah_attr, 0, sizeof *ah_attr);
>> - if (is_eth) {
>> + if (rdma_tech_iboe(device, port_num)) {
>> if (!(wc->wc_flags & IB_WC_GRH))
>> return -EPROTOTYPE;
>>
>> @@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
>> union ib_gid sgid;
>>
>> if ((*qp_attr_mask & IB_QP_AV) &&
>> - (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
>> + (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
>> ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
>> qp_attr->ah_attr.grh.sgid_index, &sgid);
>> if (ret)
>> --
>> 2.1.0

2015-04-22 07:45:06

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/22/2015 02:28 AM, ira.weiny wrote:
[snip]
>>
>> Highlights:
>> The patch set covered a wide range of IB stuff, thus for those who are
>> familiar with the particular part, your suggestion would be invaluable ;-)
>>
>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>> management helpers, 26#~27# do clean up.
>>
>> Patches haven't been tested yet, we appreciate if any one who have these
>> HW willing to provide his Tested-by :-)
>>
>> Doug suggested the bitmask mechanism:
>> https://www.mail-archive.com/[email protected]/msg23765.html
>> which could be the plan for future reforming, we prefer that to be another
>> series which focus on semantic and performance.
>>
>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>> staging, I'd like to suggest we focus on improving existed helpers and push
>> all the further reforms into next series ;-)
>>
>
> Series tested for IPoIB and MAD functionality on qib and mlx4 hardware.
>
> Tested-by: Ira Weiny <[email protected]>

Thanks for the testing :-)

Regards,
Michael Wang

>
>>
>> Proposals:
>> Sean:
>> https://www.mail-archive.com/[email protected]/msg23339.html
>> Doug:
>> https://www.mail-archive.com/[email protected]/msg23418.html
>> https://www.mail-archive.com/[email protected]/msg23765.html
>> Jason:
>> https://www.mail-archive.com/[email protected]/msg23425.html
>>
>> Michael Wang (27):
>> IB/Verbs: Implement new callback query_transport()
>> IB/Verbs: Implement raw management helpers
>> IB/Verbs: Reform IB-core mad/agent/user_mad
>> IB/Verbs: Reform IB-core cm
>> IB/Verbs: Reform IB-core sa_query
>> IB/Verbs: Reform IB-core multicast
>> IB/Verbs: Reform IB-ulp ipoib
>> IB/Verbs: Reform IB-ulp xprtrdma
>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>> IB/Verbs: Reform route related part in IB-core cma
>> IB/Verbs: Reform mcast related part in IB-core cma
>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>> IB/Verbs: Reform cma_acquire_dev()
>> IB/Verbs: Reform rest part in IB-core cma
>> IB/Verbs: Use management helper cap_ib_mad()
>> IB/Verbs: Use management helper cap_ib_smi()
>> IB/Verbs: Use management helper cap_ib_cm()
>> IB/Verbs: Use management helper cap_iw_cm()
>> IB/Verbs: Use management helper cap_ib_sa()
>> IB/Verbs: Use management helper cap_ib_mcast()
>> IB/Verbs: Use management helper cap_ipoib()
>> IB/Verbs: Use management helper cap_read_multi_sge()
>> IB/Verbs: Use management helper cap_af_ib()
>> IB/Verbs: Use management helper cap_eth_ah()
>> IB/Verbs: Clean up rdma_ib_or_iboe()
>> IB/Verbs: Cleanup rdma_node_get_transport()
>>
>> ---
>> drivers/infiniband/core/agent.c | 4
>> drivers/infiniband/core/cm.c | 26 +-
>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>> drivers/infiniband/core/device.c | 1
>> drivers/infiniband/core/mad.c | 51 ++--
>> drivers/infiniband/core/multicast.c | 18 -
>> drivers/infiniband/core/sa_query.c | 41 +--
>> drivers/infiniband/core/sysfs.c | 8
>> drivers/infiniband/core/ucm.c | 5
>> drivers/infiniband/core/ucma.c | 27 --
>> drivers/infiniband/core/user_mad.c | 32 +-
>> drivers/infiniband/core/uverbs_cmd.c | 6
>> drivers/infiniband/core/verbs.c | 33 --
>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>> drivers/infiniband/hw/cxgb4/provider.c | 7
>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>> drivers/infiniband/hw/mlx4/main.c | 10
>> drivers/infiniband/hw/mlx5/main.c | 7
>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>> 35 files changed, 584 insertions(+), 368 deletions(-)

2015-04-22 08:30:22

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

Hi, Liran

Thanks for the comment :-)

On 04/22/2015 01:36 AM, Liran Liss wrote:
[snip]
>
> (**) This has been extended to also encode the transport in the current code.
> At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code.
>
> So, I think that our "old-transport" below is just fine.
> No need to change it (and you aren't, since it is currently implemented as a function).
>
> The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type.
> I would remove the table below and tell what we really want to achieve:
> ==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers.

Our purpose is to help core layer do management more clearly, rather then
referring from transport and linklayer.

IMHO from management's point of view, what we really care about is whether
a particular management required by device or not, rather then the details
on transport and link layer.

This new transport is only understand by core-layer currently, for user-layer
we still reserve the old transport for them, next step is to use bitmask
instead of transport, at that time we can erase the new transport and make
the whole stuff used by user-layer only :-)

>
>
> Detailed remarks
> ==============
>
> 1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27.
> This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set.
> Do this and remove patches 16-24.

We have some discussion about compress the patch set, merge the reform and introducing patch
will mix the concept (like the earlier version), IMHO it will increase the difficulty
of review...

And now since many review already been done, it's not wise to change the whole structure
of patch set IMHO...

>
> 2)The name rdma_tech_* is lame.
> rdma_transport_*(), adhering to the above (*) remark, is much better.
> For example, both IB and ROCE *do* use the same transport.

We have some discussion on that too, use transport means going back...

>
> 3) The name cap_* as it is used above is not accurate.
> You use it to describe technology characteristics rather than extendable capabilities.
> I would suggest having a single convention for all helpers, such as rdma_has_*() and rdma_is_*().
> For example: cap_ib_smi() ==> rdma_has_smi().

That means going back too...

>
> 4) Remove all capabilities that do not introduce any distinction in the current code.
> We can add them as needed later.
> This means remove patches:
> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all IB devices support ipoib
> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB devices support AF_IB.
>
> On the other hand:
> - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
> - cap_ib_sa() might make sense to cut code even further in the CMA, since RoCE has a GSI but no SA.

We have discussion on define these helpers previously, again, name is not really
a problem, I would rather to see such changes in the following series after this
one working stably :-)

>
> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> It *is* the link layer!

Actually nothing changed after the modify, the prev purpose it to eliminate the link layer helpers.

But now we are not going to remove the helper any more, so let's drop this modification in next version :-)

>
> 6) Remove cap_read_multi_sge
> It is not device/port feature, but a transport capability.
> Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'.
>
> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah().
> Address handles that refer to Ethernet links always have Ethernet addressing.
>
> In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway.
> Currently, this patch just adds clutter.

There are also some discussion on these helpers, drop them means going back..

The tech helper is not enough to explain the management purpose, and this can
be the wrapper for bitmask stuff too.

>
> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
> We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model.
> This is after renaming the function to rdma_is_ib_transport()...

This means going back again... rdma_is_ib_transport() has been used previously.

This helper is just to make the review more easier, we won't need it internally,
not to mention after bitmask was introduced :-)

>
>
> Putting it all together
> ==================
>
> We are left with the following helpers:
> - rdma_is_ib_transport()
> - rdma_is_iwarp_transport()
> - rdma_is_usnic_transport()
> - rdma_is_iboe()
> - rdma_has_mad()
> - rdma_has_smi()
> - rdma_has_gsi() - complements smi; can be used by the mad code for clarity
> - rdma_has_sa()
> - rdma_has_cm()
> - rdma_has_mcast()

I think we can put the discussion on name and new helpers in future, currently
let's focus on these basic reform and make them working stably ;-)

Regards,
Michael Wang

>
>
>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>
>>
>> Since v4:
>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>> * Fix logical issue inside 3#, 14#
>> * Refine 3#, 4#, 5# with label 'free'
>> * Rework 10# to stop using port 1 when port already assigned
>>
>> There are plenty of lengthy code to check the transport type of IB device, or
>> the link layer type of it's port, but actually we are just speculating whether a
>> particular management/feature is supported by the device/port.
>>
>> Thus instead of inferring, we should have our own mechanism for IB
>> management capability/protocol/feature checking, several proposals below.
>>
>> This patch set will reform the method of getting transport type, we will now
>> using query_transport() instead of inferring from transport and link layer
>> respectively, also we defined the new transport type to make the concept
>> more reasonable.
>>
>> Mapping List:
>> node-type link-layer old-transport new-transport
>> nes RNIC ETH IWARP IWARP
>> amso1100 RNIC ETH IWARP IWARP
>> cxgb3 RNIC ETH IWARP IWARP
>> cxgb4 RNIC ETH IWARP IWARP
>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>> ocrdma IB_CA ETH IB IBOE
>> mlx4 IB_CA IB/ETH IB IB/IBOE
>> mlx5 IB_CA IB IB IB
>> ehca IB_CA IB IB IB
>> ipath IB_CA IB IB IB
>> mthca IB_CA IB IB IB
>> qib IB_CA IB IB IB
>>
>> For example:
>> if (transport == IB) && (link-layer == ETH) will now become:
>> if (query_transport() == IBOE)
>>
>> Thus we will be able to get rid of the respective transport and link-layer
>> checking, and it will help us to add new protocol/Technology (like OPA) more
>> easier, also with the introduced management helpers, IB management logical
>> will be more clear and easier for extending.
>>
>> Highlights:
>> The patch set covered a wide range of IB stuff, thus for those who are
>> familiar with the particular part, your suggestion would be invaluable ;-)
>>
>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>> management helpers, 26#~27# do clean up.
>>
>> Patches haven't been tested yet, we appreciate if any one who have these
>> HW willing to provide his Tested-by :-)
>>
>> Doug suggested the bitmask mechanism:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23765.html
>> which could be the plan for future reforming, we prefer that to be another
>> series which focus on semantic and performance.
>>
>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>> staging, I'd like to suggest we focus on improving existed helpers and push
>> all the further reforms into next series ;-)
>>
>> Proposals:
>> Sean:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23339.html
>> Doug:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23418.html
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23765.html
>> Jason:
>> https://www.mail-archive.com/linux-
>> [email protected]/msg23425.html
>>
>> Michael Wang (27):
>> IB/Verbs: Implement new callback query_transport()
>> IB/Verbs: Implement raw management helpers
>> IB/Verbs: Reform IB-core mad/agent/user_mad
>> IB/Verbs: Reform IB-core cm
>> IB/Verbs: Reform IB-core sa_query
>> IB/Verbs: Reform IB-core multicast
>> IB/Verbs: Reform IB-ulp ipoib
>> IB/Verbs: Reform IB-ulp xprtrdma
>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>> IB/Verbs: Reform route related part in IB-core cma
>> IB/Verbs: Reform mcast related part in IB-core cma
>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>> IB/Verbs: Reform cma_acquire_dev()
>> IB/Verbs: Reform rest part in IB-core cma
>> IB/Verbs: Use management helper cap_ib_mad()
>> IB/Verbs: Use management helper cap_ib_smi()
>> IB/Verbs: Use management helper cap_ib_cm()
>> IB/Verbs: Use management helper cap_iw_cm()
>> IB/Verbs: Use management helper cap_ib_sa()
>> IB/Verbs: Use management helper cap_ib_mcast()
>> IB/Verbs: Use management helper cap_ipoib()
>> IB/Verbs: Use management helper cap_read_multi_sge()
>> IB/Verbs: Use management helper cap_af_ib()
>> IB/Verbs: Use management helper cap_eth_ah()
>> IB/Verbs: Clean up rdma_ib_or_iboe()
>> IB/Verbs: Cleanup rdma_node_get_transport()
>>
>> ---
>> drivers/infiniband/core/agent.c | 4
>> drivers/infiniband/core/cm.c | 26 +-
>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>> drivers/infiniband/core/device.c | 1
>> drivers/infiniband/core/mad.c | 51 ++--
>> drivers/infiniband/core/multicast.c | 18 -
>> drivers/infiniband/core/sa_query.c | 41 +--
>> drivers/infiniband/core/sysfs.c | 8
>> drivers/infiniband/core/ucm.c | 5
>> drivers/infiniband/core/ucma.c | 27 --
>> drivers/infiniband/core/user_mad.c | 32 +-
>> drivers/infiniband/core/uverbs_cmd.c | 6
>> drivers/infiniband/core/verbs.c | 33 --
>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>> drivers/infiniband/hw/cxgb4/provider.c | 7
>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>> drivers/infiniband/hw/mlx4/main.c | 10
>> drivers/infiniband/hw/mlx5/main.c | 7
>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>> 35 files changed, 584 insertions(+), 368 deletions(-)
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>> body of a message to [email protected] More majordomo info at
>> http://vger.kernel.org/majordomo-info.html

2015-04-22 08:32:44

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/22/2015 04:41 AM, ira.weiny wrote:
[snip]
>
>>
>> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>> It *is* the link layer!
>
> I agree with this. When the Link Layer is directly being requested we should
> report the link layer. However, the internal uses of Link Layer should be
> minimal if not 0.

Reform on link_layer_show() and ib_uverbs_query_port() Will be
dropped in next version :-)

Regards,
Michael Wang

>
>>
>> 6) Remove cap_read_multi_sge
>> It is not device/port feature, but a transport capability.
>> Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'.
>
> This was already debated and we settled on cap_read_multi_sge. Checking the
> transport does not allow for other verbs devices to support this unless they set
> that same transport.
>
>>
>> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah().
>> Address handles that refer to Ethernet links always have Ethernet addressing.
>
> But how does the upper level code _know_ that? That is the point of
> cap_eth_ah.
>
>>
>> In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway.
>> Currently, this patch just adds clutter.
>>
>> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
>> We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model.
>
> I'm confused, comment 5 was talking about Link Layer???
>
>> This is after renaming the function to rdma_is_ib_transport()...
>>
>>
>> Putting it all together
>> ==================
>>
>> We are left with the following helpers:
>> - rdma_is_ib_transport()
>> - rdma_is_iwarp_transport()
>> - rdma_is_usnic_transport()
>> - rdma_is_iboe()
>> - rdma_has_mad()
>
> Not sufficient to distinguish OPA MADs from IB
>
>> - rdma_has_smi()
>> - rdma_has_gsi() - complements smi; can be used by the mad code for clarity
>> - rdma_has_sa()
>> - rdma_has_cm()
>
> Not sufficient to distinguish between the IB CM and iWarp "CM".
>
>> - rdma_has_mcast()
>
> Not sufficient to distinguish between the IB Multicast vs IBoE Multicast.
>
>
> In general I'm flexible on the function names. "cap" vs "rdma" does not really
> matter to me. Likewise "has" vs "requires" vs "uses" does not matter.
>
> Regardless we still need more granularity than "Transport" and "Link Layer" for many
> of the code choices.
>
> The result of this series is pretty explicit and much cleaner as to what the
> upper layers are really checking for.
>
> Furthermore we know that the implementations are going to change going forward.
> The point of this series is to decouple the MAD, CM, SA, IPoIB, etc modules
> from the knowledge of transport and link layer. The new interface is simply
> using the old implementation as a stepping stone.
>
> Ira
>
>>
>>
>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>>
>>>
>>> Since v4:
>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>>> * Fix logical issue inside 3#, 14#
>>> * Refine 3#, 4#, 5# with label 'free'
>>> * Rework 10# to stop using port 1 when port already assigned
>>>
>>> There are plenty of lengthy code to check the transport type of IB device, or
>>> the link layer type of it's port, but actually we are just speculating whether a
>>> particular management/feature is supported by the device/port.
>>>
>>> Thus instead of inferring, we should have our own mechanism for IB
>>> management capability/protocol/feature checking, several proposals below.
>>>
>>> This patch set will reform the method of getting transport type, we will now
>>> using query_transport() instead of inferring from transport and link layer
>>> respectively, also we defined the new transport type to make the concept
>>> more reasonable.
>>>
>>> Mapping List:
>>> node-type link-layer old-transport new-transport
>>> nes RNIC ETH IWARP IWARP
>>> amso1100 RNIC ETH IWARP IWARP
>>> cxgb3 RNIC ETH IWARP IWARP
>>> cxgb4 RNIC ETH IWARP IWARP
>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>>> ocrdma IB_CA ETH IB IBOE
>>> mlx4 IB_CA IB/ETH IB IB/IBOE
>>> mlx5 IB_CA IB IB IB
>>> ehca IB_CA IB IB IB
>>> ipath IB_CA IB IB IB
>>> mthca IB_CA IB IB IB
>>> qib IB_CA IB IB IB
>>>
>>> For example:
>>> if (transport == IB) && (link-layer == ETH) will now become:
>>> if (query_transport() == IBOE)
>>>
>>> Thus we will be able to get rid of the respective transport and link-layer
>>> checking, and it will help us to add new protocol/Technology (like OPA) more
>>> easier, also with the introduced management helpers, IB management logical
>>> will be more clear and easier for extending.
>>>
>>> Highlights:
>>> The patch set covered a wide range of IB stuff, thus for those who are
>>> familiar with the particular part, your suggestion would be invaluable ;-)
>>>
>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>>> management helpers, 26#~27# do clean up.
>>>
>>> Patches haven't been tested yet, we appreciate if any one who have these
>>> HW willing to provide his Tested-by :-)
>>>
>>> Doug suggested the bitmask mechanism:
>>> https://www.mail-archive.com/linux-
>>> [email protected]/msg23765.html
>>> which could be the plan for future reforming, we prefer that to be another
>>> series which focus on semantic and performance.
>>>
>>> This patch-set is somewhat 'bloated' now and it may be a good timing for
>>> staging, I'd like to suggest we focus on improving existed helpers and push
>>> all the further reforms into next series ;-)
>>>
>>> Proposals:
>>> Sean:
>>> https://www.mail-archive.com/linux-
>>> [email protected]/msg23339.html
>>> Doug:
>>> https://www.mail-archive.com/linux-
>>> [email protected]/msg23418.html
>>> https://www.mail-archive.com/linux-
>>> [email protected]/msg23765.html
>>> Jason:
>>> https://www.mail-archive.com/linux-
>>> [email protected]/msg23425.html
>>>
>>> Michael Wang (27):
>>> IB/Verbs: Implement new callback query_transport()
>>> IB/Verbs: Implement raw management helpers
>>> IB/Verbs: Reform IB-core mad/agent/user_mad
>>> IB/Verbs: Reform IB-core cm
>>> IB/Verbs: Reform IB-core sa_query
>>> IB/Verbs: Reform IB-core multicast
>>> IB/Verbs: Reform IB-ulp ipoib
>>> IB/Verbs: Reform IB-ulp xprtrdma
>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>>> IB/Verbs: Reform route related part in IB-core cma
>>> IB/Verbs: Reform mcast related part in IB-core cma
>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>>> IB/Verbs: Reform cma_acquire_dev()
>>> IB/Verbs: Reform rest part in IB-core cma
>>> IB/Verbs: Use management helper cap_ib_mad()
>>> IB/Verbs: Use management helper cap_ib_smi()
>>> IB/Verbs: Use management helper cap_ib_cm()
>>> IB/Verbs: Use management helper cap_iw_cm()
>>> IB/Verbs: Use management helper cap_ib_sa()
>>> IB/Verbs: Use management helper cap_ib_mcast()
>>> IB/Verbs: Use management helper cap_ipoib()
>>> IB/Verbs: Use management helper cap_read_multi_sge()
>>> IB/Verbs: Use management helper cap_af_ib()
>>> IB/Verbs: Use management helper cap_eth_ah()
>>> IB/Verbs: Clean up rdma_ib_or_iboe()
>>> IB/Verbs: Cleanup rdma_node_get_transport()
>>>
>>> ---
>>> drivers/infiniband/core/agent.c | 4
>>> drivers/infiniband/core/cm.c | 26 +-
>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>>> drivers/infiniband/core/device.c | 1
>>> drivers/infiniband/core/mad.c | 51 ++--
>>> drivers/infiniband/core/multicast.c | 18 -
>>> drivers/infiniband/core/sa_query.c | 41 +--
>>> drivers/infiniband/core/sysfs.c | 8
>>> drivers/infiniband/core/ucm.c | 5
>>> drivers/infiniband/core/ucma.c | 27 --
>>> drivers/infiniband/core/user_mad.c | 32 +-
>>> drivers/infiniband/core/uverbs_cmd.c | 6
>>> drivers/infiniband/core/verbs.c | 33 --
>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>>> drivers/infiniband/hw/cxgb4/provider.c | 7
>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>>> drivers/infiniband/hw/mlx4/main.c | 10
>>> drivers/infiniband/hw/mlx5/main.c | 7
>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>>> 35 files changed, 584 insertions(+), 368 deletions(-)
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
>>> body of a message to [email protected] More majordomo info at
>>> http://vger.kernel.org/majordomo-info.html

2015-04-22 08:50:04

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()


On 04/22/2015 07:40 AM, Jason Gunthorpe wrote:
> On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
>
>> Introduce helper cap_ipoib() to help us check if the port of an
>> IB device support IP over Infiniband.
>
> I thought we were dropping this in favor of listing the actual
> features the ULP required unconditionally? One of my messages had the
> start of a list..

Shall we drop it now or wait until the mechanism introduced?

Just wondering the requirement of ULP could be similar to the
requirement of management, isn't it? if the device can tell
which ULP it support, then may be a cap_XX() make sense in here?

Regards,
Michael Wang

>
> Jason
>

2015-04-22 15:00:00

by Doug Ledford

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Tue, 2015-04-21 at 23:36 +0000, Liran Liss wrote:
> Hi Michael,
>
> The spirit of this patch-set is great, but I think that we need to clarify some concepts.
> Since this will affect the whole patch-set, I am laying out my concerns here instead.
>
> A suggestion for the resulting management helpers is given below.
> I believe the result would be much more coherent.
> --Liran
>
> In general
> ========
>
> An ib_dev (or a port of) should be distinguished by 3 qualifiers:
> - The link layer:
> -- Ethernet (shared by iWARP, USNIC, and ROCE)
> -- Infiniband
>
> - The transport (*)
> -- IBTA transport (shared by IB and ROCE)
> -- iWARP transport
> -- USNIC transport
>
> (*) Transport means both:
> - The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack)
> - The transport semantics (for example, there are slight semantic differences between IBTA and iWARP)
>
> - The node type (**)
> -- CA
> -- Switch
> -- Router
>
> (**) This has been extended to also encode the transport in the current code.
> At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code.
>
> So, I think that our "old-transport" below is just fine.
> No need to change it (and you aren't, since it is currently implemented as a function).
>
> The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type.
> I would remove the table below and tell what we really want to achieve:
> ==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers.
>
>
> Detailed remarks
> ==============
>
> 1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27.
> This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set.
> Do this and remove patches 16-24.
>
> 2)The name rdma_tech_* is lame.
> rdma_transport_*(), adhering to the above (*) remark, is much better.
> For example, both IB and ROCE *do* use the same transport.

I especially want to second this. I haven't really been happy with the
rdma_tech_* names at all.


--
Doug Ledford <[email protected]>
GPG KeyID: 0E572FDD



Attachments:
signature.asc (819.00 B)
This is a digitally signed message part

2015-04-22 15:03:27

by Doug Ledford

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()

On Mon, 2015-04-20 at 10:32 +0200, Michael Wang wrote:
> Add new callback query_transport() and implement for each HW.

The more I think about it, the more I think we need to eliminate this
patch entirely.

The problem here is that, if we follow my suggestion, then we are going
to eliminate the query as an API function and replace the information it
gives us with a static port attribute bitmap. If we do this patch, then
reform this patch to my idea later, we introduce a very short lived
API/ABI change in the kernel module interface that serves absolutely no
purpose. Instead, let's do the bitmap creation first, update the
drivers to properly set the bitmap, then do all of the remaining reforms
you have here using that bitmap and completely skip the
query_transport() API item that will no longer serve a purpose.

> Mapping List:
> node-type link-layer old-transport new-transport
> nes RNIC ETH IWARP IWARP
> amso1100 RNIC ETH IWARP IWARP
> cxgb3 RNIC ETH IWARP IWARP
> cxgb4 RNIC ETH IWARP IWARP
> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> ocrdma IB_CA ETH IB IBOE
> mlx4 IB_CA IB/ETH IB IB/IBOE
> mlx5 IB_CA IB IB IB
> ehca IB_CA IB IB IB
> ipath IB_CA IB IB IB
> mthca IB_CA IB IB IB
> qib IB_CA IB IB IB
>
> Cc: Hal Rosenstock <[email protected]>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/device.c | 1 +
> drivers/infiniband/core/verbs.c | 4 +++-
> drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
> drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
> drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
> drivers/infiniband/hw/ehca/ehca_main.c | 1 +
> drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
> drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
> drivers/infiniband/hw/mlx5/main.c | 7 +++++++
> drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
> drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
> drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
> include/rdma/ib_verbs.h | 7 ++++++-
> 21 files changed, 104 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 18c1ece..a9587c4 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
> } mandatory_table[] = {
> IB_MANDATORY_FUNC(query_device),
> IB_MANDATORY_FUNC(query_port),
> + IB_MANDATORY_FUNC(query_transport),
> IB_MANDATORY_FUNC(query_pkey),
> IB_MANDATORY_FUNC(query_gid),
> IB_MANDATORY_FUNC(alloc_pd),
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index f93eb8d..626c9cf 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -133,14 +133,16 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
> if (device->get_link_layer)
> return device->get_link_layer(device, port_num);
>
> - switch (rdma_node_get_transport(device->node_type)) {
> + switch (device->query_transport(device, port_num)) {
> case RDMA_TRANSPORT_IB:
> return IB_LINK_LAYER_INFINIBAND;
> + case RDMA_TRANSPORT_IBOE:
> case RDMA_TRANSPORT_IWARP:
> case RDMA_TRANSPORT_USNIC:
> case RDMA_TRANSPORT_USNIC_UDP:
> return IB_LINK_LAYER_ETHERNET;
> default:
> + BUG();
> return IB_LINK_LAYER_UNSPECIFIED;
> }
> }
> diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
> index bdf3507..d46bbb0 100644
> --- a/drivers/infiniband/hw/amso1100/c2_provider.c
> +++ b/drivers/infiniband/hw/amso1100/c2_provider.c
> @@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +c2_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static int c2_query_pkey(struct ib_device *ibdev,
> u8 port, u16 index, u16 * pkey)
> {
> @@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
> dev->ibdev.dma_device = &dev->pcidev->dev;
> dev->ibdev.query_device = c2_query_device;
> dev->ibdev.query_port = c2_query_port;
> + dev->ibdev.query_transport = c2_query_transport;
> dev->ibdev.query_pkey = c2_query_pkey;
> dev->ibdev.query_gid = c2_query_gid;
> dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> index 811b24a..09682e9e 100644
> --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> @@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +iwch_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> char *buf)
> {
> @@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
> dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
> dev->ibdev.query_device = iwch_query_device;
> dev->ibdev.query_port = iwch_query_port;
> + dev->ibdev.query_transport = iwch_query_transport;
> dev->ibdev.query_pkey = iwch_query_pkey;
> dev->ibdev.query_gid = iwch_query_gid;
> dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
> index 66bd6a2..a445e0d 100644
> --- a/drivers/infiniband/hw/cxgb4/provider.c
> +++ b/drivers/infiniband/hw/cxgb4/provider.c
> @@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +static enum rdma_transport_type
> +c4iw_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
> +
> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> char *buf)
> {
> @@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
> dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
> dev->ibdev.query_device = c4iw_query_device;
> dev->ibdev.query_port = c4iw_query_port;
> + dev->ibdev.query_transport = c4iw_query_transport;
> dev->ibdev.query_pkey = c4iw_query_pkey;
> dev->ibdev.query_gid = c4iw_query_gid;
> dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
> index 9ed4d25..d5a34a6 100644
> --- a/drivers/infiniband/hw/ehca/ehca_hca.c
> +++ b/drivers/infiniband/hw/ehca/ehca_hca.c
> @@ -242,6 +242,12 @@ query_port1:
> return ret;
> }
>
> +enum rdma_transport_type
> +ehca_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> int ehca_query_sma_attr(struct ehca_shca *shca,
> u8 port, struct ehca_sma_attr *attr)
> {
> diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> index 22f79af..cec945f 100644
> --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> @@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
> int ehca_query_port(struct ib_device *ibdev, u8 port,
> struct ib_port_attr *props);
>
> +enum rdma_transport_type
> +ehca_query_transport(struct ib_device *device, u8 port_num);
> +
> int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
> struct ehca_sma_attr *attr);
>
> diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
> index cd8d290..60e0a09 100644
> --- a/drivers/infiniband/hw/ehca/ehca_main.c
> +++ b/drivers/infiniband/hw/ehca/ehca_main.c
> @@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
> shca->ib_device.dma_device = &shca->ofdev->dev;
> shca->ib_device.query_device = ehca_query_device;
> shca->ib_device.query_port = ehca_query_port;
> + shca->ib_device.query_transport = ehca_query_transport;
> shca->ib_device.query_gid = ehca_query_gid;
> shca->ib_device.query_pkey = ehca_query_pkey;
> /* shca->in_device.modify_device = ehca_modify_device */
> diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
> index 44ea939..58d36e3 100644
> --- a/drivers/infiniband/hw/ipath/ipath_verbs.c
> +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
> @@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +static enum rdma_transport_type
> +ipath_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int ipath_modify_device(struct ib_device *device,
> int device_modify_mask,
> struct ib_device_modify *device_modify)
> @@ -2140,6 +2146,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
> dev->query_device = ipath_query_device;
> dev->modify_device = ipath_modify_device;
> dev->query_port = ipath_query_port;
> + dev->query_transport = ipath_query_transport;
> dev->modify_port = ipath_modify_port;
> dev->query_pkey = ipath_query_pkey;
> dev->query_gid = ipath_query_gid;
> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
> index b972c0b..e1424ad 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
> return __mlx4_ib_query_port(ibdev, port, props, 0);
> }
>
> +static enum rdma_transport_type
> +mlx4_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + struct mlx4_dev *dev = to_mdev(device)->dev;
> +
> + return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
> + RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE;
> +}
> +
> int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> union ib_gid *gid, int netw_view)
> {
> @@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>
> ibdev->ib_dev.query_device = mlx4_ib_query_device;
> ibdev->ib_dev.query_port = mlx4_ib_query_port;
> + ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
> ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
> ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
> ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index cc4ac1e..209c796 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -351,6 +351,12 @@ out:
> return err;
> }
>
> +static enum rdma_transport_type
> +mlx5_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> union ib_gid *gid)
> {
> @@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
>
> dev->ib_dev.query_device = mlx5_ib_query_device;
> dev->ib_dev.query_port = mlx5_ib_query_port;
> + dev->ib_dev.query_transport = mlx5_ib_query_transport;
> dev->ib_dev.query_gid = mlx5_ib_query_gid;
> dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
> dev->ib_dev.modify_device = mlx5_ib_modify_device;
> diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
> index 415f8e1..67ac6a4 100644
> --- a/drivers/infiniband/hw/mthca/mthca_provider.c
> +++ b/drivers/infiniband/hw/mthca/mthca_provider.c
> @@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
> return err;
> }
>
> +static enum rdma_transport_type
> +mthca_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int mthca_modify_device(struct ib_device *ibdev,
> int mask,
> struct ib_device_modify *props)
> @@ -1281,6 +1287,7 @@ int mthca_register_device(struct mthca_dev *dev)
> dev->ib_dev.dma_device = &dev->pdev->dev;
> dev->ib_dev.query_device = mthca_query_device;
> dev->ib_dev.query_port = mthca_query_port;
> + dev->ib_dev.query_transport = mthca_query_transport;
> dev->ib_dev.modify_device = mthca_modify_device;
> dev->ib_dev.modify_port = mthca_modify_port;
> dev->ib_dev.query_pkey = mthca_query_pkey;
> diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
> index c0d0296..8df5b61 100644
> --- a/drivers/infiniband/hw/nes/nes_verbs.c
> +++ b/drivers/infiniband/hw/nes/nes_verbs.c
> @@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
> return 0;
> }
>
> +static enum rdma_transport_type
> +nes_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IWARP;
> +}
>
> /**
> * nes_query_pkey
> @@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
> nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
> nesibdev->ibdev.query_device = nes_query_device;
> nesibdev->ibdev.query_port = nes_query_port;
> + nesibdev->ibdev.query_transport = nes_query_transport;
> nesibdev->ibdev.query_pkey = nes_query_pkey;
> nesibdev->ibdev.query_gid = nes_query_gid;
> nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> index 7a2b59a..9f4d182 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> @@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
> /* mandatory verbs. */
> dev->ibdev.query_device = ocrdma_query_device;
> dev->ibdev.query_port = ocrdma_query_port;
> + dev->ibdev.query_transport = ocrdma_query_transport;
> dev->ibdev.modify_port = ocrdma_modify_port;
> dev->ibdev.query_gid = ocrdma_query_gid;
> dev->ibdev.get_link_layer = ocrdma_link_layer;
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> index 8771755..73bace4 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> @@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
> return 0;
> }
>
> +enum rdma_transport_type
> +ocrdma_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IBOE;
> +}
> +
> int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
> struct ib_port_modify *props)
> {
> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> index b8f7853..4a81b63 100644
> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> @@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port, struct ib_port_attr *props);
> int ocrdma_modify_port(struct ib_device *, u8 port, int mask,
> struct ib_port_modify *props);
>
> +enum rdma_transport_type
> +ocrdma_query_transport(struct ib_device *device, u8 port_num);
> +
> void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid);
> int ocrdma_query_gid(struct ib_device *, u8 port,
> int index, union ib_gid *gid);
> diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
> index 4a35998..caad665 100644
> --- a/drivers/infiniband/hw/qib/qib_verbs.c
> +++ b/drivers/infiniband/hw/qib/qib_verbs.c
> @@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +static enum rdma_transport_type
> +qib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_IB;
> +}
> +
> static int qib_modify_device(struct ib_device *device,
> int device_modify_mask,
> struct ib_device_modify *device_modify)
> @@ -2184,6 +2190,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
> ibdev->query_device = qib_query_device;
> ibdev->modify_device = qib_modify_device;
> ibdev->query_port = qib_query_port;
> + ibdev->query_transport = qib_query_transport;
> ibdev->modify_port = qib_modify_port;
> ibdev->query_pkey = qib_query_pkey;
> ibdev->query_gid = qib_query_gid;
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> index 0d0f986..03ea9f3 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> @@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
>
> us_ibdev->ib_dev.query_device = usnic_ib_query_device;
> us_ibdev->ib_dev.query_port = usnic_ib_query_port;
> + us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
> us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
> us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
> us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> index 53bd6a2..ff9a5f7 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> @@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
> return 0;
> }
>
> +enum rdma_transport_type
> +usnic_ib_query_transport(struct ib_device *device, u8 port_num)
> +{
> + return RDMA_TRANSPORT_USNIC_UDP;
> +}
> +
> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> int qp_attr_mask,
> struct ib_qp_init_attr *qp_init_attr)
> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> index bb864f5..0b1633b 100644
> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> @@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
> struct ib_device_attr *props);
> int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
> struct ib_port_attr *props);
> +enum rdma_transport_type
> +usnic_ib_query_transport(struct ib_device *device, u8 port_num);
> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> int qp_attr_mask,
> struct ib_qp_init_attr *qp_init_attr);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 65994a1..d54f91e 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -75,10 +75,13 @@ enum rdma_node_type {
> };
>
> enum rdma_transport_type {
> + /* legacy for users */
> RDMA_TRANSPORT_IB,
> RDMA_TRANSPORT_IWARP,
> RDMA_TRANSPORT_USNIC,
> - RDMA_TRANSPORT_USNIC_UDP
> + RDMA_TRANSPORT_USNIC_UDP,
> + /* new transport */
> + RDMA_TRANSPORT_IBOE,
> };
>
> __attribute_const__ enum rdma_transport_type
> @@ -1501,6 +1504,8 @@ struct ib_device {
> int (*query_port)(struct ib_device *device,
> u8 port_num,
> struct ib_port_attr *port_attr);
> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
> + u8 port_num);
> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
> u8 port_num);
> int (*query_gid)(struct ib_device *device,


--
Doug Ledford <[email protected]>
GPG KeyID: 0E572FDD



Attachments:
signature.asc (819.00 B)
This is a digitally signed message part

2015-04-22 15:23:28

by Devesh Sharma

[permalink] [raw]
Subject: RE: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()

> -----Original Message-----
> From: [email protected] [mailto:linux-rdma-
> [email protected]] On Behalf Of Doug Ledford
> Sent: Wednesday, April 22, 2015 8:33 PM
> To: Michael Wang
> Cc: Roland Dreier; Sean Hefty; [email protected]; linux-
> [email protected]; [email protected]; Tom Tucker; Steve Wise;
> Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal
> Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira Weiny; Tom Talpey; Jason
> Gunthorpe
> Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback
> query_transport()
>
> On Mon, 2015-04-20 at 10:32 +0200, Michael Wang wrote:
> > Add new callback query_transport() and implement for each HW.
>
> The more I think about it, the more I think we need to eliminate this patch
> entirely.
>
> The problem here is that, if we follow my suggestion, then we are going to
> eliminate the query as an API function and replace the information it gives us
> with a static port attribute bitmap. If we do this patch, then reform this patch
> to my idea later, we introduce a very short lived API/ABI change in the kernel
> module interface that serves absolutely no purpose. Instead, let's do the
> bitmap creation first, update the drivers to properly set the bitmap, then do all
> of the remaining reforms you have here using that bitmap and completely skip
> the
> query_transport() API item that will no longer serve a purpose.

Any vendor device that registers with IB stack already has capability flags, are you referring to the same as bit maps?
Is it possible to use same as bitmaps you are referring to? I am trying to understand you complete idea.

>
> > Mapping List:
> > node-type link-layer old-transport new-transport
> > nes RNIC ETH IWARP IWARP
> > amso1100 RNIC ETH IWARP IWARP
> > cxgb3 RNIC ETH IWARP IWARP
> > cxgb4 RNIC ETH IWARP IWARP
> > usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> > ocrdma IB_CA ETH IB IBOE
> > mlx4 IB_CA IB/ETH IB IB/IBOE
> > mlx5 IB_CA IB IB IB
> > ehca IB_CA IB IB IB
> > ipath IB_CA IB IB IB
> > mthca IB_CA IB IB IB
> > qib IB_CA IB IB IB
> >
> > Cc: Hal Rosenstock <[email protected]>
> > Cc: Steve Wise <[email protected]>
> > Cc: Tom Talpey <[email protected]>
> > Cc: Jason Gunthorpe <[email protected]>
> > Cc: Doug Ledford <[email protected]>
> > Cc: Ira Weiny <[email protected]>
> > Cc: Sean Hefty <[email protected]>
> > Signed-off-by: Michael Wang <[email protected]>
> > ---
> > drivers/infiniband/core/device.c | 1 +
> > drivers/infiniband/core/verbs.c | 4 +++-
> > drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
> > drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
> > drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
> > drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
> > drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
> > drivers/infiniband/hw/ehca/ehca_main.c | 1 +
> > drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
> > drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
> > drivers/infiniband/hw/mlx5/main.c | 7 +++++++
> > drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
> > drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
> > drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
> > drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
> > drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
> > drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
> > drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
> > drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
> > drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
> > include/rdma/ib_verbs.h | 7 ++++++-
> > 21 files changed, 104 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/device.c
> > b/drivers/infiniband/core/device.c
> > index 18c1ece..a9587c4 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device
> *device)
> > } mandatory_table[] = {
> > IB_MANDATORY_FUNC(query_device),
> > IB_MANDATORY_FUNC(query_port),
> > + IB_MANDATORY_FUNC(query_transport),
> > IB_MANDATORY_FUNC(query_pkey),
> > IB_MANDATORY_FUNC(query_gid),
> > IB_MANDATORY_FUNC(alloc_pd),
> > diff --git a/drivers/infiniband/core/verbs.c
> > b/drivers/infiniband/core/verbs.c index f93eb8d..626c9cf 100644
> > --- a/drivers/infiniband/core/verbs.c
> > +++ b/drivers/infiniband/core/verbs.c
> > @@ -133,14 +133,16 @@ enum rdma_link_layer
> rdma_port_get_link_layer(struct ib_device *device, u8 port_
> > if (device->get_link_layer)
> > return device->get_link_layer(device, port_num);
> >
> > - switch (rdma_node_get_transport(device->node_type)) {
> > + switch (device->query_transport(device, port_num)) {
> > case RDMA_TRANSPORT_IB:
> > return IB_LINK_LAYER_INFINIBAND;
> > + case RDMA_TRANSPORT_IBOE:
> > case RDMA_TRANSPORT_IWARP:
> > case RDMA_TRANSPORT_USNIC:
> > case RDMA_TRANSPORT_USNIC_UDP:
> > return IB_LINK_LAYER_ETHERNET;
> > default:
> > + BUG();
> > return IB_LINK_LAYER_UNSPECIFIED;
> > }
> > }
> > diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c
> > b/drivers/infiniband/hw/amso1100/c2_provider.c
> > index bdf3507..d46bbb0 100644
> > --- a/drivers/infiniband/hw/amso1100/c2_provider.c
> > +++ b/drivers/infiniband/hw/amso1100/c2_provider.c
> > @@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +c2_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IWARP;
> > +}
> > +
> > static int c2_query_pkey(struct ib_device *ibdev,
> > u8 port, u16 index, u16 * pkey)
> > {
> > @@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
> > dev->ibdev.dma_device = &dev->pcidev->dev;
> > dev->ibdev.query_device = c2_query_device;
> > dev->ibdev.query_port = c2_query_port;
> > + dev->ibdev.query_transport = c2_query_transport;
> > dev->ibdev.query_pkey = c2_query_pkey;
> > dev->ibdev.query_gid = c2_query_gid;
> > dev->ibdev.alloc_ucontext = c2_alloc_ucontext; diff --git
> > a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> > b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> > index 811b24a..09682e9e 100644
> > --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
> > +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
> > @@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device
> *ibdev,
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +iwch_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IWARP;
> > +}
> > +
> > static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> > char *buf)
> > {
> > @@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
> > dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
> > dev->ibdev.query_device = iwch_query_device;
> > dev->ibdev.query_port = iwch_query_port;
> > + dev->ibdev.query_transport = iwch_query_transport;
> > dev->ibdev.query_pkey = iwch_query_pkey;
> > dev->ibdev.query_gid = iwch_query_gid;
> > dev->ibdev.alloc_ucontext = iwch_alloc_ucontext; diff --git
> > a/drivers/infiniband/hw/cxgb4/provider.c
> > b/drivers/infiniband/hw/cxgb4/provider.c
> > index 66bd6a2..a445e0d 100644
> > --- a/drivers/infiniband/hw/cxgb4/provider.c
> > +++ b/drivers/infiniband/hw/cxgb4/provider.c
> > @@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev,
> u8 port,
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +c4iw_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IWARP;
> > +}
> > +
> > static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
> > char *buf)
> > {
> > @@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
> > dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
> > dev->ibdev.query_device = c4iw_query_device;
> > dev->ibdev.query_port = c4iw_query_port;
> > + dev->ibdev.query_transport = c4iw_query_transport;
> > dev->ibdev.query_pkey = c4iw_query_pkey;
> > dev->ibdev.query_gid = c4iw_query_gid;
> > dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext; diff --git
> > a/drivers/infiniband/hw/ehca/ehca_hca.c
> > b/drivers/infiniband/hw/ehca/ehca_hca.c
> > index 9ed4d25..d5a34a6 100644
> > --- a/drivers/infiniband/hw/ehca/ehca_hca.c
> > +++ b/drivers/infiniband/hw/ehca/ehca_hca.c
> > @@ -242,6 +242,12 @@ query_port1:
> > return ret;
> > }
> >
> > +enum rdma_transport_type
> > +ehca_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IB;
> > +}
> > +
> > int ehca_query_sma_attr(struct ehca_shca *shca,
> > u8 port, struct ehca_sma_attr *attr) { diff --git
> > a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> > b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> > index 22f79af..cec945f 100644
> > --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
> > +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
> > @@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev,
> > struct ib_device_attr *props); int ehca_query_port(struct ib_device *ibdev,
> u8 port,
> > struct ib_port_attr *props);
> >
> > +enum rdma_transport_type
> > +ehca_query_transport(struct ib_device *device, u8 port_num);
> > +
> > int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
> > struct ehca_sma_attr *attr);
> >
> > diff --git a/drivers/infiniband/hw/ehca/ehca_main.c
> > b/drivers/infiniband/hw/ehca/ehca_main.c
> > index cd8d290..60e0a09 100644
> > --- a/drivers/infiniband/hw/ehca/ehca_main.c
> > +++ b/drivers/infiniband/hw/ehca/ehca_main.c
> > @@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
> > shca->ib_device.dma_device = &shca->ofdev->dev;
> > shca->ib_device.query_device = ehca_query_device;
> > shca->ib_device.query_port = ehca_query_port;
> > + shca->ib_device.query_transport = ehca_query_transport;
> > shca->ib_device.query_gid = ehca_query_gid;
> > shca->ib_device.query_pkey = ehca_query_pkey;
> > /* shca->in_device.modify_device = ehca_modify_device */
> > diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c
> > b/drivers/infiniband/hw/ipath/ipath_verbs.c
> > index 44ea939..58d36e3 100644
> > --- a/drivers/infiniband/hw/ipath/ipath_verbs.c
> > +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
> > @@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device
> *ibdev,
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +ipath_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IB;
> > +}
> > +
> > static int ipath_modify_device(struct ib_device *device,
> > int device_modify_mask,
> > struct ib_device_modify *device_modify) @@ -
> 2140,6 +2146,7
> > @@ int ipath_register_ib_device(struct ipath_devdata *dd)
> > dev->query_device = ipath_query_device;
> > dev->modify_device = ipath_modify_device;
> > dev->query_port = ipath_query_port;
> > + dev->query_transport = ipath_query_transport;
> > dev->modify_port = ipath_modify_port;
> > dev->query_pkey = ipath_query_pkey;
> > dev->query_gid = ipath_query_gid;
> > diff --git a/drivers/infiniband/hw/mlx4/main.c
> > b/drivers/infiniband/hw/mlx4/main.c
> > index b972c0b..e1424ad 100644
> > --- a/drivers/infiniband/hw/mlx4/main.c
> > +++ b/drivers/infiniband/hw/mlx4/main.c
> > @@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device
> *ibdev, u8 port,
> > return __mlx4_ib_query_port(ibdev, port, props, 0); }
> >
> > +static enum rdma_transport_type
> > +mlx4_ib_query_transport(struct ib_device *device, u8 port_num) {
> > + struct mlx4_dev *dev = to_mdev(device)->dev;
> > +
> > + return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
> > + RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE; }
> > +
> > int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> > union ib_gid *gid, int netw_view)
> > {
> > @@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
> >
> > ibdev->ib_dev.query_device = mlx4_ib_query_device;
> > ibdev->ib_dev.query_port = mlx4_ib_query_port;
> > + ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
> > ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
> > ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
> > ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
> > diff --git a/drivers/infiniband/hw/mlx5/main.c
> > b/drivers/infiniband/hw/mlx5/main.c
> > index cc4ac1e..209c796 100644
> > --- a/drivers/infiniband/hw/mlx5/main.c
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -351,6 +351,12 @@ out:
> > return err;
> > }
> >
> > +static enum rdma_transport_type
> > +mlx5_ib_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IB;
> > +}
> > +
> > static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
> > union ib_gid *gid)
> > {
> > @@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev
> > *mdev)
> >
> > dev->ib_dev.query_device = mlx5_ib_query_device;
> > dev->ib_dev.query_port = mlx5_ib_query_port;
> > + dev->ib_dev.query_transport = mlx5_ib_query_transport;
> > dev->ib_dev.query_gid = mlx5_ib_query_gid;
> > dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
> > dev->ib_dev.modify_device = mlx5_ib_modify_device;
> > diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c
> > b/drivers/infiniband/hw/mthca/mthca_provider.c
> > index 415f8e1..67ac6a4 100644
> > --- a/drivers/infiniband/hw/mthca/mthca_provider.c
> > +++ b/drivers/infiniband/hw/mthca/mthca_provider.c
> > @@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
> > return err;
> > }
> >
> > +static enum rdma_transport_type
> > +mthca_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IB;
> > +}
> > +
> > static int mthca_modify_device(struct ib_device *ibdev,
> > int mask,
> > struct ib_device_modify *props) @@ -1281,6
> +1287,7 @@ int
> > mthca_register_device(struct mthca_dev *dev)
> > dev->ib_dev.dma_device = &dev->pdev->dev;
> > dev->ib_dev.query_device = mthca_query_device;
> > dev->ib_dev.query_port = mthca_query_port;
> > + dev->ib_dev.query_transport = mthca_query_transport;
> > dev->ib_dev.modify_device = mthca_modify_device;
> > dev->ib_dev.modify_port = mthca_modify_port;
> > dev->ib_dev.query_pkey = mthca_query_pkey;
> > diff --git a/drivers/infiniband/hw/nes/nes_verbs.c
> > b/drivers/infiniband/hw/nes/nes_verbs.c
> > index c0d0296..8df5b61 100644
> > --- a/drivers/infiniband/hw/nes/nes_verbs.c
> > +++ b/drivers/infiniband/hw/nes/nes_verbs.c
> > @@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8
> port, struct ib_port_attr
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +nes_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IWARP;
> > +}
> >
> > /**
> > * nes_query_pkey
> > @@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct
> net_device *netdev)
> > nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
> > nesibdev->ibdev.query_device = nes_query_device;
> > nesibdev->ibdev.query_port = nes_query_port;
> > + nesibdev->ibdev.query_transport = nes_query_transport;
> > nesibdev->ibdev.query_pkey = nes_query_pkey;
> > nesibdev->ibdev.query_gid = nes_query_gid;
> > nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext; diff --git
> > a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> > b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> > index 7a2b59a..9f4d182 100644
> > --- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> > +++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
> > @@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev
> *dev)
> > /* mandatory verbs. */
> > dev->ibdev.query_device = ocrdma_query_device;
> > dev->ibdev.query_port = ocrdma_query_port;
> > + dev->ibdev.query_transport = ocrdma_query_transport;
> > dev->ibdev.modify_port = ocrdma_modify_port;
> > dev->ibdev.query_gid = ocrdma_query_gid;
> > dev->ibdev.get_link_layer = ocrdma_link_layer; diff --git
> > a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> > b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> > index 8771755..73bace4 100644
> > --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> > +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
> > @@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
> > return 0;
> > }
> >
> > +enum rdma_transport_type
> > +ocrdma_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IBOE;
> > +}
> > +
> > int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
> > struct ib_port_modify *props) { diff --git
> > a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> > b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> > index b8f7853..4a81b63 100644
> > --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> > +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
> > @@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port,
> > struct ib_port_attr *props); int ocrdma_modify_port(struct ib_device *, u8
> port, int mask,
> > struct ib_port_modify *props);
> >
> > +enum rdma_transport_type
> > +ocrdma_query_transport(struct ib_device *device, u8 port_num);
> > +
> > void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid); int
> > ocrdma_query_gid(struct ib_device *, u8 port,
> > int index, union ib_gid *gid); diff --git
> > a/drivers/infiniband/hw/qib/qib_verbs.c
> > b/drivers/infiniband/hw/qib/qib_verbs.c
> > index 4a35998..caad665 100644
> > --- a/drivers/infiniband/hw/qib/qib_verbs.c
> > +++ b/drivers/infiniband/hw/qib/qib_verbs.c
> > @@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev,
> u8 port,
> > return 0;
> > }
> >
> > +static enum rdma_transport_type
> > +qib_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_IB;
> > +}
> > +
> > static int qib_modify_device(struct ib_device *device,
> > int device_modify_mask,
> > struct ib_device_modify *device_modify) @@ -
> 2184,6 +2190,7 @@
> > int qib_register_ib_device(struct qib_devdata *dd)
> > ibdev->query_device = qib_query_device;
> > ibdev->modify_device = qib_modify_device;
> > ibdev->query_port = qib_query_port;
> > + ibdev->query_transport = qib_query_transport;
> > ibdev->modify_port = qib_modify_port;
> > ibdev->query_pkey = qib_query_pkey;
> > ibdev->query_gid = qib_query_gid;
> > diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c
> > b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> > index 0d0f986..03ea9f3 100644
> > --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
> > +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
> > @@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev
> > *dev)
> >
> > us_ibdev->ib_dev.query_device = usnic_ib_query_device;
> > us_ibdev->ib_dev.query_port = usnic_ib_query_port;
> > + us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
> > us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
> > us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
> > us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer; diff
> > --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> > b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> > index 53bd6a2..ff9a5f7 100644
> > --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> > +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
> > @@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8
> port,
> > return 0;
> > }
> >
> > +enum rdma_transport_type
> > +usnic_ib_query_transport(struct ib_device *device, u8 port_num) {
> > + return RDMA_TRANSPORT_USNIC_UDP;
> > +}
> > +
> > int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> > int qp_attr_mask,
> > struct ib_qp_init_attr *qp_init_attr) diff --git
> > a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> > b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> > index bb864f5..0b1633b 100644
> > --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> > +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
> > @@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
> > struct ib_device_attr *props);
> > int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
> > struct ib_port_attr *props);
> > +enum rdma_transport_type
> > +usnic_ib_query_transport(struct ib_device *device, u8 port_num);
> > int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
> > int qp_attr_mask,
> > struct ib_qp_init_attr *qp_init_attr); diff --git
> > a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index
> > 65994a1..d54f91e 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -75,10 +75,13 @@ enum rdma_node_type { };
> >
> > enum rdma_transport_type {
> > + /* legacy for users */
> > RDMA_TRANSPORT_IB,
> > RDMA_TRANSPORT_IWARP,
> > RDMA_TRANSPORT_USNIC,
> > - RDMA_TRANSPORT_USNIC_UDP
> > + RDMA_TRANSPORT_USNIC_UDP,
> > + /* new transport */
> > + RDMA_TRANSPORT_IBOE,
> > };
> >
> > __attribute_const__ enum rdma_transport_type @@ -1501,6 +1504,8 @@
> > struct ib_device {
> > int (*query_port)(struct ib_device *device,
> > u8 port_num,
> > struct ib_port_attr
> *port_attr);
> > + enum rdma_transport_type (*query_transport)(struct ib_device
> *device,
> > + u8 port_num);
> > enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
> > u8 port_num);
> > int (*query_gid)(struct ib_device *device,
>
>
> --
> Doug Ledford <[email protected]>
> GPG KeyID: 0E572FDD
>

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-22 16:16:49

by Liran Liss

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

> From: Michael Wang [mailto:[email protected]]
>
> Hi, Liran
>
> Thanks for the comment :-)
>
> On 04/22/2015 01:36 AM, Liran Liss wrote:
> [snip]
> >
> > (**) This has been extended to also encode the transport in the current
> code.
> > At least for user-space visible APIs, we might chose to leave this for
> backward compatibility, but we can consider cleaning up the kernel code.
> >
> > So, I think that our "old-transport" below is just fine.
> > No need to change it (and you aren't, since it is currently implemented as a
> function).
> >
> > The "new-transport" does not really exist, but is broken into several
> capability checks of the L4 transport, optionally with conditions on the link
> type.
> > I would remove the table below and tell what we really want to achieve:
> > ==> move technology-specific feature-check logic out of the (multiple!) IB
> code components and various ULPs into per-feature helpers.
>
> Our purpose is to help core layer do management more clearly, rather then
> referring from transport and linklayer.
>

Right.

> IMHO from management's point of view, what we really care about is whether
> a particular management required by device or not, rather then the details on
> transport and link layer.
>

Depends on who is "we".
For ULPs, you are probably right.

However, core services (e.g., mad management, CM, SA) do care about various details.
In some cases, where it doesn't matter, this code will use management helpers.
In other cases, this code will inspect link, transport, and node attributes of rdma devices.

For example, the CM code has specific code paths for IB, RoCE, and iWARP.
There is no other CM code; there is no reason to abstract 'CM'. This code will have code
paths that depend on various specific details.

> This new transport is only understand by core-layer currently, for user-layer
> we still reserve the old transport for them, next step is to use bitmask instead
> of transport, at that time we can erase the new transport and make the
> whole stuff used by user-layer only :-)
>

I am not sure that we need a bit mask at all.
Your helpers already provide all the useful abstractions, which both core and ULPs call directly.
All the information is inferred directly from <link, transport, node> tuples.

Some of the user-space tools need *exactly* the same reasoning.
For example, management tools manage specific technologies and protocols, not some abstraction.

So, For user-space, we can think about exposing exactly the same helper framework, while providing
backward compatibility for the existing interfaces.

> >
> >
> > Detailed remarks
> > ==============
> >
> > 1) The introduction of cap_*_*() stuff should have been introduced directly
> in patch 02/27.
> > This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and
> increases the number of patches in the patch-set.
> > Do this and remove patches 16-24.
>
> We have some discussion about compress the patch set, merge the reform
> and introducing patch will mix the concept (like the earlier version), IMHO it
> will increase the difficulty of review...
>
> And now since many review already been done, it's not wise to change the
> whole structure of patch set IMHO...
>

I think it is because you are conditioning code on one thing, and then conditioning
the same code on another thing.

This is confusing.

Once we get our abstractions correct (i.e., the right helper functions), you replace the
existing logic with the suitable helper up-front.

> >
> > 2)The name rdma_tech_* is lame.
> > rdma_transport_*(), adhering to the above (*) remark, is much better.
> > For example, both IB and ROCE *do* use the same transport.
>
> We have some discussion on that too, use transport means going back...
>

No.
The existing notion of transport was correct. It was the node type that wasn't.
And in any case the new helpers didn't use it.

We need the original meaning of transport - see my response to Ira.
I propose replacing rdma_node_get_transport() with the following helpers:
- rdma_get_transport()
- rdma_is_ib_transport()
- rdma_is_iwarp_transport()
- ...

> >
> > 3) The name cap_* as it is used above is not accurate.
> > You use it to describe technology characteristics rather than extendable
> capabilities.
> > I would suggest having a single convention for all helpers, such as
> rdma_has_*() and rdma_is_*().
> > For example: cap_ib_smi() ==> rdma_has_smi().
>
> That means going back too...

See response to Ira (https://lkml.org/lkml/2015/4/21/951).


>
> >
> > 4) Remove all capabilities that do not introduce any distinction in the
> current code.
> > We can add them as needed later.
> > This means remove patches:
> > - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all
> > IB devices support ipoib
> > - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB
> devices support AF_IB.
> >
> > On the other hand:
> > - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
> > - cap_ib_sa() might make sense to cut code even further in the CMA, since
> RoCE has a GSI but no SA.
>
> We have discussion on define these helpers previously, again, name is not
> really a problem, I would rather to see such changes in the following series
> after this one working stably :-)
>

The names are not critical. This comment is about introducing helpers that are
do not introduce any new semantic notion in the current patch-set.

cap_ipoib(), for example, is brain-dead because only a single technology (as of now)
enables it: Infiniband.

> >
> > 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform
> > IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
>
> Actually nothing changed after the modify, the prev purpose it to eliminate
> the link layer helpers.
>
> But now we are not going to remove the helper any more, so let's drop this
> modification in next version :-)
>

You don't add modifications just to drop them later.
Don't add them in the first place!

This patch-set will remain forever in the kernel commit log - we want it to be
as self-explaining and coherent as possible.

Remove this.

> >
> > 6) Remove cap_read_multi_sge
> > It is not device/port feature, but a transport capability.
> > Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in
> 'enum ib_device_cap_flags'.
> >
> > 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
> cap_eth_ah().
> > Address handles that refer to Ethernet links always have Ethernet
> addressing.
> >
> > In the CMA code, using rdma_tech_iboe() is just fine. This is how you define
> cap_eth_ah() anyway.
> > Currently, this patch just adds clutter.
>
> There are also some discussion on these helpers, drop them means going
> back..
>

Back to where? Management helpers are a new concept. Let's get them right.

> The tech helper is not enough to explain the management purpose, and this
> can be the wrapper for bitmask stuff too.
>

As I said, I am not sure that we will need any bitmasks.
Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).

> >
> > 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
> > We do need a transport qualifier, as exemplified in comment 5) above, and
> for a complete clean model.
> > This is after renaming the function to rdma_is_ib_transport()...
>
> This means going back again... rdma_is_ib_transport() has been used
> previously.
>
> This helper is just to make the review more easier, we won't need it
> internally, not to mention after bitmask was introduced :-)
>

The same...

> >
> >
> > Putting it all together
> > ==================
> >
> > We are left with the following helpers:
> > - rdma_is_ib_transport()
> > - rdma_is_iwarp_transport()
> > - rdma_is_usnic_transport()
> > - rdma_is_iboe()
> > - rdma_has_mad()
> > - rdma_has_smi()
> > - rdma_has_gsi() - complements smi; can be used by the mad code for
> > clarity
> > - rdma_has_sa()
> > - rdma_has_cm()
> > - rdma_has_mcast()
>
> I think we can put the discussion on name and new helpers in future,
> currently let's focus on these basic reform and make them working stably ;-)

It's not just the names, it's their semantics.
Any problems with the names proposed above?

>
> Regards,
> Michael Wang
>
> >
> >
> >> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> >>
> >>
> >> Since v4:
> >> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> >> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> >> * Fix logical issue inside 3#, 14#
> >> * Refine 3#, 4#, 5# with label 'free'
> >> * Rework 10# to stop using port 1 when port already assigned
> >>
> >> There are plenty of lengthy code to check the transport type of IB
> >> device, or the link layer type of it's port, but actually we are just
> >> speculating whether a particular management/feature is supported by the
> device/port.
> >>
> >> Thus instead of inferring, we should have our own mechanism for IB
> >> management capability/protocol/feature checking, several proposals
> below.
> >>
> >> This patch set will reform the method of getting transport type, we
> >> will now using query_transport() instead of inferring from transport
> >> and link layer respectively, also we defined the new transport type
> >> to make the concept more reasonable.
> >>
> >> Mapping List:
> >> node-type link-layer old-transport new-transport
> >> nes RNIC ETH IWARP IWARP
> >> amso1100 RNIC ETH IWARP IWARP
> >> cxgb3 RNIC ETH IWARP IWARP
> >> cxgb4 RNIC ETH IWARP IWARP
> >> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> >> ocrdma IB_CA ETH IB IBOE
> >> mlx4 IB_CA IB/ETH IB IB/IBOE
> >> mlx5 IB_CA IB IB IB
> >> ehca IB_CA IB IB IB
> >> ipath IB_CA IB IB IB
> >> mthca IB_CA IB IB IB
> >> qib IB_CA IB IB IB
> >>
> >> For example:
> >> if (transport == IB) && (link-layer == ETH) will now become:
> >> if (query_transport() == IBOE)
> >>
> >> Thus we will be able to get rid of the respective transport and
> >> link-layer checking, and it will help us to add new
> >> protocol/Technology (like OPA) more easier, also with the introduced
> >> management helpers, IB management logical will be more clear and easier
> for extending.
> >>
> >> Highlights:
> >> The patch set covered a wide range of IB stuff, thus for those who are
> >> familiar with the particular part, your suggestion would be
> >> invaluable ;-)
> >>
> >> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> >> management helpers, 26#~27# do clean up.
> >>
> >> Patches haven't been tested yet, we appreciate if any one who have
> these
> >> HW willing to provide his Tested-by :-)
> >>
> >> Doug suggested the bitmask mechanism:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23765.html
> >> which could be the plan for future reforming, we prefer that to be
> another
> >> series which focus on semantic and performance.
> >>
> >> This patch-set is somewhat 'bloated' now and it may be a good timing
> for
> >> staging, I'd like to suggest we focus on improving existed helpers and
> push
> >> all the further reforms into next series ;-)
> >>
> >> Proposals:
> >> Sean:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23339.html
> >> Doug:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23418.html
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23765.html
> >> Jason:
> >> https://www.mail-archive.com/linux-
> >> [email protected]/msg23425.html
> >>
> >> Michael Wang (27):
> >> IB/Verbs: Implement new callback query_transport()
> >> IB/Verbs: Implement raw management helpers
> >> IB/Verbs: Reform IB-core mad/agent/user_mad
> >> IB/Verbs: Reform IB-core cm
> >> IB/Verbs: Reform IB-core sa_query
> >> IB/Verbs: Reform IB-core multicast
> >> IB/Verbs: Reform IB-ulp ipoib
> >> IB/Verbs: Reform IB-ulp xprtrdma
> >> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> >> IB/Verbs: Reform cm related part in IB-core cma/ucm
> >> IB/Verbs: Reform route related part in IB-core cma
> >> IB/Verbs: Reform mcast related part in IB-core cma
> >> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> >> IB/Verbs: Reform cma_acquire_dev()
> >> IB/Verbs: Reform rest part in IB-core cma
> >> IB/Verbs: Use management helper cap_ib_mad()
> >> IB/Verbs: Use management helper cap_ib_smi()
> >> IB/Verbs: Use management helper cap_ib_cm()
> >> IB/Verbs: Use management helper cap_iw_cm()
> >> IB/Verbs: Use management helper cap_ib_sa()
> >> IB/Verbs: Use management helper cap_ib_mcast()
> >> IB/Verbs: Use management helper cap_ipoib()
> >> IB/Verbs: Use management helper cap_read_multi_sge()
> >> IB/Verbs: Use management helper cap_af_ib()
> >> IB/Verbs: Use management helper cap_eth_ah()
> >> IB/Verbs: Clean up rdma_ib_or_iboe()
> >> IB/Verbs: Cleanup rdma_node_get_transport()
> >>
> >> ---
> >> drivers/infiniband/core/agent.c | 4
> >> drivers/infiniband/core/cm.c | 26 +-
> >> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> >> drivers/infiniband/core/device.c | 1
> >> drivers/infiniband/core/mad.c | 51 ++--
> >> drivers/infiniband/core/multicast.c | 18 -
> >> drivers/infiniband/core/sa_query.c | 41 +--
> >> drivers/infiniband/core/sysfs.c | 8
> >> drivers/infiniband/core/ucm.c | 5
> >> drivers/infiniband/core/ucma.c | 27 --
> >> drivers/infiniband/core/user_mad.c | 32 +-
> >> drivers/infiniband/core/uverbs_cmd.c | 6
> >> drivers/infiniband/core/verbs.c | 33 --
> >> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> >> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> >> drivers/infiniband/hw/cxgb4/provider.c | 7
> >> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> >> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> >> drivers/infiniband/hw/ehca/ehca_main.c | 1
> >> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> >> drivers/infiniband/hw/mlx4/main.c | 10
> >> drivers/infiniband/hw/mlx5/main.c | 7
> >> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> >> drivers/infiniband/hw/nes/nes_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> >> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> >> drivers/infiniband/hw/qib/qib_verbs.c | 7
> >> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> >> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> >> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> >> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> >> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> >> 35 files changed, 584 insertions(+), 368 deletions(-)
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> >> in the body of a message to [email protected] More majordomo
> >> info at http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-22 16:22:58

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()

On Wed, Apr 22, 2015 at 03:21:43PM +0000, Devesh Sharma wrote:
> > -----Original Message-----
> > From: [email protected] [mailto:linux-rdma-
> > [email protected]] On Behalf Of Doug Ledford
> > Sent: Wednesday, April 22, 2015 8:33 PM
> > To: Michael Wang
> > Cc: Roland Dreier; Sean Hefty; [email protected]; linux-
> > [email protected]; [email protected]; Tom Tucker; Steve Wise;
> > Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal
> > Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira Weiny; Tom Talpey; Jason
> > Gunthorpe
> > Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback
> > query_transport()
> >
> > On Mon, 2015-04-20 at 10:32 +0200, Michael Wang wrote:
> > > Add new callback query_transport() and implement for each HW.
> >
> > The more I think about it, the more I think we need to eliminate this patch
> > entirely.
> >
> > The problem here is that, if we follow my suggestion, then we are going to
> > eliminate the query as an API function and replace the information it gives us
> > with a static port attribute bitmap. If we do this patch, then reform this patch
> > to my idea later, we introduce a very short lived API/ABI change in the kernel
> > module interface that serves absolutely no purpose. Instead, let's do the
> > bitmap creation first, update the drivers to properly set the bitmap, then do all
> > of the remaining reforms you have here using that bitmap and completely skip
> > the
> > query_transport() API item that will no longer serve a purpose.
>
> Any vendor device that registers with IB stack already has capability flags, are you referring to the same as bit maps?
> Is it possible to use same as bitmaps you are referring to? I am trying to understand you complete idea.

The idea was to use additional bit maps.

https://www.mail-archive.com/[email protected]/msg23765.html

Ira

2015-04-22 16:29:08

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs

On Wed, Apr 22, 2015 at 09:38:57AM +0200, Michael Wang wrote:
> Hi, Ira
>
> Thanks for the review :-)
>
> On 04/22/2015 01:19 AM, ira.weiny wrote:
> [snip]
> >> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> >> index cbd0383..8570180 100644
> >> --- a/drivers/infiniband/core/sysfs.c
> >> +++ b/drivers/infiniband/core/sysfs.c
> >> @@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
> >> static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
> >> char *buf)
> >> {
> >> - switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
> >> - case IB_LINK_LAYER_INFINIBAND:
> >> + if (rdma_tech_ib(p->ibdev, p->port_num))
> >
> > Is the final intention to remove Link Layer from the rdma stack entirely?
> >
> > I know that the use of link layer in userspace is just as convoluted as what we
> > are trying to fix here in the kernel. So it would be nice if we can eventually
> > get user space cleaned up to not use link layer as it currently does.
> >
> > However, standard networking tools can report the link layer. So while the
> > current use of "link layer" via userspace software is wrong I don't think it is
> > wrong to report this information _to_ userspace.
> >
> > So unless we intend to completely hide the link layer from userspace I don't
> > think we should be removing the rdma_port_get_link_layer call. It is still
> > valid information even though we don't want to use it in most places.
>
> This series won't erase the rdma_port_get_link_layer(), although
> currently only mlx4 still using it in kernel...

But this patch does not return that information to the user. So we have the
drivers reporting a link layer which is no longer exposed userspace.

I think we need to separate the rdma_tech_ib (or cap_tech_ib or whatever) from
reporting the link layer.

>
> link_layer_show() was supposed to report the same info to user
> space as usual, so user tool don't have to change anything :-)

We need to expose the "cap_*" functionality to userspace which can then convert
to this interface and stop relying on inferring support based on the link
layer. But that is a separate issue from correctly reporting the link layer.

The link layer should be reported correctly from the drivers "get_link_layer"
call.

Ira

>
> Regards,
> Michael Wang
>
> >
> > Ira
> >
> >> return sprintf(buf, "%s\n", "InfiniBand");
> >> - case IB_LINK_LAYER_ETHERNET:
> >> + else
> >> return sprintf(buf, "%s\n", "Ethernet");
> >> - default:
> >> - return sprintf(buf, "%s\n", "Unknown");
> >> - }
> >> }
> >>
> >> static PORT_ATTR_RO(state);
> >> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> >> index a9f0489..5dc90aa 100644
> >> --- a/drivers/infiniband/core/uverbs_cmd.c
> >> +++ b/drivers/infiniband/core/uverbs_cmd.c
> >> @@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> >> resp.active_width = attr.active_width;
> >> resp.active_speed = attr.active_speed;
> >> resp.phys_state = attr.phys_state;
> >> - resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
> >> - cmd.port_num);
> >> + resp.link_layer = rdma_tech_ib(file->device->ib_dev,
> >> + cmd.port_num) ?
> >> + IB_LINK_LAYER_INFINIBAND :
> >> + IB_LINK_LAYER_ETHERNET;
> >>
> >> if (copy_to_user((void __user *) (unsigned long) cmd.response,
> >> &resp, sizeof resp))
> >> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> >> index 626c9cf..7264860 100644
> >> --- a/drivers/infiniband/core/verbs.c
> >> +++ b/drivers/infiniband/core/verbs.c
> >> @@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
> >> u32 flow_class;
> >> u16 gid_index;
> >> int ret;
> >> - int is_eth = (rdma_port_get_link_layer(device, port_num) ==
> >> - IB_LINK_LAYER_ETHERNET);
> >>
> >> memset(ah_attr, 0, sizeof *ah_attr);
> >> - if (is_eth) {
> >> + if (rdma_tech_iboe(device, port_num)) {
> >> if (!(wc->wc_flags & IB_WC_GRH))
> >> return -EPROTOTYPE;
> >>
> >> @@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
> >> union ib_gid sgid;
> >>
> >> if ((*qp_attr_mask & IB_QP_AV) &&
> >> - (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
> >> + (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
> >> ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
> >> qp_attr->ah_attr.grh.sgid_index, &sgid);
> >> if (ret)
> >> --
> >> 2.1.0

2015-04-22 16:41:05

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Wed, Apr 22, 2015 at 11:38:34AM +0000, Liran Liss wrote:

> This is redundant. All IB ports have SMI, so if you know that you
> are using an IB device, you know you have an SMI.

You should really go back and read the whole thread, this has already
been discussed.

The patch set was developed from the 'bottom up' - all the points that
did 'if is iwarp/rocee/ib/foo' were examined, Michael figured out what
*difference* that code actually required and made a dedicated test for it.

It turns out, one of those differences is SMI, !SMI.

The tests are an inventory of all the spec differences the code cares
about. This is the entire point.

Yes, of course, an abstract notion like <link-type, transport,
node-type> can describe the same state space, but then the call sites
loose the insight into *WHY* the code cares, and *WHAT* the difference
is.

This is bad:
if (rdma_standard_ib() || rdma_standard_rocee() ||
rdma_standard_opa())

This is worse:
if (!rdma_standard_iwarp())

This is better:
if (rdma_cap_mad())

Going forward, we are growing more tests, and worse, they are for
standards that are not public. We need to stop open-coding 'is
standard' type code and actually start documenting these differences.

Jason

2015-04-22 16:40:59

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

> > So, I think that our "old-transport" below is just fine.
> > No need to change it (and you aren't, since it is currently implemented
> as a function).
>
> I think there is a need to change this. Encoding the transport into the
> node
> type is not a good idea. Having different "transport semantics" while
> still
> returning the same transport for the port is confusing.
>
> The only thing which is clear currently is Link Layer.
>
> But the use of "Link Layer" in the code is so convoluted that it is very
> confusing.

I agree.

One could implement software iWarp or IBoUDP (RoCEv2) protocols that could run over any link layer and interoperate with existing HW solutions. The stack shouldn't be dealing with the link level at all, with the exception of user space compatibility.

> Define Transport? There has been a lot of discussion over what a
> transport is
> in Verbs.

IMO, we should replace using the word 'transport' with just 'rdma_protocol'. And even then I'm not convinced that anything should care, beyond user space compatibility. The caps are what matter.

- Sean

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-22 16:42:21

by Doug Ledford

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()

On Wed, 2015-04-22 at 15:21 +0000, Devesh Sharma wrote:
> > -----Original Message-----
> > From: [email protected] [mailto:linux-rdma-
> > [email protected]] On Behalf Of Doug Ledford
> > Sent: Wednesday, April 22, 2015 8:33 PM
> > To: Michael Wang
> > Cc: Roland Dreier; Sean Hefty; [email protected]; linux-
> > [email protected]; [email protected]; Tom Tucker; Steve Wise;
> > Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal
> > Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira Weiny; Tom Talpey; Jason
> > Gunthorpe
> > Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback
> > query_transport()
> >
> > On Mon, 2015-04-20 at 10:32 +0200, Michael Wang wrote:
> > > Add new callback query_transport() and implement for each HW.
> >
> > The more I think about it, the more I think we need to eliminate this patch
> > entirely.
> >
> > The problem here is that, if we follow my suggestion, then we are going to
> > eliminate the query as an API function and replace the information it gives us
> > with a static port attribute bitmap. If we do this patch, then reform this patch
> > to my idea later, we introduce a very short lived API/ABI change in the kernel
> > module interface that serves absolutely no purpose. Instead, let's do the
> > bitmap creation first, update the drivers to properly set the bitmap, then do all
> > of the remaining reforms you have here using that bitmap and completely skip
> > the
> > query_transport() API item that will no longer serve a purpose.
>
> Any vendor device that registers with IB stack already has capability flags, are you referring to the same as bit maps?
> Is it possible to use same as bitmaps you are referring to? I am trying to understand you complete idea.

There are two capability flags right now, a device caps flag set and a
port caps flag set (not counting possible driver internal flags).
Regardless of whether or not we wanted to use one or the other, they are
both too full to be used for our purposes. We can't get enough bits.
We would get to drop the node_type, but that's only a u8 and so it
doesn't have enough bits either. In addition, node_type is set per
device, and it would really be best if our new bitmap were per port.

That then raises the issue that right now, the core code doesn't have
direct access to per-port information, everything is done via a
combination of direct access to the per device node check and per port
driver callbacks. We may have no choice but to continue with that for
now, but I find that inefficient. And if we make the bitmap per port,
then even our node check becomes a callback.

So, what we need to do, is define a specific bitmap just for the
node/transport/link bits and the capability bits that explicitly go with
that tuple *and* define a way to access it without necessarily requiring
a callback.

Michael's patch set has gone through a lot of revisions and I think we
are getting close to the set of things we want to know, so it shouldn't
be that hard now to create the proper bitmap that makes knowing those
things quick and efficient.

The harder part will be making it accessible. Since the current struct
ib_device doesn't include direct access to the ports (it has a list head
for port_list, but that's just for the sysfs entries for this device, it
doesn't hold anything we can use for this) we are limited to either A)
adding a new callback or B) changing node_type to port_type, making it a
larger size, and making it a port indexed array.

--
Doug Ledford <[email protected]>
GPG KeyID: 0E572FDD



Attachments:
signature.asc (819.00 B)
This is a digitally signed message part

2015-04-22 16:45:39

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()

On Wed, Apr 22, 2015 at 10:49:44AM +0200, Michael Wang wrote:
>
> On 04/22/2015 07:40 AM, Jason Gunthorpe wrote:
> > On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
> >
> >> Introduce helper cap_ipoib() to help us check if the port of an
> >> IB device support IP over Infiniband.
> >
> > I thought we were dropping this in favor of listing the actual
> > features the ULP required unconditionally? One of my messages had the
> > start of a list..

??? I forget. I was arguing that we should not have it. But I thought others
disagreed with me so it was left in.

V4 of this patch had no responses.

https://www.mail-archive.com/[email protected]/msg24040.html

Jason, I can't find the email where you mentioned a list?

Ira

2015-04-22 16:56:18

by Dave Goodell

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Apr 21, 2015, at 6:36 PM, Liran Liss <[email protected]> wrote:

> An ib_dev (or a port of) should be distinguished by 3 qualifiers:
> - The link layer:
> -- Ethernet (shared by iWARP, USNIC, and ROCE)
> -- Infiniband
>
> - The transport (*)
> -- IBTA transport (shared by IB and ROCE)
> -- iWARP transport
> -- USNIC transport

I haven't been following this discussion as closely as I could have, but I want to clarify something about usNIC. There are two "transports" used by usNIC:

1. The legacy RDMA_TRANSPORT_USNIC type, which indicates usNIC traffic will be Ethernet frames with Ethertype==0x8915 (like RoCE) but containing a custom, non-IBTA-sanctioned header format instead of a full GRH. This "transport" is still supported by the usnic_verbs kernel driver but in practice is no longer in use. For this "transport" there isn't really any clear L3/L4 header to point to.

2. The current RDMA_TRANSPORT_USNIC_UDP type, which indicates usNIC traffic will be standard UDP/IP/Ethernet packets.

> (*) Transport means both:
> - The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack)
> - The transport semantics (for example, there are slight semantic differences between IBTA and iWARP)

No usNIC hardware or software currently performs any hardware offload of RDMA features (only UD is supported), so there is no usNIC equivalent of the BTH+ to discuss right now.

-Dave

2015-04-22 16:54:57

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()

> > > I thought we were dropping this in favor of listing the actual
> > > features the ULP required unconditionally? One of my messages had the
> > > start of a list..
>
> ??? I forget. I was arguing that we should not have it. But I thought
> others
> disagreed with me so it was left in.

I don't remember, but I agree with Jason. The ULPs should check for the features that they need.

2015-04-22 16:57:38

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Wed, Apr 22, 2015 at 10:59:52AM -0400, Doug Ledford wrote:

> > 2)The name rdma_tech_* is lame.
> > rdma_transport_*(), adhering to the above (*) remark, is much better.
> > For example, both IB and ROCE *do* use the same transport.
>
> I especially want to second this. I haven't really been happy with the
> rdma_tech_* names at all.

I'm not excited about the names either..

cap_ is bad because it pollutes the global namespace.

rdma_tech_ .. as used, this is selecting the standard the port
implements. The word 'standard' is a better choice than 'transport',
and 'technology' is often synonymous with 'standard'. Meh.

I've said it already, but this patch set has probably gotten too
big. If we could just do the cap conversion without messing with other
stuff, or adding rdma_tech, that would really be the best.

Nobody seems to like the rdma_tech parts of this series.

I'd also drop '[PATCH v5 09/27] IB/Verbs: Reform IB-core
verbs/uverbs_cmd/sysfs' - that is UAPI stuff, it could be done as a
followup someday, not worth the risk right now.

Jason

2015-04-22 17:11:04

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Wed, Apr 22, 2015 at 10:59:52AM -0400, Doug Ledford wrote:
> On Tue, 2015-04-21 at 23:36 +0000, Liran Liss wrote:

[snip]

> >
> > 2)The name rdma_tech_* is lame.
> > rdma_transport_*(), adhering to the above (*) remark, is much better.
> > For example, both IB and ROCE *do* use the same transport.
>
> I especially want to second this. I haven't really been happy with the
> rdma_tech_* names at all.
>

I am sure Michael is open to alternative names. I know I am. The problem is
that we can't figure out what "IBoE" is. It is not a transport, even though
query_transport is now returning it as one. :-P

I think the idea behind the "tech" name was that it is a technology "family".
I can't think of a better name.

Ira

2015-04-22 17:25:06

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()

On Wed, Apr 22, 2015 at 10:49:44AM +0200, Michael Wang wrote:
>
> On 04/22/2015 07:40 AM, Jason Gunthorpe wrote:
> > On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
> >
> >> Introduce helper cap_ipoib() to help us check if the port of an
> >> IB device support IP over Infiniband.
> >
> > I thought we were dropping this in favor of listing the actual
> > features the ULP required unconditionally? One of my messages had the
> > start of a list..
>
> Shall we drop it now or wait until the mechanism introduced?
>
> Just wondering the requirement of ULP could be similar to the
> requirement of management, isn't it? if the device can tell
> which ULP it support, then may be a cap_XX() make sense in here?

You have to audit the ipoib dirver and see what core functions it
calls that have cap requirements themselves.

At least SA, multicast and CM. It also requires cap_ib_ah() or
whatever we called that.

JAson

2015-04-23 07:13:59

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 01/27] IB/Verbs: Implement new callback query_transport()



On 04/22/2015 05:02 PM, Doug Ledford wrote:
> On Mon, 2015-04-20 at 10:32 +0200, Michael Wang wrote:
>> Add new callback query_transport() and implement for each HW.
>
> The more I think about it, the more I think we need to eliminate this
> patch entirely.
>
> The problem here is that, if we follow my suggestion, then we are going
> to eliminate the query as an API function and replace the information it
> gives us with a static port attribute bitmap. If we do this patch, then
> reform this patch to my idea later, we introduce a very short lived
> API/ABI change in the kernel module interface that serves absolutely no
> purpose. Instead, let's do the bitmap creation first, update the
> drivers to properly set the bitmap, then do all of the remaining reforms
> you have here using that bitmap and completely skip the
> query_transport() API item that will no longer serve a purpose.

I really prefer to see the bitmask mechanism come along with the bits
defined, at least we are not going to use transport anymore, correct?

I think there will be more discussion on bitmask stuff, not only about
the definition of each bit, but also the timing to initialize it (should
before each HW register the device).

That's really a different topic which we can't settle down within few
versions IMHO, and I really like to staging our progress at this moment.

With these foundation, it would be really easy to expanding the topic
further and let more folks join at the beginning, but if we introduce
bitmask at this moments, the topic will be mixed with different purpose
and become confusion...

We can do whatever the reform/discussion in next series, the topic would
be clean and clear ;-)

Regards,
Michael Wang

>
>> Mapping List:
>> node-type link-layer old-transport new-transport
>> nes RNIC ETH IWARP IWARP
>> amso1100 RNIC ETH IWARP IWARP
>> cxgb3 RNIC ETH IWARP IWARP
>> cxgb4 RNIC ETH IWARP IWARP
>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>> ocrdma IB_CA ETH IB IBOE
>> mlx4 IB_CA IB/ETH IB IB/IBOE
>> mlx5 IB_CA IB IB IB
>> ehca IB_CA IB IB IB
>> ipath IB_CA IB IB IB
>> mthca IB_CA IB IB IB
>> qib IB_CA IB IB IB
>>
>> Cc: Hal Rosenstock <[email protected]>
>> Cc: Steve Wise <[email protected]>
>> Cc: Tom Talpey <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Doug Ledford <[email protected]>
>> Cc: Ira Weiny <[email protected]>
>> Cc: Sean Hefty <[email protected]>
>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> drivers/infiniband/core/device.c | 1 +
>> drivers/infiniband/core/verbs.c | 4 +++-
>> drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
>> drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
>> drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
>> drivers/infiniband/hw/ehca/ehca_main.c | 1 +
>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
>> drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
>> drivers/infiniband/hw/mlx5/main.c | 7 +++++++
>> drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
>> drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
>> drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
>> include/rdma/ib_verbs.h | 7 ++++++-
>> 21 files changed, 104 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
>> index 18c1ece..a9587c4 100644
>> --- a/drivers/infiniband/core/device.c
>> +++ b/drivers/infiniband/core/device.c
>> @@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
>> } mandatory_table[] = {
>> IB_MANDATORY_FUNC(query_device),
>> IB_MANDATORY_FUNC(query_port),
>> + IB_MANDATORY_FUNC(query_transport),
>> IB_MANDATORY_FUNC(query_pkey),
>> IB_MANDATORY_FUNC(query_gid),
>> IB_MANDATORY_FUNC(alloc_pd),
>> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
>> index f93eb8d..626c9cf 100644
>> --- a/drivers/infiniband/core/verbs.c
>> +++ b/drivers/infiniband/core/verbs.c
>> @@ -133,14 +133,16 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
>> if (device->get_link_layer)
>> return device->get_link_layer(device, port_num);
>>
>> - switch (rdma_node_get_transport(device->node_type)) {
>> + switch (device->query_transport(device, port_num)) {
>> case RDMA_TRANSPORT_IB:
>> return IB_LINK_LAYER_INFINIBAND;
>> + case RDMA_TRANSPORT_IBOE:
>> case RDMA_TRANSPORT_IWARP:
>> case RDMA_TRANSPORT_USNIC:
>> case RDMA_TRANSPORT_USNIC_UDP:
>> return IB_LINK_LAYER_ETHERNET;
>> default:
>> + BUG();
>> return IB_LINK_LAYER_UNSPECIFIED;
>> }
>> }
>> diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
>> index bdf3507..d46bbb0 100644
>> --- a/drivers/infiniband/hw/amso1100/c2_provider.c
>> +++ b/drivers/infiniband/hw/amso1100/c2_provider.c
>> @@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +c2_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IWARP;
>> +}
>> +
>> static int c2_query_pkey(struct ib_device *ibdev,
>> u8 port, u16 index, u16 * pkey)
>> {
>> @@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
>> dev->ibdev.dma_device = &dev->pcidev->dev;
>> dev->ibdev.query_device = c2_query_device;
>> dev->ibdev.query_port = c2_query_port;
>> + dev->ibdev.query_transport = c2_query_transport;
>> dev->ibdev.query_pkey = c2_query_pkey;
>> dev->ibdev.query_gid = c2_query_gid;
>> dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
>> diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
>> index 811b24a..09682e9e 100644
>> --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
>> +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
>> @@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +iwch_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IWARP;
>> +}
>> +
>> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
>> char *buf)
>> {
>> @@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
>> dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
>> dev->ibdev.query_device = iwch_query_device;
>> dev->ibdev.query_port = iwch_query_port;
>> + dev->ibdev.query_transport = iwch_query_transport;
>> dev->ibdev.query_pkey = iwch_query_pkey;
>> dev->ibdev.query_gid = iwch_query_gid;
>> dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
>> diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
>> index 66bd6a2..a445e0d 100644
>> --- a/drivers/infiniband/hw/cxgb4/provider.c
>> +++ b/drivers/infiniband/hw/cxgb4/provider.c
>> @@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +c4iw_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IWARP;
>> +}
>> +
>> static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
>> char *buf)
>> {
>> @@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
>> dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
>> dev->ibdev.query_device = c4iw_query_device;
>> dev->ibdev.query_port = c4iw_query_port;
>> + dev->ibdev.query_transport = c4iw_query_transport;
>> dev->ibdev.query_pkey = c4iw_query_pkey;
>> dev->ibdev.query_gid = c4iw_query_gid;
>> dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
>> diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
>> index 9ed4d25..d5a34a6 100644
>> --- a/drivers/infiniband/hw/ehca/ehca_hca.c
>> +++ b/drivers/infiniband/hw/ehca/ehca_hca.c
>> @@ -242,6 +242,12 @@ query_port1:
>> return ret;
>> }
>>
>> +enum rdma_transport_type
>> +ehca_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IB;
>> +}
>> +
>> int ehca_query_sma_attr(struct ehca_shca *shca,
>> u8 port, struct ehca_sma_attr *attr)
>> {
>> diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
>> index 22f79af..cec945f 100644
>> --- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
>> +++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
>> @@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
>> int ehca_query_port(struct ib_device *ibdev, u8 port,
>> struct ib_port_attr *props);
>>
>> +enum rdma_transport_type
>> +ehca_query_transport(struct ib_device *device, u8 port_num);
>> +
>> int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
>> struct ehca_sma_attr *attr);
>>
>> diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
>> index cd8d290..60e0a09 100644
>> --- a/drivers/infiniband/hw/ehca/ehca_main.c
>> +++ b/drivers/infiniband/hw/ehca/ehca_main.c
>> @@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
>> shca->ib_device.dma_device = &shca->ofdev->dev;
>> shca->ib_device.query_device = ehca_query_device;
>> shca->ib_device.query_port = ehca_query_port;
>> + shca->ib_device.query_transport = ehca_query_transport;
>> shca->ib_device.query_gid = ehca_query_gid;
>> shca->ib_device.query_pkey = ehca_query_pkey;
>> /* shca->in_device.modify_device = ehca_modify_device */
>> diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
>> index 44ea939..58d36e3 100644
>> --- a/drivers/infiniband/hw/ipath/ipath_verbs.c
>> +++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
>> @@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device *ibdev,
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +ipath_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IB;
>> +}
>> +
>> static int ipath_modify_device(struct ib_device *device,
>> int device_modify_mask,
>> struct ib_device_modify *device_modify)
>> @@ -2140,6 +2146,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
>> dev->query_device = ipath_query_device;
>> dev->modify_device = ipath_modify_device;
>> dev->query_port = ipath_query_port;
>> + dev->query_transport = ipath_query_transport;
>> dev->modify_port = ipath_modify_port;
>> dev->query_pkey = ipath_query_pkey;
>> dev->query_gid = ipath_query_gid;
>> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
>> index b972c0b..e1424ad 100644
>> --- a/drivers/infiniband/hw/mlx4/main.c
>> +++ b/drivers/infiniband/hw/mlx4/main.c
>> @@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
>> return __mlx4_ib_query_port(ibdev, port, props, 0);
>> }
>>
>> +static enum rdma_transport_type
>> +mlx4_ib_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + struct mlx4_dev *dev = to_mdev(device)->dev;
>> +
>> + return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
>> + RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE;
>> +}
>> +
>> int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
>> union ib_gid *gid, int netw_view)
>> {
>> @@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
>>
>> ibdev->ib_dev.query_device = mlx4_ib_query_device;
>> ibdev->ib_dev.query_port = mlx4_ib_query_port;
>> + ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
>> ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
>> ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
>> ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
>> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
>> index cc4ac1e..209c796 100644
>> --- a/drivers/infiniband/hw/mlx5/main.c
>> +++ b/drivers/infiniband/hw/mlx5/main.c
>> @@ -351,6 +351,12 @@ out:
>> return err;
>> }
>>
>> +static enum rdma_transport_type
>> +mlx5_ib_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IB;
>> +}
>> +
>> static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
>> union ib_gid *gid)
>> {
>> @@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
>>
>> dev->ib_dev.query_device = mlx5_ib_query_device;
>> dev->ib_dev.query_port = mlx5_ib_query_port;
>> + dev->ib_dev.query_transport = mlx5_ib_query_transport;
>> dev->ib_dev.query_gid = mlx5_ib_query_gid;
>> dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
>> dev->ib_dev.modify_device = mlx5_ib_modify_device;
>> diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
>> index 415f8e1..67ac6a4 100644
>> --- a/drivers/infiniband/hw/mthca/mthca_provider.c
>> +++ b/drivers/infiniband/hw/mthca/mthca_provider.c
>> @@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
>> return err;
>> }
>>
>> +static enum rdma_transport_type
>> +mthca_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IB;
>> +}
>> +
>> static int mthca_modify_device(struct ib_device *ibdev,
>> int mask,
>> struct ib_device_modify *props)
>> @@ -1281,6 +1287,7 @@ int mthca_register_device(struct mthca_dev *dev)
>> dev->ib_dev.dma_device = &dev->pdev->dev;
>> dev->ib_dev.query_device = mthca_query_device;
>> dev->ib_dev.query_port = mthca_query_port;
>> + dev->ib_dev.query_transport = mthca_query_transport;
>> dev->ib_dev.modify_device = mthca_modify_device;
>> dev->ib_dev.modify_port = mthca_modify_port;
>> dev->ib_dev.query_pkey = mthca_query_pkey;
>> diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
>> index c0d0296..8df5b61 100644
>> --- a/drivers/infiniband/hw/nes/nes_verbs.c
>> +++ b/drivers/infiniband/hw/nes/nes_verbs.c
>> @@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +nes_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IWARP;
>> +}
>>
>> /**
>> * nes_query_pkey
>> @@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
>> nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
>> nesibdev->ibdev.query_device = nes_query_device;
>> nesibdev->ibdev.query_port = nes_query_port;
>> + nesibdev->ibdev.query_transport = nes_query_transport;
>> nesibdev->ibdev.query_pkey = nes_query_pkey;
>> nesibdev->ibdev.query_gid = nes_query_gid;
>> nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
>> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
>> index 7a2b59a..9f4d182 100644
>> --- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
>> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
>> @@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
>> /* mandatory verbs. */
>> dev->ibdev.query_device = ocrdma_query_device;
>> dev->ibdev.query_port = ocrdma_query_port;
>> + dev->ibdev.query_transport = ocrdma_query_transport;
>> dev->ibdev.modify_port = ocrdma_modify_port;
>> dev->ibdev.query_gid = ocrdma_query_gid;
>> dev->ibdev.get_link_layer = ocrdma_link_layer;
>> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
>> index 8771755..73bace4 100644
>> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
>> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
>> @@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
>> return 0;
>> }
>>
>> +enum rdma_transport_type
>> +ocrdma_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IBOE;
>> +}
>> +
>> int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
>> struct ib_port_modify *props)
>> {
>> diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
>> index b8f7853..4a81b63 100644
>> --- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
>> +++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
>> @@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port, struct ib_port_attr *props);
>> int ocrdma_modify_port(struct ib_device *, u8 port, int mask,
>> struct ib_port_modify *props);
>>
>> +enum rdma_transport_type
>> +ocrdma_query_transport(struct ib_device *device, u8 port_num);
>> +
>> void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid);
>> int ocrdma_query_gid(struct ib_device *, u8 port,
>> int index, union ib_gid *gid);
>> diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
>> index 4a35998..caad665 100644
>> --- a/drivers/infiniband/hw/qib/qib_verbs.c
>> +++ b/drivers/infiniband/hw/qib/qib_verbs.c
>> @@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
>> return 0;
>> }
>>
>> +static enum rdma_transport_type
>> +qib_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_IB;
>> +}
>> +
>> static int qib_modify_device(struct ib_device *device,
>> int device_modify_mask,
>> struct ib_device_modify *device_modify)
>> @@ -2184,6 +2190,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
>> ibdev->query_device = qib_query_device;
>> ibdev->modify_device = qib_modify_device;
>> ibdev->query_port = qib_query_port;
>> + ibdev->query_transport = qib_query_transport;
>> ibdev->modify_port = qib_modify_port;
>> ibdev->query_pkey = qib_query_pkey;
>> ibdev->query_gid = qib_query_gid;
>> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
>> index 0d0f986..03ea9f3 100644
>> --- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
>> +++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
>> @@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
>>
>> us_ibdev->ib_dev.query_device = usnic_ib_query_device;
>> us_ibdev->ib_dev.query_port = usnic_ib_query_port;
>> + us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
>> us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
>> us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
>> us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
>> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
>> index 53bd6a2..ff9a5f7 100644
>> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
>> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
>> @@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
>> return 0;
>> }
>>
>> +enum rdma_transport_type
>> +usnic_ib_query_transport(struct ib_device *device, u8 port_num)
>> +{
>> + return RDMA_TRANSPORT_USNIC_UDP;
>> +}
>> +
>> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
>> int qp_attr_mask,
>> struct ib_qp_init_attr *qp_init_attr)
>> diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
>> index bb864f5..0b1633b 100644
>> --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
>> +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
>> @@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
>> struct ib_device_attr *props);
>> int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
>> struct ib_port_attr *props);
>> +enum rdma_transport_type
>> +usnic_ib_query_transport(struct ib_device *device, u8 port_num);
>> int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
>> int qp_attr_mask,
>> struct ib_qp_init_attr *qp_init_attr);
>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>> index 65994a1..d54f91e 100644
>> --- a/include/rdma/ib_verbs.h
>> +++ b/include/rdma/ib_verbs.h
>> @@ -75,10 +75,13 @@ enum rdma_node_type {
>> };
>>
>> enum rdma_transport_type {
>> + /* legacy for users */
>> RDMA_TRANSPORT_IB,
>> RDMA_TRANSPORT_IWARP,
>> RDMA_TRANSPORT_USNIC,
>> - RDMA_TRANSPORT_USNIC_UDP
>> + RDMA_TRANSPORT_USNIC_UDP,
>> + /* new transport */
>> + RDMA_TRANSPORT_IBOE,
>> };
>>
>> __attribute_const__ enum rdma_transport_type
>> @@ -1501,6 +1504,8 @@ struct ib_device {
>> int (*query_port)(struct ib_device *device,
>> u8 port_num,
>> struct ib_port_attr *port_attr);
>> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
>> + u8 port_num);
>> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
>> u8 port_num);
>> int (*query_gid)(struct ib_device *device,
>
>

2015-04-23 07:33:14

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/22/2015 06:16 PM, Liran Liss wrote:
[snip]
>
> Depends on who is "we".
> For ULPs, you are probably right.
>
> However, core services (e.g., mad management, CM, SA) do care about various details.
> In some cases, where it doesn't matter, this code will use management helpers.
> In other cases, this code will inspect link, transport, and node attributes of rdma devices.
>
> For example, the CM code has specific code paths for IB, RoCE, and iWARP.
> There is no other CM code; there is no reason to abstract 'CM'. This code will have code
> paths that depend on various specific details.

That's exactly what we want to stop, we have classified the CM to IB and IWARP now :-)

>
>> This new transport is only understand by core-layer currently, for user-layer
>> we still reserve the old transport for them, next step is to use bitmask instead
>> of transport, at that time we can erase the new transport and make the
>> whole stuff used by user-layer only :-)
>>
>
> I am not sure that we need a bit mask at all.
> Your helpers already provide all the useful abstractions, which both core and ULPs call directly.
> All the information is inferred directly from <link, transport, node> tuples.
>
> Some of the user-space tools need *exactly* the same reasoning.
> For example, management tools manage specific technologies and protocols, not some abstraction.
>
> So, For user-space, we can think about exposing exactly the same helper framework, while providing
> backward compatibility for the existing interfaces.

I'd really like to put the topic on bitmask and user app reform into
different thread...

bitmask should be next topic, there are many discussion already, but
I could imaging far more discussion there, the user reform should be
the last step, after every thing in kernel settled down :-)

>
>>>
>>>
>>> Detailed remarks
>>> ==============
>>>
>>> 1) The introduction of cap_*_*() stuff should have been introduced directly
>> in patch 02/27.
>>> This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and
>> increases the number of patches in the patch-set.
>>> Do this and remove patches 16-24.
>>
>> We have some discussion about compress the patch set, merge the reform
>> and introducing patch will mix the concept (like the earlier version), IMHO it
>> will increase the difficulty of review...
>>
>> And now since many review already been done, it's not wise to change the
>> whole structure of patch set IMHO...
>>
>
> I think it is because you are conditioning code on one thing, and then conditioning
> the same code on another thing.
>
> This is confusing.
>
> Once we get our abstractions correct (i.e., the right helper functions), you replace the
> existing logic with the suitable helper up-front.

We need to classify and integrate the concept into mgmt helper, that
would be very helpful for further reform, reform followed by integration
sounds not that bad, correct?

>
>>>
>>> 2)The name rdma_tech_* is lame.
>>> rdma_transport_*(), adhering to the above (*) remark, is much better.
>>> For example, both IB and ROCE *do* use the same transport.
>>
>> We have some discussion on that too, use transport means going back...
>>
>
> No.
> The existing notion of transport was correct. It was the node type that wasn't.
> And in any case the new helpers didn't use it.
>
> We need the original meaning of transport - see my response to Ira.
> I propose replacing rdma_node_get_transport() with the following helpers:
> - rdma_get_transport()
> - rdma_is_ib_transport()
> - rdma_is_iwarp_transport()

We can change the name at anytime, tech/transport/protocol/standard, just
one patch later can easily change it and start the topic of naming, any of these
name will unsatisfied someone AFAIK, I'd like to suggest we consider this
as a mark temporarily and focus on the logical issue.

> - ...
>
>>>
>>> 3) The name cap_* as it is used above is not accurate.
>>> You use it to describe technology characteristics rather than extendable
>> capabilities.
>>> I would suggest having a single convention for all helpers, such as
>> rdma_has_*() and rdma_is_*().
>>> For example: cap_ib_smi() ==> rdma_has_smi().
>>
>> That means going back too...
>
> See response to Ira (https://lkml.org/lkml/2015/4/21/951).
>
>
>>
>>>
>>> 4) Remove all capabilities that do not introduce any distinction in the
>> current code.
>>> We can add them as needed later.
>>> This means remove patches:
>>> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all
>>> IB devices support ipoib
>>> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB
>> devices support AF_IB.
>>>
>>> On the other hand:
>>> - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
>>> - cap_ib_sa() might make sense to cut code even further in the CMA, since
>> RoCE has a GSI but no SA.
>>
>> We have discussion on define these helpers previously, again, name is not
>> really a problem, I would rather to see such changes in the following series
>> after this one working stably :-)
>>
>
> The names are not critical. This comment is about introducing helpers that are
> do not introduce any new semantic notion in the current patch-set.
>
> cap_ipoib(), for example, is brain-dead because only a single technology (as of now)
> enables it: Infiniband.

This will be dropped in next version :-)

>
>>>
>>> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform
>>> IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
>>
>> Actually nothing changed after the modify, the prev purpose it to eliminate
>> the link layer helpers.
>>
>> But now we are not going to remove the helper any more, so let's drop this
>> modification in next version :-)
>>
>
> You don't add modifications just to drop them later.
> Don't add them in the first place!
>
> This patch-set will remain forever in the kernel commit log - we want it to be
> as self-explaining and coherent as possible.
>
> Remove this.

What i mean is this will be removed in v6...

>
>>>
>>> 6) Remove cap_read_multi_sge
>>> It is not device/port feature, but a transport capability.
>>> Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in
>> 'enum ib_device_cap_flags'.
>>>
>>> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
>> cap_eth_ah().
>>> Address handles that refer to Ethernet links always have Ethernet
>> addressing.
>>>
>>> In the CMA code, using rdma_tech_iboe() is just fine. This is how you define
>> cap_eth_ah() anyway.
>>> Currently, this patch just adds clutter.
>>
>> There are also some discussion on these helpers, drop them means going
>> back..
>>
>
> Back to where? Management helpers are a new concept. Let's get them right.

Back to one point during v1~v5.

>
>> The tech helper is not enough to explain the management purpose, and this
>> can be the wrapper for bitmask stuff too.
>>
>
> As I said, I am not sure that we will need any bitmasks.
> Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).

Better discussed in another thread.

>
>>>
>>> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
>>> We do need a transport qualifier, as exemplified in comment 5) above, and
>> for a complete clean model.
>>> This is after renaming the function to rdma_is_ib_transport()...
>>
>> This means going back again... rdma_is_ib_transport() has been used
>> previously.
>>
>> This helper is just to make the review more easier, we won't need it
>> internally, not to mention after bitmask was introduced :-)
>>
>
> The same...
>
>>>
>>>
>>> Putting it all together
>>> ==================
>>>
>>> We are left with the following helpers:
>>> - rdma_is_ib_transport()
>>> - rdma_is_iwarp_transport()
>>> - rdma_is_usnic_transport()
>>> - rdma_is_iboe()
>>> - rdma_has_mad()
>>> - rdma_has_smi()
>>> - rdma_has_gsi() - complements smi; can be used by the mad code for
>>> clarity
>>> - rdma_has_sa()
>>> - rdma_has_cm()
>>> - rdma_has_mcast()
>>
>> I think we can put the discussion on name and new helpers in future,
>> currently let's focus on these basic reform and make them working stably ;-)
>
> It's not just the names, it's their semantics.
> Any problems with the names proposed above?

These were once used in old version, again, name can't satisfied anyone
at this moment and I'd like to discuss this after the logical was right,
I really don't want folks to focus on this issue since it won't broken
anything and can be easily changed once we have the agreement.

Regards,
Michael Wang

>
>>
>> Regards,
>> Michael Wang
>>
>>>
>>>
>>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>>>
>>>>
>>>> Since v4:
>>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>>>> * Fix logical issue inside 3#, 14#
>>>> * Refine 3#, 4#, 5# with label 'free'
>>>> * Rework 10# to stop using port 1 when port already assigned
>>>>
>>>> There are plenty of lengthy code to check the transport type of IB
>>>> device, or the link layer type of it's port, but actually we are just
>>>> speculating whether a particular management/feature is supported by the
>> device/port.
>>>>
>>>> Thus instead of inferring, we should have our own mechanism for IB
>>>> management capability/protocol/feature checking, several proposals
>> below.
>>>>
>>>> This patch set will reform the method of getting transport type, we
>>>> will now using query_transport() instead of inferring from transport
>>>> and link layer respectively, also we defined the new transport type
>>>> to make the concept more reasonable.
>>>>
>>>> Mapping List:
>>>> node-type link-layer old-transport new-transport
>>>> nes RNIC ETH IWARP IWARP
>>>> amso1100 RNIC ETH IWARP IWARP
>>>> cxgb3 RNIC ETH IWARP IWARP
>>>> cxgb4 RNIC ETH IWARP IWARP
>>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>>>> ocrdma IB_CA ETH IB IBOE
>>>> mlx4 IB_CA IB/ETH IB IB/IBOE
>>>> mlx5 IB_CA IB IB IB
>>>> ehca IB_CA IB IB IB
>>>> ipath IB_CA IB IB IB
>>>> mthca IB_CA IB IB IB
>>>> qib IB_CA IB IB IB
>>>>
>>>> For example:
>>>> if (transport == IB) && (link-layer == ETH) will now become:
>>>> if (query_transport() == IBOE)
>>>>
>>>> Thus we will be able to get rid of the respective transport and
>>>> link-layer checking, and it will help us to add new
>>>> protocol/Technology (like OPA) more easier, also with the introduced
>>>> management helpers, IB management logical will be more clear and easier
>> for extending.
>>>>
>>>> Highlights:
>>>> The patch set covered a wide range of IB stuff, thus for those who are
>>>> familiar with the particular part, your suggestion would be
>>>> invaluable ;-)
>>>>
>>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>>>> management helpers, 26#~27# do clean up.
>>>>
>>>> Patches haven't been tested yet, we appreciate if any one who have
>> these
>>>> HW willing to provide his Tested-by :-)
>>>>
>>>> Doug suggested the bitmask mechanism:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23765.html
>>>> which could be the plan for future reforming, we prefer that to be
>> another
>>>> series which focus on semantic and performance.
>>>>
>>>> This patch-set is somewhat 'bloated' now and it may be a good timing
>> for
>>>> staging, I'd like to suggest we focus on improving existed helpers and
>> push
>>>> all the further reforms into next series ;-)
>>>>
>>>> Proposals:
>>>> Sean:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23339.html
>>>> Doug:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23418.html
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23765.html
>>>> Jason:
>>>> https://www.mail-archive.com/linux-
>>>> [email protected]/msg23425.html
>>>>
>>>> Michael Wang (27):
>>>> IB/Verbs: Implement new callback query_transport()
>>>> IB/Verbs: Implement raw management helpers
>>>> IB/Verbs: Reform IB-core mad/agent/user_mad
>>>> IB/Verbs: Reform IB-core cm
>>>> IB/Verbs: Reform IB-core sa_query
>>>> IB/Verbs: Reform IB-core multicast
>>>> IB/Verbs: Reform IB-ulp ipoib
>>>> IB/Verbs: Reform IB-ulp xprtrdma
>>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>>>> IB/Verbs: Reform route related part in IB-core cma
>>>> IB/Verbs: Reform mcast related part in IB-core cma
>>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>>>> IB/Verbs: Reform cma_acquire_dev()
>>>> IB/Verbs: Reform rest part in IB-core cma
>>>> IB/Verbs: Use management helper cap_ib_mad()
>>>> IB/Verbs: Use management helper cap_ib_smi()
>>>> IB/Verbs: Use management helper cap_ib_cm()
>>>> IB/Verbs: Use management helper cap_iw_cm()
>>>> IB/Verbs: Use management helper cap_ib_sa()
>>>> IB/Verbs: Use management helper cap_ib_mcast()
>>>> IB/Verbs: Use management helper cap_ipoib()
>>>> IB/Verbs: Use management helper cap_read_multi_sge()
>>>> IB/Verbs: Use management helper cap_af_ib()
>>>> IB/Verbs: Use management helper cap_eth_ah()
>>>> IB/Verbs: Clean up rdma_ib_or_iboe()
>>>> IB/Verbs: Cleanup rdma_node_get_transport()
>>>>
>>>> ---
>>>> drivers/infiniband/core/agent.c | 4
>>>> drivers/infiniband/core/cm.c | 26 +-
>>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>>>> drivers/infiniband/core/device.c | 1
>>>> drivers/infiniband/core/mad.c | 51 ++--
>>>> drivers/infiniband/core/multicast.c | 18 -
>>>> drivers/infiniband/core/sa_query.c | 41 +--
>>>> drivers/infiniband/core/sysfs.c | 8
>>>> drivers/infiniband/core/ucm.c | 5
>>>> drivers/infiniband/core/ucma.c | 27 --
>>>> drivers/infiniband/core/user_mad.c | 32 +-
>>>> drivers/infiniband/core/uverbs_cmd.c | 6
>>>> drivers/infiniband/core/verbs.c | 33 --
>>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>>>> drivers/infiniband/hw/cxgb4/provider.c | 7
>>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>>>> drivers/infiniband/hw/mlx4/main.c | 10
>>>> drivers/infiniband/hw/mlx5/main.c | 7
>>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>>>> 35 files changed, 584 insertions(+), 368 deletions(-)
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>> in the body of a message to [email protected] More majordomo
>>>> info at http://vger.kernel.org/majordomo-info.html

2015-04-23 07:53:14

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()



On 04/22/2015 07:24 PM, Jason Gunthorpe wrote:
> On Wed, Apr 22, 2015 at 10:49:44AM +0200, Michael Wang wrote:
>>
>> On 04/22/2015 07:40 AM, Jason Gunthorpe wrote:
>>> On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
>>>
>>>> Introduce helper cap_ipoib() to help us check if the port of an
>>>> IB device support IP over Infiniband.
>>>
>>> I thought we were dropping this in favor of listing the actual
>>> features the ULP required unconditionally? One of my messages had the
>>> start of a list..
>>
>> Shall we drop it now or wait until the mechanism introduced?
>>
>> Just wondering the requirement of ULP could be similar to the
>> requirement of management, isn't it? if the device can tell
>> which ULP it support, then may be a cap_XX() make sense in here?
>
> You have to audit the ipoib dirver and see what core functions it
> calls that have cap requirements themselves.
>
> At least SA, multicast and CM. It also requires cap_ib_ah() or
> whatever we called that.

I get your point :-) I'd like to suggest we put these in different threads:

1. bitmask reform
2. ulp check mechanism
3. naming (i think it'll be a really big discussion :-P)

Separate them can help us focus on a particular topic at once, and the
purpose of patches will be more clear ;-)

Regards,
Michael Wang

>
> JAson
>

2015-04-23 08:37:24

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/22/2015 06:57 PM, Jason Gunthorpe wrote:
> On Wed, Apr 22, 2015 at 10:59:52AM -0400, Doug Ledford wrote:
>
>>> 2)The name rdma_tech_* is lame.
>>> rdma_transport_*(), adhering to the above (*) remark, is much better.
>>> For example, both IB and ROCE *do* use the same transport.
>>
>> I especially want to second this. I haven't really been happy with the
>> rdma_tech_* names at all.
>
> I'm not excited about the names either..
>
> cap_ is bad because it pollutes the global namespace.
>
> rdma_tech_ .. as used, this is selecting the standard the port
> implements. The word 'standard' is a better choice than 'transport',
> and 'technology' is often synonymous with 'standard'. Meh.
>
> I've said it already, but this patch set has probably gotten too
> big. If we could just do the cap conversion without messing with other
> stuff, or adding rdma_tech, that would really be the best.
>
> Nobody seems to like the rdma_tech parts of this series.
>
> I'd also drop '[PATCH v5 09/27] IB/Verbs: Reform IB-core
> verbs/uverbs_cmd/sysfs' - that is UAPI stuff, it could be done as a
> followup someday, not worth the risk right now.

There won't be risk... the logical is clear that they will return
the same result, but I'll drop the modification on link_layer_show()
and ib_uverbs_query_port() anyway, since they are just try to get the
link-layer type and we are not going to erase that helper anymore.

Regards,
Michael Wang

>
> Jason
>

2015-04-23 09:25:36

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs



On 04/22/2015 06:28 PM, ira.weiny wrote:
[snip]
>
>>
>> link_layer_show() was supposed to report the same info to user
>> space as usual, so user tool don't have to change anything :-)
>
> We need to expose the "cap_*" functionality to userspace which can then convert
> to this interface and stop relying on inferring support based on the link
> layer. But that is a separate issue from correctly reporting the link layer.
>
> The link layer should be reported correctly from the drivers "get_link_layer"
> call.

I get your point :-) link_layer_show() and ib_uverbs_query_port() do
only need the link layer type rather then a mgmt check, modification
on these two will be dropped in next version.

Regards,
Michael Wang

>
> Ira
>
>>
>> Regards,
>> Michael Wang
>>
>>>
>>> Ira
>>>
>>>> return sprintf(buf, "%s\n", "InfiniBand");
>>>> - case IB_LINK_LAYER_ETHERNET:
>>>> + else
>>>> return sprintf(buf, "%s\n", "Ethernet");
>>>> - default:
>>>> - return sprintf(buf, "%s\n", "Unknown");
>>>> - }
>>>> }
>>>>
>>>> static PORT_ATTR_RO(state);
>>>> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
>>>> index a9f0489..5dc90aa 100644
>>>> --- a/drivers/infiniband/core/uverbs_cmd.c
>>>> +++ b/drivers/infiniband/core/uverbs_cmd.c
>>>> @@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
>>>> resp.active_width = attr.active_width;
>>>> resp.active_speed = attr.active_speed;
>>>> resp.phys_state = attr.phys_state;
>>>> - resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
>>>> - cmd.port_num);
>>>> + resp.link_layer = rdma_tech_ib(file->device->ib_dev,
>>>> + cmd.port_num) ?
>>>> + IB_LINK_LAYER_INFINIBAND :
>>>> + IB_LINK_LAYER_ETHERNET;
>>>>
>>>> if (copy_to_user((void __user *) (unsigned long) cmd.response,
>>>> &resp, sizeof resp))
>>>> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
>>>> index 626c9cf..7264860 100644
>>>> --- a/drivers/infiniband/core/verbs.c
>>>> +++ b/drivers/infiniband/core/verbs.c
>>>> @@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
>>>> u32 flow_class;
>>>> u16 gid_index;
>>>> int ret;
>>>> - int is_eth = (rdma_port_get_link_layer(device, port_num) ==
>>>> - IB_LINK_LAYER_ETHERNET);
>>>>
>>>> memset(ah_attr, 0, sizeof *ah_attr);
>>>> - if (is_eth) {
>>>> + if (rdma_tech_iboe(device, port_num)) {
>>>> if (!(wc->wc_flags & IB_WC_GRH))
>>>> return -EPROTOTYPE;
>>>>
>>>> @@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
>>>> union ib_gid sgid;
>>>>
>>>> if ((*qp_attr_mask & IB_QP_AV) &&
>>>> - (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
>>>> + (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
>>>> ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
>>>> qp_attr->ah_attr.grh.sgid_index, &sgid);
>>>> if (ret)
>>>> --
>>>> 2.1.0

2015-04-23 09:31:19

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib()



On 04/22/2015 06:45 PM, ira.weiny wrote:
> On Wed, Apr 22, 2015 at 10:49:44AM +0200, Michael Wang wrote:
>>
>> On 04/22/2015 07:40 AM, Jason Gunthorpe wrote:
>>> On Mon, Apr 20, 2015 at 10:41:38AM +0200, Michael Wang wrote:
>>>
>>>> Introduce helper cap_ipoib() to help us check if the port of an
>>>> IB device support IP over Infiniband.
>>>
>>> I thought we were dropping this in favor of listing the actual
>>> features the ULP required unconditionally? One of my messages had the
>>> start of a list..
>
> ??? I forget. I was arguing that we should not have it. But I thought others
> disagreed with me so it was left in.
>
> V4 of this patch had no responses.
>
> https://www.mail-archive.com/[email protected]/msg24040.html
>
> Jason, I can't find the email where you mentioned a list?

I don't see a list too :-P but agreed the ULP reform idea, the check should be

ipoib_init()
{
if (device don't suport SA or IB CM or...)
return error;
...
}

So I'll put the reform to that series.

Regards,
Michael Wang

>
> Ira
>

2015-04-23 09:36:15

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/22/2015 07:10 PM, ira.weiny wrote:
> On Wed, Apr 22, 2015 at 10:59:52AM -0400, Doug Ledford wrote:
>> On Tue, 2015-04-21 at 23:36 +0000, Liran Liss wrote:
>
> [snip]
>
>>>
>>> 2)The name rdma_tech_* is lame.
>>> rdma_transport_*(), adhering to the above (*) remark, is much better.
>>> For example, both IB and ROCE *do* use the same transport.
>>
>> I especially want to second this. I haven't really been happy with the
>> rdma_tech_* names at all.
>>
>
> I am sure Michael is open to alternative names. I know I am. The problem is
> that we can't figure out what "IBoE" is. It is not a transport, even though
> query_transport is now returning it as one. :-P

Exactly :-P

>
> I think the idea behind the "tech" name was that it is a technology "family".
> I can't think of a better name.

IMHO a single patch to change the name and collecting all the
discussion in that thread would be better.

The discussion in this thread already starting to be scattered, like discussion
on bitmask would be better in a thread dedicated on that purpose, otherwise
we have to collect all these scattered discussions again when the bitmask series
come out...

Regards,
Michael Wang


>
> Ira
>

2015-04-24 14:44:41

by Liran Liss

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

> From: Michael Wang [mailto:[email protected]]


> [snip]
> >
> > Depends on who is "we".
> > For ULPs, you are probably right.
> >
> > However, core services (e.g., mad management, CM, SA) do care about
> various details.
> > In some cases, where it doesn't matter, this code will use management
> helpers.
> > In other cases, this code will inspect link, transport, and node attributes of
> rdma devices.
> >
> > For example, the CM code has specific code paths for IB, RoCE, and iWARP.
> > There is no other CM code; there is no reason to abstract 'CM'. This
> > code will have code paths that depend on various specific details.
>
> That's exactly what we want to stop, we have classified the CM to IB and
> IWARP now :-)
>

We don't want to stop code branches that are not abstractions but rather depend
on the specific technology!
There is no generic "iWARP CM" - only one.
There is no generic "ROCE CM" - only one.
There is no generic "IB CM" - only one.

At the CM high-level (i.e., whether an ib_dev port registers an IB client), you could consider
an rdma_has_cm() call, but this the only place in the code that this check will be called!
Hence, no need for a generic check.

You want to stop abstract code that uses IB core infrastructure.

> >
> >> This new transport is only understand by core-layer currently, for
> >> user-layer we still reserve the old transport for them, next step is
> >> to use bitmask instead of transport, at that time we can erase the
> >> new transport and make the whole stuff used by user-layer only :-)
> >>
> >
> > I am not sure that we need a bit mask at all.
> > Your helpers already provide all the useful abstractions, which both core
> and ULPs call directly.
> > All the information is inferred directly from <link, transport, node> tuples.
> >
> > Some of the user-space tools need *exactly* the same reasoning.
> > For example, management tools manage specific technologies and
> protocols, not some abstraction.
> >
> > So, For user-space, we can think about exposing exactly the same
> > helper framework, while providing backward compatibility for the existing
> interfaces.
>
> I'd really like to put the topic on bitmask and user app reform into different
> thread...
>
> bitmask should be next topic, there are many discussion already, but I could
> imaging far more discussion there, the user reform should be the last step,
> after every thing in kernel settled down :-)
>

OK

> >
> >>>
> >>>
> >>> Detailed remarks
> >>> ==============
> >>>
> >>> 1) The introduction of cap_*_*() stuff should have been introduced
> >>> directly
> >> in patch 02/27.
> >>> This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing
> >>> and
> >> increases the number of patches in the patch-set.
> >>> Do this and remove patches 16-24.
> >>
> >> We have some discussion about compress the patch set, merge the
> >> reform and introducing patch will mix the concept (like the earlier
> >> version), IMHO it will increase the difficulty of review...
> >>
> >> And now since many review already been done, it's not wise to change
> >> the whole structure of patch set IMHO...
> >>
> >
> > I think it is because you are conditioning code on one thing, and then
> > conditioning the same code on another thing.
> >
> > This is confusing.
> >
> > Once we get our abstractions correct (i.e., the right helper
> > functions), you replace the existing logic with the suitable helper up-front.
>
> We need to classify and integrate the concept into mgmt helper, that would
> be very helpful for further reform, reform followed by integration sounds not
> that bad, correct?
>

The problem is that it is hard to follow the reasoning for the first use consumer
code with the in-complete helper frame-work.

> >
> >>>
> >>> 2)The name rdma_tech_* is lame.
> >>> rdma_transport_*(), adhering to the above (*) remark, is much better.
> >>> For example, both IB and ROCE *do* use the same transport.
> >>
> >> We have some discussion on that too, use transport means going back...
> >>
> >
> > No.
> > The existing notion of transport was correct. It was the node type that
> wasn't.
> > And in any case the new helpers didn't use it.
> >
> > We need the original meaning of transport - see my response to Ira.
> > I propose replacing rdma_node_get_transport() with the following helpers:
> > - rdma_get_transport()
> > - rdma_is_ib_transport()
> > - rdma_is_iwarp_transport()
>
> We can change the name at anytime, tech/transport/protocol/standard, just
> one patch later can easily change it and start the topic of naming, any of
> these name will unsatisfied someone AFAIK, I'd like to suggest we consider
> this as a mark temporarily and focus on the logical issue.

Sure.

The logical issue is:

1. We need the existing notion of transport, meaning "a bunch of L4+headers + semantics presented to apps".
2. We might need an *additional* notion of "rdma_protocol", which designates a complete wire-format: L2-L4+ including.
This could be later a bitmask, a management helper, whatever.
Currently, I don't see anything in the existing code that would call such helpers.

>
> > - ...
> >
> >>>
> >>> 3) The name cap_* as it is used above is not accurate.
> >>> You use it to describe technology characteristics rather than
> >>> extendable
> >> capabilities.
> >>> I would suggest having a single convention for all helpers, such as
> >> rdma_has_*() and rdma_is_*().
> >>> For example: cap_ib_smi() ==> rdma_has_smi().
> >>
> >> That means going back too...
> >
> > See response to Ira (https://lkml.org/lkml/2015/4/21/951).
> >
> >
> >>
> >>>
> >>> 4) Remove all capabilities that do not introduce any distinction in
> >>> the
> >> current code.
> >>> We can add them as needed later.
> >>> This means remove patches:
> >>> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all
> >>> IB devices support ipoib
> >>> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all
> >>> IB
> >> devices support AF_IB.
> >>>
> >>> On the other hand:
> >>> - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
> >>> - cap_ib_sa() might make sense to cut code even further in the CMA,
> >>> since
> >> RoCE has a GSI but no SA.
> >>
> >> We have discussion on define these helpers previously, again, name is
> >> not really a problem, I would rather to see such changes in the
> >> following series after this one working stably :-)
> >>
> >
> > The names are not critical. This comment is about introducing helpers
> > that are do not introduce any new semantic notion in the current patch-set.
> >
> > cap_ipoib(), for example, is brain-dead because only a single
> > technology (as of now) enables it: Infiniband.
>
> This will be dropped in next version :-)
>
> >
> >>>
> >>> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs:
> >>> Reform IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
> >>
> >> Actually nothing changed after the modify, the prev purpose it to
> >> eliminate the link layer helpers.
> >>
> >> But now we are not going to remove the helper any more, so let's drop
> >> this modification in next version :-)
> >>
> >
> > You don't add modifications just to drop them later.
> > Don't add them in the first place!
> >
> > This patch-set will remain forever in the kernel commit log - we want
> > it to be as self-explaining and coherent as possible.
> >
> > Remove this.
>
> What i mean is this will be removed in v6...
>
> >
> >>>
> >>> 6) Remove cap_read_multi_sge
> >>> It is not device/port feature, but a transport capability.
> >>> Use rdma_is_iwarp_transport() instead, or introduce a new transport
> >>> flag in
> >> 'enum ib_device_cap_flags'.
> >>>
> >>> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
> >> cap_eth_ah().
> >>> Address handles that refer to Ethernet links always have Ethernet
> >> addressing.
> >>>
> >>> In the CMA code, using rdma_tech_iboe() is just fine. This is how
> >>> you define
> >> cap_eth_ah() anyway.
> >>> Currently, this patch just adds clutter.
> >>
> >> There are also some discussion on these helpers, drop them means
> >> going back..
> >>
> >
> > Back to where? Management helpers are a new concept. Let's get them
> right.
>
> Back to one point during v1~v5.
>
> >
> >> The tech helper is not enough to explain the management purpose, and
> >> this can be the wrapper for bitmask stuff too.
> >>
> >
> > As I said, I am not sure that we will need any bitmasks.
> > Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).
>
> Better discussed in another thread.
>
> >
> >>>
> >>> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
> >>> We do need a transport qualifier, as exemplified in comment 5)
> >>> above, and
> >> for a complete clean model.
> >>> This is after renaming the function to rdma_is_ib_transport()...
> >>
> >> This means going back again... rdma_is_ib_transport() has been used
> >> previously.
> >>
> >> This helper is just to make the review more easier, we won't need it
> >> internally, not to mention after bitmask was introduced :-)
> >>
> >
> > The same...
> >
> >>>
> >>>
> >>> Putting it all together
> >>> ==================
> >>>
> >>> We are left with the following helpers:
> >>> - rdma_is_ib_transport()
> >>> - rdma_is_iwarp_transport()
> >>> - rdma_is_usnic_transport()
> >>> - rdma_is_iboe()
> >>> - rdma_has_mad()
> >>> - rdma_has_smi()
> >>> - rdma_has_gsi() - complements smi; can be used by the mad code for
> >>> clarity
> >>> - rdma_has_sa()
> >>> - rdma_has_cm()
> >>> - rdma_has_mcast()
> >>
> >> I think we can put the discussion on name and new helpers in future,
> >> currently let's focus on these basic reform and make them working
> >> stably ;-)
> >
> > It's not just the names, it's their semantics.
> > Any problems with the names proposed above?
>
> These were once used in old version, again, name can't satisfied anyone at
> this moment and I'd like to discuss this after the logical was right, I really
> don't want folks to focus on this issue since it won't broken anything and can
> be easily changed once we have the agreement.
>
> Regards,
> Michael Wang
>

OK

> >
> >>
> >> Regards,
> >> Michael Wang
> >>
> >>>
> >>>
> >>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
> >>>>
> >>>>
> >>>> Since v4:
> >>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
> >>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
> >>>> * Fix logical issue inside 3#, 14#
> >>>> * Refine 3#, 4#, 5# with label 'free'
> >>>> * Rework 10# to stop using port 1 when port already assigned
> >>>>
> >>>> There are plenty of lengthy code to check the transport type of IB
> >>>> device, or the link layer type of it's port, but actually we are
> >>>> just speculating whether a particular management/feature is
> >>>> supported by the
> >> device/port.
> >>>>
> >>>> Thus instead of inferring, we should have our own mechanism for IB
> >>>> management capability/protocol/feature checking, several proposals
> >> below.
> >>>>
> >>>> This patch set will reform the method of getting transport type, we
> >>>> will now using query_transport() instead of inferring from
> >>>> transport and link layer respectively, also we defined the new
> >>>> transport type to make the concept more reasonable.
> >>>>
> >>>> Mapping List:
> >>>> node-type link-layer old-transport new-transport
> >>>> nes RNIC ETH IWARP
> IWARP
> >>>> amso1100 RNIC ETH IWARP IWARP
> >>>> cxgb3 RNIC ETH IWARP IWARP
> >>>> cxgb4 RNIC ETH IWARP IWARP
> >>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
> >>>> ocrdma IB_CA ETH IB IBOE
> >>>> mlx4 IB_CA IB/ETH IB IB/IBOE
> >>>> mlx5 IB_CA IB IB IB
> >>>> ehca IB_CA IB IB IB
> >>>> ipath IB_CA IB IB IB
> >>>> mthca IB_CA IB IB IB
> >>>> qib IB_CA IB IB IB
> >>>>
> >>>> For example:
> >>>> if (transport == IB) && (link-layer == ETH) will now become:
> >>>> if (query_transport() == IBOE)
> >>>>
> >>>> Thus we will be able to get rid of the respective transport and
> >>>> link-layer checking, and it will help us to add new
> >>>> protocol/Technology (like OPA) more easier, also with the
> >>>> introduced management helpers, IB management logical will be more
> >>>> clear and easier
> >> for extending.
> >>>>
> >>>> Highlights:
> >>>> The patch set covered a wide range of IB stuff, thus for those who are
> >>>> familiar with the particular part, your suggestion would be
> >>>> invaluable ;-)
> >>>>
> >>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
> >>>> management helpers, 26#~27# do clean up.
> >>>>
> >>>> Patches haven't been tested yet, we appreciate if any one who
> >>>> have
> >> these
> >>>> HW willing to provide his Tested-by :-)
> >>>>
> >>>> Doug suggested the bitmask mechanism:
> >>>> https://www.mail-archive.com/linux-
> >>>> [email protected]/msg23765.html
> >>>> which could be the plan for future reforming, we prefer that to
> >>>> be
> >> another
> >>>> series which focus on semantic and performance.
> >>>>
> >>>> This patch-set is somewhat 'bloated' now and it may be a good
> >>>> timing
> >> for
> >>>> staging, I'd like to suggest we focus on improving existed
> >>>> helpers and
> >> push
> >>>> all the further reforms into next series ;-)
> >>>>
> >>>> Proposals:
> >>>> Sean:
> >>>> https://www.mail-archive.com/linux-
> >>>> [email protected]/msg23339.html
> >>>> Doug:
> >>>> https://www.mail-archive.com/linux-
> >>>> [email protected]/msg23418.html
> >>>> https://www.mail-archive.com/linux-
> >>>> [email protected]/msg23765.html
> >>>> Jason:
> >>>> https://www.mail-archive.com/linux-
> >>>> [email protected]/msg23425.html
> >>>>
> >>>> Michael Wang (27):
> >>>> IB/Verbs: Implement new callback query_transport()
> >>>> IB/Verbs: Implement raw management helpers
> >>>> IB/Verbs: Reform IB-core mad/agent/user_mad
> >>>> IB/Verbs: Reform IB-core cm
> >>>> IB/Verbs: Reform IB-core sa_query
> >>>> IB/Verbs: Reform IB-core multicast
> >>>> IB/Verbs: Reform IB-ulp ipoib
> >>>> IB/Verbs: Reform IB-ulp xprtrdma
> >>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
> >>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
> >>>> IB/Verbs: Reform route related part in IB-core cma
> >>>> IB/Verbs: Reform mcast related part in IB-core cma
> >>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
> >>>> IB/Verbs: Reform cma_acquire_dev()
> >>>> IB/Verbs: Reform rest part in IB-core cma
> >>>> IB/Verbs: Use management helper cap_ib_mad()
> >>>> IB/Verbs: Use management helper cap_ib_smi()
> >>>> IB/Verbs: Use management helper cap_ib_cm()
> >>>> IB/Verbs: Use management helper cap_iw_cm()
> >>>> IB/Verbs: Use management helper cap_ib_sa()
> >>>> IB/Verbs: Use management helper cap_ib_mcast()
> >>>> IB/Verbs: Use management helper cap_ipoib()
> >>>> IB/Verbs: Use management helper cap_read_multi_sge()
> >>>> IB/Verbs: Use management helper cap_af_ib()
> >>>> IB/Verbs: Use management helper cap_eth_ah()
> >>>> IB/Verbs: Clean up rdma_ib_or_iboe()
> >>>> IB/Verbs: Cleanup rdma_node_get_transport()
> >>>>
> >>>> ---
> >>>> drivers/infiniband/core/agent.c | 4
> >>>> drivers/infiniband/core/cm.c | 26 +-
> >>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
> >>>> drivers/infiniband/core/device.c | 1
> >>>> drivers/infiniband/core/mad.c | 51 ++--
> >>>> drivers/infiniband/core/multicast.c | 18 -
> >>>> drivers/infiniband/core/sa_query.c | 41 +--
> >>>> drivers/infiniband/core/sysfs.c | 8
> >>>> drivers/infiniband/core/ucm.c | 5
> >>>> drivers/infiniband/core/ucma.c | 27 --
> >>>> drivers/infiniband/core/user_mad.c | 32 +-
> >>>> drivers/infiniband/core/uverbs_cmd.c | 6
> >>>> drivers/infiniband/core/verbs.c | 33 --
> >>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
> >>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
> >>>> drivers/infiniband/hw/cxgb4/provider.c | 7
> >>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
> >>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
> >>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
> >>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
> >>>> drivers/infiniband/hw/mlx4/main.c | 10
> >>>> drivers/infiniband/hw/mlx5/main.c | 7
> >>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
> >>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
> >>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
> >>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
> >>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
> >>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
> >>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
> >>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
> >>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
> >>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
> >>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
> >>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
> >>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
> >>>> 35 files changed, 584 insertions(+), 368 deletions(-)
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> >>>> in the body of a message to [email protected] More
> >>>> majordomo info at http://vger.kernel.org/majordomo-info.html
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-24 14:49:56

by Liran Liss

[permalink] [raw]
Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

> From: Hefty, Sean [mailto:[email protected]]

[snip]
> > > So, I think that our "old-transport" below is just fine.
> > > No need to change it (and you aren't, since it is currently
> > > implemented
> > as a function).
> >
> > I think there is a need to change this. Encoding the transport into
> > the node type is not a good idea. Having different "transport
> > semantics" while still returning the same transport for the port is
> > confusing.
> >
> > The only thing which is clear currently is Link Layer.
> >
> > But the use of "Link Layer" in the code is so convoluted that it is
> > very confusing.
>
> I agree.
>
> One could implement software iWarp or IBoUDP (RoCEv2) protocols that
> could run over any link layer and interoperate with existing HW solutions.
> The stack shouldn't be dealing with the link level at all, with the exception of
> user space compatibility.
>
> > Define Transport? There has been a lot of discussion over what a
> > transport is in Verbs.
>
> IMO, we should replace using the word 'transport' with just 'rdma_protocol'.
> And even then I'm not convinced that anything should care, beyond user
> space compatibility. The caps are what matter.
>
> - Sean

I completely agree.
If we ever see a need for representing a set or subset of cross-layer protocols (at any level, L2-L4, various encapsulations), we will add the proper management helpers.
For example:
- rdma_protocol_roce() /* both v1 and v2 */
- rdma_protocol_roce_v1()
- rdma_protocol_roce_v2()
- rdma_protocol_usnic()
- rdma_protocol_usnic_udp()

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-24 15:07:39

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers



On 04/24/2015 04:44 PM, Liran Liss wrote:
>> From: Michael Wang [mailto:[email protected]]
>
>
>> [snip]
[snip]
>>
>
> We don't want to stop code branches that are not abstractions but rather depend
> on the specific technology!
> There is no generic "iWARP CM" - only one.
> There is no generic "ROCE CM" - only one.
> There is no generic "IB CM" - only one.
>
> At the CM high-level (i.e., whether an ib_dev port registers an IB client), you could consider
> an rdma_has_cm() call, but this the only place in the code that this check will be called!
> Hence, no need for a generic check.

There is plenty of places to check IB or IWARP CM in cma, that is 18# and 19#.

And the purpose is to make sure core-layer check the right thing rather than the
'tech' or 'transport', even if there is only one use case, as long as it meaningful, a
helper would make sense IMHO, so later when comes other cases we could use the
helpers rather than the meaningless tech check.

>
> You want to stop abstract code that uses IB core infrastructure.

We want the check on management make more sense :-)

>
>>>
>>>> This new transport is only understand by core-layer currently, for
>>>> user-layer we still reserve the old transport for them, next step is
>>>> to use bitmask instead of transport, at that time we can erase the
>>>> new transport and make the whole stuff used by user-layer only :-)
[snip]
>> We need to classify and integrate the concept into mgmt helper, that would
>> be very helpful for further reform, reform followed by integration sounds not
>> that bad, correct?
>>
>
> The problem is that it is hard to follow the reasoning for the first use consumer
> code with the in-complete helper frame-work.

The problem is, with the current implementation in core layer, expanding
to new tech would be painful, the reform would come into core directly,
which make the logical even more confusing.

This patch set aim to collect all those places where a management check
necessary and integrate them together, this is the reform on core-layer
so later we don't have to change core but replace the internal
implementation of helpers, eg the idea of bitmask.

>
>>>
>>>>>
>>>>> 2)The name rdma_tech_* is lame.
>>>>> rdma_transport_*(), adhering to the above (*) remark, is much better.
>>>>> For example, both IB and ROCE *do* use the same transport.
>>>>
>>>> We have some discussion on that too, use transport means going back...
>>>>
>>>
>>> No.
>>> The existing notion of transport was correct. It was the node type that
>> wasn't.
>>> And in any case the new helpers didn't use it.
>>>
>>> We need the original meaning of transport - see my response to Ira.
>>> I propose replacing rdma_node_get_transport() with the following helpers:
>>> - rdma_get_transport()
>>> - rdma_is_ib_transport()
>>> - rdma_is_iwarp_transport()
>>
>> We can change the name at anytime, tech/transport/protocol/standard, just
>> one patch later can easily change it and start the topic of naming, any of
>> these name will unsatisfied someone AFAIK, I'd like to suggest we consider
>> this as a mark temporarily and focus on the logical issue.
>
> Sure.
>
> The logical issue is:
>
> 1. We need the existing notion of transport, meaning "a bunch of L4+headers + semantics presented to apps".
> 2. We might need an *additional* notion of "rdma_protocol", which designates a complete wire-format: L2-L4+ including.
> This could be later a bitmask, a management helper, whatever.
> Currently, I don't see anything in the existing code that would call such helpers.

My thought is mostly focus on the core-layer, so the name won't satisfied
someone focus on hw or user app, but as long as the reform won't break
anything, I prefer to leave the naming to those who more expert on
that particular part, after this first step :-)

Regards,
Michael Wang

>
>>
>>> - ...
>>>
>>>>>
>>>>> 3) The name cap_* as it is used above is not accurate.
>>>>> You use it to describe technology characteristics rather than
>>>>> extendable
>>>> capabilities.
>>>>> I would suggest having a single convention for all helpers, such as
>>>> rdma_has_*() and rdma_is_*().
>>>>> For example: cap_ib_smi() ==> rdma_has_smi().
>>>>
>>>> That means going back too...
>>>
>>> See response to Ira (https://lkml.org/lkml/2015/4/21/951).
>>>
>>>
>>>>
>>>>>
>>>>> 4) Remove all capabilities that do not introduce any distinction in
>>>>> the
>>>> current code.
>>>>> We can add them as needed later.
>>>>> This means remove patches:
>>>>> - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all
>>>>> IB devices support ipoib
>>>>> - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all
>>>>> IB
>>>> devices support AF_IB.
>>>>>
>>>>> On the other hand:
>>>>> - rdma_has_multicast() makes sense, since iWARP doesn’t support it.
>>>>> - cap_ib_sa() might make sense to cut code even further in the CMA,
>>>>> since
>>>> RoCE has a GSI but no SA.
>>>>
>>>> We have discussion on define these helpers previously, again, name is
>>>> not really a problem, I would rather to see such changes in the
>>>> following series after this one working stably :-)
>>>>
>>>
>>> The names are not critical. This comment is about introducing helpers
>>> that are do not introduce any new semantic notion in the current patch-set.
>>>
>>> cap_ipoib(), for example, is brain-dead because only a single
>>> technology (as of now) enables it: Infiniband.
>>
>> This will be dropped in next version :-)
>>
>>>
>>>>>
>>>>> 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs:
>>>>> Reform IB-core verbs/uverbs_cmd/sysfs It *is* the link layer!
>>>>
>>>> Actually nothing changed after the modify, the prev purpose it to
>>>> eliminate the link layer helpers.
>>>>
>>>> But now we are not going to remove the helper any more, so let's drop
>>>> this modification in next version :-)
>>>>
>>>
>>> You don't add modifications just to drop them later.
>>> Don't add them in the first place!
>>>
>>> This patch-set will remain forever in the kernel commit log - we want
>>> it to be as self-explaining and coherent as possible.
>>>
>>> Remove this.
>>
>> What i mean is this will be removed in v6...
>>
>>>
>>>>>
>>>>> 6) Remove cap_read_multi_sge
>>>>> It is not device/port feature, but a transport capability.
>>>>> Use rdma_is_iwarp_transport() instead, or introduce a new transport
>>>>> flag in
>>>> 'enum ib_device_cap_flags'.
>>>>>
>>>>> 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper
>>>> cap_eth_ah().
>>>>> Address handles that refer to Ethernet links always have Ethernet
>>>> addressing.
>>>>>
>>>>> In the CMA code, using rdma_tech_iboe() is just fine. This is how
>>>>> you define
>>>> cap_eth_ah() anyway.
>>>>> Currently, this patch just adds clutter.
>>>>
>>>> There are also some discussion on these helpers, drop them means
>>>> going back..
>>>>
>>>
>>> Back to where? Management helpers are a new concept. Let's get them
>> right.
>>
>> Back to one point during v1~v5.
>>
>>>
>>>> The tech helper is not enough to explain the management purpose, and
>>>> this can be the wrapper for bitmask stuff too.
>>>>
>>>
>>> As I said, I am not sure that we will need any bitmasks.
>>> Also see response to Ira (https://lkml.org/lkml/2015/4/21/951).
>>
>> Better discussed in another thread.
>>
>>>
>>>>>
>>>>> 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe().
>>>>> We do need a transport qualifier, as exemplified in comment 5)
>>>>> above, and
>>>> for a complete clean model.
>>>>> This is after renaming the function to rdma_is_ib_transport()...
>>>>
>>>> This means going back again... rdma_is_ib_transport() has been used
>>>> previously.
>>>>
>>>> This helper is just to make the review more easier, we won't need it
>>>> internally, not to mention after bitmask was introduced :-)
>>>>
>>>
>>> The same...
>>>
>>>>>
>>>>>
>>>>> Putting it all together
>>>>> ==================
>>>>>
>>>>> We are left with the following helpers:
>>>>> - rdma_is_ib_transport()
>>>>> - rdma_is_iwarp_transport()
>>>>> - rdma_is_usnic_transport()
>>>>> - rdma_is_iboe()
>>>>> - rdma_has_mad()
>>>>> - rdma_has_smi()
>>>>> - rdma_has_gsi() - complements smi; can be used by the mad code for
>>>>> clarity
>>>>> - rdma_has_sa()
>>>>> - rdma_has_cm()
>>>>> - rdma_has_mcast()
>>>>
>>>> I think we can put the discussion on name and new helpers in future,
>>>> currently let's focus on these basic reform and make them working
>>>> stably ;-)
>>>
>>> It's not just the names, it's their semantics.
>>> Any problems with the names proposed above?
>>
>> These were once used in old version, again, name can't satisfied anyone at
>> this moment and I'd like to discuss this after the logical was right, I really
>> don't want folks to focus on this issue since it won't broken anything and can
>> be easily changed once we have the agreement.
>>
>> Regards,
>> Michael Wang
>>
>
> OK
>
>>>
>>>>
>>>> Regards,
>>>> Michael Wang
>>>>
>>>>>
>>>>>
>>>>>> Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers
>>>>>>
>>>>>>
>>>>>> Since v4:
>>>>>> * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason,
>>>>>> Roland, Ira and Steve :-) Please remind me if anything missed :-P
>>>>>> * Fix logical issue inside 3#, 14#
>>>>>> * Refine 3#, 4#, 5# with label 'free'
>>>>>> * Rework 10# to stop using port 1 when port already assigned
>>>>>>
>>>>>> There are plenty of lengthy code to check the transport type of IB
>>>>>> device, or the link layer type of it's port, but actually we are
>>>>>> just speculating whether a particular management/feature is
>>>>>> supported by the
>>>> device/port.
>>>>>>
>>>>>> Thus instead of inferring, we should have our own mechanism for IB
>>>>>> management capability/protocol/feature checking, several proposals
>>>> below.
>>>>>>
>>>>>> This patch set will reform the method of getting transport type, we
>>>>>> will now using query_transport() instead of inferring from
>>>>>> transport and link layer respectively, also we defined the new
>>>>>> transport type to make the concept more reasonable.
>>>>>>
>>>>>> Mapping List:
>>>>>> node-type link-layer old-transport new-transport
>>>>>> nes RNIC ETH IWARP
>> IWARP
>>>>>> amso1100 RNIC ETH IWARP IWARP
>>>>>> cxgb3 RNIC ETH IWARP IWARP
>>>>>> cxgb4 RNIC ETH IWARP IWARP
>>>>>> usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
>>>>>> ocrdma IB_CA ETH IB IBOE
>>>>>> mlx4 IB_CA IB/ETH IB IB/IBOE
>>>>>> mlx5 IB_CA IB IB IB
>>>>>> ehca IB_CA IB IB IB
>>>>>> ipath IB_CA IB IB IB
>>>>>> mthca IB_CA IB IB IB
>>>>>> qib IB_CA IB IB IB
>>>>>>
>>>>>> For example:
>>>>>> if (transport == IB) && (link-layer == ETH) will now become:
>>>>>> if (query_transport() == IBOE)
>>>>>>
>>>>>> Thus we will be able to get rid of the respective transport and
>>>>>> link-layer checking, and it will help us to add new
>>>>>> protocol/Technology (like OPA) more easier, also with the
>>>>>> introduced management helpers, IB management logical will be more
>>>>>> clear and easier
>>>> for extending.
>>>>>>
>>>>>> Highlights:
>>>>>> The patch set covered a wide range of IB stuff, thus for those who are
>>>>>> familiar with the particular part, your suggestion would be
>>>>>> invaluable ;-)
>>>>>>
>>>>>> Patch 1#~15# included all the logical reform, 16#~25# introduced the
>>>>>> management helpers, 26#~27# do clean up.
>>>>>>
>>>>>> Patches haven't been tested yet, we appreciate if any one who
>>>>>> have
>>>> these
>>>>>> HW willing to provide his Tested-by :-)
>>>>>>
>>>>>> Doug suggested the bitmask mechanism:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> [email protected]/msg23765.html
>>>>>> which could be the plan for future reforming, we prefer that to
>>>>>> be
>>>> another
>>>>>> series which focus on semantic and performance.
>>>>>>
>>>>>> This patch-set is somewhat 'bloated' now and it may be a good
>>>>>> timing
>>>> for
>>>>>> staging, I'd like to suggest we focus on improving existed
>>>>>> helpers and
>>>> push
>>>>>> all the further reforms into next series ;-)
>>>>>>
>>>>>> Proposals:
>>>>>> Sean:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> [email protected]/msg23339.html
>>>>>> Doug:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> [email protected]/msg23418.html
>>>>>> https://www.mail-archive.com/linux-
>>>>>> [email protected]/msg23765.html
>>>>>> Jason:
>>>>>> https://www.mail-archive.com/linux-
>>>>>> [email protected]/msg23425.html
>>>>>>
>>>>>> Michael Wang (27):
>>>>>> IB/Verbs: Implement new callback query_transport()
>>>>>> IB/Verbs: Implement raw management helpers
>>>>>> IB/Verbs: Reform IB-core mad/agent/user_mad
>>>>>> IB/Verbs: Reform IB-core cm
>>>>>> IB/Verbs: Reform IB-core sa_query
>>>>>> IB/Verbs: Reform IB-core multicast
>>>>>> IB/Verbs: Reform IB-ulp ipoib
>>>>>> IB/Verbs: Reform IB-ulp xprtrdma
>>>>>> IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
>>>>>> IB/Verbs: Reform cm related part in IB-core cma/ucm
>>>>>> IB/Verbs: Reform route related part in IB-core cma
>>>>>> IB/Verbs: Reform mcast related part in IB-core cma
>>>>>> IB/Verbs: Reserve legacy transport type in 'dev_addr'
>>>>>> IB/Verbs: Reform cma_acquire_dev()
>>>>>> IB/Verbs: Reform rest part in IB-core cma
>>>>>> IB/Verbs: Use management helper cap_ib_mad()
>>>>>> IB/Verbs: Use management helper cap_ib_smi()
>>>>>> IB/Verbs: Use management helper cap_ib_cm()
>>>>>> IB/Verbs: Use management helper cap_iw_cm()
>>>>>> IB/Verbs: Use management helper cap_ib_sa()
>>>>>> IB/Verbs: Use management helper cap_ib_mcast()
>>>>>> IB/Verbs: Use management helper cap_ipoib()
>>>>>> IB/Verbs: Use management helper cap_read_multi_sge()
>>>>>> IB/Verbs: Use management helper cap_af_ib()
>>>>>> IB/Verbs: Use management helper cap_eth_ah()
>>>>>> IB/Verbs: Clean up rdma_ib_or_iboe()
>>>>>> IB/Verbs: Cleanup rdma_node_get_transport()
>>>>>>
>>>>>> ---
>>>>>> drivers/infiniband/core/agent.c | 4
>>>>>> drivers/infiniband/core/cm.c | 26 +-
>>>>>> drivers/infiniband/core/cma.c | 328 ++++++++++++---------------
>>>>>> drivers/infiniband/core/device.c | 1
>>>>>> drivers/infiniband/core/mad.c | 51 ++--
>>>>>> drivers/infiniband/core/multicast.c | 18 -
>>>>>> drivers/infiniband/core/sa_query.c | 41 +--
>>>>>> drivers/infiniband/core/sysfs.c | 8
>>>>>> drivers/infiniband/core/ucm.c | 5
>>>>>> drivers/infiniband/core/ucma.c | 27 --
>>>>>> drivers/infiniband/core/user_mad.c | 32 +-
>>>>>> drivers/infiniband/core/uverbs_cmd.c | 6
>>>>>> drivers/infiniband/core/verbs.c | 33 --
>>>>>> drivers/infiniband/hw/amso1100/c2_provider.c | 7
>>>>>> drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
>>>>>> drivers/infiniband/hw/cxgb4/provider.c | 7
>>>>>> drivers/infiniband/hw/ehca/ehca_hca.c | 6
>>>>>> drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
>>>>>> drivers/infiniband/hw/ehca/ehca_main.c | 1
>>>>>> drivers/infiniband/hw/ipath/ipath_verbs.c | 7
>>>>>> drivers/infiniband/hw/mlx4/main.c | 10
>>>>>> drivers/infiniband/hw/mlx5/main.c | 7
>>>>>> drivers/infiniband/hw/mthca/mthca_provider.c | 7
>>>>>> drivers/infiniband/hw/nes/nes_verbs.c | 6
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
>>>>>> drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
>>>>>> drivers/infiniband/hw/qib/qib_verbs.c | 7
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
>>>>>> drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
>>>>>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
>>>>>> include/rdma/ib_verbs.h | 204 +++++++++++++++-
>>>>>> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
>>>>>> net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
>>>>>> 35 files changed, 584 insertions(+), 368 deletions(-)
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>>>>> in the body of a message to [email protected] More
>>>>>> majordomo info at http://vger.kernel.org/majordomo-info.html

2015-04-24 16:42:33

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Fri, Apr 24, 2015 at 03:00:15PM +0000, Liran Liss wrote:

> Currently, the only code in the kernel that has an SMI interface is IB.
> When OPA is introduced, add the proper helper.

We already have tests checking for SMI is supported so QP0 can be
created, this is to support ROCEE

> All I am saying is that there will always be code paths that are
> technology- and standards-specific. For example, the low-level MAD
> processing code *must* do stuff like:

> if (rdma_is_transport_ib())
> /* IB-spec compliant stuff */
> else if (rdma_is_transport_opa())
> /* OPA stuff */

Why should we open code that? It is back to what I said - that doesn't
help the reader. Which of the few differences between OPA and IB MADs
is that code trying to deal with?

Heck, what are the differences? Do you know? Do I know?

If you don't know what the differences are, you can't realistically
work on the MAD layer anymore, because you might break OPA.

Whereas, If I see:

if (cap_2k_mad())
/* Special handling for OPA 2k mad support */
if (cap_opa_mad_space() && mad->baseVersion == ... )
/* Decode OPA mads */
if (cap_ib_mad_space() && mad->baseVersion == ... )
/* Decode IB mads */

The I *know* what to look for when writing new code.

That is the problem we are trying to address here. iWarp has already
created it, we addressed it using 'rdma_is_transport_iwarp' and I
don't think those results were very satisfying.

Jason

2015-04-27 19:10:46

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Fri, Apr 24, 2015 at 10:42:26AM -0600, Jason Gunthorpe wrote:
> On Fri, Apr 24, 2015 at 03:00:15PM +0000, Liran Liss wrote:
>
> > Currently, the only code in the kernel that has an SMI interface is IB.
> > When OPA is introduced, add the proper helper.
>
> We already have tests checking for SMI is supported so QP0 can be
> created, this is to support ROCEE
>
> > All I am saying is that there will always be code paths that are
> > technology- and standards-specific. For example, the low-level MAD
> > processing code *must* do stuff like:
>
> > if (rdma_is_transport_ib())
> > /* IB-spec compliant stuff */
> > else if (rdma_is_transport_opa())
> > /* OPA stuff */

The issue is that opa is _not_ a new "transport". It is just like RoCEE which
supports the IB transport with some differences.

We need a way to explain what those differences are while keeping each section
of code as clean and clear as possible. Many of us have spent a lot of time
trying to figure out what each section of the current code is doing when they
call "get_transport" and/or "get_link_layer".

>
> Why should we open code that? It is back to what I said - that doesn't
> help the reader. Which of the few differences between OPA and IB MADs
> is that code trying to deal with?
>
> Heck, what are the differences? Do you know? Do I know?
>
> If you don't know what the differences are, you can't realistically
> work on the MAD layer anymore, because you might break OPA.
>
> Whereas, If I see:
>
> if (cap_2k_mad())
> /* Special handling for OPA 2k mad support */

FWIW we decided not to special case 2K and simply provide the max MAD size
which a driver supports. This is much more flexible. I think the semantics
are equivalent to your example here but I don't think we need a discussion
around a "cap_2k_mad" helper.

> if (cap_opa_mad_space() && mad->baseVersion == ... )
> /* Decode OPA mads */
> if (cap_ib_mad_space() && mad->baseVersion == ... )
> /* Decode IB mads */

Agreed.

>
> The I *know* what to look for when writing new code.
>
> That is the problem we are trying to address here. iWarp has already
> created it, we addressed it using 'rdma_is_transport_iwarp' and I
> don't think those results were very satisfying.

No they are not and it is getting more complicated.

Ira

2015-04-27 19:22:57

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Fri, Apr 24, 2015 at 02:44:29PM +0000, Liran Liss wrote:
> > From: Michael Wang [mailto:[email protected]]
>
>
> > [snip]
> > >
> > > Depends on who is "we".
> > > For ULPs, you are probably right.
> > >
> > > However, core services (e.g., mad management, CM, SA) do care about
> > various details.
> > > In some cases, where it doesn't matter, this code will use management
> > helpers.
> > > In other cases, this code will inspect link, transport, and node attributes of
> > rdma devices.
> > >
> > > For example, the CM code has specific code paths for IB, RoCE, and iWARP.
> > > There is no other CM code; there is no reason to abstract 'CM'. This
> > > code will have code paths that depend on various specific details.
> >
> > That's exactly what we want to stop, we have classified the CM to IB and
> > IWARP now :-)
> >
>
> We don't want to stop code branches that are not abstractions but rather depend
> on the specific technology!
> There is no generic "iWARP CM" - only one.
> There is no generic "ROCE CM" - only one.
> There is no generic "IB CM" - only one.

How can you say this? Or perhaps I don't understand what you mean.

While conceptually one could say that each technology has its own "CM" we are
trying to have the same module (and code) implement them all (ie a generic CM
for a node). Therefore, the CM code _is_ generic. As is the MAD code. This
is the reason we have this problem. We are trying to reuse those modules for
multiple technologies.

>
> At the CM high-level (i.e., whether an ib_dev port registers an IB client), you could consider
> an rdma_has_cm() call, but this the only place in the code that this check will be called!
> Hence, no need for a generic check.
>
> You want to stop abstract code that uses IB core infrastructure.

Not sure what you mean here?

Ira

2015-04-27 20:58:43

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

On Fri, Apr 24, 2015 at 02:53:37PM +0000, Liran Liss wrote:
> > From: ira.weiny [mailto:[email protected]]
> > [snip]
> >
> > > >
> > > > 2)The name rdma_tech_* is lame.
> > > > rdma_transport_*(), adhering to the above (*) remark, is much better.
> > > > For example, both IB and ROCE *do* use the same transport.
> > >
> > > I especially want to second this. I haven't really been happy with
> > > the
> > > rdma_tech_* names at all.
> > >
> >
> > I am sure Michael is open to alternative names. I know I am. The problem is
> > that we can't figure out what "IBoE" is. It is not a transport, even though
> > query_transport is now returning it as one. :-P
>
> IBOE is used in part of the kernel symbols that refer to RoCE. That's it.
> ROCE HCA links have the following characteristics:
> - Ethernet link layer
> - IB transport
> - CA node type
>
> And, if needed in the future:
> - ROCEv1 and ROCEv2 protocol stacks

Let me rephrase that[*].

We don't know what to "call" "IBoE" vs "IBoIB" vs "iWarp Verbs over iWarp" vs
"IBoOPA", etc.

The problem with your argument (no matter what name we use, "Transport",
"tech", "protocol") etc is that support for various features are implied rather
than explicit.

Your argument that RoCE is just "IB transport" over Ethernet (while technically
correct) is not relevant for the problem we are trying to solve.

The ib_mad, ib_cm, rdma_cm, and ib_sa modules are attempting to support many
different device type (on a port basis). The functionality they provide is
dependent on so much more that the notion of "transport" and "link layer".

The purpose of this patch series was to create the specific helper functions
based on the support the core modules need to explicitly determine what they
should do for a port. Furthermore, the implementation of those features was to
be based on the current implementation (which happens to use the transport and
link layer) to ensure we don't break everything.

Once we could agree on items like the feature set and the names of the helper
calls we could then alter the implementation to remove the implied nature of
this support.

>
> >
> > I think the idea behind the "tech" name was that it is a technology "family".
> > I can't think of a better name.

I see "protocol" and "specification" being discussed. Those are probably
decent.

Ira

[*] We all know what RoCE (IBoE) _is_...