2015-04-13 12:20:08

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 00/28] IB/Verbs: IB Management Helpers


Since v2:
* Apply suggestions from Doug, Ira, Jason, Tom, Steve, thanks for the comments :-)
and please remind me if I missed anything :-P

There are plenty of lengthy code to check the transport type of IB device,
or the link layer type of it's port, but actually we are just speculating
whether a particular management/feature is supported by the device/port.

Thus instead of inferring, we should have our own mechanism for IB management
capability/protocol/feature checking, several proposals below.

This patch set will reform the method of getting transport type, we will
now using query_transport() instead of inferring from transport and link
layer respectively, also we defined the new transport type to make the
concept more reasonable.

Mapping List:
node-type link-layer old-transport new-transport
nes RNIC ETH IWARP IWARP
amso1100 RNIC ETH IWARP IWARP
cxgb3 RNIC ETH IWARP IWARP
cxgb4 RNIC ETH IWARP IWARP
usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
ocrdma IB_CA ETH IB IBOE
mlx4 IB_CA IB/ETH IB IB/IBOE
mlx5 IB_CA IB IB IB
ehca IB_CA IB IB IB
ipath IB_CA IB IB IB
mthca IB_CA IB IB IB
qib IB_CA IB IB IB

For example:
if (transport == IB) && (link-layer == ETH)
will now become:
if (query_transport() == IBOE)

Thus we will be able to get rid of the respective transport and link-layer
checking, and it will help us to add new protocol/Technology (like OPA) more
easier, also with the introduced management helpers, IB management logical
will be more clear and easier for extending.

Highlights:
The patch set covered a wide range of IB stuff, thus for those who are
familiar with the particular part, your suggestion would be invaluable ;-)

Patches haven't been tested yet, we appreciate if any one who have these
HW willing to provide his Tested-by :-)

Doug suggested the bitmask mechanism:
https://www.mail-archive.com/[email protected]/msg23765.html
which could be the plan for future reforming, we prefer that to be another
series which focus on semantic and performance.

This patch-set is somewhat 'bloated' now and it may be a good timing for
staging, I'd like to suggest we focus on improving existed helpers and push
all the further reforms into next series ;-)

Proposals:
Sean:
https://www.mail-archive.com/[email protected]/msg23339.html
Doug:
https://www.mail-archive.com/[email protected]/msg23418.html
https://www.mail-archive.com/[email protected]/msg23765.html
Jason:
https://www.mail-archive.com/[email protected]/msg23425.html

Michael Wang (28):
[PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()
[PATCH v3 02/28] IB/Verbs: Implement raw management helpers
[PATCH v3 03/28] IB/Verbs: Reform IB-core mad/agent/user_mad
[PATCH v3 04/28] IB/Verbs: Reform IB-core cm
[PATCH v3 05/28] IB/Verbs: Reform IB-core sa_query
[PATCH v3 06/28] IB/Verbs: Reform IB-core multicast
[PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib
[PATCH v3 08/28] IB/Verbs: Reform IB-ulp xprtrdma
[PATCH v3 09/28] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs
[PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma
[PATCH v3 11/28] IB/Verbs: Reform route related part in IB-core cma
[PATCH v3 12/28] IB/Verbs: Reform mcast related part in IB-core cma
[PATCH v3 13/28] IB/Verbs: Reserve legacy transport type in 'dev_addr'
[PATCH v3 14/28] IB/Verbs: Reform cma_acquire_dev()
[PATCH v3 15/28] IB/Verbs: Reform rest part in IB-core cma
[PATCH v3 16/28] IB/Verbs: Use management helper cap_ib_mad()
[PATCH v3 17/28] IB/Verbs: Use management helper cap_ib_smi()
[PATCH v3 18/28] IB/Verbs: Use management helper cap_ib_cm()
[PATCH v3 19/28] IB/Verbs: Use management helper cap_iw_cm()
[PATCH v3 20/28] IB/Verbs: Use management helper cap_ib_sa()
[PATCH v3 21/28] IB/Verbs: Use management helper cap_ib_mcast()
[PATCH v3 22/28] IB/Verbs: Use management helper cap_ipoib()
[PATCH v3 23/28] IB/Verbs: Use management helper cap_read_multi_sge()
[PATCH v3 24/28] IB/Verbs: Use management helper cap_ib_cm_dev()
[PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()
[PATCH v3 26/28] IB/Verbs: Use management helper cap_eth_ah()
[PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()
[PATCH v3 28/28] IB/Verbs: Cleanup rdma_node_get_transport()

---
drivers/infiniband/core/agent.c | 4
drivers/infiniband/core/cm.c | 28 +-
drivers/infiniband/core/cma.c | 331 ++++++++++++---------------
drivers/infiniband/core/device.c | 1
drivers/infiniband/core/mad.c | 28 +-
drivers/infiniband/core/multicast.c | 18 -
drivers/infiniband/core/sa_query.c | 41 +--
drivers/infiniband/core/sysfs.c | 8
drivers/infiniband/core/ucm.c | 3
drivers/infiniband/core/ucma.c | 27 --
drivers/infiniband/core/user_mad.c | 32 +-
drivers/infiniband/core/uverbs_cmd.c | 6
drivers/infiniband/core/verbs.c | 33 --
drivers/infiniband/hw/amso1100/c2_provider.c | 7
drivers/infiniband/hw/cxgb3/iwch_provider.c | 7
drivers/infiniband/hw/cxgb4/provider.c | 7
drivers/infiniband/hw/ehca/ehca_hca.c | 6
drivers/infiniband/hw/ehca/ehca_iverbs.h | 3
drivers/infiniband/hw/ehca/ehca_main.c | 1
drivers/infiniband/hw/ipath/ipath_verbs.c | 7
drivers/infiniband/hw/mlx4/main.c | 10
drivers/infiniband/hw/mlx5/main.c | 7
drivers/infiniband/hw/mthca/mthca_provider.c | 7
drivers/infiniband/hw/nes/nes_verbs.c | 6
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3
drivers/infiniband/hw/qib/qib_verbs.c | 7
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6
drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2
drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 -
include/rdma/ib_verbs.h | 224 +++++++++++++++++-
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6
net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +---
35 files changed, 599 insertions(+), 353 deletions(-)


2015-04-13 12:22:25

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()


Add new callback query_transport() and implement for each HW.

Mapping List:
node-type link-layer old-transport new-transport
nes RNIC ETH IWARP IWARP
amso1100 RNIC ETH IWARP IWARP
cxgb3 RNIC ETH IWARP IWARP
cxgb4 RNIC ETH IWARP IWARP
usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP
ocrdma IB_CA ETH IB IBOE
mlx4 IB_CA IB/ETH IB IB/IBOE
mlx5 IB_CA IB IB IB
ehca IB_CA IB IB IB
ipath IB_CA IB IB IB
mthca IB_CA IB IB IB
qib IB_CA IB IB IB

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/core/verbs.c | 4 +++-
drivers/infiniband/hw/amso1100/c2_provider.c | 7 +++++++
drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 +++++++
drivers/infiniband/hw/cxgb4/provider.c | 7 +++++++
drivers/infiniband/hw/ehca/ehca_hca.c | 6 ++++++
drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 +++
drivers/infiniband/hw/ehca/ehca_main.c | 1 +
drivers/infiniband/hw/ipath/ipath_verbs.c | 7 +++++++
drivers/infiniband/hw/mlx4/main.c | 10 ++++++++++
drivers/infiniband/hw/mlx5/main.c | 7 +++++++
drivers/infiniband/hw/mthca/mthca_provider.c | 7 +++++++
drivers/infiniband/hw/nes/nes_verbs.c | 6 ++++++
drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 +
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 ++++++
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 +++
drivers/infiniband/hw/qib/qib_verbs.c | 7 +++++++
drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 +
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 ++++++
drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 ++
include/rdma/ib_verbs.h | 7 ++++++-
21 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 18c1ece..a9587c4 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -76,6 +76,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
} mandatory_table[] = {
IB_MANDATORY_FUNC(query_device),
IB_MANDATORY_FUNC(query_port),
+ IB_MANDATORY_FUNC(query_transport),
IB_MANDATORY_FUNC(query_pkey),
IB_MANDATORY_FUNC(query_gid),
IB_MANDATORY_FUNC(alloc_pd),
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index f93eb8d..626c9cf 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -133,14 +133,16 @@ enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_
if (device->get_link_layer)
return device->get_link_layer(device, port_num);

- switch (rdma_node_get_transport(device->node_type)) {
+ switch (device->query_transport(device, port_num)) {
case RDMA_TRANSPORT_IB:
return IB_LINK_LAYER_INFINIBAND;
+ case RDMA_TRANSPORT_IBOE:
case RDMA_TRANSPORT_IWARP:
case RDMA_TRANSPORT_USNIC:
case RDMA_TRANSPORT_USNIC_UDP:
return IB_LINK_LAYER_ETHERNET;
default:
+ BUG();
return IB_LINK_LAYER_UNSPECIFIED;
}
}
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c
index bdf3507..d46bbb0 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -99,6 +99,12 @@ static int c2_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+c2_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static int c2_query_pkey(struct ib_device *ibdev,
u8 port, u16 index, u16 * pkey)
{
@@ -801,6 +807,7 @@ int c2_register_device(struct c2_dev *dev)
dev->ibdev.dma_device = &dev->pcidev->dev;
dev->ibdev.query_device = c2_query_device;
dev->ibdev.query_port = c2_query_port;
+ dev->ibdev.query_transport = c2_query_transport;
dev->ibdev.query_pkey = c2_query_pkey;
dev->ibdev.query_gid = c2_query_gid;
dev->ibdev.alloc_ucontext = c2_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 811b24a..09682e9e 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1232,6 +1232,12 @@ static int iwch_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+iwch_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -1385,6 +1391,7 @@ int iwch_register_device(struct iwch_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.rnic_info.pdev->dev);
dev->ibdev.query_device = iwch_query_device;
dev->ibdev.query_port = iwch_query_port;
+ dev->ibdev.query_transport = iwch_query_transport;
dev->ibdev.query_pkey = iwch_query_pkey;
dev->ibdev.query_gid = iwch_query_gid;
dev->ibdev.alloc_ucontext = iwch_alloc_ucontext;
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 66bd6a2..a445e0d 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -390,6 +390,12 @@ static int c4iw_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+static enum rdma_transport_type
+c4iw_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}
+
static ssize_t show_rev(struct device *dev, struct device_attribute *attr,
char *buf)
{
@@ -506,6 +512,7 @@ int c4iw_register_device(struct c4iw_dev *dev)
dev->ibdev.dma_device = &(dev->rdev.lldi.pdev->dev);
dev->ibdev.query_device = c4iw_query_device;
dev->ibdev.query_port = c4iw_query_port;
+ dev->ibdev.query_transport = c4iw_query_transport;
dev->ibdev.query_pkey = c4iw_query_pkey;
dev->ibdev.query_gid = c4iw_query_gid;
dev->ibdev.alloc_ucontext = c4iw_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ehca/ehca_hca.c b/drivers/infiniband/hw/ehca/ehca_hca.c
index 9ed4d25..d5a34a6 100644
--- a/drivers/infiniband/hw/ehca/ehca_hca.c
+++ b/drivers/infiniband/hw/ehca/ehca_hca.c
@@ -242,6 +242,12 @@ query_port1:
return ret;
}

+enum rdma_transport_type
+ehca_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
int ehca_query_sma_attr(struct ehca_shca *shca,
u8 port, struct ehca_sma_attr *attr)
{
diff --git a/drivers/infiniband/hw/ehca/ehca_iverbs.h b/drivers/infiniband/hw/ehca/ehca_iverbs.h
index 22f79af..cec945f 100644
--- a/drivers/infiniband/hw/ehca/ehca_iverbs.h
+++ b/drivers/infiniband/hw/ehca/ehca_iverbs.h
@@ -49,6 +49,9 @@ int ehca_query_device(struct ib_device *ibdev, struct ib_device_attr *props);
int ehca_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);

+enum rdma_transport_type
+ehca_query_transport(struct ib_device *device, u8 port_num);
+
int ehca_query_sma_attr(struct ehca_shca *shca, u8 port,
struct ehca_sma_attr *attr);

diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c
index cd8d290..60e0a09 100644
--- a/drivers/infiniband/hw/ehca/ehca_main.c
+++ b/drivers/infiniband/hw/ehca/ehca_main.c
@@ -467,6 +467,7 @@ static int ehca_init_device(struct ehca_shca *shca)
shca->ib_device.dma_device = &shca->ofdev->dev;
shca->ib_device.query_device = ehca_query_device;
shca->ib_device.query_port = ehca_query_port;
+ shca->ib_device.query_transport = ehca_query_transport;
shca->ib_device.query_gid = ehca_query_gid;
shca->ib_device.query_pkey = ehca_query_pkey;
/* shca->in_device.modify_device = ehca_modify_device */
diff --git a/drivers/infiniband/hw/ipath/ipath_verbs.c b/drivers/infiniband/hw/ipath/ipath_verbs.c
index 44ea939..58d36e3 100644
--- a/drivers/infiniband/hw/ipath/ipath_verbs.c
+++ b/drivers/infiniband/hw/ipath/ipath_verbs.c
@@ -1638,6 +1638,12 @@ static int ipath_query_port(struct ib_device *ibdev,
return 0;
}

+static enum rdma_transport_type
+ipath_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int ipath_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2140,6 +2146,7 @@ int ipath_register_ib_device(struct ipath_devdata *dd)
dev->query_device = ipath_query_device;
dev->modify_device = ipath_modify_device;
dev->query_port = ipath_query_port;
+ dev->query_transport = ipath_query_transport;
dev->modify_port = ipath_modify_port;
dev->query_pkey = ipath_query_pkey;
dev->query_gid = ipath_query_gid;
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index b972c0b..e1424ad 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -420,6 +420,15 @@ static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
return __mlx4_ib_query_port(ibdev, port, props, 0);
}

+static enum rdma_transport_type
+mlx4_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ struct mlx4_dev *dev = to_mdev(device)->dev;
+
+ return dev->caps.port_mask[port_num] == MLX4_PORT_TYPE_IB ?
+ RDMA_TRANSPORT_IB : RDMA_TRANSPORT_IBOE;
+}
+
int __mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid, int netw_view)
{
@@ -2201,6 +2210,7 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)

ibdev->ib_dev.query_device = mlx4_ib_query_device;
ibdev->ib_dev.query_port = mlx4_ib_query_port;
+ ibdev->ib_dev.query_transport = mlx4_ib_query_transport;
ibdev->ib_dev.get_link_layer = mlx4_ib_port_link_layer;
ibdev->ib_dev.query_gid = mlx4_ib_query_gid;
ibdev->ib_dev.query_pkey = mlx4_ib_query_pkey;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index cc4ac1e..209c796 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -351,6 +351,12 @@ out:
return err;
}

+static enum rdma_transport_type
+mlx5_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
union ib_gid *gid)
{
@@ -1336,6 +1342,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)

dev->ib_dev.query_device = mlx5_ib_query_device;
dev->ib_dev.query_port = mlx5_ib_query_port;
+ dev->ib_dev.query_transport = mlx5_ib_query_transport;
dev->ib_dev.query_gid = mlx5_ib_query_gid;
dev->ib_dev.query_pkey = mlx5_ib_query_pkey;
dev->ib_dev.modify_device = mlx5_ib_modify_device;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 415f8e1..67ac6a4 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -179,6 +179,12 @@ static int mthca_query_port(struct ib_device *ibdev,
return err;
}

+static enum rdma_transport_type
+mthca_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int mthca_modify_device(struct ib_device *ibdev,
int mask,
struct ib_device_modify *props)
@@ -1281,6 +1287,7 @@ int mthca_register_device(struct mthca_dev *dev)
dev->ib_dev.dma_device = &dev->pdev->dev;
dev->ib_dev.query_device = mthca_query_device;
dev->ib_dev.query_port = mthca_query_port;
+ dev->ib_dev.query_transport = mthca_query_transport;
dev->ib_dev.modify_device = mthca_modify_device;
dev->ib_dev.modify_port = mthca_modify_port;
dev->ib_dev.query_pkey = mthca_query_pkey;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index c0d0296..8df5b61 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -606,6 +606,11 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
return 0;
}

+static enum rdma_transport_type
+nes_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IWARP;
+}

/**
* nes_query_pkey
@@ -3879,6 +3884,7 @@ struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
nesibdev->ibdev.dev.parent = &nesdev->pcidev->dev;
nesibdev->ibdev.query_device = nes_query_device;
nesibdev->ibdev.query_port = nes_query_port;
+ nesibdev->ibdev.query_transport = nes_query_transport;
nesibdev->ibdev.query_pkey = nes_query_pkey;
nesibdev->ibdev.query_gid = nes_query_gid;
nesibdev->ibdev.alloc_ucontext = nes_alloc_ucontext;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 7a2b59a..9f4d182 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -244,6 +244,7 @@ static int ocrdma_register_device(struct ocrdma_dev *dev)
/* mandatory verbs. */
dev->ibdev.query_device = ocrdma_query_device;
dev->ibdev.query_port = ocrdma_query_port;
+ dev->ibdev.query_transport = ocrdma_query_transport;
dev->ibdev.modify_port = ocrdma_modify_port;
dev->ibdev.query_gid = ocrdma_query_gid;
dev->ibdev.get_link_layer = ocrdma_link_layer;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 8771755..73bace4 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -187,6 +187,12 @@ int ocrdma_query_port(struct ib_device *ibdev,
return 0;
}

+enum rdma_transport_type
+ocrdma_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IBOE;
+}
+
int ocrdma_modify_port(struct ib_device *ibdev, u8 port, int mask,
struct ib_port_modify *props)
{
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
index b8f7853..4a81b63 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
@@ -41,6 +41,9 @@ int ocrdma_query_port(struct ib_device *, u8 port, struct ib_port_attr *props);
int ocrdma_modify_port(struct ib_device *, u8 port, int mask,
struct ib_port_modify *props);

+enum rdma_transport_type
+ocrdma_query_transport(struct ib_device *device, u8 port_num);
+
void ocrdma_get_guid(struct ocrdma_dev *, u8 *guid);
int ocrdma_query_gid(struct ib_device *, u8 port,
int index, union ib_gid *gid);
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index 4a35998..caad665 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1650,6 +1650,12 @@ static int qib_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+static enum rdma_transport_type
+qib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_IB;
+}
+
static int qib_modify_device(struct ib_device *device,
int device_modify_mask,
struct ib_device_modify *device_modify)
@@ -2184,6 +2190,7 @@ int qib_register_ib_device(struct qib_devdata *dd)
ibdev->query_device = qib_query_device;
ibdev->modify_device = qib_modify_device;
ibdev->query_port = qib_query_port;
+ ibdev->query_transport = qib_query_transport;
ibdev->modify_port = qib_modify_port;
ibdev->query_pkey = qib_query_pkey;
ibdev->query_gid = qib_query_gid;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index 0d0f986..03ea9f3 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -360,6 +360,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)

us_ibdev->ib_dev.query_device = usnic_ib_query_device;
us_ibdev->ib_dev.query_port = usnic_ib_query_port;
+ us_ibdev->ib_dev.query_transport = usnic_ib_query_transport;
us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 53bd6a2..ff9a5f7 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -348,6 +348,12 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
return 0;
}

+enum rdma_transport_type
+usnic_ib_query_transport(struct ib_device *device, u8 port_num)
+{
+ return RDMA_TRANSPORT_USNIC_UDP;
+}
+
int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
index bb864f5..0b1633b 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
@@ -27,6 +27,8 @@ int usnic_ib_query_device(struct ib_device *ibdev,
struct ib_device_attr *props);
int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);
+enum rdma_transport_type
+usnic_ib_query_transport(struct ib_device *device, u8 port_num);
int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 65994a1..d54f91e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -75,10 +75,13 @@ enum rdma_node_type {
};

enum rdma_transport_type {
+ /* legacy for users */
RDMA_TRANSPORT_IB,
RDMA_TRANSPORT_IWARP,
RDMA_TRANSPORT_USNIC,
- RDMA_TRANSPORT_USNIC_UDP
+ RDMA_TRANSPORT_USNIC_UDP,
+ /* new transport */
+ RDMA_TRANSPORT_IBOE,
};

__attribute_const__ enum rdma_transport_type
@@ -1501,6 +1504,8 @@ struct ib_device {
int (*query_port)(struct ib_device *device,
u8 port_num,
struct ib_port_attr *port_attr);
+ enum rdma_transport_type (*query_transport)(struct ib_device *device,
+ u8 port_num);
enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
u8 port_num);
int (*query_gid)(struct ib_device *device,
--
2.1.0

2015-04-13 12:22:43

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 02/28] IB/Verbs: Implement raw management helpers


Add raw helpers:
rdma_tech_ib
rdma_tech_iboe
rdma_tech_iwarp
rdma_ib_or_iboe (transition, clean up later)
To help us detect which technology the port supported.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index d54f91e..a12e876 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1748,6 +1748,31 @@ int ib_query_port(struct ib_device *device,
enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device,
u8 port_num);

+static inline int rdma_tech_ib(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IB;
+}
+
+static inline int rdma_tech_iboe(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IBOE;
+}
+
+static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
+{
+ return device->query_transport(device, port_num)
+ == RDMA_TRANSPORT_IWARP;
+}
+
+static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
+{
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:23:25

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 03/28] IB/Verbs: Reform IB-core mad/agent/user_mad


Use raw management helpers to reform IB-core mad/agent/user_mad.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/agent.c | 2 +-
drivers/infiniband/core/mad.c | 20 ++++++++++----------
drivers/infiniband/core/user_mad.c | 26 ++++++++++++++++++++------
3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index f6d2961..ffdef4d 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
goto error1;
}

- if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 74c30f4..d451a47 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);

cq_size = mad_sendq_size + mad_recvq_size;
- has_smi = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND;
+ has_smi = rdma_tech_ib(device, port_num);
if (has_smi)
cq_size *= 2;

@@ -3057,9 +3057,6 @@ static void ib_mad_init_device(struct ib_device *device)
{
int start, end, i;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
if (device->node_type == RDMA_NODE_IB_SWITCH) {
start = 0;
end = 0;
@@ -3069,6 +3066,9 @@ static void ib_mad_init_device(struct ib_device *device)
}

for (i = start; i <= end; i++) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
if (ib_mad_port_open(device, i)) {
dev_err(&device->dev, "Couldn't open port %d\n", i);
goto error;
@@ -3086,15 +3086,15 @@ error_agent:
dev_err(&device->dev, "Couldn't close port %d\n", i);

error:
- i--;
+ while (--i >= start) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;

- while (i >= start) {
if (ib_agent_port_close(device, i))
dev_err(&device->dev,
"Couldn't close port %d for agents\n", i);
if (ib_mad_port_close(device, i))
dev_err(&device->dev, "Couldn't close port %d\n", i);
- i--;
}
}

@@ -3102,9 +3102,6 @@ static void ib_mad_remove_device(struct ib_device *device)
{
int i, num_ports, cur_port;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
if (device->node_type == RDMA_NODE_IB_SWITCH) {
num_ports = 1;
cur_port = 0;
@@ -3113,6 +3110,9 @@ static void ib_mad_remove_device(struct ib_device *device)
cur_port = 1;
}
for (i = 0; i < num_ports; i++, cur_port++) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
if (ib_agent_port_close(device, cur_port))
dev_err(&device->dev,
"Couldn't close port %d for agents\n",
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 928cdd2..71fc8ba 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1273,9 +1273,7 @@ static void ib_umad_add_one(struct ib_device *device)
{
struct ib_umad_device *umad_dev;
int s, e, i;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1296,11 +1294,21 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port = e;

for (i = s; i <= e; ++i) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
umad_dev->port[i - s].umad_dev = umad_dev;

if (ib_umad_init_port(device, i, umad_dev,
&umad_dev->port[i - s]))
goto err;
+
+ count++;
+ }
+
+ if (!count) {
+ kobject_put(&umad_dev->kobj);
+ return;
}

ib_set_client_data(device, &umad_client, umad_dev);
@@ -1308,8 +1316,12 @@ static void ib_umad_add_one(struct ib_device *device)
return;

err:
- while (--i >= s)
+ while (--i >= s) {
+ if (!rdma_ib_or_iboe(device, i))
+ continue;
+
ib_umad_kill_port(&umad_dev->port[i - s]);
+ }

kobject_put(&umad_dev->kobj);
}
@@ -1322,8 +1334,10 @@ static void ib_umad_remove_one(struct ib_device *device)
if (!umad_dev)
return;

- for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i)
- ib_umad_kill_port(&umad_dev->port[i]);
+ for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
+ if (rdma_ib_or_iboe(device, i))
+ ib_umad_kill_port(&umad_dev->port[i]);
+ }

kobject_put(&umad_dev->kobj);
}
--
2.1.0

2015-04-13 12:23:58

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm


Use raw management helpers to reform IB-core cm.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cm.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..50321fe 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
unsigned long flags;
int ret;
u8 i;
-
- if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
ib_device->phys_port_cnt, GFP_KERNEL);
@@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)

set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = kzalloc(sizeof *port, GFP_KERNEL);
if (!port)
goto error1;
@@ -3809,7 +3810,16 @@ static void cm_add_one(struct ib_device *ib_device)
ret = ib_modify_port(ib_device, i, 0, &port_modify);
if (ret)
goto error3;
+
+ count++;
}
+
+ if (!count) {
+ device_unregister(cm_dev->device);
+ kfree(cm_dev);
+ return;
+ }
+
ib_set_client_data(ib_device, &cm_client, cm_dev);

write_lock_irqsave(&cm.device_lock, flags);
@@ -3825,6 +3835,9 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
@@ -3853,6 +3866,9 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);

for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+ if (!rdma_ib_or_iboe(ib_device, i))
+ continue;
+
port = cm_dev->port[i-1];
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
--
2.1.0

2015-04-13 12:24:24

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 05/28] IB/Verbs: Reform IB-core sa_query


Use raw management helpers to reform IB-core sa_query.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/sa_query.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index c38f030..803ccf7 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - sa_dev->start_port];

- if (rdma_port_get_link_layer(handler->device, port->port_num) != IB_LINK_LAYER_INFINIBAND)
+ if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
return;

spin_lock_irqsave(&port->ah_lock, flags);
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;

- force_grh = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_ETHERNET;
+ force_grh = rdma_tech_iboe(device, port_num);

if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
@@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
{
struct ib_sa_device *sa_dev;
int s, e, i;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

if (device->node_type == RDMA_NODE_IB_SWITCH)
s = e = 0;
@@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)

for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
- if (rdma_port_get_link_layer(device, i + 1) != IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, i + 1))
continue;

sa_dev->port[i].sm_ah = NULL;
@@ -1189,6 +1187,13 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;

INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
+
+ count++;
+ }
+
+ if (!count) {
+ kfree(sa_dev);
+ return;
}

ib_set_client_data(device, &sa_client, sa_dev);
@@ -1204,16 +1209,18 @@ static void ib_sa_add_one(struct ib_device *device)
if (ib_register_event_handler(&sa_dev->event_handler))
goto err;

- for (i = 0; i <= e - s; ++i)
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
+ for (i = 0; i <= e - s; ++i) {
+ if (rdma_tech_ib(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
+ }

return;

err:
- while (--i >= 0)
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND)
+ while (--i >= 0) {
+ if (rdma_tech_ib(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
+ }

kfree(sa_dev);

@@ -1233,7 +1240,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);

for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
- if (rdma_port_get_link_layer(device, i + 1) == IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
--
2.1.0

2015-04-13 12:24:52

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 06/28] IB/Verbs: Reform IB-core multicast


Use raw management helpers to reform IB-core multicast.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/multicast.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index fa17b55..24d93f5 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,8 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
int index;

dev = container_of(handler, struct mcast_device, event_handler);
- if (rdma_port_get_link_layer(dev->device, event->element.port_num) !=
- IB_LINK_LAYER_INFINIBAND)
+ if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
return;

index = event->element.port_num - dev->start_port;
@@ -808,9 +807,6 @@ static void mcast_add_one(struct ib_device *device)
int i;
int count = 0;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
dev = kmalloc(sizeof *dev + device->phys_port_cnt * sizeof *port,
GFP_KERNEL);
if (!dev)
@@ -824,8 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_port_get_link_layer(device, dev->start_port + i) !=
- IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -863,8 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_port_get_link_layer(device, dev->start_port + i) ==
- IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
--
2.1.0

2015-04-13 12:25:24

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib


Use raw management helpers to reform IB-ulp ipoib.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 58b5aa3..97372b1 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1654,9 +1654,7 @@ static void ipoib_add_one(struct ib_device *device)
struct net_device *dev;
struct ipoib_dev_priv *priv;
int s, e, p;
-
- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
+ int count = 0;

dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
if (!dev_list)
@@ -1673,13 +1671,19 @@ static void ipoib_add_one(struct ib_device *device)
}

for (p = s; p <= e; ++p) {
- if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
+ if (!rdma_tech_ib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
priv = netdev_priv(dev);
list_add_tail(&priv->list, dev_list);
}
+ count++;
+ }
+
+ if (!count) {
+ kfree(dev_list);
+ return;
}

ib_set_client_data(device, &ipoib_client, dev_list);
@@ -1690,9 +1694,6 @@ static void ipoib_remove_one(struct ib_device *device)
struct ipoib_dev_priv *priv, *tmp;
struct list_head *dev_list;

- if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
- return;
-
dev_list = ib_get_client_data(device, &ipoib_client);
if (!dev_list)
return;
--
2.1.0

2015-04-13 12:25:52

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 08/28] IB/Verbs: Reform IB-ulp xprtrdma


Use raw management helpers to reform IB-ulp xprtrdma.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 +--
net/sunrpc/xprtrdma/svc_rdma_transport.c | 45 +++++++++++++-------------------
2 files changed, 19 insertions(+), 29 deletions(-)

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index f9f13a3..a5bed5b 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,8 +117,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,

static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
{
- if (rdma_node_get_transport(xprt->sc_cm_id->device->node_type) ==
- RDMA_TRANSPORT_IWARP)
+ if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index f609c1c..a09b7a1 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -851,7 +851,7 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
struct ib_qp_init_attr qp_attr;
struct ib_device_attr devattr;
int uninitialized_var(dma_mr_acc);
- int need_dma_mr;
+ int need_dma_mr = 0;
int ret;
int i;

@@ -985,35 +985,26 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
/*
* Determine if a DMA MR is required and if so, what privs are required
*/
- switch (rdma_node_get_transport(newxprt->sc_cm_id->device->node_type)) {
- case RDMA_TRANSPORT_IWARP:
- newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
- if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
- need_dma_mr = 1;
- dma_mr_acc =
- (IB_ACCESS_LOCAL_WRITE |
- IB_ACCESS_REMOTE_WRITE);
- } else if (!(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else
- need_dma_mr = 0;
- break;
- case RDMA_TRANSPORT_IB:
- if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else if (!(devattr.device_cap_flags &
- IB_DEVICE_LOCAL_DMA_LKEY)) {
- need_dma_mr = 1;
- dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
- } else
- need_dma_mr = 0;
- break;
- default:
+ if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
goto errout;
+
+ if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
+ !(devattr.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)) {
+ need_dma_mr = 1;
+ dma_mr_acc = IB_ACCESS_LOCAL_WRITE;
+ if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG))
+ dma_mr_acc |= IB_ACCESS_REMOTE_WRITE;
}

+ if (rdma_tech_iwarp(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
+ newxprt->sc_dev_caps |= SVCRDMA_DEVCAP_READ_W_INV;
+
/* Create the DMA MR if needed, otherwise, use the DMA LKEY */
if (need_dma_mr) {
/* Register all of physical memory */
--
2.1.0

2015-04-13 12:26:29

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 09/28] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs


Use raw management helpers to reform IB-core verbs/uverbs_cmd/sysfs.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/sysfs.c | 8 ++------
drivers/infiniband/core/uverbs_cmd.c | 6 ++++--
drivers/infiniband/core/verbs.c | 6 ++----
3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index cbd0383..8570180 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
char *buf)
{
- switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(p->ibdev, p->port_num))
return sprintf(buf, "%s\n", "InfiniBand");
- case IB_LINK_LAYER_ETHERNET:
+ else
return sprintf(buf, "%s\n", "Ethernet");
- default:
- return sprintf(buf, "%s\n", "Unknown");
- }
}

static PORT_ATTR_RO(state);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a9f0489..5dc90aa 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
resp.active_width = attr.active_width;
resp.active_speed = attr.active_speed;
resp.phys_state = attr.phys_state;
- resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
- cmd.port_num);
+ resp.link_layer = rdma_tech_ib(file->device->ib_dev,
+ cmd.port_num) ?
+ IB_LINK_LAYER_INFINIBAND :
+ IB_LINK_LAYER_ETHERNET;

if (copy_to_user((void __user *) (unsigned long) cmd.response,
&resp, sizeof resp))
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 626c9cf..6b5fd9d 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
u32 flow_class;
u16 gid_index;
int ret;
- int is_eth = (rdma_port_get_link_layer(device, port_num) ==
- IB_LINK_LAYER_ETHERNET);

memset(ah_attr, 0, sizeof *ah_attr);
- if (is_eth) {
+ if (rdma_tech_iboe(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;

@@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
union ib_gid sgid;

if ((*qp_attr_mask & IB_QP_AV) &&
- (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
+ (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
qp_attr->ah_attr.grh.sgid_index, &sgid);
if (ret)
--
2.1.0

2015-04-13 12:27:01

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma


Use raw management helpers to reform cm related part in IB-core cma.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 79 ++++++++++++++-----------------------------
1 file changed, 25 insertions(+), 54 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..8ba5553 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -735,8 +735,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
int ret = 0;

id_priv = container_of(id, struct rdma_id_private, id);
- switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
else
@@ -745,19 +744,16 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,

if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id_priv->id.device,
+ id_priv->id.port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
} else
ret = iw_cm_init_qp_attr(id_priv->cm_id.iw, qp_attr,
qp_attr_mask);
- break;
- default:
+ } else
ret = -ENOSYS;
- break;
- }

return ret;
}
@@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);

if (id_priv->cma_dev) {
- switch (rdma_node_get_transport(id_priv->id.device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id_priv->id.device,
+ id_priv->id.port_num)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
- break;
- default:
- break;
}
cma_leave_mc_groups(id_priv);
cma_release_dev(id_priv);
@@ -2060,7 +2052,7 @@ port_found:
goto out;

id_priv->id.route.addr.dev_addr.dev_type =
- (rdma_port_get_link_layer(cma_dev->device, p) == IB_LINK_LAYER_INFINIBAND) ?
+ (rdma_tech_ib(cma_dev->device, p)) ?
ARPHRD_INFINIBAND : ARPHRD_ETHER;

rdma_addr_set_sgid(&id_priv->id.route.addr.dev_addr, &gid);
@@ -2537,18 +2529,15 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)

id_priv->backlog = backlog;
if (id->device) {
- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
- break;
- default:
+ } else {
ret = -ENOSYS;
goto err;
}
@@ -2884,20 +2873,15 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }
if (ret)
goto err;

@@ -3000,8 +2984,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
@@ -3017,14 +3000,10 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }

if (ret)
goto reject;
@@ -3068,8 +3047,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
if (!id_priv->cm_id.ib)
return -EINVAL;

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
private_data, private_data_len);
@@ -3077,15 +3055,12 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
IB_CM_REJ_CONSUMER_DEFINED, NULL,
0, private_data, private_data_len);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
private_data, private_data_len);
- break;
- default:
+ } else
ret = -ENOSYS;
- break;
- }
+
return ret;
}
EXPORT_SYMBOL(rdma_reject);
@@ -3099,22 +3074,18 @@ int rdma_disconnect(struct rdma_cm_id *id)
if (!id_priv->cm_id.ib)
return -EINVAL;

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
+ if (rdma_ib_or_iboe(id->device, id->port_num)) {
ret = cma_modify_qp_err(id_priv);
if (ret)
goto out;
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
- break;
- case RDMA_TRANSPORT_IWARP:
+ } else if (rdma_tech_iwarp(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
- break;
- default:
+ } else
ret = -EINVAL;
- break;
- }
+
out:
return ret;
}
--
2.1.0

2015-04-13 12:28:00

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 11/28] IB/Verbs: Reform route related part in IB-core cma


Use raw management helpers to reform route related part in IB-core cma.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 31 ++++++++-----------------------
drivers/infiniband/core/ucma.c | 25 ++++++-------------------
2 files changed, 14 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8ba5553..8c41b3f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -924,13 +924,9 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)

static void cma_cancel_route(struct rdma_id_private *id_priv)
{
- switch (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
- break;
- default:
- break;
}
}

@@ -1959,26 +1955,15 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
return -EINVAL;

atomic_inc(&id_priv->refcount);
- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ret = cma_resolve_ib_route(id_priv, timeout_ms);
- break;
- case IB_LINK_LAYER_ETHERNET:
- ret = cma_resolve_iboe_route(id_priv);
- break;
- default:
- ret = -ENOSYS;
- }
- break;
- case RDMA_TRANSPORT_IWARP:
+ if (rdma_tech_ib(id->device, id->port_num))
+ ret = cma_resolve_ib_route(id_priv, timeout_ms);
+ else if (rdma_tech_iboe(id->device, id->port_num))
+ ret = cma_resolve_iboe_route(id_priv);
+ else if (rdma_tech_iwarp(id->device, id->port_num))
ret = cma_resolve_iw_route(id_priv, timeout_ms);
- break;
- default:
+ else
ret = -ENOSYS;
- break;
- }
+
if (ret)
goto err;

diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 45d67e9..7331c6c 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -722,26 +722,13 @@ static ssize_t ucma_query_route(struct ucma_file *file,

resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;
- switch (rdma_node_get_transport(ctx->cm_id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(ctx->cm_id->device,
- ctx->cm_id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ucma_copy_ib_route(&resp, &ctx->cm_id->route);
- break;
- case IB_LINK_LAYER_ETHERNET:
- ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
- break;
- default:
- break;
- }
- break;
- case RDMA_TRANSPORT_IWARP:
+
+ if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+ ucma_copy_ib_route(&resp, &ctx->cm_id->route);
+ else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
+ ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
+ else if (rdma_tech_iwarp(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iw_route(&resp, &ctx->cm_id->route);
- break;
- default:
- break;
- }

out:
if (copy_to_user((void __user *)(unsigned long)cmd.response,
--
2.1.0

2015-04-13 12:28:29

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 12/28] IB/Verbs: Reform mcast related part in IB-core cma


Use raw management helpers to reform mcast related part in IB-core cma.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 57 +++++++++++++++----------------------------
1 file changed, 19 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 8c41b3f..0a36a42 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -998,17 +998,12 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
mc = container_of(id_priv->mc_list.next,
struct cma_multicast, list);
list_del(&mc->list);
- switch (rdma_port_get_link_layer(id_priv->cma_dev->device, id_priv->id.port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
+ if (rdma_tech_ib(id_priv->cma_dev->device,
+ id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
+ } else
kref_put(&mc->mcref, release_mc);
- break;
- default:
- break;
- }
}
}

@@ -3316,24 +3311,13 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
list_add(&mc->list, &id_priv->mc_list);
spin_unlock(&id_priv->lock);

- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ret = cma_join_ib_multicast(id_priv, mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
- kref_init(&mc->mcref);
- ret = cma_iboe_join_multicast(id_priv, mc);
- break;
- default:
- ret = -EINVAL;
- }
- break;
- default:
+ if (rdma_tech_iboe(id->device, id->port_num)) {
+ kref_init(&mc->mcref);
+ ret = cma_iboe_join_multicast(id_priv, mc);
+ } else if (rdma_tech_ib(id->device, id->port_num))
+ ret = cma_join_ib_multicast(id_priv, mc);
+ else
ret = -ENOSYS;
- break;
- }

if (ret) {
spin_lock_irq(&id_priv->lock);
@@ -3361,19 +3345,16 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)
ib_detach_mcast(id->qp,
&mc->multicast.ib->rec.mgid,
be16_to_cpu(mc->multicast.ib->rec.mlid));
- if (rdma_node_get_transport(id_priv->cma_dev->device->node_type) == RDMA_TRANSPORT_IB) {
- switch (rdma_port_get_link_layer(id->device, id->port_num)) {
- case IB_LINK_LAYER_INFINIBAND:
- ib_sa_free_multicast(mc->multicast.ib);
- kfree(mc);
- break;
- case IB_LINK_LAYER_ETHERNET:
- kref_put(&mc->mcref, release_mc);
- break;
- default:
- break;
- }
- }
+
+ BUG_ON(id_priv->cma_dev->device != id->device);
+
+ if (rdma_tech_ib(id->device, id->port_num)) {
+ ib_sa_free_multicast(mc->multicast.ib);
+ kfree(mc);
+ } else if (rdma_tech_iboe(id->device,
+ id->port_num))
+ kref_put(&mc->mcref, release_mc);
+
return;
}
}
--
2.1.0

2015-04-13 12:29:01

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 13/28] IB/Verbs: Reserve legacy transport type in 'dev_addr'


Reserve the legacy transport type for the 'transport' member
of 'struct rdma_dev_addr' until we make sure this is no
longer needed.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0a36a42..d2052a4 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -244,14 +244,35 @@ static inline void cma_set_ip_ver(struct cma_hdr *hdr, u8 ip_ver)
hdr->ip_version = (ip_ver << 4) | (hdr->ip_version & 0xF);
}

+static inline void cma_set_legacy_transport(struct rdma_cm_id *id)
+{
+ switch (id->device->node_type) {
+ case RDMA_NODE_IB_CA:
+ case RDMA_NODE_IB_SWITCH:
+ case RDMA_NODE_IB_ROUTER:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IB;
+ break;
+ case RDMA_NODE_RNIC:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_IWARP;
+ break;
+ case RDMA_NODE_USNIC:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC;
+ break;
+ case RDMA_NODE_USNIC_UDP:
+ id->route.addr.dev_addr.transport = RDMA_TRANSPORT_USNIC_UDP;
+ break;
+ default:
+ BUG();
+ }
+}
+
static void cma_attach_to_dev(struct rdma_id_private *id_priv,
struct cma_device *cma_dev)
{
atomic_inc(&cma_dev->refcount);
id_priv->cma_dev = cma_dev;
id_priv->id.device = cma_dev->device;
- id_priv->id.route.addr.dev_addr.transport =
- rdma_node_get_transport(cma_dev->device->node_type);
+ cma_set_legacy_transport(&id_priv->id);
list_add_tail(&id_priv->list, &cma_dev->id_list);
}

--
2.1.0

2015-04-13 12:29:30

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 14/28] IB/Verbs: Reform cma_acquire_dev()


Reform cma_acquire_dev() with management helpers, introduce
cma_validate_port() to make the code more clean.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 69 +++++++++++++++++++++++++------------------
1 file changed, 41 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d2052a4..c528d17 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -370,18 +370,36 @@ static int cma_translate_addr(struct sockaddr *addr, struct rdma_dev_addr *dev_a
return ret;
}

+static inline int cma_validate_port(struct ib_device *device, u8 port,
+ union ib_gid *gid, int dev_type)
+{
+ u8 found_port;
+ int ret = -ENODEV;
+
+ if ((dev_type == ARPHRD_INFINIBAND) && !rdma_tech_ib(device, port))
+ return ret;
+
+ if ((dev_type != ARPHRD_INFINIBAND) && rdma_tech_ib(device, port))
+ return ret;
+
+ ret = ib_find_cached_gid(device, gid, &found_port, NULL);
+
+ if (!ret && (port == found_port))
+ return 0;
+
+ return ret;
+}
+
static int cma_acquire_dev(struct rdma_id_private *id_priv,
struct rdma_id_private *listen_id_priv)
{
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
struct cma_device *cma_dev;
- union ib_gid gid, iboe_gid;
+ union ib_gid gid, iboe_gid, *gidp;
int ret = -ENODEV;
- u8 port, found_port;
- enum rdma_link_layer dev_ll = dev_addr->dev_type == ARPHRD_INFINIBAND ?
- IB_LINK_LAYER_INFINIBAND : IB_LINK_LAYER_ETHERNET;
+ u8 port;

- if (dev_ll != IB_LINK_LAYER_INFINIBAND &&
+ if (dev_addr->dev_type != ARPHRD_INFINIBAND &&
id_priv->id.ps == RDMA_PS_IPOIB)
return -EINVAL;

@@ -391,41 +409,36 @@ static int cma_acquire_dev(struct rdma_id_private *id_priv,

memcpy(&gid, dev_addr->src_dev_addr +
rdma_addr_gid_offset(dev_addr), sizeof gid);
- if (listen_id_priv &&
- rdma_port_get_link_layer(listen_id_priv->id.device,
- listen_id_priv->id.port_num) == dev_ll) {
+
+ if (listen_id_priv) {
cma_dev = listen_id_priv->cma_dev;
port = listen_id_priv->id.port_num;
- if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
- ret = ib_find_cached_gid(cma_dev->device, &iboe_gid,
- &found_port, NULL);
- else
- ret = ib_find_cached_gid(cma_dev->device, &gid,
- &found_port, NULL);
+ gidp = rdma_tech_iboe(cma_dev->device, port) ?
+ &iboe_gid : &gid;

- if (!ret && (port == found_port)) {
- id_priv->id.port_num = found_port;
+ ret = cma_validate_port(cma_dev->device, port, gidp,
+ dev_addr->dev_type);
+ if (!ret) {
+ id_priv->id.port_num = port;
goto out;
}
}
+
list_for_each_entry(cma_dev, &dev_list, list) {
for (port = 1; port <= cma_dev->device->phys_port_cnt; ++port) {
if (listen_id_priv &&
listen_id_priv->cma_dev == cma_dev &&
listen_id_priv->id.port_num == port)
continue;
- if (rdma_port_get_link_layer(cma_dev->device, port) == dev_ll) {
- if (rdma_node_get_transport(cma_dev->device->node_type) == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(cma_dev->device, port) == IB_LINK_LAYER_ETHERNET)
- ret = ib_find_cached_gid(cma_dev->device, &iboe_gid, &found_port, NULL);
- else
- ret = ib_find_cached_gid(cma_dev->device, &gid, &found_port, NULL);
-
- if (!ret && (port == found_port)) {
- id_priv->id.port_num = found_port;
- goto out;
- }
+
+ gidp = rdma_tech_iboe(cma_dev->device, port) ?
+ &iboe_gid : &gid;
+
+ ret = cma_validate_port(cma_dev->device, port, gidp,
+ dev_addr->dev_type);
+ if (!ret) {
+ id_priv->id.port_num = port;
+ goto out;
}
}
}
--
2.1.0

2015-04-13 12:29:55

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 15/28] IB/Verbs: Reform rest part in IB-core cma


Use raw management helpers to reform rest part in IB-core cma.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c528d17..33d080f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -469,10 +469,10 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
pkey = ntohs(addr->sib_pkey);

list_for_each_entry(cur_dev, &dev_list, list) {
- if (rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
- continue;
-
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
+ if (!rdma_ib_or_iboe(cur_dev->device, p))
+ continue;
+
if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
continue;

@@ -667,10 +667,9 @@ static int cma_modify_qp_rtr(struct rdma_id_private *id_priv,
if (ret)
goto out;

- if (rdma_node_get_transport(id_priv->cma_dev->device->node_type)
- == RDMA_TRANSPORT_IB &&
- rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num)
- == IB_LINK_LAYER_ETHERNET) {
+ BUG_ON(id_priv->cma_dev->device != id_priv->id.device);
+
+ if (rdma_tech_iboe(id_priv->id.device, id_priv->id.port_num)) {
ret = rdma_addr_find_smac_by_sgid(&sgid, qp_attr.smac, NULL);

if (ret)
@@ -734,8 +733,7 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
int ret;
u16 pkey;

- if (rdma_port_get_link_layer(id_priv->id.device, id_priv->id.port_num) ==
- IB_LINK_LAYER_INFINIBAND)
+ if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num))
pkey = ib_addr_get_pkey(dev_addr);
else
pkey = 0xffff;
--
2.1.0

2015-04-13 12:30:26

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 16/28] IB/Verbs: Use management helper cap_ib_mad()


Introduce helper cap_ib_mad() to help us check if the port of an
IB device support Infiniband Management Datagrams.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/mad.c | 6 +++---
drivers/infiniband/core/user_mad.c | 6 +++---
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index d451a47..750ad3e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -3066,7 +3066,7 @@ static void ib_mad_init_device(struct ib_device *device)
}

for (i = start; i <= end; i++) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_mad_port_open(device, i)) {
@@ -3087,7 +3087,7 @@ error_agent:

error:
while (--i >= start) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_agent_port_close(device, i))
@@ -3110,7 +3110,7 @@ static void ib_mad_remove_device(struct ib_device *device)
cur_port = 1;
}
for (i = 0; i < num_ports; i++, cur_port++) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

if (ib_agent_port_close(device, cur_port))
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 71fc8ba..b52884b 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1294,7 +1294,7 @@ static void ib_umad_add_one(struct ib_device *device)
umad_dev->end_port = e;

for (i = s; i <= e; ++i) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

umad_dev->port[i - s].umad_dev = umad_dev;
@@ -1317,7 +1317,7 @@ static void ib_umad_add_one(struct ib_device *device)

err:
while (--i >= s) {
- if (!rdma_ib_or_iboe(device, i))
+ if (!cap_ib_mad(device, i))
continue;

ib_umad_kill_port(&umad_dev->port[i - s]);
@@ -1335,7 +1335,7 @@ static void ib_umad_remove_one(struct ib_device *device)
return;

for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) {
- if (rdma_ib_or_iboe(device, i))
+ if (cap_ib_mad(device, i))
ib_umad_kill_port(&umad_dev->port[i]);
}

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a12e876..624e963 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1773,6 +1773,21 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

+/**
+ * cap_ib_mad - Check if the port of device has the capability Infiniband
+ * Management Datagrams.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Management Datagrams.
+ */
+static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:30:59

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 17/28] IB/Verbs: Use management helper cap_ib_smi()


Introduce helper cap_ib_smi() to help us check if the port of an
IB device support Infiniband Subnet Management Interface.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/agent.c | 2 +-
drivers/infiniband/core/mad.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/agent.c b/drivers/infiniband/core/agent.c
index ffdef4d..61471ee 100644
--- a/drivers/infiniband/core/agent.c
+++ b/drivers/infiniband/core/agent.c
@@ -156,7 +156,7 @@ int ib_agent_port_open(struct ib_device *device, int port_num)
goto error1;
}

- if (rdma_tech_ib(device, port_num)) {
+ if (cap_ib_smi(device, port_num)) {
/* Obtain send only MAD agent for SMI QP */
port_priv->agent[0] = ib_register_mad_agent(device, port_num,
IB_QPT_SMI, NULL, 0,
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 750ad3e..2668d4e 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2938,7 +2938,7 @@ static int ib_mad_port_open(struct ib_device *device,
init_mad_qp(port_priv, &port_priv->qp_info[1]);

cq_size = mad_sendq_size + mad_recvq_size;
- has_smi = rdma_tech_ib(device, port_num);
+ has_smi = cap_ib_smi(device, port_num);
if (has_smi)
cq_size *= 2;

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 624e963..873b9a6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1788,6 +1788,21 @@ static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
}

+/**
+ * cap_ib_smi - Check if the port of device has the capability Infiniband
+ * Subnet Management Interface.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Management Interface.
+ */
+static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:31:35

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 18/28] IB/Verbs: Use management helper cap_ib_cm()


Introduce helper cap_ib_cm() to help us check if the port of an
IB device support Infiniband Communication Manager.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cm.c | 6 +++---
drivers/infiniband/core/cma.c | 14 +++++++-------
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 50321fe..63418ee 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3781,7 +3781,7 @@ static void cm_add_one(struct ib_device *ib_device)

set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
for (i = 1; i <= ib_device->phys_port_cnt; i++) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = kzalloc(sizeof *port, GFP_KERNEL);
@@ -3835,7 +3835,7 @@ error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
while (--i) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = cm_dev->port[i-1];
@@ -3866,7 +3866,7 @@ static void cm_remove_one(struct ib_device *ib_device)
write_unlock_irqrestore(&cm.device_lock, flags);

for (i = 1; i <= ib_device->phys_port_cnt; i++) {
- if (!rdma_ib_or_iboe(ib_device, i))
+ if (!cap_ib_cm(ib_device, i))
continue;

port = cm_dev->port[i-1];
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 33d080f..ccd319a 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -767,7 +767,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,
int ret = 0;

id_priv = container_of(id, struct rdma_id_private, id);
- if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
+ if (cap_ib_cm(id_priv->id.device, id_priv->id.port_num)) {
if (!id_priv->cm_id.ib || (id_priv->id.qp_type == IB_QPT_UD))
ret = cma_ib_init_qp_attr(id_priv, qp_attr, qp_attr_mask);
else
@@ -1056,7 +1056,7 @@ void rdma_destroy_id(struct rdma_cm_id *id)
mutex_unlock(&id_priv->handler_mutex);

if (id_priv->cma_dev) {
- if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num)) {
+ if (cap_ib_cm(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
} else if (rdma_tech_iwarp(id_priv->id.device,
@@ -2541,7 +2541,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)

id_priv->backlog = backlog;
if (id->device) {
- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
@@ -2885,7 +2885,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
@@ -2996,7 +2996,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
id_priv->srq = conn_param->srq;
}

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD) {
if (conn_param)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_SUCCESS,
@@ -3059,7 +3059,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
if (!id_priv->cm_id.ib)
return -EINVAL;

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
if (id->qp_type == IB_QPT_UD)
ret = cma_send_sidr_rep(id_priv, IB_SIDR_REJECT, 0,
private_data, private_data_len);
@@ -3086,7 +3086,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
if (!id_priv->cm_id.ib)
return -EINVAL;

- if (rdma_ib_or_iboe(id->device, id->port_num)) {
+ if (cap_ib_cm(id->device, id->port_num)) {
ret = cma_modify_qp_err(id_priv);
if (ret)
goto out;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 873b9a6..6805e3e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1803,6 +1803,21 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_ib_cm - Check if the port of device has the capability Infiniband
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Communication Manager.
+ */
+static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:46:20

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 19/28] IB/Verbs: Use management helper cap_iw_cm()


Introduce helper cap_iw_cm() to help us check if the port of an
IB device support IWARP Communication Manager.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 17 ++++++++---------
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index ccd319a..4f4a420 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -776,8 +776,7 @@ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ib_qp_attr *qp_attr,

if (qp_attr->qp_state == IB_QPS_RTR)
qp_attr->rq_psn = id_priv->seq_num;
- } else if (rdma_tech_iwarp(id_priv->id.device,
- id_priv->id.port_num)) {
+ } else if (cap_iw_cm(id_priv->id.device, id_priv->id.port_num)) {
if (!id_priv->cm_id.iw) {
qp_attr->qp_access_flags = 0;
*qp_attr_mask = IB_QP_STATE | IB_QP_ACCESS_FLAGS;
@@ -1059,8 +1058,8 @@ void rdma_destroy_id(struct rdma_cm_id *id)
if (cap_ib_cm(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->cm_id.ib)
ib_destroy_cm_id(id_priv->cm_id.ib);
- } else if (rdma_tech_iwarp(id_priv->id.device,
- id_priv->id.port_num)) {
+ } else if (cap_iw_cm(id_priv->id.device,
+ id_priv->id.port_num)) {
if (id_priv->cm_id.iw)
iw_destroy_cm_id(id_priv->cm_id.iw);
}
@@ -2545,7 +2544,7 @@ int rdma_listen(struct rdma_cm_id *id, int backlog)
ret = cma_ib_listen(id_priv);
if (ret)
goto err;
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
ret = cma_iw_listen(id_priv, backlog);
if (ret)
goto err;
@@ -2890,7 +2889,7 @@ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
ret = cma_resolve_ib_udp(id_priv, conn_param);
else
ret = cma_connect_ib(id_priv, conn_param);
- } else if (rdma_tech_iwarp(id->device, id->port_num))
+ } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_connect_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3012,7 +3011,7 @@ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param)
else
ret = cma_rep_recv(id_priv);
}
- } else if (rdma_tech_iwarp(id->device, id->port_num))
+ } else if (cap_iw_cm(id->device, id->port_num))
ret = cma_accept_iw(id_priv, conn_param);
else
ret = -ENOSYS;
@@ -3067,7 +3066,7 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
ret = ib_send_cm_rej(id_priv->cm_id.ib,
IB_CM_REJ_CONSUMER_DEFINED, NULL,
0, private_data, private_data_len);
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_reject(id_priv->cm_id.iw,
private_data, private_data_len);
} else
@@ -3093,7 +3092,7 @@ int rdma_disconnect(struct rdma_cm_id *id)
/* Initiate or respond to a disconnect. */
if (ib_send_cm_dreq(id_priv->cm_id.ib, NULL, 0))
ib_send_cm_drep(id_priv->cm_id.ib, NULL, 0);
- } else if (rdma_tech_iwarp(id->device, id->port_num)) {
+ } else if (cap_iw_cm(id->device, id->port_num)) {
ret = iw_cm_disconnect(id_priv->cm_id.iw, 0);
} else
ret = -EINVAL;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6805e3e..e4999f6 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1818,6 +1818,21 @@ static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
return rdma_ib_or_iboe(device, port_num);
}

+/**
+ * cap_iw_cm - Check if the port of device has the capability IWARP
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support IWARP
+ * Communication Manager.
+ */
+static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_iwarp(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:42:27

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 20/28] IB/Verbs: Use management helper cap_ib_sa()


Introduce helper cap_ib_sa() to help us check if the port of an
IB device support Infiniband Subnet Administrator.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 4 ++--
drivers/infiniband/core/sa_query.c | 10 +++++-----
drivers/infiniband/core/ucma.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
4 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 4f4a420..013930c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -955,7 +955,7 @@ static inline int cma_user_data_offset(struct rdma_id_private *id_priv)

static void cma_cancel_route(struct rdma_id_private *id_priv)
{
- if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num)) {
+ if (cap_ib_sa(id_priv->id.device, id_priv->id.port_num)) {
if (id_priv->query)
ib_sa_cancel_query(id_priv->query_id, id_priv->query);
}
@@ -1981,7 +1981,7 @@ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms)
return -EINVAL;

atomic_inc(&id_priv->refcount);
- if (rdma_tech_ib(id->device, id->port_num))
+ if (cap_ib_sa(id->device, id->port_num))
ret = cma_resolve_ib_route(id_priv, timeout_ms);
else if (rdma_tech_iboe(id->device, id->port_num))
ret = cma_resolve_iboe_route(id_priv);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 803ccf7..fc7e161 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
struct ib_sa_port *port =
&sa_dev->port[event->element.port_num - sa_dev->start_port];

- if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
+ if (WARN_ON(!cap_ib_sa(handler->device, port->port_num)))
return;

spin_lock_irqsave(&port->ah_lock, flags);
@@ -1173,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)

for (i = 0; i <= e - s; ++i) {
spin_lock_init(&sa_dev->port[i].ah_lock);
- if (!rdma_tech_ib(device, i + 1))
+ if (!cap_ib_sa(device, i + 1))
continue;

sa_dev->port[i].sm_ah = NULL;
@@ -1210,7 +1210,7 @@ static void ib_sa_add_one(struct ib_device *device)
goto err;

for (i = 0; i <= e - s; ++i) {
- if (rdma_tech_ib(device, i + 1))
+ if (cap_ib_sa(device, i + 1))
update_sm_ah(&sa_dev->port[i].update_task);
}

@@ -1218,7 +1218,7 @@ static void ib_sa_add_one(struct ib_device *device)

err:
while (--i >= 0) {
- if (rdma_tech_ib(device, i + 1))
+ if (cap_ib_sa(device, i + 1))
ib_unregister_mad_agent(sa_dev->port[i].agent);
}

@@ -1240,7 +1240,7 @@ static void ib_sa_remove_one(struct ib_device *device)
flush_workqueue(ib_wq);

for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) {
- if (rdma_tech_ib(device, i + 1)) {
+ if (cap_ib_sa(device, i + 1)) {
ib_unregister_mad_agent(sa_dev->port[i].agent);
if (sa_dev->port[i].sm_ah)
kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah);
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 7331c6c..bed7957 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -723,7 +723,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
resp.port_num = ctx->cm_id->port_num;

- if (rdma_tech_ib(ctx->cm_id->device, ctx->cm_id->port_num))
+ if (cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_ib_route(&resp, &ctx->cm_id->route);
else if (rdma_tech_iboe(ctx->cm_id->device, ctx->cm_id->port_num))
ucma_copy_iboe_route(&resp, &ctx->cm_id->route);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index e4999f6..3bfdf81 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1833,6 +1833,21 @@ static inline int cap_iw_cm(struct ib_device *device, u8 port_num)
return rdma_tech_iwarp(device, port_num);
}

+/**
+ * cap_ib_sa - Check if the port of device has the capability Infiniband
+ * Subnet Administrator.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Subnet Administrator.
+ */
+static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:33:31

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 21/28] IB/Verbs: Use management helper cap_ib_mcast()


Introduce helper cap_ib_mcast() to help us check if the port of an
IB device support Infiniband Multicast.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 6 +++---
drivers/infiniband/core/multicast.c | 6 +++---
include/rdma/ib_verbs.h | 15 +++++++++++++++
3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 013930c..4ed582e 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1029,7 +1029,7 @@ static void cma_leave_mc_groups(struct rdma_id_private *id_priv)
mc = container_of(id_priv->mc_list.next,
struct cma_multicast, list);
list_del(&mc->list);
- if (rdma_tech_ib(id_priv->cma_dev->device,
+ if (cap_ib_mcast(id_priv->cma_dev->device,
id_priv->id.port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
@@ -3345,7 +3345,7 @@ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr,
if (rdma_tech_iboe(id->device, id->port_num)) {
kref_init(&mc->mcref);
ret = cma_iboe_join_multicast(id_priv, mc);
- } else if (rdma_tech_ib(id->device, id->port_num))
+ } else if (cap_ib_mcast(id->device, id->port_num))
ret = cma_join_ib_multicast(id_priv, mc);
else
ret = -ENOSYS;
@@ -3379,7 +3379,7 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr)

BUG_ON(id_priv->cma_dev->device != id->device);

- if (rdma_tech_ib(id->device, id->port_num)) {
+ if (cap_ib_mcast(id->device, id->port_num)) {
ib_sa_free_multicast(mc->multicast.ib);
kfree(mc);
} else if (rdma_tech_iboe(id->device,
diff --git a/drivers/infiniband/core/multicast.c b/drivers/infiniband/core/multicast.c
index 24d93f5..bdc1880 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -780,7 +780,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
int index;

dev = container_of(handler, struct mcast_device, event_handler);
- if (WARN_ON(!rdma_tech_ib(dev->device, event->element.port_num)))
+ if (WARN_ON(!cap_ib_mcast(dev->device, event->element.port_num)))
return;

index = event->element.port_num - dev->start_port;
@@ -820,7 +820,7 @@ static void mcast_add_one(struct ib_device *device)
}

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (!rdma_tech_ib(device, dev->start_port + i))
+ if (!cap_ib_mcast(device, dev->start_port + i))
continue;
port = &dev->port[i];
port->dev = dev;
@@ -858,7 +858,7 @@ static void mcast_remove_one(struct ib_device *device)
flush_workqueue(mcast_wq);

for (i = 0; i <= dev->end_port - dev->start_port; i++) {
- if (rdma_tech_ib(device, dev->start_port + i)) {
+ if (cap_ib_mcast(device, dev->start_port + i)) {
port = &dev->port[i];
deref_port(port);
wait_for_completion(&port->comp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 3bfdf81..b2cee8d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1848,6 +1848,21 @@ static inline int cap_ib_sa(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_ib_mcast - Check if the port of device has the capability Infiniband
+ * Multicast.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support Infiniband
+ * Multicast.
+ */
+static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
+{
+ return cap_ib_sa(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:33:56

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 22/28] IB/Verbs: Use management helper cap_ipoib()


Introduce helper cap_ipoib() to help us check if the port of an
IB device support IP over Infiniband.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 97372b1..150768f 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1671,7 +1671,7 @@ static void ipoib_add_one(struct ib_device *device)
}

for (p = s; p <= e; ++p) {
- if (!rdma_tech_ib(device, p))
+ if (!cap_ipoib(device, p))
continue;
dev = ipoib_add_port("ib%d", device, p);
if (!IS_ERR(dev)) {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index b2cee8d..ac62b3a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1863,6 +1863,21 @@ static inline int cap_ib_mcast(struct ib_device *device, u8 port_num)
return cap_ib_sa(device, port_num);
}

+/**
+ * cap_ipoib - Check if the port of device has the capability
+ * IP over Infiniband.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * IP over Infiniband.
+ */
+static inline int cap_ipoib(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_ib(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:34:21

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 23/28] IB/Verbs: Use management helper cap_read_multi_sge()


Introduce helper cap_read_multi_sge() to help us check if the port of an
IB device support RDMA Read Multiple Scatter-Gather Entries.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 15 +++++++++++++++
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 ++-
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index ac62b3a..60f7efb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1878,6 +1878,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
return rdma_tech_ib(device, port_num);
}

+/**
+ * cap_read_multi_sge - Check if the port of device has the capability
+ * RDMA Read Multiple Scatter-Gather Entries.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * RDMA Read Multiple Scatter-Gather Entries.
+ */
+static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
+{
+ return !rdma_tech_iwarp(device, port_num);
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index a5bed5b..7711b7a 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -117,7 +117,8 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,

static int rdma_read_max_sge(struct svcxprt_rdma *xprt, int sge_count)
{
- if (rdma_tech_iwarp(xprt->sc_cm_id->device, xprt->sc_cm_id->port_num))
+ if (!cap_read_multi_sge(xprt->sc_cm_id->device,
+ xprt->sc_cm_id->port_num))
return 1;
else
return min_t(int, sge_count, xprt->sc_max_sge);
--
2.1.0

2015-04-13 12:34:47

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 24/28] IB/Verbs: Use management helper cap_ib_cm_dev()


Introduce helper cap_ib_cm_dev() to help us check if any port of
an IB device has the capability Infiniband Communication Manager.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 5 ++---
drivers/infiniband/core/ucm.c | 3 +--
include/rdma/ib_verbs.h | 20 ++++++++++++++++++++
3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 4ed582e..65e41f4 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1639,8 +1639,7 @@ static void cma_listen_on_dev(struct rdma_id_private *id_priv,
struct rdma_cm_id *id;
int ret;

- if (cma_family(id_priv) == AF_IB &&
- rdma_node_get_transport(cma_dev->device->node_type) != RDMA_TRANSPORT_IB)
+ if (cma_family(id_priv) == AF_IB && !cap_ib_cm_dev(cma_dev->device))
return;

id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2031,7 +2030,7 @@ static int cma_bind_loopback(struct rdma_id_private *id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
- rdma_node_get_transport(cur_dev->device->node_type) != RDMA_TRANSPORT_IB)
+ !cap_ib_cm_dev(cur_dev->device))
continue;

if (!cma_dev)
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index f2f6393..065405e 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -1253,8 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
dev_t base;
struct ib_ucm_device *ucm_dev;

- if (!device->alloc_ucontext ||
- rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
+ if (!device->alloc_ucontext || !cap_ib_cm_dev(device))
return;

ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 60f7efb..29ddd14 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1893,6 +1893,26 @@ static inline int cap_read_multi_sge(struct ib_device *device, u8 port_num)
return !rdma_tech_iwarp(device, port_num);
}

+/**
+ * cap_ib_cm_dev - Check if any port of device has the capability Infiniband
+ * Communication Manager.
+ *
+ * @device: Device to be checked
+ *
+ * Return 0 when all port of the device don't support Infiniband
+ * Communication Manager.
+ */
+static inline int cap_ib_cm_dev(struct ib_device *device)
+{
+ int i;
+
+ for (i = 1; i <= device->phys_port_cnt; i++) {
+ if (cap_ib_cm(device, i))
+ return 1;
+ }
+ return 0;
+}
+
int ib_query_gid(struct ib_device *device,
u8 port_num, int index, union ib_gid *gid);

--
2.1.0

2015-04-13 12:35:23

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()


Introduce helper cap_af_ib() to help us check if the port of an
IB device support Native Infiniband Address.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 65e41f4..7f5815d 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -470,7 +470,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)

list_for_each_entry(cur_dev, &dev_list, list) {
for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
- if (!rdma_ib_or_iboe(cur_dev->device, p))
+ if (!cap_af_ib(cur_dev->device, p))
continue;

if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 29ddd14..dfe33f3 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
}

/**
+ * cap_af_ib - Check if the port of device has the capability
+ * Native Infiniband Address.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Native Infiniband Address.
+ */
+static inline int cap_af_ib(struct ib_device *device, u8 port_num)
+{
+ return rdma_ib_or_iboe(device, port_num);
+}
+
+/**
* cap_read_multi_sge - Check if the port of device has the capability
* RDMA Read Multiple Scatter-Gather Entries.
*
--
2.1.0

2015-04-13 12:36:18

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 26/28] IB/Verbs: Use management helper cap_eth_ah()


Introduce helper cap_eth_ah() to help us check if the port of an
IB device support Ethernet Address Handler.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/cma.c | 6 +++---
drivers/infiniband/core/sa_query.c | 2 +-
drivers/infiniband/core/verbs.c | 2 +-
include/rdma/ib_verbs.h | 15 +++++++++++++++
4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 7f5815d..4b083f5 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -733,10 +733,10 @@ static int cma_ib_init_qp_attr(struct rdma_id_private *id_priv,
int ret;
u16 pkey;

- if (rdma_tech_ib(id_priv->id.device, id_priv->id.port_num))
- pkey = ib_addr_get_pkey(dev_addr);
- else
+ if (cap_eth_ah(id_priv->id.device, id_priv->id.port_num))
pkey = 0xffff;
+ else
+ pkey = ib_addr_get_pkey(dev_addr);

ret = ib_find_cached_pkey(id_priv->id.device, id_priv->id.port_num,
pkey, &qp_attr->pkey_index);
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index fc7e161..409ec4e 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
ah_attr->port_num = port_num;
ah_attr->static_rate = rec->rate;

- force_grh = rdma_tech_iboe(device, port_num);
+ force_grh = cap_eth_ah(device, port_num);

if (rec->hop_limit > 1 || force_grh) {
ah_attr->ah_flags = IB_AH_GRH;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 6b5fd9d..2e0c2cf1 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -202,7 +202,7 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
int ret;

memset(ah_attr, 0, sizeof *ah_attr);
- if (rdma_tech_iboe(device, port_num)) {
+ if (cap_eth_ah(device, port_num)) {
if (!(wc->wc_flags & IB_WC_GRH))
return -EPROTOTYPE;

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index dfe33f3..78a5cdb2 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1894,6 +1894,21 @@ static inline int cap_af_ib(struct ib_device *device, u8 port_num)
}

/**
+ * cap_eth_ah - Check if the port of device has the capability
+ * Ethernet Address Handler.
+ *
+ * @device: Device to be checked
+ * @port_num: Port number of the device
+ *
+ * Return 0 when port of the device don't support
+ * Ethernet Address Handler.
+ */
+static inline int cap_eth_ah(struct ib_device *device, u8 port_num)
+{
+ return rdma_tech_iboe(device, port_num);
+}
+
+/**
* cap_read_multi_sge - Check if the port of device has the capability
* RDMA Read Multiple Scatter-Gather Entries.
*
--
2.1.0

2015-04-13 12:36:54

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()


We have finished introducing the cap_XX(), and raw helper rdma_ib_or_iboe()
is no longer necessary, thus clean it up.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
include/rdma/ib_verbs.h | 19 +++++++++----------
net/sunrpc/xprtrdma/svc_rdma_transport.c | 6 ++++--
2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 78a5cdb2..9f6b88e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1766,13 +1766,6 @@ static inline int rdma_tech_iwarp(struct ib_device *device, u8 port_num)
== RDMA_TRANSPORT_IWARP;
}

-static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
-{
- enum rdma_transport_type tp = device->query_transport(device, port_num);
-
- return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
-}
-
/**
* cap_ib_mad - Check if the port of device has the capability Infiniband
* Management Datagrams.
@@ -1785,7 +1778,9 @@ static inline int rdma_ib_or_iboe(struct ib_device *device, u8 port_num)
*/
static inline int cap_ib_mad(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
@@ -1815,7 +1810,9 @@ static inline int cap_ib_smi(struct ib_device *device, u8 port_num)
*/
static inline int cap_ib_cm(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
@@ -1890,7 +1887,9 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
*/
static inline int cap_af_ib(struct ib_device *device, u8 port_num)
{
- return rdma_ib_or_iboe(device, port_num);
+ enum rdma_transport_type tp = device->query_transport(device, port_num);
+
+ return (tp == RDMA_TRANSPORT_IB || tp == RDMA_TRANSPORT_IBOE);
}

/**
diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c
index a09b7a1..8af6f92 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_transport.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c
@@ -987,8 +987,10 @@ static struct svc_xprt *svc_rdma_accept(struct svc_xprt *xprt)
*/
if (!rdma_tech_iwarp(newxprt->sc_cm_id->device,
newxprt->sc_cm_id->port_num) &&
- !rdma_ib_or_iboe(newxprt->sc_cm_id->device,
- newxprt->sc_cm_id->port_num))
+ !rdma_tech_ib(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num) &&
+ !rdma_tech_iboe(newxprt->sc_cm_id->device,
+ newxprt->sc_cm_id->port_num))
goto errout;

if (!(newxprt->sc_dev_caps & SVCRDMA_DEVCAP_FAST_REG) ||
--
2.1.0

2015-04-13 12:37:22

by Michael Wang

[permalink] [raw]
Subject: [PATCH v3 28/28] IB/Verbs: Cleanup rdma_node_get_transport()


We have get rid of all the scene using legacy rdma_node_get_transport(),
now clean it up.

Cc: Steve Wise <[email protected]>
Cc: Tom Talpey <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Sean Hefty <[email protected]>
Signed-off-by: Michael Wang <[email protected]>
---
drivers/infiniband/core/verbs.c | 21 ---------------------
include/rdma/ib_verbs.h | 3 ---
2 files changed, 24 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 2e0c2cf1..d2a2b52 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -107,27 +107,6 @@ __attribute_const__ int ib_rate_to_mbps(enum ib_rate rate)
}
EXPORT_SYMBOL(ib_rate_to_mbps);

-__attribute_const__ enum rdma_transport_type
-rdma_node_get_transport(enum rdma_node_type node_type)
-{
- switch (node_type) {
- case RDMA_NODE_IB_CA:
- case RDMA_NODE_IB_SWITCH:
- case RDMA_NODE_IB_ROUTER:
- return RDMA_TRANSPORT_IB;
- case RDMA_NODE_RNIC:
- return RDMA_TRANSPORT_IWARP;
- case RDMA_NODE_USNIC:
- return RDMA_TRANSPORT_USNIC;
- case RDMA_NODE_USNIC_UDP:
- return RDMA_TRANSPORT_USNIC_UDP;
- default:
- BUG();
- return 0;
- }
-}
-EXPORT_SYMBOL(rdma_node_get_transport);
-
enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num)
{
if (device->get_link_layer)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9f6b88e..3cc3f53 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -84,9 +84,6 @@ enum rdma_transport_type {
RDMA_TRANSPORT_IBOE,
};

-__attribute_const__ enum rdma_transport_type
-rdma_node_get_transport(enum rdma_node_type node_type);
-
enum rdma_link_layer {
IB_LINK_LAYER_UNSPECIFIED,
IB_LINK_LAYER_INFINIBAND,
--
2.1.0

2015-04-13 18:13:04

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm

On Mon, Apr 13, 2015 at 02:23:46PM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core cm.
>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cm.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e28a494..50321fe 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -3761,9 +3761,7 @@ static void cm_add_one(struct ib_device *ib_device)
> unsigned long flags;
> int ret;
> u8 i;
> -
> - if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;

I'm ok with this as an intermediate patch but going forward if we are going to
have calls like

static inline int cap_ib_cm_dev(struct ib_device *device)

Then I think we should have similar calls like

cap_ib_mad_dev(device)

Which eliminates the clean up below...

>
> cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
> ib_device->phys_port_cnt, GFP_KERNEL);
> @@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)
>
> set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
> + if (!rdma_ib_or_iboe(ib_device, i))
> + continue;
> +
> port = kzalloc(sizeof *port, GFP_KERNEL);
> if (!port)
> goto error1;
> @@ -3809,7 +3810,16 @@ static void cm_add_one(struct ib_device *ib_device)
> ret = ib_modify_port(ib_device, i, 0, &port_modify);
> if (ret)
> goto error3;
> +
> + count++;
> }
> +
> + if (!count) {
> + device_unregister(cm_dev->device);
> + kfree(cm_dev);
> + return;

Here.

I worry about mistakes being made when we loop through only to find that none
of the ports support the feature and then we have to clean up. As this is
initialization code I don't see any issue with looping through the ports 2
times and making the code cleaner.

This applies to the SA and CM modules as well.

However, in the ib_cm module you already have cap_ib_cm_dev(device) so you
should use it at the start of cm_add_one.

Ira

2015-04-13 18:14:16

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 05/28] IB/Verbs: Reform IB-core sa_query

On Mon, Apr 13, 2015 at 02:24:18PM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-core sa_query.
>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/sa_query.c | 29 ++++++++++++++++++-----------
> 1 file changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> index c38f030..803ccf7 100644
> --- a/drivers/infiniband/core/sa_query.c
> +++ b/drivers/infiniband/core/sa_query.c
> @@ -450,7 +450,7 @@ static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event
> struct ib_sa_port *port =
> &sa_dev->port[event->element.port_num - sa_dev->start_port];
>
> - if (rdma_port_get_link_layer(handler->device, port->port_num) != IB_LINK_LAYER_INFINIBAND)
> + if (WARN_ON(!rdma_tech_ib(handler->device, port->port_num)))
> return;
>
> spin_lock_irqsave(&port->ah_lock, flags);
> @@ -540,7 +540,7 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
> ah_attr->port_num = port_num;
> ah_attr->static_rate = rec->rate;
>
> - force_grh = rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_ETHERNET;
> + force_grh = rdma_tech_iboe(device, port_num);
>
> if (rec->hop_limit > 1 || force_grh) {
> ah_attr->ah_flags = IB_AH_GRH;
> @@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
> {
> struct ib_sa_device *sa_dev;
> int s, e, i;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;

Same comment here as for the user_mad.c change.

Ira

2015-04-13 18:16:59

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 13, 2015 at 02:25:16PM +0200, Michael Wang wrote:
>
> Use raw management helpers to reform IB-ulp ipoib.
>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/ulp/ipoib/ipoib_main.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index 58b5aa3..97372b1 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -1654,9 +1654,7 @@ static void ipoib_add_one(struct ib_device *device)
> struct net_device *dev;
> struct ipoib_dev_priv *priv;
> int s, e, p;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;

Same comment as before.

rdma_tech_ib_dev(device)

Ira

2015-04-13 18:40:43

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm

> > - if (rdma_node_get_transport(ib_device->node_type) !=
> RDMA_TRANSPORT_IB)
> > - return;
> > + int count = 0;
>
> I'm ok with this as an intermediate patch but going forward if we are
> going to
> have calls like
>
> static inline int cap_ib_cm_dev(struct ib_device *device)

I would rather keep everything to checking per port, not per device. Specifically, because we have code that does this:

> > port = kzalloc(sizeof *port, GFP_KERNEL);
> > if (!port)
> > goto error1;
> > @@ -3809,7 +3810,16 @@ static void cm_add_one(struct ib_device
> *ib_device)
> > ret = ib_modify_port(ib_device, i, 0, &port_modify);
> > if (ret)
> > goto error3;

It will also help keep the checks consistent, so that we don't end up with CM checks being per device, but SA checks being per port.

- Sean

2015-04-13 18:45:14

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 05/28] IB/Verbs: Reform IB-core sa_query

> @@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
> {
> struct ib_sa_device *sa_dev;
> int s, e, i;
> -
> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
> - return;
> + int count = 0;
>
> if (device->node_type == RDMA_NODE_IB_SWITCH)
> s = e = 0;
> @@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
>
> for (i = 0; i <= e - s; ++i) {
> spin_lock_init(&sa_dev->port[i].ah_lock);
> - if (rdma_port_get_link_layer(device, i + 1) !=
> IB_LINK_LAYER_INFINIBAND)
> + if (!rdma_tech_ib(device, i + 1))

Note for someone who cares. This patch didn't introduce this problem, but I think the port number should be "i + s".

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-13 19:27:19

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma

> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> mutex_unlock(&id_priv->handler_mutex);
>
> if (id_priv->cma_dev) {
> - switch (rdma_node_get_transport(id_priv->id.device-
> >node_type)) {
> - case RDMA_TRANSPORT_IB:
> + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))

A listen id can be associated with a device without being associated with a port (see the listen_any_list). Some other check is needed to handle this case. I guess the code could check the first port on the device (replace port_num with hardcoded value 1). Then we wouldn't be any more broken than the code already is. (The 'break' is conceptual, not practical.)

This appears to be highlighting an architectural flaw in the iboe integration.

> {
> if (id_priv->cm_id.ib)
> ib_destroy_cm_id(id_priv->cm_id.ib);
> - break;
> - case RDMA_TRANSPORT_IWARP:
> + } else if (rdma_tech_iwarp(id_priv->id.device,
> + id_priv->id.port_num)) {
> if (id_priv->cm_id.iw)
> iw_destroy_cm_id(id_priv->cm_id.iw);
> - break;
> - default:
> - break;
> }
> cma_leave_mc_groups(id_priv);
> cma_release_dev(id_priv);

- Sean
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-13 19:27:24

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 13, 2015 at 02:25:16PM +0200, Michael Wang wrote:
> dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
> if (!dev_list)
> @@ -1673,13 +1671,19 @@ static void ipoib_add_one(struct ib_device *device)
> }
>
> for (p = s; p <= e; ++p) {
> - if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
> + if (!rdma_tech_ib(device, p))
> continue;
> dev = ipoib_add_port("ib%d", device, p);
> if (!IS_ERR(dev)) {
> priv = netdev_priv(dev);
> list_add_tail(&priv->list, dev_list);
> }
> + count++;
> + }
> +
> + if (!count) {
> + kfree(dev_list);
> + return;
> }

This doesn't quite look right, it should be 'goto error1'

But then I read 'goto error1' and it doesn't look like it can handle
cm_dev->port being NULL, so more fixing is needed.

Ditto for cm_remove_one

Should audit all uses of cm_dev->port[] to make sure they can all
handle NULL.

Jason

2015-04-13 19:29:37

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm

On Mon, Apr 13, 2015 at 06:40:35PM +0000, Hefty, Sean wrote:
> > > - if (rdma_node_get_transport(ib_device->node_type) !=
> > RDMA_TRANSPORT_IB)
> > > - return;
> > > + int count = 0;
> >
> > I'm ok with this as an intermediate patch but going forward if we are
> > going to
> > have calls like
> >
> > static inline int cap_ib_cm_dev(struct ib_device *device)
>
> I would rather keep everything to checking per port, not per device.
> Specifically, because we have code that does this:

Argee.

I asked Michael for it and stand by it, the property is per-port, not
per device. Having the per-device tests just muddles the logic, look
at the trouble Sean notices in #10 when we are now forced to think of
things clearly.

Jason

2015-04-13 19:46:14

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 13, 2015 at 01:27:01PM -0600, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 02:25:16PM +0200, Michael Wang wrote:
> > dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
> > if (!dev_list)
> > @@ -1673,13 +1671,19 @@ static void ipoib_add_one(struct ib_device *device)
> > }
> >
> > for (p = s; p <= e; ++p) {
> > - if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
> > + if (!rdma_tech_ib(device, p))
> > continue;
> > dev = ipoib_add_port("ib%d", device, p);
> > if (!IS_ERR(dev)) {
> > priv = netdev_priv(dev);
> > list_add_tail(&priv->list, dev_list);
> > }
> > + count++;
> > + }
> > +
> > + if (!count) {
> > + kfree(dev_list);
> > + return;
> > }
>
> This doesn't quite look right, it should be 'goto error1'

Looks like you replied to the wrong patch. ?? I don't see error1 in ipoib_add_one.

For the ib_cm module...

Yes I think it should go to "error1". However, see below...

This is the type of clean up error which would be avoided if a call to
cap_ib_cm_dev() were done at the top of the function.

>
> But then I read 'goto error1' and it doesn't look like it can handle
> cm_dev->port being NULL, so more fixing is needed.

I think this is highlighting an existing dependency which we don't want but
exists.

It appears the code currently does not support this situation.

Dev
port 1 : cap_is_cm == true
port 2 : cap_is_cm == false
port 3 : cap_is_cm == true

Because the error handling bails when port 2 fails... Leaving both port 1 and
port 3 uninitialized. :-(

Ira

>
> Ditto for cm_remove_one
>
> Should audit all uses of cm_dev->port[] to make sure they can all
> handle NULL.
>
> Jason

2015-04-13 19:50:51

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma

On Mon, Apr 13, 2015 at 07:25:48PM +0000, Hefty, Sean wrote:
> > @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> > mutex_unlock(&id_priv->handler_mutex);
> >
> > if (id_priv->cma_dev) {
> > - switch (rdma_node_get_transport(id_priv->id.device-
> > >node_type)) {
> > - case RDMA_TRANSPORT_IB:
> > + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
>
> A listen id can be associated with a device without being associated
> with a port (see the listen_any_list). Some other check is needed
> to handle this case. I guess the code could check the first port on
> the device (replace port_num with hardcoded value 1). Then we
> wouldn't be any more broken than the code already is. (The 'break'
> is conceptual, not practical.)

Hum. So, devices on a port must have some compatibility when it comes
to these invariants. It looks like all ports must have the same
iwarpyness, for multiple reasons.

Less clear is how rocee vs ib work within a device... Can you APM
between those two kinds of ports?

All these switches are so ugly :| Function pointers setup in
iw_/ib_create_cm_id would be a lot clearer and safer.

> This appears to be highlighting an architectural flaw in the iboe integration.

You mean iwarp?

Jason

2015-04-13 19:52:27

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm

On Mon, Apr 13, 2015 at 01:29:30PM -0600, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 06:40:35PM +0000, Hefty, Sean wrote:
> > > > - if (rdma_node_get_transport(ib_device->node_type) !=
> > > RDMA_TRANSPORT_IB)
> > > > - return;
> > > > + int count = 0;
> > >
> > > I'm ok with this as an intermediate patch but going forward if we are
> > > going to
> > > have calls like
> > >
> > > static inline int cap_ib_cm_dev(struct ib_device *device)
> >
> > I would rather keep everything to checking per port, not per device.
> > Specifically, because we have code that does this:
>
> Argee.
>
> I asked Michael for it and stand by it, the property is per-port, not
> per device. Having the per-device tests just muddles the logic, look
> at the trouble Sean notices in #10 when we are now forced to think of
> things clearly.

What about having those be helpers within the corresponding C code?

For example move cap_ib_cm_dev() into cm.c. Or just put the logic at the top
of cm_add_one()?

I think that having the ib_umad, ib_sa, and ib_cm modules skip devices which
have no ports which support those functions makes the code clean. But I
understand the desire to have checks from the devices be per port.

Also for these function clean up patches we preserve the existing logic.

Ira

2015-04-13 20:01:52

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 13, 2015 at 03:46:03PM -0400, ira.weiny wrote:

> > This doesn't quite look right, it should be 'goto error1'
>
> Looks like you replied to the wrong patch. ?? I don't see error1 in ipoib_add_one.

> For the ib_cm module...

Right, sorry.

> Yes I think it should go to "error1". However, see below...
>
> This is the type of clean up error which would be avoided if a call to
> cap_ib_cm_dev() were done at the top of the function.

So what does cap_ib_cm_dev return in your example:

> Dev
> port 1 : cap_is_cm == true
> port 2 : cap_is_cm == false
> port 3 : cap_is_cm == true

True? Then the code is still broken, having cap_ib_cm_dev doesn't help
anything.

If we make it possible to be per port then it has to be fixed.

If you want to argue the above example is illegal and port 2 has to be
on a different device, I'd be interested to see what that looks like.

Thinking about it some more, cap_foo_dev only makes sense if all ports
are either true or false. Mixed is a BUG.

That seems reasonable, and solves the #10 problem, but we should
enforce this invariant during device register.

Typically the ports seem to be completely orthogonal (like SA), so in those
cases the _dev and restriction makes no sense.

CM seems to be different, so it should probably enforce its rules

Jason

2015-04-13 20:31:15

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma

> On Mon, Apr 13, 2015 at 07:25:48PM +0000, Hefty, Sean wrote:
> > > @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> > > mutex_unlock(&id_priv->handler_mutex);
> > >
> > > if (id_priv->cma_dev) {
> > > - switch (rdma_node_get_transport(id_priv->id.device-
> > > >node_type)) {
> > > - case RDMA_TRANSPORT_IB:
> > > + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> >
> > A listen id can be associated with a device without being associated
> > with a port (see the listen_any_list). Some other check is needed
> > to handle this case. I guess the code could check the first port on
> > the device (replace port_num with hardcoded value 1). Then we
> > wouldn't be any more broken than the code already is. (The 'break'
> > is conceptual, not practical.)
>
> Hum. So, devices on a port must have some compatibility when it comes
> to these invariants. It looks like all ports must have the same
> iwarpyness, for multiple reasons.
>
> Less clear is how rocee vs ib work within a device... Can you APM
> between those two kinds of ports?

No idea

> All these switches are so ugly :| Function pointers setup in
> iw_/ib_create_cm_id would be a lot clearer and safer.

I noticed this too. The if checks throughout the cma are becoming unmaintainable. It may be cleaner if all devices adopted using the cm device function pointers.

> > This appears to be highlighting an architectural flaw in the iboe
> integration.
>
> You mean iwarp?

I meant iboe. Wildcard listens map to multiple listens, one per device. The assumption being that all ports on the device are the same. IBoE changed that assumption.

2015-04-13 20:34:01

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

On Mon, Apr 13, 2015 at 02:36:45PM +0200, Michael Wang wrote:
> We have finished introducing the cap_XX(), and raw helper rdma_ib_or_iboe()
> is no longer necessary, thus clean it up.

So, the net result is not looking too bad, but I'm confused about the
structure of this series.

Why introduce query_transport early on?

Why is the patch series going through this progression most lines?

- if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) {
+ if (rdma_tech_ib(device, port_num)) {
+ if (cap_ib_smi(device, port_num)) {

This would be a lot shorter and simpler to look at if the cap's were
introduced first, then their implementation finalized.

I thought we agreed Doug's bitmask plan should be the final
destination for this series, so this progress seems even stranger?

I would be very happy to see a patch that adds cap_ib_smi to the
current tree and states 'This patch is tested to have no change on the
binary compilation results'

Jason

2015-04-14 07:51:15

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm



On 04/13/2015 08:12 PM, ira.weiny wrote:
[snip]
>> -
>> - if (rdma_node_get_transport(ib_device->node_type) != RDMA_TRANSPORT_IB)
>> - return;
>> + int count = 0;
>
> I'm ok with this as an intermediate patch but going forward if we are going to
> have calls like
>
> static inline int cap_ib_cm_dev(struct ib_device *device)

Actually I really don't want to introduce this kind of helper, it's slow, ugly
and break the consistency, but I can't find a good way to avoid that...

For example the check inside cma_listen_on_dev(), how could we do per-port check
while don't even know which port will be used later...

>
> Then I think we should have similar calls like
>
> cap_ib_mad_dev(device)
>
> Which eliminates the clean up below...

I'd like to avoid using such helper as long as possible :-P

>
>>
>> cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
>> ib_device->phys_port_cnt, GFP_KERNEL);
>> @@ -3783,6 +3781,9 @@ static void cm_add_one(struct ib_device *ib_device)
>>
>> set_bit(IB_MGMT_METHOD_SEND, reg_req.method_mask);
>> for (i = 1; i <= ib_device->phys_port_cnt; i++) {
>> + if (!rdma_ib_or_iboe(ib_device, i))
>> + continue;
>> +
>> port = kzalloc(sizeof *port, GFP_KERNEL);
>> if (!port)
>> goto error1;
>> @@ -3809,7 +3810,16 @@ static void cm_add_one(struct ib_device *ib_device)
>> ret = ib_modify_port(ib_device, i, 0, &port_modify);
>> if (ret)
>> goto error3;
>> +
>> + count++;
>> }
>> +
>> + if (!count) {
>> + device_unregister(cm_dev->device);
>> + kfree(cm_dev);
>> + return;
>
> Here.
>
> I worry about mistakes being made when we loop through only to find that none
> of the ports support the feature and then we have to clean up. As this is
> initialization code I don't see any issue with looping through the ports 2
> times and making the code cleaner.

This style of logical could be found in other core module too, may be keep
consistent is not a bad idea?

After all, it's just initialization code which relatively rarely used :-)

Regards,
Michael Wang

>
> This applies to the SA and CM modules as well.
>
> However, in the ib_cm module you already have cap_ib_cm_dev(device) so you
> should use it at the start of cm_add_one.
>
> Ira
>

2015-04-14 07:57:13

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 04/28] IB/Verbs: Reform IB-core cm



On 04/13/2015 09:29 PM, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 06:40:35PM +0000, Hefty, Sean wrote:
>>>> - if (rdma_node_get_transport(ib_device->node_type) !=
>>> RDMA_TRANSPORT_IB)
>>>> - return;
>>>> + int count = 0;
>>>
>>> I'm ok with this as an intermediate patch but going forward if we are
>>> going to
>>> have calls like
>>>
>>> static inline int cap_ib_cm_dev(struct ib_device *device)
>>
>> I would rather keep everything to checking per port, not per device.
>> Specifically, because we have code that does this:
>
> Argee.
>
> I asked Michael for it and stand by it, the property is per-port, not
> per device. Having the per-device tests just muddles the logic, look
> at the trouble Sean notices in #10 when we are now forced to think of
> things clearly.

The only per-dev checking left is all included in #24 (now may be #10 too),
which is inside:

1. cma_listen_on_dev
2. ib_ucm_add_one

I can't find a good way to apply per-port check in this two, seems like
they are at the stage which not related to port yet... any ideas on how
to improve that?

Regards,
Michael Wang

>
> Jason
>

2015-04-14 08:03:51

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 05/28] IB/Verbs: Reform IB-core sa_query

On 04/13/2015 08:45 PM, Hefty, Sean wrote:
>> @@ -1153,9 +1153,7 @@ static void ib_sa_add_one(struct ib_device *device)
>> {
>> struct ib_sa_device *sa_dev;
>> int s, e, i;
>> -
>> - if (rdma_node_get_transport(device->node_type) != RDMA_TRANSPORT_IB)
>> - return;
>> + int count = 0;
>>
>> if (device->node_type == RDMA_NODE_IB_SWITCH)
>> s = e = 0;
>> @@ -1175,7 +1173,7 @@ static void ib_sa_add_one(struct ib_device *device)
>>
>> for (i = 0; i <= e - s; ++i) {
>> spin_lock_init(&sa_dev->port[i].ah_lock);
>> - if (rdma_port_get_link_layer(device, i + 1) !=
>> IB_LINK_LAYER_INFINIBAND)
>> + if (!rdma_tech_ib(device, i + 1))
>
> Note for someone who cares. This patch didn't introduce this problem, but I think the port number should be "i + s".

Actually I'm planning to cleanup the places play with 's' and 'e' too, for
example both cache.c and device.c implemented helper start_port() end_port()
with exactly the same logical, and there are also many places like here which
play with port number ugly, I'd like to refine these part later if no one else
interested :-P

Regards,
Michael Wang

>

2015-04-14 08:08:40

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib



On 04/13/2015 09:27 PM, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 02:25:16PM +0200, Michael Wang wrote:
>> dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL);
>> if (!dev_list)
>> @@ -1673,13 +1671,19 @@ static void ipoib_add_one(struct ib_device *device)
>> }
>>
>> for (p = s; p <= e; ++p) {
>> - if (rdma_port_get_link_layer(device, p) != IB_LINK_LAYER_INFINIBAND)
>> + if (!rdma_tech_ib(device, p))
>> continue;
>> dev = ipoib_add_port("ib%d", device, p);
>> if (!IS_ERR(dev)) {
>> priv = netdev_priv(dev);
>> list_add_tail(&priv->list, dev_list);
>> }
>> + count++;
>> + }
>> +
>> + if (!count) {
>> + kfree(dev_list);
>> + return;
>> }
>
> This doesn't quite look right, it should be 'goto error1'
>
> But then I read 'goto error1' and it doesn't look like it can handle
> cm_dev->port being NULL, so more fixing is needed.

Nice catch ;-) I guess the 'count++' should be inside 'if (!IS_ERR(dev))' after
link the node.

Regards,
Michael Wang

>
> Ditto for cm_remove_one
>
> Should audit all uses of cm_dev->port[] to make sure they can all
> handle NULL.
>
> Jason
>

2015-04-14 08:35:51

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma



On 04/13/2015 09:25 PM, Hefty, Sean wrote:
>> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
>> mutex_unlock(&id_priv->handler_mutex);
>>
>> if (id_priv->cma_dev) {
>> - switch (rdma_node_get_transport(id_priv->id.device-
>>> node_type)) {
>> - case RDMA_TRANSPORT_IB:
>> + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
>
> A listen id can be associated with a device without being associated with a port (see the listen_any_list).
Some other check is needed to handle this case. I guess the code could check the first port on the device
(replace port_num with hardcoded value 1). Then we wouldn't be any more broken than the code already is.
(The 'break' is conceptual, not practical.)

Agree, seems like this is very similar to the case of cma_listen_on_dev() which
do not associated with any particular port in #24.

If the port 1 is enough to present the whole device's cm capability, maybe we can
get rid of cap_ib_cm_dev() too?

And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
be safer?

Regards,
Michael Wang


>
> This appears to be highlighting an architectural flaw in the iboe integration.
>
>> {
>> if (id_priv->cm_id.ib)
>> ib_destroy_cm_id(id_priv->cm_id.ib);
>> - break;
>> - case RDMA_TRANSPORT_IWARP:
>> + } else if (rdma_tech_iwarp(id_priv->id.device,
>> + id_priv->id.port_num)) {
>> if (id_priv->cm_id.iw)
>> iw_destroy_cm_id(id_priv->cm_id.iw);
>> - break;
>> - default:
>> - break;
>> }
>> cma_leave_mc_groups(id_priv);
>> cma_release_dev(id_priv);
>
> - Sean
>

2015-04-14 09:13:19

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()



On 04/13/2015 10:33 PM, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 02:36:45PM +0200, Michael Wang wrote:
>> We have finished introducing the cap_XX(), and raw helper rdma_ib_or_iboe()
>> is no longer necessary, thus clean it up.
>
> So, the net result is not looking too bad, but I'm confused about the
> structure of this series.
>
> Why introduce query_transport early on?

This won't be erased until bitmask introduced, at this moment it's the basic
method for helpers to acquire port transport from device.

Sure we can still use the legacy method but IMHO this abstraction will be
more readable for internal reforming, it's like 'mapping from tech to bits'
VS 'mapping from transport and link layer to bits'.

>
> Why is the patch series going through this progression most lines?
>
> - if (rdma_port_get_link_layer(device, port_num) == IB_LINK_LAYER_INFINIBAND) {
> + if (rdma_tech_ib(device, port_num)) {

This is mostly focus on the reforming on logical.

> + if (cap_ib_smi(device, port_num)) {

This focus on the description and semantic, won't contain logical reform
, just replace the helper.

I want this way to help us focus on the different main point during the review :-)

>
> This would be a lot shorter and simpler to look at if the cap's were
> introduced first, then their implementation finalized.
>
> I thought we agreed Doug's bitmask plan should be the final
> destination for this series, so this progress seems even stranger?
>
> I would be very happy to see a patch that adds cap_ib_smi to the
> current tree and states 'This patch is tested to have no change on the
> binary compilation results'

There are too much reform there (per-dev to per-port), I guess the binary
will changed more or less anyway...

BTW, I may misunderstanding your point on "Re: [PATCH v2 03/17]":

> I would prefer to see these changes in control flow as dedicated
> patches, at the top of your patch stack.
> For this kind of work a patch should be mechanical changes only, it is
> easier to review that way.
> Same comment applies throughout.

I thought you prefer introducing cap_XX() based on the reforming...

But anyway, please let me know if this really bothered the review :-)

Regards,
Michael Wang

>
> Jason
>

2015-04-14 14:18:26

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Mon, Apr 13, 2015 at 02:01:38PM -0600, Jason Gunthorpe wrote:
> On Mon, Apr 13, 2015 at 03:46:03PM -0400, ira.weiny wrote:
>
> > > This doesn't quite look right, it should be 'goto error1'
> >
> > Looks like you replied to the wrong patch. ?? I don't see error1 in ipoib_add_one.
>
> > For the ib_cm module...
>
> Right, sorry.
>
> > Yes I think it should go to "error1". However, see below...
> >
> > This is the type of clean up error which would be avoided if a call to
> > cap_ib_cm_dev() were done at the top of the function.
>
> So what does cap_ib_cm_dev return in your example:

I was thinking it was all or nothing. But I see now that the cap_ib_cm_dev is
actually true if "any" port supports the CM.

> > Dev
> > port 1 : cap_is_cm == true
> > port 2 : cap_is_cm == false
> > port 3 : cap_is_cm == true
>
> True? Then the code is still broken, having cap_ib_cm_dev doesn't help
> anything.

I see that now.

>
> If we make it possible to be per port then it has to be fixed.

Yes

>
> If you want to argue the above example is illegal and port 2 has to be
> on a different device, I'd be interested to see what that looks like.

I think the easiest fix for this series is to add this to the final CM code
(applies after the end of the series compile tested only):

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 63418ee..0d0fc24 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3761,7 +3761,11 @@ static void cm_add_one(struct ib_device *ib_device)
unsigned long flags;
int ret;
u8 i;
- int count = 0;
+
+ for (i = 1; i <= ib_device->phys_port_cnt; i++) {
+ if (!cap_ib_cm(ib_device, i))
+ return;
+ }

cm_dev = kzalloc(sizeof(*cm_dev) + sizeof(*port) *
ib_device->phys_port_cnt, GFP_KERNEL);
@@ -3810,14 +3814,6 @@ static void cm_add_one(struct ib_device *ib_device)
ret = ib_modify_port(ib_device, i, 0, &port_modify);
if (ret)
goto error3;
-
- count++;
- }
-
- if (!count) {
- device_unregister(cm_dev->device);
- kfree(cm_dev);
- return;
}

ib_set_client_data(ib_device, &cm_client, cm_dev);


>
> Thinking about it some more, cap_foo_dev only makes sense if all ports
> are either true or false. Mixed is a BUG.

Agree

After more thought and reading other opinions, I must agree we should not
have cap_foo_dev.

For the CM case which has some need to support itself device device wide what
about this patch (compile tested only):


10:03:57 > git di
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 4b083f5..7347445 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1639,7 +1639,7 @@ static void cma_listen_on_dev(struct rdma_id_private
*id_priv,
struct rdma_cm_id *id;
int ret;

- if (cma_family(id_priv) == AF_IB && !cap_ib_cm_dev(cma_dev->device))
+ if (cma_family(id_priv) == AF_IB && !cap_ib_cm_any_port(cma_dev->device))
return;

id = rdma_create_id(cma_listen_handler, id_priv, id_priv->id.ps,
@@ -2030,7 +2030,7 @@ static int cma_bind_loopback(struct rdma_id_private
*id_priv)
mutex_lock(&lock);
list_for_each_entry(cur_dev, &dev_list, list) {
if (cma_family(id_priv) == AF_IB &&
- !cap_ib_cm_dev(cur_dev->device))
+ !cap_ib_cm_any_port(cur_dev->device))
continue;

if (!cma_dev)
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index 065405e..dc4caae 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -1253,7 +1253,7 @@ static void ib_ucm_add_one(struct ib_device *device)
dev_t base;
struct ib_ucm_device *ucm_dev;

- if (!device->alloc_ucontext || !cap_ib_cm_dev(device))
+ if (!device->alloc_ucontext || !cap_ib_cm_any_port(device))
return;

ucm_dev = kzalloc(sizeof *ucm_dev, GFP_KERNEL);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 3cc3f53..a8fa1f5 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1920,15 +1920,14 @@ static inline int cap_read_multi_sge(struct ib_device
*device, u8 port_num)
}

/**
- * cap_ib_cm_dev - Check if any port of device has the capability Infiniband
- * Communication Manager.
+ * cap_ib_cm_any_port - Check if any port of the device has Infiniband
+ * Communication Manager (CM) support.
*
* @device: Device to be checked
*
- * Return 0 when all port of the device don't support Infiniband
- * Communication Manager.
+ * Return 1 if any port of the device supports the IB CM.
*/
-static inline int cap_ib_cm_dev(struct ib_device *device)
+static inline int cap_ib_cm_any_port(struct ib_device *device)
{
int i;


>
> That seems reasonable, and solves the #10 problem, but we should
> enforce this invariant during device register.
>
> Typically the ports seem to be completely orthogonal (like SA), so in those
> cases the _dev and restriction makes no sense.

While the ports in ib_sa and ib_umad probably can be orthogonal the current
implementation does not support that and this patch series obscures that a bit.

>
> CM seems to be different, so it should probably enforce its rules

Technically, the implementation of the ib_sa and ib_umad modules are not
different it is just that Michaels patch is not broken.

I'm not saying we can't change the ib_sa and ib_umad modules but the current
logic is all or nothing. And I think changing this :

1) require more review and testing
2) is not the purpose of this series.

Ira

>
> Jason

2015-04-14 14:33:11

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib



On 04/14/2015 04:18 PM, ira.weiny wrote:
[snip]
>
> /**
> - * cap_ib_cm_dev - Check if any port of device has the capability Infiniband
> - * Communication Manager.
> + * cap_ib_cm_any_port - Check if any port of the device has Infiniband
> + * Communication Manager (CM) support.
> *
> * @device: Device to be checked
> *
> - * Return 0 when all port of the device don't support Infiniband
> - * Communication Manager.
> + * Return 1 if any port of the device supports the IB CM.
> */
> -static inline int cap_ib_cm_dev(struct ib_device *device)
> +static inline int cap_ib_cm_any_port(struct ib_device *device)
> {
> int i;

I think we maybe able to get rid of this helper according to Sean's suggestion :-)

We just need to check the port 1 of HCA see if it support ib cm, seems like
currently there is no case that port 1 support cm while others doesn't.

Regards,
Michael Wang

>
>
>>
>> That seems reasonable, and solves the #10 problem, but we should
>> enforce this invariant during device register.
>>
>> Typically the ports seem to be completely orthogonal (like SA), so in those
>> cases the _dev and restriction makes no sense.
>
> While the ports in ib_sa and ib_umad probably can be orthogonal the current
> implementation does not support that and this patch series obscures that a bit.
>
>>
>> CM seems to be different, so it should probably enforce its rules
>
> Technically, the implementation of the ib_sa and ib_umad modules are not
> different it is just that Michaels patch is not broken.
>
> I'm not saying we can't change the ib_sa and ib_umad modules but the current
> logic is all or nothing. And I think changing this :
>
> 1) require more review and testing
> 2) is not the purpose of this series.
>
> Ira
>
>>
>> Jason

2015-04-14 15:40:28

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Tue, Apr 14, 2015 at 04:32:57PM +0200, Michael Wang wrote:
>
>
> On 04/14/2015 04:18 PM, ira.weiny wrote:
> [snip]
> >
> > /**
> > - * cap_ib_cm_dev - Check if any port of device has the capability Infiniband
> > - * Communication Manager.
> > + * cap_ib_cm_any_port - Check if any port of the device has Infiniband
> > + * Communication Manager (CM) support.
> > *
> > * @device: Device to be checked
> > *
> > - * Return 0 when all port of the device don't support Infiniband
> > - * Communication Manager.
> > + * Return 1 if any port of the device supports the IB CM.
> > */
> > -static inline int cap_ib_cm_dev(struct ib_device *device)
> > +static inline int cap_ib_cm_any_port(struct ib_device *device)
> > {
> > int i;
>
> I think we maybe able to get rid of this helper according to Sean's suggestion :-)
>
> We just need to check the port 1 of HCA see if it support ib cm, seems like
> currently there is no case that port 1 support cm while others doesn't.

But that moves us in the wrong direction. If we later support port 2 without
port 1 that code will be broken.

Ira

2015-04-14 15:50:50

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma

On Tue, Apr 14, 2015 at 10:35:34AM +0200, Michael Wang wrote:
>
>
> On 04/13/2015 09:25 PM, Hefty, Sean wrote:
> >> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
> >> mutex_unlock(&id_priv->handler_mutex);
> >>
> >> if (id_priv->cma_dev) {
> >> - switch (rdma_node_get_transport(id_priv->id.device-
> >>> node_type)) {
> >> - case RDMA_TRANSPORT_IB:
> >> + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
> >
> > A listen id can be associated with a device without being associated with a port (see the listen_any_list).
> Some other check is needed to handle this case. I guess the code could check the first port on the device
> (replace port_num with hardcoded value 1). Then we wouldn't be any more broken than the code already is.
> (The 'break' is conceptual, not practical.)
>
> Agree, seems like this is very similar to the case of cma_listen_on_dev() which
> do not associated with any particular port in #24.
>
> If the port 1 is enough to present the whole device's cm capability, maybe we can
> get rid of cap_ib_cm_dev() too?
>
> And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
> be safer?

I don't see support for switch port 0 in cm_add_one() now. Are switches supposed
to be supported?

Ira

>
> Regards,
> Michael Wang
>

2015-04-14 15:51:47

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib



On 04/14/2015 05:40 PM, ira.weiny wrote:
> On Tue, Apr 14, 2015 at 04:32:57PM +0200, Michael Wang wrote:
>>
>>
>> On 04/14/2015 04:18 PM, ira.weiny wrote:
>> [snip]
>>>
>>> /**
>>> - * cap_ib_cm_dev - Check if any port of device has the capability Infiniband
>>> - * Communication Manager.
>>> + * cap_ib_cm_any_port - Check if any port of the device has Infiniband
>>> + * Communication Manager (CM) support.
>>> *
>>> * @device: Device to be checked
>>> *
>>> - * Return 0 when all port of the device don't support Infiniband
>>> - * Communication Manager.
>>> + * Return 1 if any port of the device supports the IB CM.
>>> */
>>> -static inline int cap_ib_cm_dev(struct ib_device *device)
>>> +static inline int cap_ib_cm_any_port(struct ib_device *device)
>>> {
>>> int i;
>>
>> I think we maybe able to get rid of this helper according to Sean's suggestion :-)
>>
>> We just need to check the port 1 of HCA see if it support ib cm, seems like
>> currently there is no case that port 1 support cm while others doesn't.
>
> But that moves us in the wrong direction. If we later support port 2 without
> port 1 that code will be broken.

I'm not sure if we should sacrifice the consistency at this moment for such 'future'
capability... maybe we can leave such reform work to those who introduce the new capability?

Regards,
Michael Wang

>
> Ira
>

2015-04-14 15:59:00

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma



On 04/14/2015 05:50 PM, ira.weiny wrote:
> On Tue, Apr 14, 2015 at 10:35:34AM +0200, Michael Wang wrote:
>>
>>
>> On 04/13/2015 09:25 PM, Hefty, Sean wrote:
>>>> @@ -1037,17 +1033,13 @@ void rdma_destroy_id(struct rdma_cm_id *id)
>>>> mutex_unlock(&id_priv->handler_mutex);
>>>>
>>>> if (id_priv->cma_dev) {
>>>> - switch (rdma_node_get_transport(id_priv->id.device-
>>>>> node_type)) {
>>>> - case RDMA_TRANSPORT_IB:
>>>> + if (rdma_ib_or_iboe(id_priv->id.device, id_priv->id.port_num))
>>>
>>> A listen id can be associated with a device without being associated with a port (see the listen_any_list).
>> Some other check is needed to handle this case. I guess the code could check the first port on the device
>> (replace port_num with hardcoded value 1). Then we wouldn't be any more broken than the code already is.
>> (The 'break' is conceptual, not practical.)
>>
>> Agree, seems like this is very similar to the case of cma_listen_on_dev() which
>> do not associated with any particular port in #24.
>>
>> If the port 1 is enough to present the whole device's cm capability, maybe we can
>> get rid of cap_ib_cm_dev() too?
>>
>> And maybe cap_ib_cm(device, device->node_type == RDMA_NODE_IB_SWITCH ? 0:1) would
>> be safer?
>
> I don't see support for switch port 0 in cm_add_one() now. Are switches supposed
> to be supported?

Just concern about the validation of port... is it possible that the device we check
in here don't have port 1? (forgive me if the question is too silly :-P)

Regards,
Michael Wang

>
> Ira
>
>>
>> Regards,
>> Michael Wang
>>

2015-04-14 17:10:06

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

> But that moves us in the wrong direction. If we later support port 2
> without
> port 1 that code will be broken.

I agree that the code will be broken, but supporting that model requires a lot more work in how the ib_cm listens across devices.

2015-04-14 17:25:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Tue, Apr 14, 2015 at 10:18:07AM -0400, ira.weiny wrote:

> After more thought and reading other opinions, I must agree we should not
> have cap_foo_dev.

I looked at it a bit, and I think Sean has also basically said, CM
does not support certain mixed port combinations. iWarp and IB simply
cannot be mixed with the current CM and it doesn't look easy to fix
that. We can fix a few point areas simply, but not all of it.

So we have to have the _dev tests, only to fill the CM's need and they
must have the all true/all false/BUG semantics CM demands.

Verify on register.

> While the ports in ib_sa and ib_umad probably can be orthogonal the current
> implementation does not support that and this patch series obscures that a bit.

Really? Do you see any bugs/missed things? Both were made port
orthogonal when RoCEE was added, because RoCEE needs that.

CM wasn't because RoCEE and IB seem to use almost the same code,
though I wonder if mixing really works 100%..

Jason

2015-04-14 17:43:50

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Tue, Apr 14, 2015 at 11:25:15AM -0600, Jason Gunthorpe wrote:
> On Tue, Apr 14, 2015 at 10:18:07AM -0400, ira.weiny wrote:
>
> > After more thought and reading other opinions, I must agree we should not
> > have cap_foo_dev.
>
> I looked at it a bit, and I think Sean has also basically said, CM
> does not support certain mixed port combinations. iWarp and IB simply
> cannot be mixed with the current CM and it doesn't look easy to fix
> that. We can fix a few point areas simply, but not all of it.
>
> So we have to have the _dev tests, only to fill the CM's need and they
> must have the all true/all false/BUG semantics CM demands.
>
> Verify on register.
>
> > While the ports in ib_sa and ib_umad probably can be orthogonal the current
> > implementation does not support that and this patch series obscures that a bit.
>
> Really? Do you see any bugs/missed things? Both were made port
> orthogonal when RoCEE was added, because RoCEE needs that.

They are not completely orthogonal:

A failure to init port 2 ends up ends up "killing" port 1 and releasing the
device associated resources.

static void ib_umad_add_one(struct ib_device *device)
{
...
if (ib_umad_init_port(device, i, umad_dev,
&umad_dev->port[i - s]))
goto err;
...

err:
while (--i >= s) {
if (!cap_ib_mad(device, i))
continue;

ib_umad_kill_port(&umad_dev->port[i - s]);
}

kobject_put(&umad_dev->kobj);
}

>
> CM wasn't because RoCEE and IB seem to use almost the same code,
> though I wonder if mixing really works 100%..

The support can (and should) be orthogonal but the implementation is
incomplete.

Ira

>
> Jason

2015-04-14 17:59:23

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Tue, Apr 14, 2015 at 01:43:11PM -0400, ira.weiny wrote:

> A failure to init port 2 ends up ends up "killing" port 1 and releasing the
> device associated resources.

Yes, that is the only reasonable thing that could happen.

init failure should only be possible under exceptional cases (OOM).

The only system response is to call ib_umad_add_one again - so of
course the first call had to completely clean up everything it did.

Hopefully all these errors propogate enough so that driver insmod fails
with a perfect clean up. Otherwise it is broken :|

Jason

2015-04-14 18:02:57

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

> Yes, that is the only reasonable thing that could happen.
>
> init failure should only be possible under exceptional cases (OOM).
>
> The only system response is to call ib_umad_add_one again - so of
> course the first call had to completely clean up everything it did.

A reasonable follow up change would be to replace the add device callbacks with add port callbacks.

2015-04-14 18:21:54

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Tue, Apr 14, 2015 at 06:02:47PM +0000, Hefty, Sean wrote:
> > Yes, that is the only reasonable thing that could happen.
> >
> > init failure should only be possible under exceptional cases (OOM).
> >
> > The only system response is to call ib_umad_add_one again - so of
> > course the first call had to completely clean up everything it did.
>
> A reasonable follow up change would be to replace the add device
> callbacks with add port callbacks.

Yes, combined with a port argument to ib_set_client_data /
ib_get_client_data it would be a nice simplifying clean up.

It would be nice to have sane error handling too :( In an ideal world
the add call back should return an error and the thing that triggered
it should unwind back to module load failure.

Jason

2015-04-15 07:58:33

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib



On 04/14/2015 08:21 PM, Jason Gunthorpe wrote:
> On Tue, Apr 14, 2015 at 06:02:47PM +0000, Hefty, Sean wrote:
>>> Yes, that is the only reasonable thing that could happen.
>>>
>>> init failure should only be possible under exceptional cases (OOM).
>>>
>>> The only system response is to call ib_umad_add_one again - so of
>>> course the first call had to completely clean up everything it did.
>>
>> A reasonable follow up change would be to replace the add device
>> callbacks with add port callbacks.
>
> Yes, combined with a port argument to ib_set_client_data /
> ib_get_client_data it would be a nice simplifying clean up.
>
> It would be nice to have sane error handling too :( In an ideal world
> the add call back should return an error and the thing that triggered
> it should unwind back to module load failure.

We can give client->add() callback a return value and make ib_register_device()
return -ENOMEM when it failed, just wondering why we don't do this at first, any
special reason?

Regards,
Michael Wang

>
> Jason
>

2015-04-15 18:36:25

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

On 4/13/2015 8:22 AM, Michael Wang wrote:
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 65994a1..d54f91e 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -75,10 +75,13 @@ enum rdma_node_type {
> };
>
> enum rdma_transport_type {
> + /* legacy for users */
> RDMA_TRANSPORT_IB,
> RDMA_TRANSPORT_IWARP,
> RDMA_TRANSPORT_USNIC,
> - RDMA_TRANSPORT_USNIC_UDP
> + RDMA_TRANSPORT_USNIC_UDP,
> + /* new transport */
> + RDMA_TRANSPORT_IBOE,
> };
>
> __attribute_const__ enum rdma_transport_type
> @@ -1501,6 +1504,8 @@ struct ib_device {
> int (*query_port)(struct ib_device *device,
> u8 port_num,
> struct ib_port_attr *port_attr);
> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
> + u8 port_num);
> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
> u8 port_num);
> int (*query_gid)(struct ib_device *device,

libibverbs also exposes transport at the device level. Isn't a change to
make transport per port rather than per device needed there as well to
be consistent with these proposed kernel changes ? If so, would the
additional IBoE transport be exposed ? We also need to worry about
backward compatibility for existing applications.

-- Hal

2015-04-15 18:36:44

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH v3 10/28] IB/Verbs: Reform cm related part in IB-core cma

On 4/13/2015 3:50 PM, Jason Gunthorpe wrote:
> Less clear is how rocee vs ib work within a device... Can you APM
> between those two kinds of ports?

The specs allow this to work but AFAIK it's not implemented.

2015-04-15 19:29:57

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

> libibverbs also exposes transport at the device level. Isn't a change to
> make transport per port rather than per device needed there as well to
> be consistent with these proposed kernel changes ? If so, would the
> additional IBoE transport be exposed ? We also need to worry about
> backward compatibility for existing applications.

Libibverbs probably can't change without bumping the major version number. If the kernel attribute structures change, this is probably something that the user_verbs module would need to handle to avoid breaking the ABI.

- Sean
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-15 20:33:50

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

On Wed, Apr 15, 2015 at 02:36:13PM -0400, Hal Rosenstock wrote:
> On 4/13/2015 8:22 AM, Michael Wang wrote:
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> > index 65994a1..d54f91e 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -75,10 +75,13 @@ enum rdma_node_type {
> > };
> >
> > enum rdma_transport_type {
> > + /* legacy for users */
> > RDMA_TRANSPORT_IB,
> > RDMA_TRANSPORT_IWARP,
> > RDMA_TRANSPORT_USNIC,
> > - RDMA_TRANSPORT_USNIC_UDP
> > + RDMA_TRANSPORT_USNIC_UDP,
> > + /* new transport */
> > + RDMA_TRANSPORT_IBOE,
> > };
> >
> > __attribute_const__ enum rdma_transport_type
> > @@ -1501,6 +1504,8 @@ struct ib_device {
> > int (*query_port)(struct ib_device *device,
> > u8 port_num,
> > struct ib_port_attr *port_attr);
> > + enum rdma_transport_type (*query_transport)(struct ib_device *device,
> > + u8 port_num);
> > enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
> > u8 port_num);
> > int (*query_gid)(struct ib_device *device,
>
> libibverbs also exposes transport at the device level. Isn't a change to
> make transport per port rather than per device needed there as well to
> be consistent with these proposed kernel changes ? If so, would the
> additional IBoE transport be exposed ? We also need to worry about
> backward compatibility for existing applications.

Early on in this thread we agreed that user space would stay the same until we
get the kernel straightened out. The above enum and function are not exported
to user space so I believe this is in alignment with that plan.

Ira

2015-04-15 20:40:02

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

On Wed, Apr 15, 2015 at 01:29:31PM -0600, Hefty, Sean wrote:
> > libibverbs also exposes transport at the device level. Isn't a change to
> > make transport per port rather than per device needed there as well to
> > be consistent with these proposed kernel changes ? If so, would the
> > additional IBoE transport be exposed ? We also need to worry about
> > backward compatibility for existing applications.
>
> Libibverbs probably can't change without bumping the major version number. If the kernel attribute structures change, this is probably something that the user_verbs module would need to handle to avoid breaking the ABI.
>

Transport is not directly sent thought the kernel ABI. It is fabricated when
the HCA provider driver is loaded.

static struct ibv_device *try_driver(struct ibv_driver *driver,
struct ibv_sysfs_dev *sysfs_dev)
{
...
switch (dev->node_type) {
case IBV_NODE_CA:
case IBV_NODE_SWITCH:
case IBV_NODE_ROUTER:
dev->transport_type = IBV_TRANSPORT_IB;
break;
case IBV_NODE_RNIC:
dev->transport_type = IBV_TRANSPORT_IWARP;
break;
case IBV_NODE_USNIC:
dev->transport_type = IBV_TRANSPORT_USNIC;
break;
case IBV_NODE_USNIC_UDP:
dev->transport_type = IBV_TRANSPORT_USNIC_UDP;
break;
default:
dev->transport_type = IBV_TRANSPORT_UNKNOWN;
break;
}
...
}

I'm not sure how to fix this once we get to the user space problem. But I
don't think it affects this set of patches.

Ira

2015-04-16 07:30:47

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

Hi, Hal

On 04/15/2015 08:36 PM, Hal Rosenstock wrote:
> On 4/13/2015 8:22 AM, Michael Wang wrote:
[snip]
>> __attribute_const__ enum rdma_transport_type
>> @@ -1501,6 +1504,8 @@ struct ib_device {
>> int (*query_port)(struct ib_device *device,
>> u8 port_num,
>> struct ib_port_attr *port_attr);
>> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
>> + u8 port_num);
>> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
>> u8 port_num);
>> int (*query_gid)(struct ib_device *device,
>
> libibverbs also exposes transport at the device level. Isn't a change to
> make transport per port rather than per device needed there as well to
> be consistent with these proposed kernel changes ? If so, would the
> additional IBoE transport be exposed ? We also need to worry about
> backward compatibility for existing applications.

The proposal of this patch-set is to integrate IB core layer management
checking, without noticed by user layer.

Later the bitmask reform should not be noticed by core layer, so user
layer app should have totally no idea what happened inside kernel ;-)

Regards,
Michael Wang

>
> -- Hal
>

2015-04-16 11:40:34

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH v3 01/28] IB/Verbs: Implement new callback query_transport()

On 4/15/2015 4:33 PM, ira.weiny wrote:
> On Wed, Apr 15, 2015 at 02:36:13PM -0400, Hal Rosenstock wrote:
>> On 4/13/2015 8:22 AM, Michael Wang wrote:
>>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>>> index 65994a1..d54f91e 100644
>>> --- a/include/rdma/ib_verbs.h
>>> +++ b/include/rdma/ib_verbs.h
>>> @@ -75,10 +75,13 @@ enum rdma_node_type {
>>> };
>>>
>>> enum rdma_transport_type {
>>> + /* legacy for users */
>>> RDMA_TRANSPORT_IB,
>>> RDMA_TRANSPORT_IWARP,
>>> RDMA_TRANSPORT_USNIC,
>>> - RDMA_TRANSPORT_USNIC_UDP
>>> + RDMA_TRANSPORT_USNIC_UDP,
>>> + /* new transport */
>>> + RDMA_TRANSPORT_IBOE,
>>> };
>>>
>>> __attribute_const__ enum rdma_transport_type
>>> @@ -1501,6 +1504,8 @@ struct ib_device {
>>> int (*query_port)(struct ib_device *device,
>>> u8 port_num,
>>> struct ib_port_attr *port_attr);
>>> + enum rdma_transport_type (*query_transport)(struct ib_device *device,
>>> + u8 port_num);
>>> enum rdma_link_layer (*get_link_layer)(struct ib_device *device,
>>> u8 port_num);
>>> int (*query_gid)(struct ib_device *device,
>>
>> libibverbs also exposes transport at the device level. Isn't a change to
>> make transport per port rather than per device needed there as well to
>> be consistent with these proposed kernel changes ? If so, would the
>> additional IBoE transport be exposed ? We also need to worry about
>> backward compatibility for existing applications.
>
> Early on in this thread we agreed that user space would stay the same until we
> get the kernel straightened out.

I missed that in this very long thread. I just wanted to be sure that
this will be addressed.

-- Hal

> The above enum and function are not exported
> to user space so I believe this is in alignment with that plan.
>
> Ira
>
> .
>

2015-04-16 13:58:04

by Or Gerlitz

[permalink] [raw]
Subject: Re: [PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()

On Mon, Apr 13, 2015 at 3:35 PM, Michael Wang <[email protected]> wrote:
>
> Introduce helper cap_af_ib() to help us check if the port of an
> IB device support Native Infiniband Address.
>
> Cc: Steve Wise <[email protected]>
> Cc: Tom Talpey <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Doug Ledford <[email protected]>
> Cc: Ira Weiny <[email protected]>
> Cc: Sean Hefty <[email protected]>
> Signed-off-by: Michael Wang <[email protected]>
> ---
> drivers/infiniband/core/cma.c | 2 +-
> include/rdma/ib_verbs.h | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 65e41f4..7f5815d 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -470,7 +470,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
>
> list_for_each_entry(cur_dev, &dev_list, list) {
> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
> - if (!rdma_ib_or_iboe(cur_dev->device, p))
> + if (!cap_af_ib(cur_dev->device, p))
> continue;
>
> if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 29ddd14..dfe33f3 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
> }
>
> /**
> + * cap_af_ib - Check if the port of device has the capability
> + * Native Infiniband Address.
> + *
> + * @device: Device to be checked
> + * @port_num: Port number of the device
> + *
> + * Return 0 when port of the device don't support
> + * Native Infiniband Address.
> + */
> +static inline int cap_af_ib(struct ib_device *device, u8 port_num)
> +{
> + return rdma_ib_or_iboe(device, port_num);
> +}

Sean, can you please put a precise writeup what does it take to
support AF_IB... I am a bit
confused here and wasn't sure if this can be supported with RoCE.

2015-04-16 14:16:49

by Hal Rosenstock

[permalink] [raw]
Subject: Re: [PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()

On 4/16/2015 9:57 AM, Or Gerlitz wrote:
> On Mon, Apr 13, 2015 at 3:35 PM, Michael Wang <[email protected]> wrote:
>>
>> Introduce helper cap_af_ib() to help us check if the port of an
>> IB device support Native Infiniband Address.
>>
>> Cc: Steve Wise <[email protected]>
>> Cc: Tom Talpey <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Doug Ledford <[email protected]>
>> Cc: Ira Weiny <[email protected]>
>> Cc: Sean Hefty <[email protected]>
>> Signed-off-by: Michael Wang <[email protected]>
>> ---
>> drivers/infiniband/core/cma.c | 2 +-
>> include/rdma/ib_verbs.h | 15 +++++++++++++++
>> 2 files changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index 65e41f4..7f5815d 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -470,7 +470,7 @@ static int cma_resolve_ib_dev(struct rdma_id_private *id_priv)
>>
>> list_for_each_entry(cur_dev, &dev_list, list) {
>> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
>> - if (!rdma_ib_or_iboe(cur_dev->device, p))
>> + if (!cap_af_ib(cur_dev->device, p))
>> continue;
>>
>> if (ib_find_cached_pkey(cur_dev->device, p, pkey, &index))
>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
>> index 29ddd14..dfe33f3 100644
>> --- a/include/rdma/ib_verbs.h
>> +++ b/include/rdma/ib_verbs.h
>> @@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device *device, u8 port_num)
>> }
>>
>> /**
>> + * cap_af_ib - Check if the port of device has the capability
>> + * Native Infiniband Address.
>> + *
>> + * @device: Device to be checked
>> + * @port_num: Port number of the device
>> + *
>> + * Return 0 when port of the device don't support
>> + * Native Infiniband Address.
>> + */
>> +static inline int cap_af_ib(struct ib_device *device, u8 port_num)
>> +{
>> + return rdma_ib_or_iboe(device, port_num);
>> +}
>
> Sean, can you please put a precise writeup what does it take to
> support AF_IB... I am a bit
> confused here and wasn't sure if this can be supported with RoCE.

I think this means IB GID addressing is checked (Native Infiniband
Address) and not AF_IB (which is socket address/protocol family like
INET and INET6).

I think this naming is confusing and maybe cap_ib_gid is better ?

-- Hal

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2015-04-16 15:12:21

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()

> >> diff --git a/drivers/infiniband/core/cma.c
> b/drivers/infiniband/core/cma.c
> >> index 65e41f4..7f5815d 100644
> >> --- a/drivers/infiniband/core/cma.c
> >> +++ b/drivers/infiniband/core/cma.c
> >> @@ -470,7 +470,7 @@ static int cma_resolve_ib_dev(struct
> rdma_id_private *id_priv)
> >>
> >> list_for_each_entry(cur_dev, &dev_list, list) {
> >> for (p = 1; p <= cur_dev->device->phys_port_cnt; ++p) {
> >> - if (!rdma_ib_or_iboe(cur_dev->device, p))
> >> + if (!cap_af_ib(cur_dev->device, p))
> >> continue;
> >>
> >> if (ib_find_cached_pkey(cur_dev->device, p,
> pkey, &index))
> >> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> >> index 29ddd14..dfe33f3 100644
> >> --- a/include/rdma/ib_verbs.h
> >> +++ b/include/rdma/ib_verbs.h
> >> @@ -1879,6 +1879,21 @@ static inline int cap_ipoib(struct ib_device
> *device, u8 port_num)
> >> }
> >>
> >> /**
> >> + * cap_af_ib - Check if the port of device has the capability
> >> + * Native Infiniband Address.
> >> + *
> >> + * @device: Device to be checked
> >> + * @port_num: Port number of the device
> >> + *
> >> + * Return 0 when port of the device don't support
> >> + * Native Infiniband Address.
> >> + */
> >> +static inline int cap_af_ib(struct ib_device *device, u8 port_num)
> >> +{
> >> + return rdma_ib_or_iboe(device, port_num);
> >> +}
> >
> > Sean, can you please put a precise writeup what does it take to
> > support AF_IB... I am a bit
> > confused here and wasn't sure if this can be supported with RoCE.
>
> I think this means IB GID addressing is checked (Native Infiniband
> Address) and not AF_IB (which is socket address/protocol family like
> INET and INET6).
>
> I think this naming is confusing and maybe cap_ib_gid is better ?

I need to trace back through the code to see where this is used, but I thought the check was determining if an address conveyed using sockaddr_ib applied to the port. Sockaddr_ib was intended to target IB-classic specifically. It should apply to RoCE/IBoE as well. I don't think it applies to RoCEv2/IBoUDP without a translation.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-16 15:28:52

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 25/28] IB/Verbs: Use management helper cap_af_ib()



On 04/16/2015 05:09 PM, Hefty, Sean wrote:
[snip]
>>> Sean, can you please put a precise writeup what does it take to
>>> support AF_IB... I am a bit
>>> confused here and wasn't sure if this can be supported with RoCE.
>>
>> I think this means IB GID addressing is checked (Native Infiniband
>> Address) and not AF_IB (which is socket address/protocol family like
>> INET and INET6).
>>
>> I think this naming is confusing and maybe cap_ib_gid is better ?
>
> I need to trace back through the code to see where this is used, but I thought the check was determining if an address conveyed using sockaddr_ib applied to the port. Sockaddr_ib was intended to target IB-classic specifically. It should apply to RoCE/IBoE as well. I don't think it applies to RoCEv2/IBoUDP without a translation.

The usage is:

rdma_resolve_addr
{
...
if dst_addr->sa_family == AF_IB
cma_resolve_ib_addr --> cma_resolve_ib_dev
else
rdma_resolve_ip
...
}

So I guess IBoE using ib address rather than ip?

Regards,
Michael Wang

>

2015-04-16 16:43:31

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:

> > I would be very happy to see a patch that adds cap_ib_smi to the
> > current tree and states 'This patch is tested to have no change on the
> > binary compilation results'
>
> There are too much reform there (per-dev to per-port), I guess the binary
> will changed more or less anyway...

I think this patch series is huge, and everytime someone new looks at
it small functional errors seem to pop up..

Doing something to reduce the review surface would be really helpful
here. Not changing the same line twice, using tools too perform these
transforms and then assert the patch is a NOP because .. tools. Some
other idea?

Jason

2015-04-16 16:44:31

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Wed, Apr 15, 2015 at 09:58:18AM +0200, Michael Wang wrote:

> We can give client->add() callback a return value and make
> ib_register_device() return -ENOMEM when it failed, just wondering
> why we don't do this at first, any special reason?

No idea, but having ib_register_device fail and unwind if a client
fails to attach makes sense to me.

Jason

2015-04-16 17:03:18

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe
<[email protected]> wrote:
>> We can give client->add() callback a return value and make
>> ib_register_device() return -ENOMEM when it failed, just wondering
>> why we don't do this at first, any special reason?

> No idea, but having ib_register_device fail and unwind if a client
> fails to attach makes sense to me.

It seems a bit unfriendly to fail an entire device if one ULP has a
problem. Let's say you have a system whose main network connection is
IPoIB. Would you want that connection to come up even if, say, the
NFS/RDMA server fails to find the memory registration type it likes?

- R.

2015-04-16 17:05:49

by Ira Weiny

[permalink] [raw]
Subject: RE: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

>
> On Wed, Apr 15, 2015 at 09:58:18AM +0200, Michael Wang wrote:
>
> > We can give client->add() callback a return value and make
> > ib_register_device() return -ENOMEM when it failed, just wondering why
> > we don't do this at first, any special reason?
>
> No idea, but having ib_register_device fail and unwind if a client fails to attach
> makes sense to me.

Yes that is what we should do _but_

I think we should tackle that in a different series.

As you said in another email, this series is getting very long and hard to review/prove is correct. This is why I was advocating keeping a check at the top of cm_add_one which verified all Ports supported the CM. This is the current logic today and is proven to work for the devices/use cases out there.

We can clean up the initialization code and implement support for individual ports in follow on patches.

Ira

2015-04-16 17:21:32

by Hefty, Sean

[permalink] [raw]
Subject: RE: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

> > No idea, but having ib_register_device fail and unwind if a client
> > fails to attach makes sense to me.
>
> It seems a bit unfriendly to fail an entire device if one ULP has a
> problem. Let's say you have a system whose main network connection is
> IPoIB. Would you want that connection to come up even if, say, the
> NFS/RDMA server fails to find the memory registration type it likes?

What's missing is some way for the device to indicate which modules are actually necessary for it to run. Without having something like that, I agree with Roland.

- Sean
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-16 17:46:07

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On Thu, Apr 16, 2015 at 10:02:46AM -0700, Roland Dreier wrote:
> On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe
> <[email protected]> wrote:
> >> We can give client->add() callback a return value and make
> >> ib_register_device() return -ENOMEM when it failed, just wondering
> >> why we don't do this at first, any special reason?
>
> > No idea, but having ib_register_device fail and unwind if a client
> > fails to attach makes sense to me.
>
> It seems a bit unfriendly to fail an entire device if one ULP has a
> problem. Let's say you have a system whose main network connection is
> IPoIB. Would you want that connection to come up even if, say, the
> NFS/RDMA server fails to find the memory registration type it likes?

That is true, but we also never test any of the cases where one
expected ULP fails to load but another one needs it. Like IPoIB needs
sa_query, for instance.

I'm not saying ULPs should error for soft reasons, like your NFS
example, this unwind would be for OOM, or other 'impossible' class
errors.

That said, the driver core does not fail a bus driver register if a
client probe fails (we could copy this), but it also doesn't just
silently eat the error code either.

What about ib_register_client? If any add's fail should it unwind and
fail upwards?

Jason

2015-04-16 18:07:53

by Steve Wise

[permalink] [raw]
Subject: RE: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()



> -----Original Message-----
> From: Jason Gunthorpe [mailto:[email protected]]
> Sent: Thursday, April 16, 2015 11:43 AM
> To: Michael Wang
> Cc: Roland Dreier; Sean Hefty; Hal Rosenstock; [email protected]; [email protected]; Tom Tucker; Steve Wise;
> Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira
Weiny;
> Tom Talpey; Doug Ledford
> Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()
>
> On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:
>
> > > I would be very happy to see a patch that adds cap_ib_smi to the
> > > current tree and states 'This patch is tested to have no change on the
> > > binary compilation results'
> >
> > There are too much reform there (per-dev to per-port), I guess the binary
> > will changed more or less anyway...
>
> I think this patch series is huge, and everytime someone new looks at
> it small functional errors seem to pop up..
>
> Doing something to reduce the review surface would be really helpful
> here. Not changing the same line twice, using tools too perform these
> transforms and then assert the patch is a NOP because .. tools. Some
> other idea?
>

Don't try and change everything in one giant series. Just do some changes this cycle (keep it at < 8 or 10 patches), and do more
later...

2015-04-17 07:35:58

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

Hi, Roland

Thanks for the comment :-)

On 04/16/2015 07:02 PM, Roland Dreier wrote:
> On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe
> <[email protected]> wrote:
>>> We can give client->add() callback a return value and make
>>> ib_register_device() return -ENOMEM when it failed, just wondering
>>> why we don't do this at first, any special reason?
>
>> No idea, but having ib_register_device fail and unwind if a client
>> fails to attach makes sense to me.
>
> It seems a bit unfriendly to fail an entire device if one ULP has a
> problem. Let's say you have a system whose main network connection is
> IPoIB. Would you want that connection to come up even if, say, the
> NFS/RDMA server fails to find the memory registration type it likes?

Agree, the idea is correct that one client's initialization failure should not
influence the whole device, as long as the rest client can keep the device
working (but how to estimate that...).

While just ignore the failure seems really strange...

Regards,
Michael Wang

>
> - R.
>

2015-04-17 07:40:45

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib

On 04/16/2015 07:05 PM, Weiny, Ira wrote:
>>
>> On Wed, Apr 15, 2015 at 09:58:18AM +0200, Michael Wang wrote:
>>
>>> We can give client->add() callback a return value and make
>>> ib_register_device() return -ENOMEM when it failed, just wondering why
>>> we don't do this at first, any special reason?
>>
>> No idea, but having ib_register_device fail and unwind if a client fails to attach
>> makes sense to me.
>
> Yes that is what we should do _but_
>
> I think we should tackle that in a different series.
>
> As you said in another email, this series is getting very long and hard to review/prove is correct. This is why I was advocating keeping a check at the top of cm_add_one which verified all Ports supported the CM. This is the current logic today and is proven to work for the devices/use cases out there.
>
> We can clean up the initialization code and implement support for individual ports in follow on patches.

Agree, as long as this series do not introduce any Bug, I suggest we
put other reform ideas into next series :-)

We have already eliminate the old inferring way and integrate all the
cases into helpers, further reform should be far more clear based on
this foundation.

Regards,
Michael Wang

>
> Ira
>

2015-04-17 08:00:15

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()



On 04/16/2015 06:43 PM, Jason Gunthorpe wrote:
> On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:
>
>>> I would be very happy to see a patch that adds cap_ib_smi to the
>>> current tree and states 'This patch is tested to have no change on the
>>> binary compilation results'
>>
>> There are too much reform there (per-dev to per-port), I guess the binary
>> will changed more or less anyway...
>
> I think this patch series is huge, and everytime someone new looks at
> it small functional errors seem to pop up..

This is a big changing after all :-P

As Doug suggested at very beginning, all these changing are necessary
in order to eliminate the usage of old inferring method, then we will
have a clean stage for next reform.

And since it's big, I tried to classified them according to logical,
to help us review more easily, I'm not sure but compress the series
may increasing the difficulty of reviewing...

>
> Doing something to reduce the review surface would be really helpful
> here. Not changing the same line twice, using tools too perform these
> transforms and then assert the patch is a NOP because .. tools. Some
> other idea?

Actually the main reform work finished in 1#~15#, the rest are just
introducing cap_XX which we only need to check the description and
usage, thus I'd like to suggest we focus on reviewing 1#~15#, after all,
the rest won't introducing Bug and we can edit them at any time :-P

Frankly speaking I think it's a good thing that we locate errors at
this moment, whenever someone find issues, that means the patch has
been reviewed thoroughly, I think may be just few more version, this
series will become stable ;-)

Regards,
Michael Wang


>
> Jason
>

2015-04-17 08:04:56

by Michael Wang

[permalink] [raw]
Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()

On 04/16/2015 08:07 PM, Steve Wise wrote:
>
>
>> -----Original Message-----
>> From: Jason Gunthorpe [mailto:[email protected]]
>> Sent: Thursday, April 16, 2015 11:43 AM
>> To: Michael Wang
>> Cc: Roland Dreier; Sean Hefty; Hal Rosenstock; [email protected]; [email protected]; Tom Tucker; Steve Wise;
>> Hoang-Nam Nguyen; Christoph Raisch; Mike Marciniszyn; Eli Cohen; Faisal Latif; Jack Morgenstein; Or Gerlitz; Haggai Eran; Ira
> Weiny;
>> Tom Talpey; Doug Ledford
>> Subject: Re: [PATCH v3 27/28] IB/Verbs: Clean up rdma_ib_or_iboe()
>>
>> On Tue, Apr 14, 2015 at 11:13:03AM +0200, Michael Wang wrote:
>>
>>>> I would be very happy to see a patch that adds cap_ib_smi to the
>>>> current tree and states 'This patch is tested to have no change on the
>>>> binary compilation results'
>>>
>>> There are too much reform there (per-dev to per-port), I guess the binary
>>> will changed more or less anyway...
>>
>> I think this patch series is huge, and everytime someone new looks at
>> it small functional errors seem to pop up..
>>
>> Doing something to reduce the review surface would be really helpful
>> here. Not changing the same line twice, using tools too perform these
>> transforms and then assert the patch is a NOP because .. tools. Some
>> other idea?
>>
>
> Don't try and change everything in one giant series. Just do some changes this cycle (keep it at < 8 or 10 patches), and do more
> later...

Actually only 1#~15# related to logical reform, rest are just replacement :-)

Me too would like to stop introducing new stuff at this moment, and focus on
the improvement of what we have already settled down.

Regards,
Michael Wang

>