2016-10-17 16:31:59

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 0/8] infiniband: Remove semaphores

Hi,

These are a set of patches which removes semaphores from infiniband.
These are part of a bigger effort to eliminate all semaphores from the
linux kernel.

NB: A few semaphores which are counting ones are replaced with an open-coded
implementation by introducing a new type in 'include/rdma/ib_sa.h'. Need to
see if this can be programmed in a generic way using wait queues.

Thanks,
Binoy

Binoy Jayan (8):
IB/core: iwpm_nlmsg_request: Replace semaphore with completion
IB/core: Replace semaphore sm_sem with completion
IB/hns: Replace semaphore poll_sem with mutex
IB/mthca: Replace semaphore poll_sem with mutex
IB/isert: Replace semaphore sem with completion
IB/hns: Replace counting semaphore event_sem with wait condition
IB/mthca: Replace counting semaphore event_sem with wait condition
IB/mlx5: Replace counting semaphore sem with wait condition

drivers/infiniband/core/iwpm_msg.c | 8 ++++----
drivers/infiniband/core/iwpm_util.c | 7 +++----
drivers/infiniband/core/iwpm_util.h | 3 ++-
drivers/infiniband/core/user_mad.c | 14 ++++++++------
drivers/infiniband/hw/hns/hns_roce_cmd.c | 28 ++++++++++++++++++----------
drivers/infiniband/hw/hns/hns_roce_device.h | 6 ++++--
drivers/infiniband/hw/mlx5/main.c | 3 ++-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
drivers/infiniband/hw/mlx5/mr.c | 28 +++++++++++++++++++---------
drivers/infiniband/hw/mthca/mthca_cmd.c | 22 +++++++++++++---------
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 5 +++--
drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
include/rdma/ib_sa.h | 5 +++++
15 files changed, 89 insertions(+), 53 deletions(-)

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


2016-10-17 16:32:09

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 2/8] IB/core: Replace semaphore sm_sem with completion

The semaphore 'sm_sem' is used as completion, so convert it to
struct completion. Semaphores are going away in the future. The initial
status of the completion variable is marked as completed by a call to
the function 'complete' immediately following the initialization.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/core/user_mad.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 415a318..df070cc 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -47,6 +47,7 @@
#include <linux/kref.h>
#include <linux/compat.h>
#include <linux/sched.h>
+#include <linux/completion.h>
#include <linux/semaphore.h>
#include <linux/slab.h>

@@ -87,7 +88,7 @@ struct ib_umad_port {

struct cdev sm_cdev;
struct device *sm_dev;
- struct semaphore sm_sem;
+ struct completion sm_comp;

struct mutex file_mutex;
struct list_head file_list;
@@ -1030,12 +1031,12 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
port = container_of(inode->i_cdev, struct ib_umad_port, sm_cdev);

if (filp->f_flags & O_NONBLOCK) {
- if (down_trylock(&port->sm_sem)) {
+ if (!try_wait_for_completion(&port->sm_comp)) {
ret = -EAGAIN;
goto fail;
}
} else {
- if (down_interruptible(&port->sm_sem)) {
+ if (wait_for_completion_interruptible(&port->sm_comp)) {
ret = -ERESTARTSYS;
goto fail;
}
@@ -1060,7 +1061,7 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
ib_modify_port(port->ib_dev, port->port_num, 0, &props);

err_up_sem:
- up(&port->sm_sem);
+ complete(&port->sm_comp);

fail:
return ret;
@@ -1079,7 +1080,7 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp)
ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
mutex_unlock(&port->file_mutex);

- up(&port->sm_sem);
+ complete(&port->sm_comp);

kobject_put(&port->umad_dev->kobj);

@@ -1177,7 +1178,8 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,

port->ib_dev = device;
port->port_num = port_num;
- sema_init(&port->sm_sem, 1);
+ init_completion(&port->sm_comp);
+ complete(&port->sm_comp);
mutex_init(&port->file_mutex);
INIT_LIST_HEAD(&port->file_list);

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:32:18

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 8/8] IB/mlx5: Replace counting semaphore sem with wait condition

Counting semaphores are going away in the future, so replace the semaphore
umr_common::sem with an open-coded implementation.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mlx5/main.c | 3 ++-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 ++-
drivers/infiniband/hw/mlx5/mr.c | 28 +++++++++++++++++++---------
3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 2217477..5667ea8 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2520,7 +2520,8 @@ static int create_umr_res(struct mlx5_ib_dev *dev)
dev->umrc.cq = cq;
dev->umrc.pd = pd;

- sema_init(&dev->umrc.sem, MAX_UMR_WR);
+ init_waitqueue_head(&dev->umrc.sem.wq);
+ atomic_set(&dev->umrc.sem.count, MAX_UMR_WR);
ret = mlx5_mr_cache_init(dev);
if (ret) {
mlx5_ib_warn(dev, "mr cache init failed %d\n", ret);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index dcdcd19..60e2d29 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -45,6 +45,7 @@
#include <linux/mlx5/transobj.h>
#include <rdma/ib_user_verbs.h>
#include <rdma/mlx5-abi.h>
+#include <rdma/ib_sa.h>

#define mlx5_ib_dbg(dev, format, arg...) \
pr_debug("%s:%s:%d:(pid %d): " format, (dev)->ib_dev.name, __func__, \
@@ -533,7 +534,7 @@ struct umr_common {
struct ib_qp *qp;
/* control access to UMR QP
*/
- struct semaphore sem;
+ struct ib_semaphore sem;
};

enum {
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index d4ad672..7c2af26 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -900,7 +900,9 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
page_shift, virt_addr, len, access_flags);

- down(&umrc->sem);
+ wait_event(umrc->sem.wq,
+ atomic_add_unless(&umrc->sem.count, -1, 0));
+
err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
if (err) {
mlx5_ib_warn(dev, "post send failed, err %d\n", err);
@@ -920,7 +922,8 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
mr->live = 1;

unmap_dma:
- up(&umrc->sem);
+ if (atomic_inc_return(&umrc->sem.count) == 1)
+ wake_up(&umrc->sem.wq);
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);

kfree(mr_pas);
@@ -1031,7 +1034,8 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
wr.mkey = mr->mmkey.key;
wr.target.offset = start_page_index;

- down(&umrc->sem);
+ wait_event(umrc->sem.wq,
+ atomic_add_unless(&umrc->sem.count, -1, 0));
err = ib_post_send(umrc->qp, &wr.wr, &bad);
if (err) {
mlx5_ib_err(dev, "UMR post send failed, err %d\n", err);
@@ -1043,7 +1047,8 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
err = -EFAULT;
}
}
- up(&umrc->sem);
+ if (atomic_inc_return(&umrc->sem.count) == 1)
+ wake_up(&umrc->sem.wq);
}
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);

@@ -1224,15 +1229,18 @@ static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
umrwr.wr.wr_cqe = &umr_context.cqe;
prep_umr_unreg_wqe(dev, &umrwr.wr, mr->mmkey.key);

- down(&umrc->sem);
+ wait_event(umrc->sem.wq,
+ atomic_add_unless(&umrc->sem.count, -1, 0));
err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
if (err) {
- up(&umrc->sem);
+ if (atomic_inc_return(&umrc->sem.count) == 1)
+ wake_up(&umrc->sem.wq);
mlx5_ib_dbg(dev, "err %d\n", err);
goto error;
} else {
wait_for_completion(&umr_context.done);
- up(&umrc->sem);
+ if (atomic_inc_return(&umrc->sem.count) == 1)
+ wake_up(&umrc->sem.wq);
}
if (umr_context.status != IB_WC_SUCCESS) {
mlx5_ib_warn(dev, "unreg umr failed\n");
@@ -1291,7 +1299,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
}

/* post send request to UMR QP */
- down(&umrc->sem);
+ wait_event(umrc->sem.wq,
+ atomic_add_unless(&umrc->sem.count, -1, 0));
err = ib_post_send(umrc->qp, &umrwr.wr, &bad);

if (err) {
@@ -1305,7 +1314,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
}
}

- up(&umrc->sem);
+ if (atomic_inc_return(&umrc->sem.count) == 1)
+ wake_up(&umrc->sem.wq);
if (flags & IB_MR_REREG_TRANS) {
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
kfree(mr_pas);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:32:24

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 7/8] IB/mthca: Replace counting semaphore event_sem with wait condition

Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with an open-coded implementation.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 12 ++++++++----
drivers/infiniband/hw/mthca/mthca_dev.h | 3 ++-
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 0cb58ea..5b8d5ea 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -417,7 +417,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
int err = 0;
struct mthca_cmd_context *context;

- down(&dev->cmd.event_sem);
+ wait_event(dev->cmd.event_sem.wq,
+ atomic_add_unless(&dev->cmd.event_sem.count, -1, 0));

spin_lock(&dev->cmd.context_lock);
BUG_ON(dev->cmd.free_head < 0);
@@ -459,7 +460,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
dev->cmd.free_head = context - dev->cmd.context;
spin_unlock(&dev->cmd.context_lock);

- up(&dev->cmd.event_sem);
+ if (atomic_inc_return(&dev->cmd.event_sem.count) == 1)
+ wake_up(&dev->cmd.event_sem.wq);
return err;
}

@@ -571,7 +573,8 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
dev->cmd.free_head = 0;

- sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+ init_waitqueue_head(&dev->cmd.event_sem.wq);
+ atomic_set(&dev->cmd.event_sem.count, dev->cmd.max_cmds);
spin_lock_init(&dev->cmd.context_lock);

for (dev->cmd.token_mask = 1;
@@ -597,7 +600,8 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;

for (i = 0; i < dev->cmd.max_cmds; ++i)
- down(&dev->cmd.event_sem);
+ wait_event(dev->cmd.event_sem.wq,
+ atomic_add_unless(&dev->cmd.event_sem.count, -1, 0));

kfree(dev->cmd.context);

diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 87ab964..1f88835c 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -46,6 +46,7 @@
#include <linux/list.h>
#include <linux/semaphore.h>

+#include <rdma/ib_sa.h>
#include "mthca_provider.h"
#include "mthca_doorbell.h"

@@ -121,7 +122,7 @@ struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
struct mutex poll_mutex;
- struct semaphore event_sem;
+ struct ib_semaphore event_sem;
int max_cmds;
spinlock_t context_lock;
int free_head;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:32:36

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 6/8] IB/hns: Replace counting semaphore event_sem with wait condition

Counting semaphores are going away in the future, so replace the semaphore
hns_roce_cmdq::event_sem with an open-coded implementation.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 16 ++++++++++++----
drivers/infiniband/hw/hns/hns_roce_device.h | 3 ++-
include/rdma/ib_sa.h | 5 +++++
3 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 1421fdb..3e76717 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -248,10 +248,14 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret = 0;

- down(&hr_dev->cmd.event_sem);
+ wait_event(hr_dev->cmd.event_sem.wq,
+ atomic_add_unless(&hr_dev->cmd.event_sem.count, -1, 0));
+
ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
in_modifier, op_modifier, op, timeout);
- up(&hr_dev->cmd.event_sem);
+
+ if (atomic_inc_return(&hr_dev->cmd.event_sem.count) == 1)
+ wake_up(&hr_dev->cmd.event_sem.wq);

return ret;
}
@@ -313,7 +317,9 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
hr_cmd->free_head = 0;

- sema_init(&hr_cmd->event_sem, hr_cmd->max_cmds);
+ init_waitqueue_head(&hr_cmd->event_sem.wq);
+ atomic_set(&hr_cmd->event_sem.count, hr_cmd->max_cmds);
+
spin_lock_init(&hr_cmd->context_lock);

hr_cmd->token_mask = CMD_TOKEN_MASK;
@@ -332,7 +338,9 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
hr_cmd->use_events = 0;

for (i = 0; i < hr_cmd->max_cmds; ++i)
- down(&hr_cmd->event_sem);
+ wait_event(hr_cmd->event_sem.wq,
+ atomic_add_unless(
+ &hr_dev->cmd.event_sem.count, -1, 0));

kfree(hr_cmd->context);
mutex_unlock(&hr_cmd->poll_mutex);
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2afe075..6aed04a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
#define _HNS_ROCE_DEVICE_H

#include <rdma/ib_verbs.h>
+#include <rdma/ib_sa.h>
#include <linux/mutex.h>

#define DRV_NAME "hns_roce"
@@ -364,7 +365,7 @@ struct hns_roce_cmdq {
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
*/
- struct semaphore event_sem;
+ struct ib_semaphore event_sem;
int max_cmds;
spinlock_t context_lock;
int free_head;
diff --git a/include/rdma/ib_sa.h b/include/rdma/ib_sa.h
index 5ee7aab..1901042 100644
--- a/include/rdma/ib_sa.h
+++ b/include/rdma/ib_sa.h
@@ -291,6 +291,11 @@ struct ib_sa_service_rec {
#define IB_SA_GUIDINFO_REC_GID6 IB_SA_COMP_MASK(10)
#define IB_SA_GUIDINFO_REC_GID7 IB_SA_COMP_MASK(11)

+struct ib_semaphore {
+ wait_queue_head_t wq;
+ atomic_t count;
+};
+
struct ib_sa_guidinfo_rec {
__be16 lid;
u8 block_num;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:32:48

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 4/8] IB/mthca: Replace semaphore poll_sem with mutex

The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 10 +++++-----
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 2 +-
3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..0cb58ea 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -347,7 +347,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
unsigned long end;
u8 status;

- down(&dev->cmd.poll_sem);
+ mutex_lock(&dev->cmd.poll_mutex);

err = mthca_cmd_post(dev, in_param,
out_param ? *out_param : 0,
@@ -382,7 +382,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
}

out:
- up(&dev->cmd.poll_sem);
+ mutex_unlock(&dev->cmd.poll_mutex);
return err;
}

@@ -520,7 +520,7 @@ static int mthca_cmd_imm(struct mthca_dev *dev,
int mthca_cmd_init(struct mthca_dev *dev)
{
mutex_init(&dev->cmd.hcr_mutex);
- sema_init(&dev->cmd.poll_sem, 1);
+ mutex_init(&dev->cmd.poll_mutex);
dev->cmd.flags = 0;

dev->hcr = ioremap(pci_resource_start(dev->pdev, 0) + MTHCA_HCR_BASE,
@@ -582,7 +582,7 @@ int mthca_cmd_use_events(struct mthca_dev *dev)

dev->cmd.flags |= MTHCA_CMD_USE_EVENTS;

- down(&dev->cmd.poll_sem);
+ mutex_lock(&dev->cmd.poll_mutex);

return 0;
}
@@ -601,7 +601,7 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)

kfree(dev->cmd.context);

- up(&dev->cmd.poll_sem);
+ mutex_unlock(&dev->cmd.poll_mutex);
}

struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h
index d2e5b19..a7f197e 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.h
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.h
@@ -35,6 +35,7 @@
#ifndef MTHCA_CMD_H
#define MTHCA_CMD_H

+#include <linux/mutex.h>
#include <rdma/ib_verbs.h>

#define MTHCA_MAILBOX_SIZE 4096
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..87ab964 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -120,7 +120,7 @@ enum {
struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
struct semaphore event_sem;
int max_cmds;
spinlock_t context_lock;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:33:00

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 5/8] IB/isert: Replace semaphore sem with completion

The semaphore 'sem' in isert_device is used as completion, so convert
it to struct completion. Semaphores are going away in the future.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6dd43f6..de80f56 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -619,7 +619,7 @@
mutex_unlock(&isert_np->mutex);

isert_info("np %p: Allow accept_np to continue\n", isert_np);
- up(&isert_np->sem);
+ complete(&isert_np->comp);
}

static void
@@ -2311,7 +2311,7 @@ struct rdma_cm_id *
isert_err("Unable to allocate struct isert_np\n");
return -ENOMEM;
}
- sema_init(&isert_np->sem, 0);
+ init_completion(&isert_np->comp);
mutex_init(&isert_np->mutex);
INIT_LIST_HEAD(&isert_np->accepted);
INIT_LIST_HEAD(&isert_np->pending);
@@ -2427,7 +2427,7 @@ struct rdma_cm_id *
int ret;

accept_wait:
- ret = down_interruptible(&isert_np->sem);
+ ret = wait_for_completion_interruptible(&isert_np->comp);
if (ret)
return -ENODEV;

diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index c02ada5..a1277c0 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -3,6 +3,7 @@
#include <linux/in6.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdma_cm.h>
+#include <linux/completion.h>
#include <rdma/rw.h>
#include <scsi/iser.h>

@@ -190,7 +191,7 @@ struct isert_device {

struct isert_np {
struct iscsi_np *np;
- struct semaphore sem;
+ struct completion comp;
struct rdma_cm_id *cm_id;
struct mutex mutex;
struct list_head accepted;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:33:38

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 3/8] IB/hns: Replace semaphore poll_sem with mutex

The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 12 ++++++------
drivers/infiniband/hw/hns/hns_roce_device.h | 3 ++-
2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..1421fdb 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -119,7 +119,7 @@ static int hns_roce_cmd_mbox_post_hw(struct hns_roce_dev *hr_dev, u64 in_param,
return ret;
}

-/* this should be called with "poll_sem" */
+/* this should be called with "poll_mutex" */
static int __hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
u8 op_modifier, u16 op,
@@ -167,10 +167,10 @@ static int hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret;

- down(&hr_dev->cmd.poll_sem);
+ mutex_lock(&hr_dev->cmd.poll_mutex);
ret = __hns_roce_cmd_mbox_poll(hr_dev, in_param, out_param, in_modifier,
op_modifier, op, timeout);
- up(&hr_dev->cmd.poll_sem);
+ mutex_unlock(&hr_dev->cmd.poll_mutex);

return ret;
}
@@ -275,7 +275,7 @@ int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
struct device *dev = &hr_dev->pdev->dev;

mutex_init(&hr_dev->cmd.hcr_mutex);
- sema_init(&hr_dev->cmd.poll_sem, 1);
+ mutex_init(&hr_dev->cmd.poll_mutex);
hr_dev->cmd.use_events = 0;
hr_dev->cmd.toggle = 1;
hr_dev->cmd.max_cmds = CMD_MAX_NUM;
@@ -319,7 +319,7 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->token_mask = CMD_TOKEN_MASK;
hr_cmd->use_events = 1;

- down(&hr_cmd->poll_sem);
+ mutex_lock(&hr_cmd->poll_mutex);

return 0;
}
@@ -335,7 +335,7 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
down(&hr_cmd->event_sem);

kfree(hr_cmd->context);
- up(&hr_cmd->poll_sem);
+ mutex_unlock(&hr_cmd->poll_mutex);
}

struct hns_roce_cmd_mailbox
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..2afe075 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
#define _HNS_ROCE_DEVICE_H

#include <rdma/ib_verbs.h>
+#include <linux/mutex.h>

#define DRV_NAME "hns_roce"

@@ -358,7 +359,7 @@ struct hns_roce_cmdq {
struct dma_pool *pool;
u8 __iomem *hcr;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
/*
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:33:47

by Binoy Jayan

[permalink] [raw]
Subject: [PATCH 1/8] IB/core: iwpm_nlmsg_request: Replace semaphore with completion

Semaphore sem in iwpm_nlmsg_request is used as completion, so
convert it to a struct completion type. Semaphores are going
away in the future.

Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/core/iwpm_msg.c | 8 ++++----
drivers/infiniband/core/iwpm_util.c | 7 +++----
drivers/infiniband/core/iwpm_util.h | 3 ++-
3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/core/iwpm_msg.c b/drivers/infiniband/core/iwpm_msg.c
index 1c41b95..761358f 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -394,7 +394,7 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found nlmsg_request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_register_pid_cb);
@@ -463,7 +463,7 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_mapping_cb);
@@ -555,7 +555,7 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
@@ -749,7 +749,7 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c b/drivers/infiniband/core/iwpm_util.c
index ade71e7..08ddd2e 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -323,8 +323,7 @@ struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
nlmsg_request->nl_client = nl_client;
nlmsg_request->request_done = 0;
nlmsg_request->err_code = 0;
- sema_init(&nlmsg_request->sem, 1);
- down(&nlmsg_request->sem);
+ init_completion(&nlmsg_request->comp);
return nlmsg_request;
}

@@ -368,8 +367,8 @@ int iwpm_wait_complete_req(struct iwpm_nlmsg_request *nlmsg_request)
{
int ret;

- ret = down_timeout(&nlmsg_request->sem, IWPM_NL_TIMEOUT);
- if (ret) {
+ ret = wait_for_completion_timeout(&nlmsg_request->comp, IWPM_NL_TIMEOUT);
+ if (!ret) {
ret = -EINVAL;
pr_info("%s: Timeout %d sec for netlink request (seq = %u)\n",
__func__, (IWPM_NL_TIMEOUT/HZ), nlmsg_request->nlmsg_seq);
diff --git a/drivers/infiniband/core/iwpm_util.h b/drivers/infiniband/core/iwpm_util.h
index af1fc14..ea6c299 100644
--- a/drivers/infiniband/core/iwpm_util.h
+++ b/drivers/infiniband/core/iwpm_util.h
@@ -43,6 +43,7 @@
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <linux/mutex.h>
+#include <linux/completion.h>
#include <linux/jhash.h>
#include <linux/kref.h>
#include <net/netlink.h>
@@ -69,7 +70,7 @@ struct iwpm_nlmsg_request {
u8 nl_client;
u8 request_done;
u16 err_code;
- struct semaphore sem;
+ struct completion comp;
struct kref kref;
};

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2016-10-17 16:39:42

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 6/8] IB/hns: Replace counting semaphore event_sem with wait condition

On Mon, Oct 17, 2016 at 10:01:00PM +0530, Binoy Jayan wrote:
> Counting semaphores are going away in the future, so replace the semaphore
> hns_roce_cmdq::event_sem with an open-coded implementation.

Sorry, but no. Using a proper semaphore abstraction is always
better than open coding it. While most semaphore users should move
to better primitives, those that actually are counting semaphore
should stick to the existing primitive.

2016-10-17 16:40:28

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 8/8] IB/mlx5: Replace counting semaphore sem with wait condition

On Mon, Oct 17, 2016 at 10:01:02PM +0530, Binoy Jayan wrote:
> Counting semaphores are going away in the future, so replace the semaphore
> umr_common::sem with an open-coded implementation.

And NAK again..

2016-10-17 17:01:03

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH 0/8] infiniband: Remove semaphores

On 10/17/2016 09:30 AM, Binoy Jayan wrote:
> These are a set of patches which removes semaphores from infiniband.
> These are part of a bigger effort to eliminate all semaphores from the
> linux kernel.

Hello Binoy,

Why do you think it would be a good idea to eliminate all semaphores
from the Linux kernel? I don't know anyone who doesn't consider
semaphores a useful abstraction.

Thanks,

Bart.

2016-10-17 20:08:42

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/8] infiniband: Remove semaphores

On Monday, October 17, 2016 9:57:34 AM CEST Bart Van Assche wrote:
> On 10/17/2016 09:30 AM, Binoy Jayan wrote:
> > These are a set of patches which removes semaphores from infiniband.
> > These are part of a bigger effort to eliminate all semaphores from the
> > linux kernel.
>
> Hello Binoy,
>
> Why do you think it would be a good idea to eliminate all semaphores
> from the Linux kernel? I don't know anyone who doesn't consider
> semaphores a useful abstraction.

There are a several reasons why the semaphores as defined in the
kernel are bad and we should get rid of them:

- semaphores cannot be analysed using lockdep, since they don't
always fit in the simpler 'mutex' semantics

- those that are basically mutexes should be converted to mutexes
for efficiency and consistency anyway

- the semaphores that are not just used as mutexes are typically
used as completions and should just be converted to completions
for simplicity

- when running a preempt-rt kernel, semaphores suffer from priority
inversion problems, while mutexes use use priority inheritance
as a countermeasure

There are very few remaining semaphores in the kernel and generally
speaking we'd be better off removing them all so no new users
show up in the future. Most of them are trivial to replace with
mutexes or completions. For the ones that are not trivially replaced,
we have to look at each one and decide what to do about them,
there usually is some solution that actually improves the code.

Using an open-coded semaphore as a replacement is probably just
the last resort that we can consider once we are down to the
last handful of users. I haven't looked at drivers/infiniband/
yet for this, but I believe that drivers/acpi/ is a case for
which I see no better alternative (the AML bytecode requires
counting semaphore semantics).

Arnd

2016-10-17 20:28:18

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH 0/8] infiniband: Remove semaphores

On 10/17/2016 01:06 PM, Arnd Bergmann wrote:
> Using an open-coded semaphore as a replacement is probably just
> the last resort that we can consider once we are down to the
> last handful of users. I haven't looked at drivers/infiniband/
> yet for this, but I believe that drivers/acpi/ is a case for
> which I see no better alternative (the AML bytecode requires
> counting semaphore semantics).

Hello Arnd,

Thanks for the detailed reply. However, I doubt that removing and
open-coding counting semaphores is the best alternative. Counting
semaphores are a useful abstraction. I think open-coding counting
semaphores everywhere counting semaphores are used would be a step back
instead of a step forward.

Bart.

2016-10-17 20:30:36

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 6/8] IB/hns: Replace counting semaphore event_sem with wait condition

On Monday, October 17, 2016 10:01:00 PM CEST Binoy Jayan wrote:
> --- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
> +++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
> @@ -248,10 +248,14 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
> {
> int ret = 0;
>
> - down(&hr_dev->cmd.event_sem);
> + wait_event(hr_dev->cmd.event_sem.wq,
> + atomic_add_unless(&hr_dev->cmd.event_sem.count, -1, 0));
> +
> ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
> in_modifier, op_modifier, op, timeout);
> - up(&hr_dev->cmd.event_sem);
> +
> + if (atomic_inc_return(&hr_dev->cmd.event_sem.count) == 1)
> + wake_up(&hr_dev->cmd.event_sem.wq);
>
> return ret;
> }

This is the only interesting use of the event_sem that cares about
the counting and it protects the __hns_roce_cmd_mbox_wait() from being
entered too often. The count here is the number of size of the
hr_dev->cmd.context[] array.

However, that function already use a spinlock to protect that array
and pick the correct context. I think changing the inner function
to handle the case of 'no context available' by using a waitqueue
without counting anything would be a reasonable transformation
away from the semaphore.

Arnd

2016-10-17 20:33:00

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 7/8] IB/mthca: Replace counting semaphore event_sem with wait condition

On Monday, October 17, 2016 9:39:54 AM CEST Christoph Hellwig wrote:
> Same NAK as for the last patch.

Right, whatever we decide to do for patch 6 is what should be
done here too, as the hns code was clearly copied unchanged.

Arnd

2016-10-17 20:38:06

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 0/8] infiniband: Remove semaphores

On Monday, October 17, 2016 1:28:01 PM CEST Bart Van Assche wrote:
> On 10/17/2016 01:06 PM, Arnd Bergmann wrote:
> > Using an open-coded semaphore as a replacement is probably just
> > the last resort that we can consider once we are down to the
> > last handful of users. I haven't looked at drivers/infiniband/
> > yet for this, but I believe that drivers/acpi/ is a case for
> > which I see no better alternative (the AML bytecode requires
> > counting semaphore semantics).
>
> Hello Arnd,
>
> Thanks for the detailed reply. However, I doubt that removing and
> open-coding counting semaphores is the best alternative. Counting
> semaphores are a useful abstraction. I think open-coding counting
> semaphores everywhere counting semaphores are used would be a step back
> instead of a step forward.

Absolutely agreed, that's why I said 'last resort' above. I don't
think that we need to go there for infiniband. See my reply
for patch 6 for one idea on how to handle hns and mthca. There
might be better ways.

These fall into the general category of using the counting semaphore
to count something that we already know in the code that uses
the semaphore, so we can remove the count and just need some other
waiting primitive.

Arnd

2016-10-18 05:16:56

by Binoy Jayan

[permalink] [raw]
Subject: Re: [PATCH 6/8] IB/hns: Replace counting semaphore event_sem with wait condition

On 18 October 2016 at 01:59, Arnd Bergmann <[email protected]> wrote:
> On Monday, October 17, 2016 10:01:00 PM CEST Binoy Jayan wrote:
>> --- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
>> +++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
>> @@ -248,10 +248,14 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
>> {
>> int ret = 0;
>>
>> - down(&hr_dev->cmd.event_sem);
>> + wait_event(hr_dev->cmd.event_sem.wq,
>> + atomic_add_unless(&hr_dev->cmd.event_sem.count, -1, 0));
>> +
>> ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
>> in_modifier, op_modifier, op, timeout);
>> - up(&hr_dev->cmd.event_sem);
>> +
>> + if (atomic_inc_return(&hr_dev->cmd.event_sem.count) == 1)
>> + wake_up(&hr_dev->cmd.event_sem.wq);
>>
>> return ret;
>> }
>
> This is the only interesting use of the event_sem that cares about
> the counting and it protects the __hns_roce_cmd_mbox_wait() from being
> entered too often. The count here is the number of size of the
> hr_dev->cmd.context[] array.
>
> However, that function already use a spinlock to protect that array
> and pick the correct context. I think changing the inner function
> to handle the case of 'no context available' by using a waitqueue
> without counting anything would be a reasonable transformation
> away from the semaphore.
>
> Arnd


Hi Arnd,

Thank you for replying for the questions. I''ll look for alternatives
for patches
6,7 and 8 and resend the series.

-Binoy

2016-10-19 15:15:27

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH 6/8] IB/hns: Replace counting semaphore event_sem with wait condition

On Tuesday, October 18, 2016 10:46:45 AM CEST Binoy Jayan wrote:
> Thank you for replying for the questions. I''ll look for alternatives
> for patches 6,7 and 8 and resend the series.

Ok, thanks!

I also looked at patch 8 some more and noticed that those four
functions all do the exact same sequence:

- initialize a mlx5_ib_umr_context on the stack
- assign "umrwr.wr.wr_cqe = &umr_context.cqe"
- take the semaphore
- call ib_post_send with a single ib_send_wr
- wait for the mlx5_ib_umr_done() function to get called
- if we get back a failure, print a warning and return -EFAULT.
- release the semaphore

Moving all of these into a shared helper function would be
a good cleanup, and it leaves only a single function using
the semaphore, which can then be rewritten to use something
else.

The existing completion in there can be simplified to a
wait_event, since we are waiting for the return value to
be filled.

Arnd