Hi,
These are a set of patches [v2] which removes semaphores from infiniband.
These are part of a bigger effort to eliminate all semaphores from the
linux kernel.
v1 -> v2:
IB/hns : Use wait_event instead of open coding counting semaphores
IB/mthca : Use wait_event instead of open coding counting semaphores
IB/mthca : Remove mutex_[un]lock from *_cmd_use_events/*_cmd_use_polling
IB/mlx5 : Cleanup, add helper mlx5_ib_post_send_wait
Thanks,
Binoy
Binoy Jayan (8):
IB/core: iwpm_nlmsg_request: Replace semaphore with completion
IB/core: Replace semaphore sm_sem with completion
IB/hns: Replace semaphore poll_sem with mutex
IB/mthca: Replace semaphore poll_sem with mutex
IB/isert: Replace semaphore sem with completion
IB/hns: Replace counting semaphore event_sem with wait_event
IB/mthca: Replace counting semaphore event_sem with wait_event
IB/mlx5: Add helper mlx5_ib_post_send_wait
drivers/infiniband/core/iwpm_msg.c | 8 +--
drivers/infiniband/core/iwpm_util.c | 7 +--
drivers/infiniband/core/iwpm_util.h | 3 +-
drivers/infiniband/core/user_mad.c | 14 +++--
drivers/infiniband/hw/hns/hns_roce_cmd.c | 48 ++++++++++-----
drivers/infiniband/hw/hns/hns_roce_device.h | 5 +-
drivers/infiniband/hw/mlx5/mr.c | 96 +++++++++--------------------
drivers/infiniband/hw/mthca/mthca_cmd.c | 47 +++++++++-----
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 5 +-
drivers/infiniband/ulp/isert/ib_isert.c | 6 +-
drivers/infiniband/ulp/isert/ib_isert.h | 3 +-
12 files changed, 119 insertions(+), 124 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Semaphore sem in iwpm_nlmsg_request is used as completion, so
convert it to a struct completion type. Semaphores are going
away in the future.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/core/iwpm_msg.c | 8 ++++----
drivers/infiniband/core/iwpm_util.c | 7 +++----
drivers/infiniband/core/iwpm_util.h | 3 ++-
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/core/iwpm_msg.c b/drivers/infiniband/core/iwpm_msg.c
index 1c41b95..761358f 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -394,7 +394,7 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found nlmsg_request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_register_pid_cb);
@@ -463,7 +463,7 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_mapping_cb);
@@ -555,7 +555,7 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
@@ -749,7 +749,7 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
/* always for found request */
kref_put(&nlmsg_request->kref, iwpm_free_nlmsg_request);
barrier();
- up(&nlmsg_request->sem);
+ complete(&nlmsg_request->comp);
return 0;
}
EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c b/drivers/infiniband/core/iwpm_util.c
index ade71e7..08ddd2e 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -323,8 +323,7 @@ struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
nlmsg_request->nl_client = nl_client;
nlmsg_request->request_done = 0;
nlmsg_request->err_code = 0;
- sema_init(&nlmsg_request->sem, 1);
- down(&nlmsg_request->sem);
+ init_completion(&nlmsg_request->comp);
return nlmsg_request;
}
@@ -368,8 +367,8 @@ int iwpm_wait_complete_req(struct iwpm_nlmsg_request *nlmsg_request)
{
int ret;
- ret = down_timeout(&nlmsg_request->sem, IWPM_NL_TIMEOUT);
- if (ret) {
+ ret = wait_for_completion_timeout(&nlmsg_request->comp, IWPM_NL_TIMEOUT);
+ if (!ret) {
ret = -EINVAL;
pr_info("%s: Timeout %d sec for netlink request (seq = %u)\n",
__func__, (IWPM_NL_TIMEOUT/HZ), nlmsg_request->nlmsg_seq);
diff --git a/drivers/infiniband/core/iwpm_util.h b/drivers/infiniband/core/iwpm_util.h
index af1fc14..ea6c299 100644
--- a/drivers/infiniband/core/iwpm_util.h
+++ b/drivers/infiniband/core/iwpm_util.h
@@ -43,6 +43,7 @@
#include <linux/delay.h>
#include <linux/workqueue.h>
#include <linux/mutex.h>
+#include <linux/completion.h>
#include <linux/jhash.h>
#include <linux/kref.h>
#include <net/netlink.h>
@@ -69,7 +70,7 @@ struct iwpm_nlmsg_request {
u8 nl_client;
u8 request_done;
u16 err_code;
- struct semaphore sem;
+ struct completion comp;
struct kref kref;
};
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace the semaphore 'poll_sem'
with a mutex. Also, remove mutex_[un]lock from mthca_cmd_use_events and
mthca_cmd_use_polling respectively.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 10 +++-------
drivers/infiniband/hw/mthca/mthca_cmd.h | 1 +
drivers/infiniband/hw/mthca/mthca_dev.h | 2 +-
3 files changed, 5 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..49c6e19 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -347,7 +347,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
unsigned long end;
u8 status;
- down(&dev->cmd.poll_sem);
+ mutex_lock(&dev->cmd.poll_mutex);
err = mthca_cmd_post(dev, in_param,
out_param ? *out_param : 0,
@@ -382,7 +382,7 @@ static int mthca_cmd_poll(struct mthca_dev *dev,
}
out:
- up(&dev->cmd.poll_sem);
+ mutex_unlock(&dev->cmd.poll_mutex);
return err;
}
@@ -520,7 +520,7 @@ static int mthca_cmd_imm(struct mthca_dev *dev,
int mthca_cmd_init(struct mthca_dev *dev)
{
mutex_init(&dev->cmd.hcr_mutex);
- sema_init(&dev->cmd.poll_sem, 1);
+ mutex_init(&dev->cmd.poll_mutex);
dev->cmd.flags = 0;
dev->hcr = ioremap(pci_resource_start(dev->pdev, 0) + MTHCA_HCR_BASE,
@@ -582,8 +582,6 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.flags |= MTHCA_CMD_USE_EVENTS;
- down(&dev->cmd.poll_sem);
-
return 0;
}
@@ -600,8 +598,6 @@ void mthca_cmd_use_polling(struct mthca_dev *dev)
down(&dev->cmd.event_sem);
kfree(dev->cmd.context);
-
- up(&dev->cmd.poll_sem);
}
struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.h b/drivers/infiniband/hw/mthca/mthca_cmd.h
index d2e5b19..a7f197e 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.h
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.h
@@ -35,6 +35,7 @@
#ifndef MTHCA_CMD_H
#define MTHCA_CMD_H
+#include <linux/mutex.h>
#include <rdma/ib_verbs.h>
#define MTHCA_MAILBOX_SIZE 4096
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..87ab964 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -120,7 +120,7 @@ enum {
struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
struct semaphore event_sem;
int max_cmds;
spinlock_t context_lock;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The semaphore 'sm_sem' is used as completion, so convert it to
struct completion. Semaphores are going away in the future. The initial
status of the completion variable is marked as completed by a call to
the function 'complete' immediately following the initialization.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/core/user_mad.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 415a318..df070cc 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -47,6 +47,7 @@
#include <linux/kref.h>
#include <linux/compat.h>
#include <linux/sched.h>
+#include <linux/completion.h>
#include <linux/semaphore.h>
#include <linux/slab.h>
@@ -87,7 +88,7 @@ struct ib_umad_port {
struct cdev sm_cdev;
struct device *sm_dev;
- struct semaphore sm_sem;
+ struct completion sm_comp;
struct mutex file_mutex;
struct list_head file_list;
@@ -1030,12 +1031,12 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
port = container_of(inode->i_cdev, struct ib_umad_port, sm_cdev);
if (filp->f_flags & O_NONBLOCK) {
- if (down_trylock(&port->sm_sem)) {
+ if (!try_wait_for_completion(&port->sm_comp)) {
ret = -EAGAIN;
goto fail;
}
} else {
- if (down_interruptible(&port->sm_sem)) {
+ if (wait_for_completion_interruptible(&port->sm_comp)) {
ret = -ERESTARTSYS;
goto fail;
}
@@ -1060,7 +1061,7 @@ static int ib_umad_sm_open(struct inode *inode, struct file *filp)
ib_modify_port(port->ib_dev, port->port_num, 0, &props);
err_up_sem:
- up(&port->sm_sem);
+ complete(&port->sm_comp);
fail:
return ret;
@@ -1079,7 +1080,7 @@ static int ib_umad_sm_close(struct inode *inode, struct file *filp)
ret = ib_modify_port(port->ib_dev, port->port_num, 0, &props);
mutex_unlock(&port->file_mutex);
- up(&port->sm_sem);
+ complete(&port->sm_comp);
kobject_put(&port->umad_dev->kobj);
@@ -1177,7 +1178,8 @@ static int ib_umad_init_port(struct ib_device *device, int port_num,
port->ib_dev = device;
port->port_num = port_num;
- sema_init(&port->sm_sem, 1);
+ init_completion(&port->sm_comp);
+ complete(&port->sm_comp);
mutex_init(&port->file_mutex);
INIT_LIST_HEAD(&port->file_list);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The semaphore 'poll_sem' is a simple mutex, so it should be written as one.
Semaphores are going away in the future. So replace the semaphore 'poll_sem'
with a mutex. Also, remove mutex_[un]lock from mthca_cmd_use_events and
mthca_cmd_use_polling respectively.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 11 ++++-------
drivers/infiniband/hw/hns/hns_roce_device.h | 3 ++-
2 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 2a0b6c0..51a0675 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -119,7 +119,7 @@ static int hns_roce_cmd_mbox_post_hw(struct hns_roce_dev *hr_dev, u64 in_param,
return ret;
}
-/* this should be called with "poll_sem" */
+/* this should be called with "poll_mutex" */
static int __hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
u8 op_modifier, u16 op,
@@ -167,10 +167,10 @@ static int hns_roce_cmd_mbox_poll(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret;
- down(&hr_dev->cmd.poll_sem);
+ mutex_lock(&hr_dev->cmd.poll_mutex);
ret = __hns_roce_cmd_mbox_poll(hr_dev, in_param, out_param, in_modifier,
op_modifier, op, timeout);
- up(&hr_dev->cmd.poll_sem);
+ mutex_unlock(&hr_dev->cmd.poll_mutex);
return ret;
}
@@ -275,7 +275,7 @@ int hns_roce_cmd_init(struct hns_roce_dev *hr_dev)
struct device *dev = &hr_dev->pdev->dev;
mutex_init(&hr_dev->cmd.hcr_mutex);
- sema_init(&hr_dev->cmd.poll_sem, 1);
+ mutex_init(&hr_dev->cmd.poll_mutex);
hr_dev->cmd.use_events = 0;
hr_dev->cmd.toggle = 1;
hr_dev->cmd.max_cmds = CMD_MAX_NUM;
@@ -319,8 +319,6 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->token_mask = CMD_TOKEN_MASK;
hr_cmd->use_events = 1;
- down(&hr_cmd->poll_sem);
-
return 0;
}
@@ -335,7 +333,6 @@ void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
down(&hr_cmd->event_sem);
kfree(hr_cmd->context);
- up(&hr_cmd->poll_sem);
}
struct hns_roce_cmd_mailbox
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 3417315..2afe075 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -34,6 +34,7 @@
#define _HNS_ROCE_DEVICE_H
#include <rdma/ib_verbs.h>
+#include <linux/mutex.h>
#define DRV_NAME "hns_roce"
@@ -358,7 +359,7 @@ struct hns_roce_cmdq {
struct dma_pool *pool;
u8 __iomem *hcr;
struct mutex hcr_mutex;
- struct semaphore poll_sem;
+ struct mutex poll_mutex;
/*
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/hns/hns_roce_cmd.c | 37 +++++++++++++++++++++--------
drivers/infiniband/hw/hns/hns_roce_device.h | 2 +-
2 files changed, 28 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cmd.c b/drivers/infiniband/hw/hns/hns_roce_cmd.c
index 51a0675..efc5c48 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cmd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cmd.c
@@ -189,6 +189,26 @@ void hns_roce_cmd_event(struct hns_roce_dev *hr_dev, u16 token, u8 status,
complete(&context->done);
}
+/* Similar to atomic_cmpxchg but with the complimentary condition. Returns
+ * index to a free node. It also sets cmd->free_head to 'new' so it ensures
+ * atomicity between a call to 'wait_event' and manipulating the free_head.
+ */
+
+static inline int atomic_free_node(struct hns_roce_cmdq *cmd, int new)
+{
+ int orig;
+
+ spin_lock(&cmd->context_lock);
+
+ orig = cmd->free_head;
+ if (likely(cmd->free_head != -1))
+ cmd->free_head = new;
+
+ spin_unlock(&cmd->context_lock);
+
+ return orig;
+}
+
/* this should be called with "use_events" */
static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
u64 out_param, unsigned long in_modifier,
@@ -198,11 +218,12 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
struct hns_roce_cmdq *cmd = &hr_dev->cmd;
struct device *dev = &hr_dev->pdev->dev;
struct hns_roce_cmd_context *context;
- int ret = 0;
+ int orig_free_head, ret = 0;
+
+ wait_event(cmd->wq, (orig_free_head = atomic_free_node(cmd, -1)) != -1);
spin_lock(&cmd->context_lock);
- WARN_ON(cmd->free_head < 0);
- context = &cmd->context[cmd->free_head];
+ context = &cmd->context[orig_free_head];
context->token += cmd->token_mask + 1;
cmd->free_head = context->next;
spin_unlock(&cmd->context_lock);
@@ -238,6 +259,7 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
context->next = cmd->free_head;
cmd->free_head = context - cmd->context;
spin_unlock(&cmd->context_lock);
+ wake_up(&cmd->wq);
return ret;
}
@@ -248,10 +270,8 @@ static int hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
{
int ret = 0;
- down(&hr_dev->cmd.event_sem);
ret = __hns_roce_cmd_mbox_wait(hr_dev, in_param, out_param,
in_modifier, op_modifier, op, timeout);
- up(&hr_dev->cmd.event_sem);
return ret;
}
@@ -313,7 +333,7 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
hr_cmd->context[hr_cmd->max_cmds - 1].next = -1;
hr_cmd->free_head = 0;
- sema_init(&hr_cmd->event_sem, hr_cmd->max_cmds);
+ init_waitqueue_head(&hr_cmd->wq);
spin_lock_init(&hr_cmd->context_lock);
hr_cmd->token_mask = CMD_TOKEN_MASK;
@@ -325,12 +345,9 @@ int hns_roce_cmd_use_events(struct hns_roce_dev *hr_dev)
void hns_roce_cmd_use_polling(struct hns_roce_dev *hr_dev)
{
struct hns_roce_cmdq *hr_cmd = &hr_dev->cmd;
- int i;
hr_cmd->use_events = 0;
-
- for (i = 0; i < hr_cmd->max_cmds; ++i)
- down(&hr_cmd->event_sem);
+ hr_cmd->free_head = -1;
kfree(hr_cmd->context);
}
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 2afe075..ac95f52 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -364,7 +364,7 @@ struct hns_roce_cmdq {
* Event mode: cmd register mutex protection,
* ensure to not exceed max_cmds and user use limit region
*/
- struct semaphore event_sem;
+ wait_queue_head_t wq;
int max_cmds;
spinlock_t context_lock;
int free_head;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Counting semaphores are going away in the future, so replace the semaphore
mthca_cmd::event_sem with a conditional wait_event.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mthca/mthca_cmd.c | 37 ++++++++++++++++++++++++---------
drivers/infiniband/hw/mthca/mthca_dev.h | 3 ++-
2 files changed, 29 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 49c6e19..87c475c 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -405,6 +405,26 @@ void mthca_cmd_event(struct mthca_dev *dev,
complete(&context->done);
}
+/* Similar to atomic_cmpxchg but with the complimentary condition. Returns
+ * index to a free node. It also sets cmd->free_head to 'new' so it ensures
+ * atomicity between a call to 'wait_event' and manipulating the free_head.
+ */
+
+static inline int atomic_free_node(struct mthca_cmd *cmd, int new)
+{
+ int orig;
+
+ spin_lock(&cmd->context_lock);
+
+ orig = cmd->free_head;
+ if (likely(cmd->free_head != -1))
+ cmd->free_head = new;
+
+ spin_unlock(&cmd->context_lock);
+
+ return orig;
+}
+
static int mthca_cmd_wait(struct mthca_dev *dev,
u64 in_param,
u64 *out_param,
@@ -414,14 +434,14 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
u16 op,
unsigned long timeout)
{
- int err = 0;
+ int orig_free_head, err = 0;
struct mthca_cmd_context *context;
- down(&dev->cmd.event_sem);
+ wait_event(dev->cmd.wq,
+ (orig_free_head = atomic_free_node(&dev->cmd, -1)) != -1);
spin_lock(&dev->cmd.context_lock);
- BUG_ON(dev->cmd.free_head < 0);
- context = &dev->cmd.context[dev->cmd.free_head];
+ context = &dev->cmd.context[orig_free_head];
context->token += dev->cmd.token_mask + 1;
dev->cmd.free_head = context->next;
spin_unlock(&dev->cmd.context_lock);
@@ -458,8 +478,8 @@ static int mthca_cmd_wait(struct mthca_dev *dev,
context->next = dev->cmd.free_head;
dev->cmd.free_head = context - dev->cmd.context;
spin_unlock(&dev->cmd.context_lock);
+ wake_up(&dev->cmd.wq);
- up(&dev->cmd.event_sem);
return err;
}
@@ -571,7 +591,7 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
dev->cmd.context[dev->cmd.max_cmds - 1].next = -1;
dev->cmd.free_head = 0;
- sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds);
+ init_waitqueue_head(&dev->cmd.wq);
spin_lock_init(&dev->cmd.context_lock);
for (dev->cmd.token_mask = 1;
@@ -590,12 +610,9 @@ int mthca_cmd_use_events(struct mthca_dev *dev)
*/
void mthca_cmd_use_polling(struct mthca_dev *dev)
{
- int i;
-
dev->cmd.flags &= ~MTHCA_CMD_USE_EVENTS;
- for (i = 0; i < dev->cmd.max_cmds; ++i)
- down(&dev->cmd.event_sem);
+ dev->cmd.free_head = -1;
kfree(dev->cmd.context);
}
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 87ab964..2fc86db 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -46,6 +46,7 @@
#include <linux/list.h>
#include <linux/semaphore.h>
+#include <rdma/ib_sa.h>
#include "mthca_provider.h"
#include "mthca_doorbell.h"
@@ -121,7 +122,7 @@ struct mthca_cmd {
struct pci_pool *pool;
struct mutex hcr_mutex;
struct mutex poll_mutex;
- struct semaphore event_sem;
+ wait_queue_head_t wq;
int max_cmds;
spinlock_t context_lock;
int free_head;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Clean up common code (to post a list of work requests to the send queue of
the specified QP) at various places and add a helper function
'mlx5_ib_post_send_wait' to implement the same. The counting semaphore
'umr_common:sem' is also moved into the helper. This may later be modified
to replace the semaphore with an alternative.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/hw/mlx5/mr.c | 96 +++++++++++++----------------------------
1 file changed, 29 insertions(+), 67 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index d4ad672..261984b 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -856,16 +856,38 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
init_completion(&context->done);
}
+static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
+ struct mlx5_ib_umr_context *umr_context,
+ struct mlx5_umr_wr *umrwr)
+{
+ struct umr_common *umrc = &dev->umrc;
+ struct ib_send_wr __maybe_unused *bad;
+ int err;
+
+ down(&umrc->sem);
+ err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
+ if (err) {
+ mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
+ } else {
+ wait_for_completion(&umr_context->done);
+ if (umr_context->status != IB_WC_SUCCESS) {
+ mlx5_ib_warn(dev, "reg umr failed (%u)\n",
+ umr_context->status);
+ err = -EFAULT;
+ }
+ }
+ up(&umrc->sem);
+ return err;
+}
+
static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
u64 virt_addr, u64 len, int npages,
int page_shift, int order, int access_flags)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct device *ddev = dev->ib_dev.dma_device;
- struct umr_common *umrc = &dev->umrc;
struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
- struct ib_send_wr *bad;
struct mlx5_ib_mr *mr;
struct ib_sge sg;
int size;
@@ -900,18 +922,9 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
page_shift, virt_addr, len, access_flags);
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
- if (err) {
- mlx5_ib_warn(dev, "post send failed, err %d\n", err);
+ err = mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
+ if (err != -EFAULT)
goto unmap_dma;
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "reg umr failed\n");
- err = -EFAULT;
- }
- }
mr->mmkey.iova = virt_addr;
mr->mmkey.size = len;
@@ -920,7 +933,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
mr->live = 1;
unmap_dma:
- up(&umrc->sem);
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
kfree(mr_pas);
@@ -940,13 +952,11 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
{
struct mlx5_ib_dev *dev = mr->dev;
struct device *ddev = dev->ib_dev.dma_device;
- struct umr_common *umrc = &dev->umrc;
struct mlx5_ib_umr_context umr_context;
struct ib_umem *umem = mr->umem;
int size;
__be64 *pas;
dma_addr_t dma;
- struct ib_send_wr *bad;
struct mlx5_umr_wr wr;
struct ib_sge sg;
int err = 0;
@@ -1031,19 +1041,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
wr.mkey = mr->mmkey.key;
wr.target.offset = start_page_index;
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &wr.wr, &bad);
- if (err) {
- mlx5_ib_err(dev, "UMR post send failed, err %d\n", err);
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_err(dev, "UMR completion failed, code %d\n",
- umr_context.status);
- err = -EFAULT;
- }
- }
- up(&umrc->sem);
+ err = mlx5_ib_post_send_wait(dev, &umr_context, &wr);
}
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
@@ -1210,11 +1208,8 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
{
struct mlx5_core_dev *mdev = dev->mdev;
- struct umr_common *umrc = &dev->umrc;
struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
- struct ib_send_wr *bad;
- int err;
if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
return 0;
@@ -1224,25 +1219,7 @@ static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
umrwr.wr.wr_cqe = &umr_context.cqe;
prep_umr_unreg_wqe(dev, &umrwr.wr, mr->mmkey.key);
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
- if (err) {
- up(&umrc->sem);
- mlx5_ib_dbg(dev, "err %d\n", err);
- goto error;
- } else {
- wait_for_completion(&umr_context.done);
- up(&umrc->sem);
- }
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "unreg umr failed\n");
- err = -EFAULT;
- goto error;
- }
- return 0;
-
-error:
- return err;
+ return mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
}
static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
@@ -1252,10 +1229,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct device *ddev = dev->ib_dev.dma_device;
struct mlx5_ib_umr_context umr_context;
- struct ib_send_wr *bad;
struct mlx5_umr_wr umrwr = {};
struct ib_sge sg;
- struct umr_common *umrc = &dev->umrc;
dma_addr_t dma = 0;
__be64 *mr_pas = NULL;
int size;
@@ -1291,21 +1266,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
}
/* post send request to UMR QP */
- down(&umrc->sem);
- err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
-
- if (err) {
- mlx5_ib_warn(dev, "post send failed, err %d\n", err);
- } else {
- wait_for_completion(&umr_context.done);
- if (umr_context.status != IB_WC_SUCCESS) {
- mlx5_ib_warn(dev, "reg umr failed (%u)\n",
- umr_context.status);
- err = -EFAULT;
- }
- }
+ err = mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
- up(&umrc->sem);
if (flags & IB_MR_REREG_TRANS) {
dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
kfree(mr_pas);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
The semaphore 'sem' in isert_device is used as completion, so convert
it to struct completion. Semaphores are going away in the future.
Signed-off-by: Binoy Jayan <[email protected]>
---
drivers/infiniband/ulp/isert/ib_isert.c | 6 +++---
drivers/infiniband/ulp/isert/ib_isert.h | 3 ++-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 6dd43f6..de80f56 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -619,7 +619,7 @@
mutex_unlock(&isert_np->mutex);
isert_info("np %p: Allow accept_np to continue\n", isert_np);
- up(&isert_np->sem);
+ complete(&isert_np->comp);
}
static void
@@ -2311,7 +2311,7 @@ struct rdma_cm_id *
isert_err("Unable to allocate struct isert_np\n");
return -ENOMEM;
}
- sema_init(&isert_np->sem, 0);
+ init_completion(&isert_np->comp);
mutex_init(&isert_np->mutex);
INIT_LIST_HEAD(&isert_np->accepted);
INIT_LIST_HEAD(&isert_np->pending);
@@ -2427,7 +2427,7 @@ struct rdma_cm_id *
int ret;
accept_wait:
- ret = down_interruptible(&isert_np->sem);
+ ret = wait_for_completion_interruptible(&isert_np->comp);
if (ret)
return -ENODEV;
diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h
index c02ada5..a1277c0 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.h
+++ b/drivers/infiniband/ulp/isert/ib_isert.h
@@ -3,6 +3,7 @@
#include <linux/in6.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdma_cm.h>
+#include <linux/completion.h>
#include <rdma/rw.h>
#include <scsi/iser.h>
@@ -190,7 +191,7 @@ struct isert_device {
struct isert_np {
struct iscsi_np *np;
- struct semaphore sem;
+ struct completion comp;
struct rdma_cm_id *cm_id;
struct mutex mutex;
struct list_head accepted;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
On Tuesday, October 25, 2016 5:31:59 PM CEST Binoy Jayan wrote:
> Clean up common code (to post a list of work requests to the send queue of
> the specified QP) at various places and add a helper function
> 'mlx5_ib_post_send_wait' to implement the same. The counting semaphore
> 'umr_common:sem' is also moved into the helper. This may later be modified
> to replace the semaphore with an alternative.
>
> Signed-off-by: Binoy Jayan <[email protected]>
Looks reasonable.
> ---
> drivers/infiniband/hw/mlx5/mr.c | 96 +++++++++++++----------------------------
> 1 file changed, 29 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index d4ad672..261984b 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -856,16 +856,38 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
> init_completion(&context->done);
> }
>
> +static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
> + struct mlx5_ib_umr_context *umr_context,
> + struct mlx5_umr_wr *umrwr)
> +{
> + struct umr_common *umrc = &dev->umrc;
> + struct ib_send_wr __maybe_unused *bad;
Did you get a warning about 'bad' being unused here? I would have
guessed not, since the original code was not that different and
it does get passed into a function.
Why not move the umr_context variable into this function too?
The only thing we ever seem to do to is initialize it and
assign the wr_cqe pointer, both of which can be done here.
Arnd
On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
> Clean up common code (to post a list of work requests to the send queue of
> the specified QP) at various places and add a helper function
> 'mlx5_ib_post_send_wait' to implement the same. The counting semaphore
> 'umr_common:sem' is also moved into the helper. This may later be modified
> to replace the semaphore with an alternative.
>
> Signed-off-by: Binoy Jayan <[email protected]>
> ---
> drivers/infiniband/hw/mlx5/mr.c | 96 +++++++++++++----------------------------
> 1 file changed, 29 insertions(+), 67 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index d4ad672..261984b 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -856,16 +856,38 @@ static inline void mlx5_ib_init_umr_context(struct mlx5_ib_umr_context *context)
> init_completion(&context->done);
> }
>
> +static inline int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
> + struct mlx5_ib_umr_context *umr_context,
> + struct mlx5_umr_wr *umrwr)
> +{
> + struct umr_common *umrc = &dev->umrc;
> + struct ib_send_wr __maybe_unused *bad;
> + int err;
> +
> + down(&umrc->sem);
> + err = ib_post_send(umrc->qp, &umrwr->wr, &bad);
> + if (err) {
> + mlx5_ib_warn(dev, "UMR post send failed, err %d\n", err);
> + } else {
> + wait_for_completion(&umr_context->done);
> + if (umr_context->status != IB_WC_SUCCESS) {
> + mlx5_ib_warn(dev, "reg umr failed (%u)\n",
> + umr_context->status);
> + err = -EFAULT;
> + }
> + }
> + up(&umrc->sem);
> + return err;
> +}
> +
> static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
> u64 virt_addr, u64 len, int npages,
> int page_shift, int order, int access_flags)
> {
> struct mlx5_ib_dev *dev = to_mdev(pd->device);
> struct device *ddev = dev->ib_dev.dma_device;
> - struct umr_common *umrc = &dev->umrc;
> struct mlx5_ib_umr_context umr_context;
> struct mlx5_umr_wr umrwr = {};
> - struct ib_send_wr *bad;
> struct mlx5_ib_mr *mr;
> struct ib_sge sg;
> int size;
> @@ -900,18 +922,9 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
> prep_umr_reg_wqe(pd, &umrwr.wr, &sg, dma, npages, mr->mmkey.key,
> page_shift, virt_addr, len, access_flags);
>
> - down(&umrc->sem);
> - err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
> - if (err) {
> - mlx5_ib_warn(dev, "post send failed, err %d\n", err);
> + err = mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
> + if (err != -EFAULT)
> goto unmap_dma;
In case of success (err == 0), you will call to unmap_dma instead of
normal flow.
NAK,
Leon Romanovsky <[email protected]>
> - } else {
> - wait_for_completion(&umr_context.done);
> - if (umr_context.status != IB_WC_SUCCESS) {
> - mlx5_ib_warn(dev, "reg umr failed\n");
> - err = -EFAULT;
> - }
> - }
>
> mr->mmkey.iova = virt_addr;
> mr->mmkey.size = len;
> @@ -920,7 +933,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
> mr->live = 1;
>
> unmap_dma:
> - up(&umrc->sem);
> dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
>
> kfree(mr_pas);
> @@ -940,13 +952,11 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
> {
> struct mlx5_ib_dev *dev = mr->dev;
> struct device *ddev = dev->ib_dev.dma_device;
> - struct umr_common *umrc = &dev->umrc;
> struct mlx5_ib_umr_context umr_context;
> struct ib_umem *umem = mr->umem;
> int size;
> __be64 *pas;
> dma_addr_t dma;
> - struct ib_send_wr *bad;
> struct mlx5_umr_wr wr;
> struct ib_sge sg;
> int err = 0;
> @@ -1031,19 +1041,7 @@ int mlx5_ib_update_mtt(struct mlx5_ib_mr *mr, u64 start_page_index, int npages,
> wr.mkey = mr->mmkey.key;
> wr.target.offset = start_page_index;
>
> - down(&umrc->sem);
> - err = ib_post_send(umrc->qp, &wr.wr, &bad);
> - if (err) {
> - mlx5_ib_err(dev, "UMR post send failed, err %d\n", err);
> - } else {
> - wait_for_completion(&umr_context.done);
> - if (umr_context.status != IB_WC_SUCCESS) {
> - mlx5_ib_err(dev, "UMR completion failed, code %d\n",
> - umr_context.status);
> - err = -EFAULT;
> - }
> - }
> - up(&umrc->sem);
> + err = mlx5_ib_post_send_wait(dev, &umr_context, &wr);
> }
> dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
>
> @@ -1210,11 +1208,8 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
> static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
> {
> struct mlx5_core_dev *mdev = dev->mdev;
> - struct umr_common *umrc = &dev->umrc;
> struct mlx5_ib_umr_context umr_context;
> struct mlx5_umr_wr umrwr = {};
> - struct ib_send_wr *bad;
> - int err;
>
> if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
> return 0;
> @@ -1224,25 +1219,7 @@ static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
> umrwr.wr.wr_cqe = &umr_context.cqe;
> prep_umr_unreg_wqe(dev, &umrwr.wr, mr->mmkey.key);
>
> - down(&umrc->sem);
> - err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
> - if (err) {
> - up(&umrc->sem);
> - mlx5_ib_dbg(dev, "err %d\n", err);
> - goto error;
> - } else {
> - wait_for_completion(&umr_context.done);
> - up(&umrc->sem);
> - }
> - if (umr_context.status != IB_WC_SUCCESS) {
> - mlx5_ib_warn(dev, "unreg umr failed\n");
> - err = -EFAULT;
> - goto error;
> - }
> - return 0;
> -
> -error:
> - return err;
> + return mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
> }
>
> static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
> @@ -1252,10 +1229,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
> struct mlx5_ib_dev *dev = to_mdev(pd->device);
> struct device *ddev = dev->ib_dev.dma_device;
> struct mlx5_ib_umr_context umr_context;
> - struct ib_send_wr *bad;
> struct mlx5_umr_wr umrwr = {};
> struct ib_sge sg;
> - struct umr_common *umrc = &dev->umrc;
> dma_addr_t dma = 0;
> __be64 *mr_pas = NULL;
> int size;
> @@ -1291,21 +1266,8 @@ static int rereg_umr(struct ib_pd *pd, struct mlx5_ib_mr *mr, u64 virt_addr,
> }
>
> /* post send request to UMR QP */
> - down(&umrc->sem);
> - err = ib_post_send(umrc->qp, &umrwr.wr, &bad);
> -
> - if (err) {
> - mlx5_ib_warn(dev, "post send failed, err %d\n", err);
> - } else {
> - wait_for_completion(&umr_context.done);
> - if (umr_context.status != IB_WC_SUCCESS) {
> - mlx5_ib_warn(dev, "reg umr failed (%u)\n",
> - umr_context.status);
> - err = -EFAULT;
> - }
> - }
> + err = mlx5_ib_post_send_wait(dev, &umr_context, &umrwr);
>
> - up(&umrc->sem);
> if (flags & IB_MR_REREG_TRANS) {
> dma_unmap_single(ddev, dma, size, DMA_TO_DEVICE);
> kfree(mr_pas);
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday, October 25, 2016 5:31:57 PM CEST Binoy Jayan wrote:
> static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
> u64 out_param, unsigned long in_modifier,
> @@ -198,11 +218,12 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
> struct hns_roce_cmdq *cmd = &hr_dev->cmd;
> struct device *dev = &hr_dev->pdev->dev;
> struct hns_roce_cmd_context *context;
> - int ret = 0;
> + int orig_free_head, ret = 0;
> +
> + wait_event(cmd->wq, (orig_free_head = atomic_free_node(cmd, -1)) != -1);
>
> spin_lock(&cmd->context_lock);
> - WARN_ON(cmd->free_head < 0);
> - context = &cmd->context[cmd->free_head];
> + context = &cmd->context[orig_free_head];
> context->token += cmd->token_mask + 1;
> cmd->free_head = context->next;
> spin_unlock(&cmd->context_lock);
>
You get the lock in atomic_free_node() and then again right after that.
Why not combine the two and only take the lock inside of that
function that returns a context?
Arnd
Hi Binoy,
snip
>
> port->ib_dev = device;
> port->port_num = port_num;
> - sema_init(&port->sm_sem, 1);
> + init_completion(&port->sm_comp);
> + complete(&port->sm_comp);
Why complete here?
> mutex_init(&port->file_mutex);
> INIT_LIST_HEAD(&port->file_list);
>
> --
KR,
Jinpu
On 25 October 2016 at 17:53, Arnd Bergmann <[email protected]> wrote:
> On Tuesday, October 25, 2016 5:31:59 PM CEST Binoy Jayan wrote:
> Looks reasonable.
Thank you Arnd for looking at it again.
> Did you get a warning about 'bad' being unused here? I would have
> guessed not, since the original code was not that different and
> it does get passed into a function.
I remembered getting a warning like that, but when i remove the unused
attribute now, it does not. It was probably a side effect of some other error.
Will change it.
> Why not move the umr_context variable into this function too?
> The only thing we ever seem to do to is initialize it and
> assign the wr_cqe pointer, both of which can be done here.
I guess it could be done.
Binoy
On 25 October 2016 at 17:58, Arnd Bergmann <[email protected]> wrote:
> On Tuesday, October 25, 2016 5:31:57 PM CEST Binoy Jayan wrote:
>> static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
>> u64 out_param, unsigned long in_modifier,
>> @@ -198,11 +218,12 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
>> struct hns_roce_cmdq *cmd = &hr_dev->cmd;
>> struct device *dev = &hr_dev->pdev->dev;
>> struct hns_roce_cmd_context *context;
>> - int ret = 0;
>> + int orig_free_head, ret = 0;
>> +
>> + wait_event(cmd->wq, (orig_free_head = atomic_free_node(cmd, -1)) != -1);
>>
>> spin_lock(&cmd->context_lock);
>> - WARN_ON(cmd->free_head < 0);
>> - context = &cmd->context[cmd->free_head];
>> + context = &cmd->context[orig_free_head];
>> context->token += cmd->token_mask + 1;
>> cmd->free_head = context->next;
>> spin_unlock(&cmd->context_lock);
>>
>
> You get the lock in atomic_free_node() and then again right after that.
> Why not combine the two and only take the lock inside of that
> function that returns a context?
Hi Arnd,
I couldn't figure out a way to wait for a node to be free followed by
acquiring a lock
in an atomic fashion. If the lock is acquired after the wait_event,
there could be race
between the wait_event and acquiring the lock. If the lock is acquired
before the
wait_event, the process may goto sleep with the lock held which is not desired.
Could you suggest me of some way to circumvent this?
-Binoy
On 25 October 2016 at 17:56, Leon Romanovsky <[email protected]> wrote:
> On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
> In case of success (err == 0), you will call to unmap_dma instead of
> normal flow.
>
> NAK,
> Leon Romanovsky <[email protected]>
Hi Loen,
Even in the original code, the regular flow seems to reach 'unmap_dma' after
returning from 'wait_for_completion'().
-Binoy
On Tuesday, October 25, 2016 6:29:45 PM CEST Binoy Jayan wrote:
> On 25 October 2016 at 17:58, Arnd Bergmann <[email protected]> wrote:
> > On Tuesday, October 25, 2016 5:31:57 PM CEST Binoy Jayan wrote:
> >> static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
> >> u64 out_param, unsigned long in_modifier,
> >> @@ -198,11 +218,12 @@ static int __hns_roce_cmd_mbox_wait(struct hns_roce_dev *hr_dev, u64 in_param,
> >> struct hns_roce_cmdq *cmd = &hr_dev->cmd;
> >> struct device *dev = &hr_dev->pdev->dev;
> >> struct hns_roce_cmd_context *context;
> >> - int ret = 0;
> >> + int orig_free_head, ret = 0;
> >> +
> >> + wait_event(cmd->wq, (orig_free_head = atomic_free_node(cmd, -1)) != -1);
> >>
> >> spin_lock(&cmd->context_lock);
> >> - WARN_ON(cmd->free_head < 0);
> >> - context = &cmd->context[cmd->free_head];
> >> + context = &cmd->context[orig_free_head];
> >> context->token += cmd->token_mask + 1;
> >> cmd->free_head = context->next;
> >> spin_unlock(&cmd->context_lock);
> >>
> >
> > You get the lock in atomic_free_node() and then again right after that.
> > Why not combine the two and only take the lock inside of that
> > function that returns a context?
>
>
> Hi Arnd,
>
> I couldn't figure out a way to wait for a node to be free followed by
> acquiring a lock
> in an atomic fashion. If the lock is acquired after the wait_event,
> there could be race
> between the wait_event and acquiring the lock. If the lock is acquired
> before the
> wait_event, the process may goto sleep with the lock held which is not desired.
> Could you suggest me of some way to circumvent this?
Something like
static struct hns_roce_cmd_context *hns_roce_try_get_context(struct hns_roce_cmdq *cmd)
{
struct hns_roce_cmd_context *context = NULL;
spin_lock(&cmd->context_lock);
if (cmd->free_head < 0)
goto out;
context = &cmd->context[cmd->free_head];
... /* update free_head */
out:
spin_unlock(&cmd->context_lock);
return context;
}
...
static struct hns_roce_cmd_context *hns_roce_get_context(struct hns_roce_cmdq *cmd)
{
struct hns_roce_cmd_context *context;
wait_event(cmd->wq, (context = hns_roce_try_get_context(cmd)));
return context;
}
Arnd
On 25 October 2016 at 18:51, Arnd Bergmann <[email protected]> wrote:
> On Tuesday, October 25, 2016 6:29:45 PM CEST Binoy Jayan wrote:
>
> Something like
>
> static struct hns_roce_cmd_context *hns_roce_try_get_context(struct hns_roce_cmdq *cmd)
> {
> struct hns_roce_cmd_context *context = NULL;
>
> spin_lock(&cmd->context_lock);
>
> if (cmd->free_head < 0)
> goto out;
>
> context = &cmd->context[cmd->free_head];
>
> ... /* update free_head */
>
> out:
> spin_unlock(&cmd->context_lock);
>
> return context;
> }
> ...
>
> static struct hns_roce_cmd_context *hns_roce_get_context(struct hns_roce_cmdq *cmd)
> {
> struct hns_roce_cmd_context *context;
>
> wait_event(cmd->wq, (context = hns_roce_try_get_context(cmd)));
>
> return context;
> }
That looks more elegant. Didn't think of that, Thank you Arnd.:)
On 25 October 2016 at 18:13, Jack Wang <[email protected]> wrote:
> Hi Binoy,
>
> snip
>>
>> port->ib_dev = device;
>> port->port_num = port_num;
>> - sema_init(&port->sm_sem, 1);
>> + init_completion(&port->sm_comp);
>> + complete(&port->sm_comp);
>
> Why complete here?
>
>> mutex_init(&port->file_mutex);
>> INIT_LIST_HEAD(&port->file_list);
>>
>> --
> KR,
> Jinpu
Hi Jack,
ib_umad_sm_open() calls wait_for_completion_interruptible() which comes before
ib_umad_sm_close() that calls complete(). In the initial open() there
will not be
anybody to signal the completion, so the complete is called to mark
the initial state.
I am not sure if this is the right way to do it, though.
-Binoy
Hi Binoy,
2016-10-25 17:08 GMT+02:00 Binoy Jayan <[email protected]>:
> On 25 October 2016 at 18:13, Jack Wang <[email protected]> wrote:
>> Hi Binoy,
>>
>> snip
>>>
>>> port->ib_dev = device;
>>> port->port_num = port_num;
>>> - sema_init(&port->sm_sem, 1);
>>> + init_completion(&port->sm_comp);
>>> + complete(&port->sm_comp);
>>
>> Why complete here?
>>
>>> mutex_init(&port->file_mutex);
>>> INIT_LIST_HEAD(&port->file_list);
>>>
>>> --
>> KR,
>> Jinpu
>
>
> Hi Jack,
>
> ib_umad_sm_open() calls wait_for_completion_interruptible() which comes before
> ib_umad_sm_close() that calls complete(). In the initial open() there
> will not be
> anybody to signal the completion, so the complete is called to mark
> the initial state.
> I am not sure if this is the right way to do it, though.
>
> -Binoy
>From Documentation/scheduler/completion.txt ,
"
117 This is not implying any temporal order on wait_for_completion() and the
118 call to complete() - if the call to complete() happened before the call
119 to wait_for_completion() then the waiting side simply will continue
120 immediately as all dependencies are satisfied if not it will block until
121 completion is signaled by complete().
"
In this case here, if sm_open/sm_close are paired, it should work.
KR
Jack
On Tue, Oct 25, 2016 at 08:38:21PM +0530, Binoy Jayan wrote:
> On 25 October 2016 at 18:13, Jack Wang <[email protected]> wrote:
> > Hi Binoy,
> >
> > snip
> >>
> >> port->ib_dev = device;
> >> port->port_num = port_num;
> >> - sema_init(&port->sm_sem, 1);
> >> + init_completion(&port->sm_comp);
> >> + complete(&port->sm_comp);
> >
> > Why complete here?
> >
> >> mutex_init(&port->file_mutex);
> >> INIT_LIST_HEAD(&port->file_list);
> >>
> > KR,
> > Jinpu
>
>
> Hi Jack,
>
> ib_umad_sm_open() calls wait_for_completion_interruptible() which
> comes before ib_umad_sm_close() that calls complete(). In the
> initial open() there will not be anybody to signal the completion,
> so the complete is called to mark the initial state. I am not sure
> if this is the right way to do it, though.
Using a completion to model exclusive ownership seems convoluted to
me - is that a thing now? What about an atomic?
open:
while (true) {
wait_event_interruptible(priv->queue,test_bit(CLAIMED_BIT, &priv->flags));
if (!test_and_set_bit(CLAIMED_BIT, &priv->flags))
break;
}
close():
clear_bit(CLAIMED_BIT, &priv->flags)
wake_up(&priv->queue);
??
Jason
On Tuesday, October 25, 2016 9:48:22 AM CEST Jason Gunthorpe wrote:
>
> Using a completion to model exclusive ownership seems convoluted to
> me - is that a thing now? What about an atomic?
I also totally missed how this is used. I initially mentioned to Binoy
that almost all semaphores can either be converted to a mutex or a
completion unless they rely on the counting behavior of the semaphore.
This driver clearly falls into another category and none of the above
apply here.
> open:
>
> while (true) {
> wait_event_interruptible(priv->queue,test_bit(CLAIMED_BIT, &priv->flags));
> if (!test_and_set_bit(CLAIMED_BIT, &priv->flags))
> break;
> }
I'd fold the test_and_set_bit() into the wait_event() and get rid of the
loop here, otherwise this looks nice.
I'm generally a bit suspicious of open() functions that try to
protect you from having multiple file descriptors open, as dup()
or fork() will also result in extra file descriptors, but this
use here seems harmless, as the device itself only supports
open() and close() and the only thing it does is to keep track
of whether it is open or not.
It's also interesting that the ib_modify_port() function also
keeps track of the a flag that is set in open() and cleared
in close(), so in principle we should be able to unify this
with the semaphore, but the wait_event() approach is certainly
much simpler.
Arnd
On Tue, Oct 25, 2016 at 06:46:58PM +0530, Binoy Jayan wrote:
> On 25 October 2016 at 17:56, Leon Romanovsky <[email protected]> wrote:
> > On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
>
> > In case of success (err == 0), you will call to unmap_dma instead of
> > normal flow.
> >
> > NAK,
> > Leon Romanovsky <[email protected]>
>
> Hi Loen,
>
> Even in the original code, the regular flow seems to reach 'unmap_dma' after
> returning from 'wait_for_completion'().
In original flow, the code executed assignments to mr->mmkey. In you
code, it is skipped.
http://lxr.free-electrons.com/source/drivers/infiniband/hw/mlx5/mr.c#L900
>
> -Binoy
On 27 October 2016 at 11:35, Leon Romanovsky <[email protected]> wrote:
> On Tue, Oct 25, 2016 at 06:46:58PM +0530, Binoy Jayan wrote:
>> On 25 October 2016 at 17:56, Leon Romanovsky <[email protected]> wrote:
>> > On Tue, Oct 25, 2016 at 05:31:59PM +0530, Binoy Jayan wrote:
>>
>> > In case of success (err == 0), you will call to unmap_dma instead of
>> > normal flow.
>> >
>> > NAK,
>> > Leon Romanovsky <[email protected]>
>>
>> Hi Loen,
>>
>> Even in the original code, the regular flow seems to reach 'unmap_dma' after
>> returning from 'wait_for_completion'().
>
> In original flow, the code executed assignments to mr->mmkey. In you
> code, it is skipped.
>
Yes you are right, I just noted it. My bad. I've changed it now.
Thanks,
Binoy