The documentation of the function rvt_error_qp says both r_lock and
s_lock need to be held when calling that function.
It also asserts using lockdep that both of those locks are held.
rvt_error_qp is called form rvt_send_cq, which is called from
rvt_qp_complete_swqe, which is called from rvt_send_complete, which is
called from rvt_ruc_loopback in two places. Both of these places do not
hold r_lock. Fix this by acquiring a spin_lock of r_lock in both of
these places.
The r_lock acquiring cannot be added in rvt_qp_complete_swqe because
some of its other callers already have r_lock acquired.
Signed-off-by: Niels Dossche <[email protected]>
---
drivers/infiniband/sw/rdmavt/qp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index ae50b56e8913..3bac4f90a6d6 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -2775,7 +2775,7 @@ void rvt_qp_iter(struct rvt_dev_info *rdi,
EXPORT_SYMBOL(rvt_qp_iter);
/*
- * This should be called with s_lock held.
+ * This should be called with s_lock and r_lock held.
*/
void rvt_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
enum ib_wc_status status)
@@ -3134,7 +3134,9 @@ void rvt_ruc_loopback(struct rvt_qp *sqp)
rvp->n_loop_pkts++;
flush_send:
sqp->s_rnr_retry = sqp->s_rnr_retry_cnt;
+ spin_lock(&sqp->r_lock);
rvt_send_complete(sqp, wqe, send_status);
+ spin_unlock(&sqp->r_lock);
if (local_ops) {
atomic_dec(&sqp->local_ops_pending);
local_ops = 0;
@@ -3188,7 +3190,9 @@ void rvt_ruc_loopback(struct rvt_qp *sqp)
spin_unlock_irqrestore(&qp->r_lock, flags);
serr_no_r_lock:
spin_lock_irqsave(&sqp->s_lock, flags);
+ spin_lock(&sqp->r_lock);
rvt_send_complete(sqp, wqe, send_status);
+ spin_unlock(&sqp->r_lock);
if (sqp->ibqp.qp_type == IB_QPT_RC) {
int lastwqe = rvt_error_qp(sqp, IB_WC_WR_FLUSH_ERR);
--
2.35.1
On Mon, Feb 28, 2022 at 08:51:44PM +0100, Niels Dossche wrote:
> The documentation of the function rvt_error_qp says both r_lock and
> s_lock need to be held when calling that function.
> It also asserts using lockdep that both of those locks are held.
> rvt_error_qp is called form rvt_send_cq, which is called from
> rvt_qp_complete_swqe, which is called from rvt_send_complete, which is
> called from rvt_ruc_loopback in two places. Both of these places do not
> hold r_lock. Fix this by acquiring a spin_lock of r_lock in both of
> these places.
> The r_lock acquiring cannot be added in rvt_qp_complete_swqe because
> some of its other callers already have r_lock acquired.
>
> Signed-off-by: Niels Dossche <[email protected]>
> ---
> drivers/infiniband/sw/rdmavt/qp.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
Applied to for-next, thanks
Jason