2017-06-05 07:37:02

by Jia-Ju Bai

[permalink] [raw]
Subject: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

The driver may sleep under a spin lock, and the function call path is:
post_one_send (acquire the lock by spin_lock_irqsave)
init_send_wqe
copy_from_user --> may sleep

To fix it, the lock is released before copy_from_user, and the lock is
acquired again after this function. The parameter "flags" is used to
restore and save the irq status.


Signed-off-by: Jia-Ju Bai <[email protected]>
---
V3:
* It corrects the mistakes of remaining legacy code in V2.
(Thank Ram for pointing it out)

V2:
* The parameter "flags" is added to restore and save the irq status.
Thank Leon for good advice.

drivers/infiniband/sw/rxe/rxe_verbs.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 83d709e..5293d15 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr,

static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
unsigned int mask, unsigned int length,
- struct rxe_send_wqe *wqe)
+ struct rxe_send_wqe *wqe, unsigned long *flags)
{
int num_sge = ibwr->num_sge;
struct ib_sge *sge;
- int i;
+ int i, err;
u8 *p;

init_send_wr(qp, &wqe->wr, ibwr);
@@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,

sge = ibwr->sg_list;
for (i = 0; i < num_sge; i++, sge++) {
- if (qp->is_user && copy_from_user(p, (__user void *)
- (uintptr_t)sge->addr, sge->length))
+ spin_unlock_irqrestore(&qp->sq.sq_lock, *flags);
+ err = copy_from_user(p, (__user void *)
+ (uintptr_t)sge->addr, sge->length);
+ spin_lock_irqsave(&qp->sq.sq_lock, *flags);
+ if (qp->is_user && err)
return -EFAULT;

else if (!qp->is_user)
@@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr,

send_wqe = producer_addr(sq->queue);

- err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
+ err = init_send_wqe(qp, ibwr, mask, length, send_wqe, &flags);
if (unlikely(err))
goto err1;

--
1.7.9.5



2017-06-05 07:45:22

by Yuval Shaia

[permalink] [raw]
Subject: Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

On Mon, Jun 05, 2017 at 03:39:02PM +0800, Jia-Ju Bai wrote:
> The driver may sleep under a spin lock, and the function call path is:
> post_one_send (acquire the lock by spin_lock_irqsave)
> init_send_wqe
> copy_from_user --> may sleep
>
> To fix it, the lock is released before copy_from_user, and the lock is
> acquired again after this function. The parameter "flags" is used to
> restore and save the irq status.
>
>
> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> V3:
> * It corrects the mistakes of remaining legacy code in V2.
> (Thank Ram for pointing it out)
>
> V2:
> * The parameter "flags" is added to restore and save the irq status.
> Thank Leon for good advice.
>

My mistake for not mentioning it specifically but in addition to moving
such version history after the "---" you should also add (manually) a
terminating "---" after the block.

> drivers/infiniband/sw/rxe/rxe_verbs.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 83d709e..5293d15 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -721,11 +721,11 @@ static void init_send_wr(struct rxe_qp *qp, struct rxe_send_wr *wr,
>
> static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
> unsigned int mask, unsigned int length,
> - struct rxe_send_wqe *wqe)
> + struct rxe_send_wqe *wqe, unsigned long *flags)
> {
> int num_sge = ibwr->num_sge;
> struct ib_sge *sge;
> - int i;
> + int i, err;
> u8 *p;
>
> init_send_wr(qp, &wqe->wr, ibwr);
> @@ -740,8 +740,11 @@ static int init_send_wqe(struct rxe_qp *qp, struct ib_send_wr *ibwr,
>
> sge = ibwr->sg_list;
> for (i = 0; i < num_sge; i++, sge++) {
> - if (qp->is_user && copy_from_user(p, (__user void *)
> - (uintptr_t)sge->addr, sge->length))
> + spin_unlock_irqrestore(&qp->sq.sq_lock, *flags);
> + err = copy_from_user(p, (__user void *)
> + (uintptr_t)sge->addr, sge->length);
> + spin_lock_irqsave(&qp->sq.sq_lock, *flags);
> + if (qp->is_user && err)
> return -EFAULT;
>
> else if (!qp->is_user)
> @@ -794,7 +797,7 @@ static int post_one_send(struct rxe_qp *qp, struct ib_send_wr *ibwr,
>
> send_wqe = producer_addr(sq->queue);
>
> - err = init_send_wqe(qp, ibwr, mask, length, send_wqe);
> + err = init_send_wqe(qp, ibwr, mask, length, send_wqe, &flags);
> if (unlikely(err))
> goto err1;
>
> --
> 1.7.9.5
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-06-05 08:30:17

by Moni Shoua

[permalink] [raw]
Subject: Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

> - if (qp->is_user && copy_from_user(p, (__user void *)
> - (uintptr_t)sge->addr, sge->length))
> + spin_unlock_irqrestore(&qp->sq.sq_lock, *flags);
> + err = copy_from_user(p, (__user void *)
> + (uintptr_t)sge->addr, sge->length);
> + spin_lock_irqsave(&qp->sq.sq_lock, *flags);
> + if (qp->is_user && err)
> return -EFAULT;
qp-_is_user is always false in this function (flow starts from
rxe_post_send_kernel) so this line is a dead code
In fact, this patch seems to add a serious bug when it uses
copy_from_user() from a non user pointer.
Do you agree?

2017-06-05 08:39:14

by Jia-Ju Bai

[permalink] [raw]
Subject: Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

On 06/05/2017 04:30 PM, Moni Shoua wrote:
>> - if (qp->is_user&& copy_from_user(p, (__user void *)
>> - (uintptr_t)sge->addr, sge->length))
>> + spin_unlock_irqrestore(&qp->sq.sq_lock, *flags);
>> + err = copy_from_user(p, (__user void *)
>> + (uintptr_t)sge->addr, sge->length);
>> + spin_lock_irqsave(&qp->sq.sq_lock, *flags);
>> + if (qp->is_user&& err)
>> return -EFAULT;
> qp-_is_user is always false in this function (flow starts from
> rxe_post_send_kernel) so this line is a dead code
> In fact, this patch seems to add a serious bug when it uses
> copy_from_user() from a non user pointer.
> Do you agree?
I agree.
So, it is fine to me to remove this line, as you said in the former email:

> Second, I think that there is no flow that leads to this function
> when qp->is user is true so maybe the correct action is to remove this
> line completely
> if (qp->is_user&& copy_from_user(p, (__user void *)



2017-06-05 09:21:22

by Moni Shoua

[permalink] [raw]
Subject: Re: [PATCH V3] rxe: Fix a sleep-in-atomic bug in post_one_send

> I agree.
> So, it is fine to me to remove this line, as you said in the former email:
>
Thanks.
Can you please send a patch like that?