2018-09-01 12:08:39

by Jia-Ju Bai

[permalink] [raw]
Subject: [PATCH] infiniband: core: mad: Fix a sleep-in-atomic-context bug in ib_mad_recv_done()

The driver may sleep with holding a spinlock.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] alloc_mad_private(GFP_KERNEL)
drivers/infiniband/core/mad.c, 2264:
alloc_mad_private in ib_mad_recv_done
drivers/infiniband/core/cq.c, 45:
[FUNC_PTR]ib_mad_recv_done in __ib_process_cq
drivers/infiniband/core/cq.c, 77:
__ib_process_cq in ib_process_cq_direct
drivers/infiniband/ulp/srp/ib_srp.c, 2010:
ib_process_cq_direct in __srp_get_tx_iu
drivers/infiniband/ulp/srp/ib_srp.c, 2353:
__srp_get_tx_iu in srp_queuecommand
drivers/infiniband/ulp/srp/ib_srp.c, 2352:
_raw_spin_lock_irqsave in srp_queuecommand

[FUNC] alloc_mad_private(GFP_KERNEL)
drivers/infiniband/core/mad.c, 2264:
alloc_mad_private in ib_mad_recv_done
drivers/infiniband/core/cq.c, 45:
[FUNC_PTR]ib_mad_recv_done in __ib_process_cq
drivers/infiniband/core/cq.c, 77:
__ib_process_cq in ib_process_cq_direct
drivers/infiniband/ulp/srp/ib_srp.c, 2010:
ib_process_cq_direct in __srp_get_tx_iu
drivers/infiniband/ulp/srp/ib_srp.c, 2903:
__srp_get_tx_iu in srp_send_tsk_mgmt
drivers/infiniband/ulp/srp/ib_srp.c, 2902:
spin_lock_irq in srp_send_tsk_mgmt

To fix this bug, GFP_KERNEL is replaced with GFP_ATOMIC.

This bug is found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai <[email protected]>
---
drivers/infiniband/core/mad.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index f742ae7a768b..0db954f6958a 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -2263,7 +2263,7 @@ static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
goto out;

mad_size = recv->mad_size;
- response = alloc_mad_private(mad_size, GFP_KERNEL);
+ response = alloc_mad_private(mad_size, GFP_ATOMIC);
if (!response)
goto out;

--
2.17.0



2018-09-02 20:43:53

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] infiniband: core: mad: Fix a sleep-in-atomic-context bug in ib_mad_recv_done()

On Sat, Sep 01, 2018 at 08:06:59PM +0800, Jia-Ju Bai wrote:
> The driver may sleep with holding a spinlock.
>
> The function call paths (from bottom to top) in Linux-4.16 are:
>
> [FUNC] alloc_mad_private(GFP_KERNEL)
> drivers/infiniband/core/mad.c, 2264:
> alloc_mad_private in ib_mad_recv_done
> drivers/infiniband/core/cq.c, 45:
> [FUNC_PTR]ib_mad_recv_done in __ib_process_cq
> drivers/infiniband/core/cq.c, 77:
> __ib_process_cq in ib_process_cq_direct
> drivers/infiniband/ulp/srp/ib_srp.c, 2010:
> ib_process_cq_direct in __srp_get_tx_iu
> drivers/infiniband/ulp/srp/ib_srp.c, 2353:
> __srp_get_tx_iu in srp_queuecommand
> drivers/infiniband/ulp/srp/ib_srp.c, 2352:
> _raw_spin_lock_irqsave in srp_queuecommand
>
> [FUNC] alloc_mad_private(GFP_KERNEL)
> drivers/infiniband/core/mad.c, 2264:
> alloc_mad_private in ib_mad_recv_done
> drivers/infiniband/core/cq.c, 45:
> [FUNC_PTR]ib_mad_recv_done in __ib_process_cq
> drivers/infiniband/core/cq.c, 77:
> __ib_process_cq in ib_process_cq_direct

This trace doesn't seem right, the CQ used by SRP will never have
ib_mad_recv_done as a function pointer.

Jason

2018-09-03 01:43:50

by Jia-Ju Bai

[permalink] [raw]
Subject: Re: [PATCH] infiniband: core: mad: Fix a sleep-in-atomic-context bug in ib_mad_recv_done()



On 2018/9/3 4:32, Jason Gunthorpe wrote:
> On Sat, Sep 01, 2018 at 08:06:59PM +0800, Jia-Ju Bai wrote:
>> The driver may sleep with holding a spinlock.
>>
>> The function call paths (from bottom to top) in Linux-4.16 are:
>>
>> [FUNC] alloc_mad_private(GFP_KERNEL)
>> drivers/infiniband/core/mad.c, 2264:
>> alloc_mad_private in ib_mad_recv_done
>> drivers/infiniband/core/cq.c, 45:
>> [FUNC_PTR]ib_mad_recv_done in __ib_process_cq
>> drivers/infiniband/core/cq.c, 77:
>> __ib_process_cq in ib_process_cq_direct
>> drivers/infiniband/ulp/srp/ib_srp.c, 2010:
>> ib_process_cq_direct in __srp_get_tx_iu
>> drivers/infiniband/ulp/srp/ib_srp.c, 2353:
>> __srp_get_tx_iu in srp_queuecommand
>> drivers/infiniband/ulp/srp/ib_srp.c, 2352:
>> _raw_spin_lock_irqsave in srp_queuecommand
>>
>> [FUNC] alloc_mad_private(GFP_KERNEL)
>> drivers/infiniband/core/mad.c, 2264:
>> alloc_mad_private in ib_mad_recv_done
>> drivers/infiniband/core/cq.c, 45:
>> [FUNC_PTR]ib_mad_recv_done in __ib_process_cq
>> drivers/infiniband/core/cq.c, 77:
>> __ib_process_cq in ib_process_cq_direct
> This trace doesn't seem right, the CQ used by SRP will never have
> ib_mad_recv_done as a function pointer.

Okay, sorry for this false positive.


Best wishes,
Jia-Ju Bai