2018-09-05 08:42:12

by Jia-Ju Bai

[permalink] [raw]
Subject: [PATCH v2] lightnvm: pblk-recovery: lightnvm: pblk: Fix two sleep-in-atomic-context bugs

The driver may sleep with holding a spinlock.

The function call paths (from bottom to top) in Linux-4.16 are:

[FUNC] nvm_dev_dma_alloc(GFP_KERNEL)
drivers/lightnvm/pblk-core.c, 754:
nvm_dev_dma_alloc in pblk_line_submit_smeta_io
drivers/lightnvm/pblk-core.c, 1048:
pblk_line_submit_smeta_io in pblk_line_init_bb
drivers/lightnvm/pblk-core.c, 1434:
pblk_line_init_bb in pblk_line_replace_data
drivers/lightnvm/pblk-recovery.c, 980:
pblk_line_replace_data in pblk_recov_l2p
drivers/lightnvm/pblk-recovery.c, 976:
spin_lock in pblk_recov_l2p

[FUNC] bio_map_kern(GFP_KERNEL)
drivers/lightnvm/pblk-core.c, 762:
bio_map_kern in pblk_line_submit_smeta_io
drivers/lightnvm/pblk-core.c, 1048:
pblk_line_submit_smeta_io in pblk_line_init_bb
drivers/lightnvm/pblk-core.c, 1434:
pblk_line_init_bb in pblk_line_replace_data
drivers/lightnvm/pblk-recovery.c, 980:
pblk_line_replace_data in pblk_recov_l2p
drivers/lightnvm/pblk-recovery.c, 976:
spin_lock in pblk_recov_l2p

To fix these bugs, the call to pblk_line_replace_data()
is moved out of the spinlock protection.

These bugs are found by my static analysis tool DSAC.

Signed-off-by: Jia-Ju Bai <[email protected]>
---
v2:
* Move the call to pblk_line_replace_data() out of the spinlock
protection, instead of v1 that changes GFP_KERNEL to GFP_ATOMIC in
the calls to bio_map_kern() and nvm_dev_dma_alloc().
Thanks Javier for good advice.
---
drivers/lightnvm/pblk-recovery.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index 3a5069183859..5fde414d78bb 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -955,12 +955,14 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
}
}

- spin_lock(&l_mg->free_lock);
if (!open_lines) {
+ spin_lock(&l_mg->free_lock);
WARN_ON_ONCE(!test_and_clear_bit(meta_line,
&l_mg->meta_bitmap));
+ spin_unlock(&l_mg->free_lock);
pblk_line_replace_data(pblk);
} else {
+ spin_lock(&l_mg->free_lock);
/* Allocate next line for preparation */
l_mg->data_next = pblk_line_get(pblk);
if (l_mg->data_next) {
@@ -968,8 +970,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
l_mg->data_next->type = PBLK_LINETYPE_DATA;
is_next = 1;
}
+ spin_unlock(&l_mg->free_lock);
}
- spin_unlock(&l_mg->free_lock);

if (is_next)
pblk_line_erase(pblk, l_mg->data_next);
--
2.17.0



2018-09-06 09:42:18

by Hans Holmberg

[permalink] [raw]
Subject: Re: [PATCH v2] lightnvm: pblk-recovery: lightnvm: pblk: Fix two sleep-in-atomic-context bugs

On Wed, Sep 5, 2018 at 10:40 AM Jia-Ju Bai <[email protected]> wrote:
>
> The driver may sleep with holding a spinlock.
>
> The function call paths (from bottom to top) in Linux-4.16 are:
>
> [FUNC] nvm_dev_dma_alloc(GFP_KERNEL)
> drivers/lightnvm/pblk-core.c, 754:
> nvm_dev_dma_alloc in pblk_line_submit_smeta_io
> drivers/lightnvm/pblk-core.c, 1048:
> pblk_line_submit_smeta_io in pblk_line_init_bb
> drivers/lightnvm/pblk-core.c, 1434:
> pblk_line_init_bb in pblk_line_replace_data
> drivers/lightnvm/pblk-recovery.c, 980:
> pblk_line_replace_data in pblk_recov_l2p
> drivers/lightnvm/pblk-recovery.c, 976:
> spin_lock in pblk_recov_l2p
>
> [FUNC] bio_map_kern(GFP_KERNEL)
> drivers/lightnvm/pblk-core.c, 762:
> bio_map_kern in pblk_line_submit_smeta_io
> drivers/lightnvm/pblk-core.c, 1048:
> pblk_line_submit_smeta_io in pblk_line_init_bb
> drivers/lightnvm/pblk-core.c, 1434:
> pblk_line_init_bb in pblk_line_replace_data
> drivers/lightnvm/pblk-recovery.c, 980:
> pblk_line_replace_data in pblk_recov_l2p
> drivers/lightnvm/pblk-recovery.c, 976:
> spin_lock in pblk_recov_l2p
>
> To fix these bugs, the call to pblk_line_replace_data()
> is moved out of the spinlock protection.
>
> These bugs are found by my static analysis tool DSAC.
>
> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> v2:
> * Move the call to pblk_line_replace_data() out of the spinlock
> protection, instead of v1 that changes GFP_KERNEL to GFP_ATOMIC in
> the calls to bio_map_kern() and nvm_dev_dma_alloc().
> Thanks Javier for good advice.
> ---
> drivers/lightnvm/pblk-recovery.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 3a5069183859..5fde414d78bb 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -955,12 +955,14 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
> }
> }
>
> - spin_lock(&l_mg->free_lock);
> if (!open_lines) {
> + spin_lock(&l_mg->free_lock);
> WARN_ON_ONCE(!test_and_clear_bit(meta_line,
> &l_mg->meta_bitmap));
> + spin_unlock(&l_mg->free_lock);
> pblk_line_replace_data(pblk);
> } else {
> + spin_lock(&l_mg->free_lock);
> /* Allocate next line for preparation */
> l_mg->data_next = pblk_line_get(pblk);
> if (l_mg->data_next) {
> @@ -968,8 +970,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
> l_mg->data_next->type = PBLK_LINETYPE_DATA;
> is_next = 1;
> }
> + spin_unlock(&l_mg->free_lock);
> }
> - spin_unlock(&l_mg->free_lock);

Wouldn't the most straight forward solution here to not take the lock?

As pblk is doing all the recovery sequentially during initialization,
I don't see the
reason for grabbing the lock.

>
> if (is_next)
> pblk_line_erase(pblk, l_mg->data_next);
> --
> 2.17.0
>

2018-09-06 13:13:32

by Javier Gonzalez

[permalink] [raw]
Subject: Re: [PATCH v2] lightnvm: pblk-recovery: lightnvm: pblk: Fix two sleep-in-atomic-context bugs

> On 6 Sep 2018, at 00.51, Hans Holmberg <[email protected]> wrote:
>
> On Wed, Sep 5, 2018 at 10:40 AM Jia-Ju Bai <[email protected]> wrote:
>> The driver may sleep with holding a spinlock.
>>
>> The function call paths (from bottom to top) in Linux-4.16 are:
>>
>> [FUNC] nvm_dev_dma_alloc(GFP_KERNEL)
>> drivers/lightnvm/pblk-core.c, 754:
>> nvm_dev_dma_alloc in pblk_line_submit_smeta_io
>> drivers/lightnvm/pblk-core.c, 1048:
>> pblk_line_submit_smeta_io in pblk_line_init_bb
>> drivers/lightnvm/pblk-core.c, 1434:
>> pblk_line_init_bb in pblk_line_replace_data
>> drivers/lightnvm/pblk-recovery.c, 980:
>> pblk_line_replace_data in pblk_recov_l2p
>> drivers/lightnvm/pblk-recovery.c, 976:
>> spin_lock in pblk_recov_l2p
>>
>> [FUNC] bio_map_kern(GFP_KERNEL)
>> drivers/lightnvm/pblk-core.c, 762:
>> bio_map_kern in pblk_line_submit_smeta_io
>> drivers/lightnvm/pblk-core.c, 1048:
>> pblk_line_submit_smeta_io in pblk_line_init_bb
>> drivers/lightnvm/pblk-core.c, 1434:
>> pblk_line_init_bb in pblk_line_replace_data
>> drivers/lightnvm/pblk-recovery.c, 980:
>> pblk_line_replace_data in pblk_recov_l2p
>> drivers/lightnvm/pblk-recovery.c, 976:
>> spin_lock in pblk_recov_l2p
>>
>> To fix these bugs, the call to pblk_line_replace_data()
>> is moved out of the spinlock protection.
>>
>> These bugs are found by my static analysis tool DSAC.
>>
>> Signed-off-by: Jia-Ju Bai <[email protected]>
>> ---
>> v2:
>> * Move the call to pblk_line_replace_data() out of the spinlock
>> protection, instead of v1 that changes GFP_KERNEL to GFP_ATOMIC in
>> the calls to bio_map_kern() and nvm_dev_dma_alloc().
>> Thanks Javier for good advice.
>> ---
>> drivers/lightnvm/pblk-recovery.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index 3a5069183859..5fde414d78bb 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -955,12 +955,14 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
>> }
>> }
>>
>> - spin_lock(&l_mg->free_lock);
>> if (!open_lines) {
>> + spin_lock(&l_mg->free_lock);
>> WARN_ON_ONCE(!test_and_clear_bit(meta_line,
>> &l_mg->meta_bitmap));
>> + spin_unlock(&l_mg->free_lock);
>> pblk_line_replace_data(pblk);
>> } else {
>> + spin_lock(&l_mg->free_lock);
>> /* Allocate next line for preparation */
>> l_mg->data_next = pblk_line_get(pblk);
>> if (l_mg->data_next) {
>> @@ -968,8 +970,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
>> l_mg->data_next->type = PBLK_LINETYPE_DATA;
>> is_next = 1;
>> }
>> + spin_unlock(&l_mg->free_lock);
>> }
>> - spin_unlock(&l_mg->free_lock);
>
> Wouldn't the most straight forward solution here to not take the lock?
>
> As pblk is doing all the recovery sequentially during initialization,
> I don't see the
> reason for grabbing the lock.
>

We start doing GC as we recover and, even though right now there is no
interference, the code might change. For such a little change, I would
rather maintain lock consistency than being smart about where we can
avoid locking - specially when this is not the fast path.

Javier


Attachments:
signature.asc (849.00 B)
Message signed with OpenPGP

2018-09-06 13:20:35

by Javier Gonzalez

[permalink] [raw]
Subject: Re: [PATCH v2] lightnvm: pblk-recovery: lightnvm: pblk: Fix two sleep-in-atomic-context bugs

> On 5 Sep 2018, at 01.40, Jia-Ju Bai <[email protected]> wrote:
>
> The driver may sleep with holding a spinlock.
>
> The function call paths (from bottom to top) in Linux-4.16 are:
>
> [FUNC] nvm_dev_dma_alloc(GFP_KERNEL)
> drivers/lightnvm/pblk-core.c, 754:
> nvm_dev_dma_alloc in pblk_line_submit_smeta_io
> drivers/lightnvm/pblk-core.c, 1048:
> pblk_line_submit_smeta_io in pblk_line_init_bb
> drivers/lightnvm/pblk-core.c, 1434:
> pblk_line_init_bb in pblk_line_replace_data
> drivers/lightnvm/pblk-recovery.c, 980:
> pblk_line_replace_data in pblk_recov_l2p
> drivers/lightnvm/pblk-recovery.c, 976:
> spin_lock in pblk_recov_l2p
>
> [FUNC] bio_map_kern(GFP_KERNEL)
> drivers/lightnvm/pblk-core.c, 762:
> bio_map_kern in pblk_line_submit_smeta_io
> drivers/lightnvm/pblk-core.c, 1048:
> pblk_line_submit_smeta_io in pblk_line_init_bb
> drivers/lightnvm/pblk-core.c, 1434:
> pblk_line_init_bb in pblk_line_replace_data
> drivers/lightnvm/pblk-recovery.c, 980:
> pblk_line_replace_data in pblk_recov_l2p
> drivers/lightnvm/pblk-recovery.c, 976:
> spin_lock in pblk_recov_l2p
>
> To fix these bugs, the call to pblk_line_replace_data()
> is moved out of the spinlock protection.
>
> These bugs are found by my static analysis tool DSAC.
>
> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> v2:
> * Move the call to pblk_line_replace_data() out of the spinlock
> protection, instead of v1 that changes GFP_KERNEL to GFP_ATOMIC in
> the calls to bio_map_kern() and nvm_dev_dma_alloc().
> Thanks Javier for good advice.
> ---
> drivers/lightnvm/pblk-recovery.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
> index 3a5069183859..5fde414d78bb 100644
> --- a/drivers/lightnvm/pblk-recovery.c
> +++ b/drivers/lightnvm/pblk-recovery.c
> @@ -955,12 +955,14 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
> }
> }
>
> - spin_lock(&l_mg->free_lock);
> if (!open_lines) {
> + spin_lock(&l_mg->free_lock);
> WARN_ON_ONCE(!test_and_clear_bit(meta_line,
> &l_mg->meta_bitmap));
> + spin_unlock(&l_mg->free_lock);
> pblk_line_replace_data(pblk);
> } else {
> + spin_lock(&l_mg->free_lock);
> /* Allocate next line for preparation */
> l_mg->data_next = pblk_line_get(pblk);
> if (l_mg->data_next) {
> @@ -968,8 +970,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
> l_mg->data_next->type = PBLK_LINETYPE_DATA;
> is_next = 1;
> }
> + spin_unlock(&l_mg->free_lock);
> }
> - spin_unlock(&l_mg->free_lock);
>
> if (is_next)
> pblk_line_erase(pblk, l_mg->data_next);
> --
> 2.17.0

It looks good to me. Thanks Jia-Ju.

Reviewed-by: Javier González <[email protected]>


Attachments:
signature.asc (849.00 B)
Message signed with OpenPGP

2018-09-07 10:11:06

by Matias Bjørling

[permalink] [raw]
Subject: Re: [PATCH v2] lightnvm: pblk-recovery: lightnvm: pblk: Fix two sleep-in-atomic-context bugs

On 09/06/2018 03:18 PM, Javier Gonzalez wrote:
>> On 5 Sep 2018, at 01.40, Jia-Ju Bai <[email protected]> wrote:
>>
>> The driver may sleep with holding a spinlock.
>>
>> The function call paths (from bottom to top) in Linux-4.16 are:
>>
>> [FUNC] nvm_dev_dma_alloc(GFP_KERNEL)
>> drivers/lightnvm/pblk-core.c, 754:
>> nvm_dev_dma_alloc in pblk_line_submit_smeta_io
>> drivers/lightnvm/pblk-core.c, 1048:
>> pblk_line_submit_smeta_io in pblk_line_init_bb
>> drivers/lightnvm/pblk-core.c, 1434:
>> pblk_line_init_bb in pblk_line_replace_data
>> drivers/lightnvm/pblk-recovery.c, 980:
>> pblk_line_replace_data in pblk_recov_l2p
>> drivers/lightnvm/pblk-recovery.c, 976:
>> spin_lock in pblk_recov_l2p
>>
>> [FUNC] bio_map_kern(GFP_KERNEL)
>> drivers/lightnvm/pblk-core.c, 762:
>> bio_map_kern in pblk_line_submit_smeta_io
>> drivers/lightnvm/pblk-core.c, 1048:
>> pblk_line_submit_smeta_io in pblk_line_init_bb
>> drivers/lightnvm/pblk-core.c, 1434:
>> pblk_line_init_bb in pblk_line_replace_data
>> drivers/lightnvm/pblk-recovery.c, 980:
>> pblk_line_replace_data in pblk_recov_l2p
>> drivers/lightnvm/pblk-recovery.c, 976:
>> spin_lock in pblk_recov_l2p
>>
>> To fix these bugs, the call to pblk_line_replace_data()
>> is moved out of the spinlock protection.
>>
>> These bugs are found by my static analysis tool DSAC.
>>
>> Signed-off-by: Jia-Ju Bai <[email protected]>
>> ---
>> v2:
>> * Move the call to pblk_line_replace_data() out of the spinlock
>> protection, instead of v1 that changes GFP_KERNEL to GFP_ATOMIC in
>> the calls to bio_map_kern() and nvm_dev_dma_alloc().
>> Thanks Javier for good advice.
>> ---
>> drivers/lightnvm/pblk-recovery.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
>> index 3a5069183859..5fde414d78bb 100644
>> --- a/drivers/lightnvm/pblk-recovery.c
>> +++ b/drivers/lightnvm/pblk-recovery.c
>> @@ -955,12 +955,14 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
>> }
>> }
>>
>> - spin_lock(&l_mg->free_lock);
>> if (!open_lines) {
>> + spin_lock(&l_mg->free_lock);
>> WARN_ON_ONCE(!test_and_clear_bit(meta_line,
>> &l_mg->meta_bitmap));
>> + spin_unlock(&l_mg->free_lock);
>> pblk_line_replace_data(pblk);
>> } else {
>> + spin_lock(&l_mg->free_lock);
>> /* Allocate next line for preparation */
>> l_mg->data_next = pblk_line_get(pblk);
>> if (l_mg->data_next) {
>> @@ -968,8 +970,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
>> l_mg->data_next->type = PBLK_LINETYPE_DATA;
>> is_next = 1;
>> }
>> + spin_unlock(&l_mg->free_lock);
>> }
>> - spin_unlock(&l_mg->free_lock);
>>
>> if (is_next)
>> pblk_line_erase(pblk, l_mg->data_next);
>> --
>> 2.17.0
>
> It looks good to me. Thanks Jia-Ju.
>
> Reviewed-by: Javier González <[email protected]>
>

Thanks. Applied for 4.20.