2019-03-14 11:17:56

by Hans Holmberg

[permalink] [raw]
Subject: [PATCH RFC] lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

From: Hans Holmberg <[email protected]>

Ever since '07173c3ec276 ("block: enable multipage bvecs")' we
need to handle bios with multipage bvecs in pblk.

Currently, a multipage bvec results in a crash[1].
Fix this by using bvec iterators in stead of direct bvec indexing.

Also add a dcache flush, for the same reasons as in:
'6e6e811d747b ("block: Add missing flush_dcache_page() call")'

[1] https://github.com/OpenChannelSSD/linux/issues/61

Reported-by: Klaus Jensen <[email protected]>
Signed-off-by: Hans Holmberg <[email protected]>

---

RFC to get more eyes on this - note that we're doing something
very similar to bio_copy_data_iter.

This works in my QEMU setup, and I'll start more regression testing now.

drivers/lightnvm/pblk-read.c | 50 ++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index 1f9b319c9737..b8eb6bdb983b 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -231,14 +231,14 @@ static void pblk_end_partial_read(struct nvm_rq *rqd)
struct pblk_sec_meta *meta;
struct bio *new_bio = rqd->bio;
struct bio *bio = pr_ctx->orig_bio;
- struct bio_vec src_bv, dst_bv;
void *meta_list = rqd->meta_list;
- int bio_init_idx = pr_ctx->bio_init_idx;
unsigned long *read_bitmap = pr_ctx->bitmap;
+ struct bvec_iter orig_iter = BVEC_ITER_ALL_INIT;
+ struct bvec_iter new_iter = BVEC_ITER_ALL_INIT;
int nr_secs = pr_ctx->orig_nr_secs;
int nr_holes = nr_secs - bitmap_weight(read_bitmap, nr_secs);
void *src_p, *dst_p;
- int hole, i;
+ int bit, i;

if (unlikely(nr_holes == 1)) {
struct ppa_addr ppa;
@@ -257,33 +257,39 @@ static void pblk_end_partial_read(struct nvm_rq *rqd)

/* Fill the holes in the original bio */
i = 0;
- hole = find_first_zero_bit(read_bitmap, nr_secs);
- do {
- struct pblk_line *line;
+ for (bit = 0; bit < nr_secs; bit++) {
+ if (!test_bit(bit, read_bitmap)) {
+ struct bio_vec dst_bv, src_bv;
+ struct pblk_line *line;

- line = pblk_ppa_to_line(pblk, rqd->ppa_list[i]);
- kref_put(&line->ref, pblk_line_put);
+ line = pblk_ppa_to_line(pblk, rqd->ppa_list[i]);
+ kref_put(&line->ref, pblk_line_put);

- meta = pblk_get_meta(pblk, meta_list, hole);
- meta->lba = cpu_to_le64(pr_ctx->lba_list_media[i]);
+ meta = pblk_get_meta(pblk, meta_list, bit);
+ meta->lba = cpu_to_le64(pr_ctx->lba_list_media[i]);

- src_bv = new_bio->bi_io_vec[i++];
- dst_bv = bio->bi_io_vec[bio_init_idx + hole];
+ dst_bv = bio_iter_iovec(bio, orig_iter);
+ src_bv = bio_iter_iovec(new_bio, new_iter);

- src_p = kmap_atomic(src_bv.bv_page);
- dst_p = kmap_atomic(dst_bv.bv_page);
+ src_p = kmap_atomic(src_bv.bv_page);
+ dst_p = kmap_atomic(dst_bv.bv_page);

- memcpy(dst_p + dst_bv.bv_offset,
- src_p + src_bv.bv_offset,
- PBLK_EXPOSED_PAGE_SIZE);
+ memcpy(dst_p + dst_bv.bv_offset,
+ src_p + src_bv.bv_offset,
+ PBLK_EXPOSED_PAGE_SIZE);

- kunmap_atomic(src_p);
- kunmap_atomic(dst_p);
+ kunmap_atomic(src_p);
+ kunmap_atomic(dst_p);

- mempool_free(src_bv.bv_page, &pblk->page_bio_pool);
+ flush_dcache_page(dst_bv.bv_page);
+ mempool_free(src_bv.bv_page, &pblk->page_bio_pool);

- hole = find_next_zero_bit(read_bitmap, nr_secs, hole + 1);
- } while (hole < nr_secs);
+ bio_advance_iter(new_bio, &new_iter,
+ PBLK_EXPOSED_PAGE_SIZE);
+ i++;
+ }
+ bio_advance_iter(bio, &orig_iter, PBLK_EXPOSED_PAGE_SIZE);
+ }

bio_put(new_bio);
kfree(pr_ctx);
--
2.17.1



2019-03-14 13:52:05

by Igor Konopko

[permalink] [raw]
Subject: Re: [PATCH RFC] lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

While reading this patch, idea came to my mind - maybe it would be
simply better to get rid of partial read handling from pblk in current
form at all and use bio_split() & bio_chain() combination instead?

This would definitely reduce a number of this 'kind of complicated' code
inside pblk and let block layer help us with that. Even that the
performance of such a requests could be a little worse (few smaller IOs
instead of single vector IO), I believe that partial read is more a
corner case, then a typical use case, so maybe this would be a path to go.

Let me know what you think about such an approach - I can make a patch
with that if you want.

Igor

On 14.03.2019 12:16, [email protected] wrote:
> From: Hans Holmberg <[email protected]>
>
> Ever since '07173c3ec276 ("block: enable multipage bvecs")' we
> need to handle bios with multipage bvecs in pblk.
>
> Currently, a multipage bvec results in a crash[1].
> Fix this by using bvec iterators in stead of direct bvec indexing.
>
> Also add a dcache flush, for the same reasons as in:
> '6e6e811d747b ("block: Add missing flush_dcache_page() call")'
>
> [1] https://github.com/OpenChannelSSD/linux/issues/61
>
> Reported-by: Klaus Jensen <[email protected]>
> Signed-off-by: Hans Holmberg <[email protected]>
>
> ---
>
> RFC to get more eyes on this - note that we're doing something
> very similar to bio_copy_data_iter.
>
> This works in my QEMU setup, and I'll start more regression testing now.
>
> drivers/lightnvm/pblk-read.c | 50 ++++++++++++++++++++----------------
> 1 file changed, 28 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
> index 1f9b319c9737..b8eb6bdb983b 100644
> --- a/drivers/lightnvm/pblk-read.c
> +++ b/drivers/lightnvm/pblk-read.c
> @@ -231,14 +231,14 @@ static void pblk_end_partial_read(struct nvm_rq *rqd)
> struct pblk_sec_meta *meta;
> struct bio *new_bio = rqd->bio;
> struct bio *bio = pr_ctx->orig_bio;
> - struct bio_vec src_bv, dst_bv;
> void *meta_list = rqd->meta_list;
> - int bio_init_idx = pr_ctx->bio_init_idx;
> unsigned long *read_bitmap = pr_ctx->bitmap;
> + struct bvec_iter orig_iter = BVEC_ITER_ALL_INIT;
> + struct bvec_iter new_iter = BVEC_ITER_ALL_INIT;
> int nr_secs = pr_ctx->orig_nr_secs;
> int nr_holes = nr_secs - bitmap_weight(read_bitmap, nr_secs);
> void *src_p, *dst_p;
> - int hole, i;
> + int bit, i;
>
> if (unlikely(nr_holes == 1)) {
> struct ppa_addr ppa;
> @@ -257,33 +257,39 @@ static void pblk_end_partial_read(struct nvm_rq *rqd)
>
> /* Fill the holes in the original bio */
> i = 0;
> - hole = find_first_zero_bit(read_bitmap, nr_secs);
> - do {
> - struct pblk_line *line;
> + for (bit = 0; bit < nr_secs; bit++) {
> + if (!test_bit(bit, read_bitmap)) {
> + struct bio_vec dst_bv, src_bv;
> + struct pblk_line *line;
>
> - line = pblk_ppa_to_line(pblk, rqd->ppa_list[i]);
> - kref_put(&line->ref, pblk_line_put);
> + line = pblk_ppa_to_line(pblk, rqd->ppa_list[i]);
> + kref_put(&line->ref, pblk_line_put);
>
> - meta = pblk_get_meta(pblk, meta_list, hole);
> - meta->lba = cpu_to_le64(pr_ctx->lba_list_media[i]);
> + meta = pblk_get_meta(pblk, meta_list, bit);
> + meta->lba = cpu_to_le64(pr_ctx->lba_list_media[i]);
>
> - src_bv = new_bio->bi_io_vec[i++];
> - dst_bv = bio->bi_io_vec[bio_init_idx + hole];
> + dst_bv = bio_iter_iovec(bio, orig_iter);
> + src_bv = bio_iter_iovec(new_bio, new_iter);
>
> - src_p = kmap_atomic(src_bv.bv_page);
> - dst_p = kmap_atomic(dst_bv.bv_page);
> + src_p = kmap_atomic(src_bv.bv_page);
> + dst_p = kmap_atomic(dst_bv.bv_page);
>
> - memcpy(dst_p + dst_bv.bv_offset,
> - src_p + src_bv.bv_offset,
> - PBLK_EXPOSED_PAGE_SIZE);
> + memcpy(dst_p + dst_bv.bv_offset,
> + src_p + src_bv.bv_offset,
> + PBLK_EXPOSED_PAGE_SIZE);
>
> - kunmap_atomic(src_p);
> - kunmap_atomic(dst_p);
> + kunmap_atomic(src_p);
> + kunmap_atomic(dst_p);
>
> - mempool_free(src_bv.bv_page, &pblk->page_bio_pool);
> + flush_dcache_page(dst_bv.bv_page);
> + mempool_free(src_bv.bv_page, &pblk->page_bio_pool);
>
> - hole = find_next_zero_bit(read_bitmap, nr_secs, hole + 1);
> - } while (hole < nr_secs);
> + bio_advance_iter(new_bio, &new_iter,
> + PBLK_EXPOSED_PAGE_SIZE);
> + i++;
> + }
> + bio_advance_iter(bio, &orig_iter, PBLK_EXPOSED_PAGE_SIZE);
> + }
>
> bio_put(new_bio);
> kfree(pr_ctx);
>

2019-03-14 14:18:29

by Javier González

[permalink] [raw]
Subject: Re: [PATCH RFC] lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

> On 14 Mar 2019, at 06.49, Igor Konopko <[email protected]> wrote:
>
> While reading this patch, idea came to my mind - maybe it would be simply better to get rid of partial read handling from pblk in current form at all and use bio_split() & bio_chain() combination instead?
>
> This would definitely reduce a number of this 'kind of complicated' code inside pblk and let block layer help us with that. Even that the performance of such a requests could be a little worse (few smaller IOs instead of single vector IO), I believe that partial read is more a corner case, then a typical use case, so maybe this would be a path to go.
>
> Let me know what you think about such an approach - I can make a patch with that if you want.

I agree with Igor.

As I mentioned offline, we should fix this in a way that survives
further changes in struct bio; either making pblk handling visible to
the outside or rethinking the whole thing.

Igor: If you can send a patch you mention, it would be great. I have
been trying the helper approach for some time, but it is too specific,
as fixing holes in the bvec breaks the bio advance-only semantics. Your
approach seems much better.

Thanks,
Javier


Attachments:
signature.asc (849.00 B)
Message signed with OpenPGP

2019-03-14 16:17:37

by Heiner Litz

[permalink] [raw]
Subject: Re: [PATCH RFC] lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

On Thu, Mar 14, 2019 at 7:16 AM Javier González <[email protected]> wrote:
>
> > On 14 Mar 2019, at 06.49, Igor Konopko <[email protected]> wrote:
> >
> > While reading this patch, idea came to my mind - maybe it would be simply better to get rid of partial read handling from pblk in current form at all and use bio_split() & bio_chain() combination instead?
> >
> > This would definitely reduce a number of this 'kind of complicated' code inside pblk and let block layer help us with that. Even that the performance of such a requests could be a little worse (few smaller IOs instead of single vector IO), I believe that partial read is more a corner case, then a typical use case, so maybe this would be a path to go.
> >
> > Let me know what you think about such an approach - I can make a patch with that if you want.

I like this idea as it will clean up a lot of code and get rid of the
read_bitmap.
Note that a single request can be turned into up to 32 requests if
cached and non-cached sectors alternate, each one probably requiring
an l2p lookup. I still think it's worth doing though.
When you split, note that you have to release all line_refs which were
acquired during l2p lookup.

>
> I agree with Igor.
>
> As I mentioned offline, we should fix this in a way that survives
> further changes in struct bio; either making pblk handling visible to
> the outside or rethinking the whole thing.
>
> Igor: If you can send a patch you mention, it would be great. I have
> been trying the helper approach for some time, but it is too specific,
> as fixing holes in the bvec breaks the bio advance-only semantics. Your
> approach seems much better.
>
> Thanks,
> Javier

2019-03-15 08:35:25

by Hans Holmberg

[permalink] [raw]
Subject: Re: [PATCH RFC] lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

On Thu, Mar 14, 2019 at 5:15 PM Heiner Litz <[email protected]> wrote:
>
> On Thu, Mar 14, 2019 at 7:16 AM Javier González <[email protected]> wrote:
> >
> > > On 14 Mar 2019, at 06.49, Igor Konopko <[email protected]> wrote:
> > >
> > > While reading this patch, idea came to my mind - maybe it would be simply better to get rid of partial read handling from pblk in current form at all and use bio_split() & bio_chain() combination instead?
> > >
> > > This would definitely reduce a number of this 'kind of complicated' code inside pblk and let block layer help us with that. Even that the performance of such a requests could be a little worse (few smaller IOs instead of single vector IO), I believe that partial read is more a corner case, then a typical use case, so maybe this would be a path to go.
> > >
> > > Let me know what you think about such an approach - I can make a patch with that if you want.
>
> I like this idea as it will clean up a lot of code and get rid of the
> read_bitmap.
> Note that a single request can be turned into up to 32 requests if
> cached and non-cached sectors alternate, each one probably requiring
> an l2p lookup. I still think it's worth doing though.
> When you split, note that you have to release all line_refs which were
> acquired during l2p lookup.
>
> >
> > I agree with Igor.
> >

I agree as well, this path deserves some loving care and cleanup.
Let's just try to avoid read tail latency degradation.

Simplifying the code is a great first step, then we can optimize if needed be.

Thanks for the feedback y'all!


> > As I mentioned offline, we should fix this in a way that survives
> > further changes in struct bio; either making pblk handling visible to
> > the outside or rethinking the whole thing.
> >
> > Igor: If you can send a patch you mention, it would be great. I have
> > been trying the helper approach for some time, but it is too specific,
> > as fixing holes in the bvec breaks the bio advance-only semantics. Your
> > approach seems much better.
> >
> > Thanks,
> > Javier