2019-02-05 06:52:03

by Javier González

[permalink] [raw]
Subject: [PATCH V3] lightnvm: pblk: prevent stall due to wb threshold

In order to respect mw_cuinits, pblk's write buffer maintains a
backpointer to protect data not yet persisted; when writing to the write
buffer, this backpointer defines a threshold that pblk's rate-limiter
enforces.

On small PU configurations, the following scenarios might take place: (i)
the threshold is larger than the write buffer and (ii) the threshold is
smaller than the write buffer, but larger than the maximun allowed
split bio - 256KB at this moment (Note that writes are not always
split - we only do this when we the size of the buffer is smaller
than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
be written to the buffer, thus stalling.

This patch fixes the original backpointer implementation by considering
the threshold both on buffer creation and on the rate-limiters path,
when bio_split is triggered (case (ii) above).

Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
Signed-off-by: Javier González <[email protected]>
---

Changes since V1:
- Fix a bad arithmetinc on the rate-limiter max_io calculation (from
Hans)
Changes since V2:
- Address case where mw_cunits = 0 in the new math

drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
drivers/lightnvm/pblk-rl.c | 5 ++---
drivers/lightnvm/pblk.h | 2 +-
3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index d4ca8c64ee0f..a6133b50ed9c 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
/*
* pblk_rb_calculate_size -- calculate the size of the write buffer
*/
-static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
+static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
+ unsigned int threshold)
{
- /* Alloc a write buffer that can at least fit 128 entries */
- return (1 << max(get_count_order(nr_entries), 7));
+ unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
+ unsigned int max_sz = max(thr_sz, nr_entries);
+ unsigned int max_io;
+
+ /* Alloc a write buffer that can (i) fit at least two split bios
+ * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
+ * threshold will be respected
+ */
+ max_io = (1 << max((int)(get_count_order(max_sz)),
+ (int)(get_count_order(NVM_MAX_VLBA << 1))));
+ if ((threshold + NVM_MAX_VLBA) >= max_io)
+ max_io <<= 1;
+
+ return max_io;
}

/*
@@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
unsigned int alloc_order, order, iter;
unsigned int nr_entries;

- nr_entries = pblk_rb_calculate_size(size);
+ nr_entries = pblk_rb_calculate_size(size, threshold);
entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
if (!entries)
return -ENOMEM;

- power_size = get_count_order(size);
+ power_size = get_count_order(nr_entries);
power_seg_sz = get_count_order(seg_size);

down_write(&pblk_rb_lock);
@@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
* Initialize rate-limiter, which controls access to the write buffer
* by user and GC I/O
*/
- pblk_rl_init(&pblk->rl, rb->nr_entries);
+ pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);

return 0;
}
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index 76116d5f78e4..b014957dde0b 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
del_timer(&rl->u_timer);
}

-void pblk_rl_init(struct pblk_rl *rl, int budget)
+void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
{
struct pblk *pblk = container_of(rl, struct pblk, rl);
struct nvm_tgt_dev *dev = pblk->dev;
@@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
int sec_meta, blk_meta;
unsigned int rb_windows;

-
/* Consider sectors used for metadata */
sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
@@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
/* To start with, all buffer is available to user I/O writers */
rl->rb_budget = budget;
rl->rb_user_max = budget;
- rl->rb_max_io = budget >> 1;
+ rl->rb_max_io = threshold ? (budget - threshold) : (budget - 1);
rl->rb_gc_max = 0;
rl->rb_state = PBLK_RL_HIGH;

diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 72ae8755764e..a6386d5acd73 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
/*
* pblk rate limiter
*/
-void pblk_rl_init(struct pblk_rl *rl, int budget);
+void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
void pblk_rl_free(struct pblk_rl *rl);
void pblk_rl_update_rates(struct pblk_rl *rl);
int pblk_rl_high_thrs(struct pblk_rl *rl);
--
2.17.1



2019-02-05 09:13:44

by Matias Bjørling

[permalink] [raw]
Subject: Re: [PATCH V3] lightnvm: pblk: prevent stall due to wb threshold

On 2/5/19 7:50 AM, Javier González wrote:
> In order to respect mw_cuinits, pblk's write buffer maintains a
> backpointer to protect data not yet persisted; when writing to the write
> buffer, this backpointer defines a threshold that pblk's rate-limiter
> enforces.
>
> On small PU configurations, the following scenarios might take place: (i)
> the threshold is larger than the write buffer and (ii) the threshold is
> smaller than the write buffer, but larger than the maximun allowed
> split bio - 256KB at this moment (Note that writes are not always
> split - we only do this when we the size of the buffer is smaller
> than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
> be written to the buffer, thus stalling.
>
> This patch fixes the original backpointer implementation by considering
> the threshold both on buffer creation and on the rate-limiters path,
> when bio_split is triggered (case (ii) above).
>
> Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
> Signed-off-by: Javier González <[email protected]>
> ---
>
> Changes since V1:
> - Fix a bad arithmetinc on the rate-limiter max_io calculation (from
> Hans)
> Changes since V2:
> - Address case where mw_cunits = 0 in the new math
>
> drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
> drivers/lightnvm/pblk-rl.c | 5 ++---
> drivers/lightnvm/pblk.h | 2 +-
> 3 files changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
> index d4ca8c64ee0f..a6133b50ed9c 100644
> --- a/drivers/lightnvm/pblk-rb.c
> +++ b/drivers/lightnvm/pblk-rb.c
> @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
> /*
> * pblk_rb_calculate_size -- calculate the size of the write buffer
> */
> -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
> +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
> + unsigned int threshold)
> {
> - /* Alloc a write buffer that can at least fit 128 entries */
> - return (1 << max(get_count_order(nr_entries), 7));
> + unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
> + unsigned int max_sz = max(thr_sz, nr_entries);
> + unsigned int max_io;
> +
> + /* Alloc a write buffer that can (i) fit at least two split bios
> + * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
> + * threshold will be respected
> + */
> + max_io = (1 << max((int)(get_count_order(max_sz)),
> + (int)(get_count_order(NVM_MAX_VLBA << 1))));
> + if ((threshold + NVM_MAX_VLBA) >= max_io)
> + max_io <<= 1;
> +
> + return max_io;
> }
>
> /*
> @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
> unsigned int alloc_order, order, iter;
> unsigned int nr_entries;
>
> - nr_entries = pblk_rb_calculate_size(size);
> + nr_entries = pblk_rb_calculate_size(size, threshold);
> entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
> if (!entries)
> return -ENOMEM;
>
> - power_size = get_count_order(size);
> + power_size = get_count_order(nr_entries);
> power_seg_sz = get_count_order(seg_size);
>
> down_write(&pblk_rb_lock);
> @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
> * Initialize rate-limiter, which controls access to the write buffer
> * by user and GC I/O
> */
> - pblk_rl_init(&pblk->rl, rb->nr_entries);
> + pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
>
> return 0;
> }
> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
> index 76116d5f78e4..b014957dde0b 100644
> --- a/drivers/lightnvm/pblk-rl.c
> +++ b/drivers/lightnvm/pblk-rl.c
> @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
> del_timer(&rl->u_timer);
> }
>
> -void pblk_rl_init(struct pblk_rl *rl, int budget)
> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
> {
> struct pblk *pblk = container_of(rl, struct pblk, rl);
> struct nvm_tgt_dev *dev = pblk->dev;
> @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
> int sec_meta, blk_meta;
> unsigned int rb_windows;
>
> -
> /* Consider sectors used for metadata */
> sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
> blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
> @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
> /* To start with, all buffer is available to user I/O writers */
> rl->rb_budget = budget;
> rl->rb_user_max = budget;
> - rl->rb_max_io = budget >> 1;
> + rl->rb_max_io = threshold ? (budget - threshold) : (budget - 1);
> rl->rb_gc_max = 0;
> rl->rb_state = PBLK_RL_HIGH;
>
> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
> index 72ae8755764e..a6386d5acd73 100644
> --- a/drivers/lightnvm/pblk.h
> +++ b/drivers/lightnvm/pblk.h
> @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
> /*
> * pblk rate limiter
> */
> -void pblk_rl_init(struct pblk_rl *rl, int budget);
> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
> void pblk_rl_free(struct pblk_rl *rl);
> void pblk_rl_update_rates(struct pblk_rl *rl);
> int pblk_rl_high_thrs(struct pblk_rl *rl);
>

Thanks. Applied for 5.1.