Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7337490imu; Thu, 31 Jan 2019 08:34:24 -0800 (PST) X-Google-Smtp-Source: ALg8bN6jUCRluUOwNu6HsgLzcaQEHy8IZBCdIxxRpoiGT/uW0N6/daFetKq9ltCvv4NwwySC/bj3 X-Received: by 2002:a17:902:654a:: with SMTP id d10mr34705954pln.324.1548952463959; Thu, 31 Jan 2019 08:34:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548952463; cv=none; d=google.com; s=arc-20160816; b=JUOsbFAnOGXEJ4ez9SDYTyN7G5tNX6t0VkJisWDQx1SPupG+Pjn8pHT+l89LKWwKb5 Bb/TqL5cz2KdlcBCJL6ivTI6OVLaJwMqaY0gDZGQ25+TRBAh5GqivaP9+I+mU376oI8S gcJy/l0fxSmCnZKVGYNOSarlssmO9Gts5vLnlEN1WpvIBbeH3G5nCt0lkUICojjzZe5Y iDhSShMRpL01xkDMwHq1/ybSBZFise5y5m4B6/0gM8/VVA4VzOSRvpQ3p1LR1EF3jQg6 SgVZGasV98pZmYgAql1G0CCZ/SqcrdU+txLFDmjRWg3LJ/b1OYCb7RK3FfUvT/OWnobE 7ILA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=N131cBkmPO6ZG353LPw+duwBsX7Ikptu5L7e+pOYSxg=; b=JrYbvWaTJBmOyaCFy/OV3cSENKV6yRyuJ3vfvw1N1iOi8q44rV8IDlYqVvL5ta1I0F RQW+OJxa1KjvNfgiJz9Oh+sWjZHbQbu99LrrmWcFAx6mjah0zk/vJOBHYvSJ9q4cgVfo eSbR6sCMpE5CdUGNtCX7mlcbeD2HSJz13XSC2+gNz7watTrpGiSEEk3svbfbWycb8cmu Jh5kBTpO+b4SWupDlC+ycIeAN8ajZ0zZVIcnr84uVL1lFVH3oAIURaqW/23c3xzIj47b Ap+lRCN5TaImRhIMVmAVkUyg4muRTzwdBQJrgH4+aGCMa/bR+iSWORMUGnFnH+GP043F e+TA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b="sO7A/yXk"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y126si4851052pgb.165.2019.01.31.08.34.07; Thu, 31 Jan 2019 08:34:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b="sO7A/yXk"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388520AbfAaQdM (ORCPT + 99 others); Thu, 31 Jan 2019 11:33:12 -0500 Received: from mail-lj1-f196.google.com ([209.85.208.196]:39403 "EHLO mail-lj1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388143AbfAaQdM (ORCPT ); Thu, 31 Jan 2019 11:33:12 -0500 Received: by mail-lj1-f196.google.com with SMTP id t9-v6so3225100ljh.6 for ; Thu, 31 Jan 2019 08:33:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=javigon-com.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=N131cBkmPO6ZG353LPw+duwBsX7Ikptu5L7e+pOYSxg=; b=sO7A/yXkGeFD1Am01O9hU8+VfIISfXQaHTgFI0l9eUrnN92Fz9lOuF7J2ULur+HlpD 4LBgEl0qmm3HMTzW2kejI7HgC9Jir+cIBgT4fIcOp2JyVIIR1gG0Yq7lPCsb9gbZTZBh Vz7WDdjpZlnRW/P5WDE7AUJHUx5SRxwPfq3dJzQp7yBKG0b+qEohd0dmKqCpP8OG8ySY yO+M+tQKu5GwhKLOcARWlz4Ze/Dq97YLnf4PWUlZ7I61zpz61ODY0kx/tgHvzoCue8x3 GSSKpiKKU24OBXqnwOccRLkvLbvO+tYc69mkFHUDy7koiZK7DGWQ66Z/qnsHAjHjQkYN hsZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=N131cBkmPO6ZG353LPw+duwBsX7Ikptu5L7e+pOYSxg=; b=g0uwPDH1LJMvHi5CBmrCNBCjuIHqBcLEPzSDiCzsMvqNfEEMnDDmFErm+WINbb1TEY lUcJkEUL/xi44GMZvus06hFm4IPpquSHn7XNU5JTr6BoUaDol945EMvW53hCcHEZf6si S+isF9mpDATDFRHpaVgZhzypbwvk3fuDxXFMTzmwjbcLpYhD+Gm7Ari4EdsYEtqLFLtH qMfnYdYv10ghfVgZ+JfBpXqbAWEPYK5GJmn7YSkwfYX3KZJmafbCY7HNa9vSqpwbv+lH CLq3OGtXKXVTwKUvEG8tcxC2lKMyneP1sXH96wtLK1PsVqkGh9scZzJVAfNRBOlMe9Nx TFjg== X-Gm-Message-State: AJcUukdfCjzfb4vz+azbFjiRjf043i/bA2EqfuJKgOxAowA313BYyoK+ FuPNI0khbcv6mWQ3p/9x1c0Mqk375Iuvow== X-Received: by 2002:a2e:9356:: with SMTP id m22-v6mr28314515ljh.135.1548952389191; Thu, 31 Jan 2019 08:33:09 -0800 (PST) Received: from [100.111.108.164] (77.241.128.248.mobile.3.dk. [77.241.128.248]) by smtp.gmail.com with ESMTPSA id 18-v6sm852276ljg.83.2019.01.31.08.33.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 31 Jan 2019 08:33:08 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH V2] lightnvm: pblk: prevent stall due to wb threshold From: =?utf-8?Q?Javier_Gonz=C3=A1lez?= X-Mailer: iPhone Mail (16C101) In-Reply-To: Date: Thu, 31 Jan 2019 16:33:06 +0000 Cc: Matias Bjorling , Hans Holmberg , linux-block@vger.kernel.org, Linux Kernel Mailing List Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190130102604.14496-1-javier@javigon.com> To: Hans Holmberg Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On 31 Jan 2019, at 11.41, Hans Holmberg wrote: >=20 > Hi Javier! >=20 > How did you test this? I'm trying to add a test case to our testing framew= ork. >=20 > This is what i ran in qemu, and I got a hang (with this version of the pat= ch) >=20 > nvme lnvm create -d nvme0n1 -t pblk -n pblk0 -f -b 0 -e 0 I run several low configurations without problem. Can you share the qemu con= figuration and version? I=E2=80=99m on travel until Friday - I=E2=80=99ll come back to you over the w= eekend.=20 >=20 > kernel log: [ 116.381799] pblk pblk0: luns:1, lines:280, secs:212736, > buf entries:128 >=20 > # dd if=3D/dev/zero of=3D/dev/pblk0 oflag=3Ddirect bs=3D4k count=3D1 > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000480941 s, 8.5 MB/s > # dd if=3D/dev/zero of=3D/dev/pblk0 oflag=3Ddirect bs=3D64k count=3D1 > 1+0 records in > 1+0 records out > 65536 bytes (66 kB, 64 KiB) copied, 0.000477373 s, 137 MB/s > # dd if=3D/dev/zero of=3D/dev/pblk0 oflag=3Ddirect bs=3D128k count=3D1 > 1+0 records in > 1+0 records out > 131072 bytes (131 kB, 128 KiB) copied, 0.000548722 s, 239 MB/s > # dd if=3D/dev/zero of=3D/dev/pblk0 oflag=3Ddirect bs=3D256k count=3D1 > 1+0 records in > 1+0 records out > 262144 bytes (262 kB, 256 KiB) copied, 0.000718515 s, 365 MB/s > # dd if=3D/dev/zero of=3D/dev/pblk0 oflag=3Ddirect bs=3D512k count=3D1 > >=20 >=20 >> On Wed, Jan 30, 2019 at 11:28 AM Javier Gonz=C3=A1lez wrote: >>=20 >> In order to respect mw_cuinits, pblk's write buffer maintains a >> backpointer to protect data not yet persisted; when writing to the write >> buffer, this backpointer defines a threshold that pblk's rate-limiter >> enforces. >>=20 >> On small PU configurations, the following scenarios might take place: (i)= >> the threshold is larger than the write buffer and (ii) the threshold is >> smaller than the write buffer, but larger than the maximun allowed >> split bio - 256KB at this moment (Note that writes are not always >> split - we only do this when we the size of the buffer is smaller >> than the buffer). In both cases, pblk's rate-limiter prevents the I/O to >> be written to the buffer, thus stalling. >>=20 >> This patch fixes the original backpointer implementation by considering >> the threshold both on buffer creation and on the rate-limiters path, >> when bio_split is triggered (case (ii) above). >>=20 >> Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respe= cted on writer stall") >> Signed-off-by: Javier Gonz=C3=A1lez >> --- >>=20 >> Changes since V1: >> - Fix a bad arithmetinc on the rate-limiter max_io calculation (from >> Hans) >>=20 >> drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------ >> drivers/lightnvm/pblk-rl.c | 5 ++--- >> drivers/lightnvm/pblk.h | 2 +- >> 3 files changed, 22 insertions(+), 10 deletions(-) >>=20 >> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c >> index d4ca8c64ee0f..a6133b50ed9c 100644 >> --- a/drivers/lightnvm/pblk-rb.c >> +++ b/drivers/lightnvm/pblk-rb.c >> @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb) >> /* >> * pblk_rb_calculate_size -- calculate the size of the write buffer >> */ >> -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries) >> +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries, >> + unsigned int threshold) >> { >> - /* Alloc a write buffer that can at least fit 128 entries */ >> - return (1 << max(get_count_order(nr_entries), 7)); >> + unsigned int thr_sz =3D 1 << (get_count_order(threshold + NVM_MAX= _VLBA)); >> + unsigned int max_sz =3D max(thr_sz, nr_entries); >> + unsigned int max_io; >> + >> + /* Alloc a write buffer that can (i) fit at least two split bios >> + * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee tha= t the >> + * threshold will be respected >> + */ >> + max_io =3D (1 << max((int)(get_count_order(max_sz)), >> + (int)(get_count_order(NVM_MAX_VLBA << 1))= )); >> + if ((threshold + NVM_MAX_VLBA) >=3D max_io) >> + max_io <<=3D 1; >> + >> + return max_io; >> } >>=20 >> /* >> @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int siz= e, unsigned int threshold, >> unsigned int alloc_order, order, iter; >> unsigned int nr_entries; >>=20 >> - nr_entries =3D pblk_rb_calculate_size(size); >> + nr_entries =3D pblk_rb_calculate_size(size, threshold); >> entries =3D vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_e= ntry))); >> if (!entries) >> return -ENOMEM; >>=20 >> - power_size =3D get_count_order(size); >> + power_size =3D get_count_order(nr_entries); >> power_seg_sz =3D get_count_order(seg_size); >>=20 >> down_write(&pblk_rb_lock); >> @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int siz= e, unsigned int threshold, >> * Initialize rate-limiter, which controls access to the write buf= fer >> * by user and GC I/O >> */ >> - pblk_rl_init(&pblk->rl, rb->nr_entries); >> + pblk_rl_init(&pblk->rl, rb->nr_entries, threshold); >>=20 >> return 0; >> } >> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c >> index 76116d5f78e4..e9e0af0df165 100644 >> --- a/drivers/lightnvm/pblk-rl.c >> +++ b/drivers/lightnvm/pblk-rl.c >> @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl) >> del_timer(&rl->u_timer); >> } >>=20 >> -void pblk_rl_init(struct pblk_rl *rl, int budget) >> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold) >> { >> struct pblk *pblk =3D container_of(rl, struct pblk, rl); >> struct nvm_tgt_dev *dev =3D pblk->dev; >> @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget) >> int sec_meta, blk_meta; >> unsigned int rb_windows; >>=20 >> - >> /* Consider sectors used for metadata */ >> sec_meta =3D (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_li= nes; >> blk_meta =3D DIV_ROUND_UP(sec_meta, geo->clba); >> @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget) >> /* To start with, all buffer is available to user I/O writers */ >> rl->rb_budget =3D budget; >> rl->rb_user_max =3D budget; >> - rl->rb_max_io =3D budget >> 1; >> + rl->rb_max_io =3D budget - threshold; >> rl->rb_gc_max =3D 0; >> rl->rb_state =3D PBLK_RL_HIGH; >>=20 >> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h >> index 72ae8755764e..a6386d5acd73 100644 >> --- a/drivers/lightnvm/pblk.h >> +++ b/drivers/lightnvm/pblk.h >> @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force)= ; >> /* >> * pblk rate limiter >> */ >> -void pblk_rl_init(struct pblk_rl *rl, int budget); >> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold); >> void pblk_rl_free(struct pblk_rl *rl); >> void pblk_rl_update_rates(struct pblk_rl *rl); >> int pblk_rl_high_thrs(struct pblk_rl *rl); >> -- >> 2.17.1 >>=20