Received: by 10.223.176.5 with SMTP id f5csp546739wra; Tue, 6 Feb 2018 03:30:40 -0800 (PST) X-Google-Smtp-Source: AH8x225KdAqk8APQeYSySivG1Hh5Uv9LC35DbLZa6jpqGzt3dgqD14evnlnU7E9zDiGOAaIcYvep X-Received: by 10.98.163.15 with SMTP id s15mr2129163pfe.67.1517916640351; Tue, 06 Feb 2018 03:30:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517916640; cv=none; d=google.com; s=arc-20160816; b=VH6rSphbShoNvUgso65lsEryg8FxUTJuZDYZkJrLTg4ExW+SXNfpVzjj2tmRnktuSU ITt5WHweU9t3FaHdrVQfgLLqwvdki0hEMgH8T3NHMU9SUt79hyk8Asp8jzBjz96Ciy2c kJSM2t8FUzbHjciRQp1mJuUBLkbp/Gd94VFIj+4WfrNG2fATzD+T7i0z7/rP/BP94Cnf EObjPWNwzypiU7CIgAgzwdaeL3HTp3ezY0cdnvIY9xOcsxdjNh9QwRc18u7A2V1gL3ig NpOJVsoMmtP3Ld0OVaSHOxrv8BFMHCfKHtsBiKOUhT8IjWjM9O9eoPf6nWt8wm88GqBV IAFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature:arc-authentication-results; bh=KZkji49wCAygECLgvGpRu0x9NAE+Dc8gDtunoSTxrLA=; b=yt0zUojO47WdBPfJHIom05aS5Z3x+ZuCw2+7BfF2/wUAZ2cJ/UoKengRjewpWKcoDN smW5nvvqkjDC13dQeo3hEauKkP4BtN9oKOVfem0jeuiFrqbF+88h2ldKukltN9XNICi5 QnoZ+2yhE9p4AkfgrYmsCrrW/ofewTsB9WRraXJTew1Bc0qmwH+p0lQrJkZkYTTcnI0B 0TX9bgcYt+PuJ48Z0oL+fxsycrNwHYOgxlJoGq+bHngzFyCflic5kHHBQMawzBx2rMeV pdObVwsRj+cVmO2do9kfnVKFKUyT9W2DWkva+mHicP1NdRjzq5l/L9De7Zkb/uTHUc3a wJUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@owltronix-com.20150623.gappssmtp.com header.s=20150623 header.b=ADbLjvVi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u12si649659pfl.24.2018.02.06.03.30.25; Tue, 06 Feb 2018 03:30:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@owltronix-com.20150623.gappssmtp.com header.s=20150623 header.b=ADbLjvVi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752691AbeBFL2v (ORCPT + 99 others); Tue, 6 Feb 2018 06:28:51 -0500 Received: from mail-ua0-f194.google.com ([209.85.217.194]:43985 "EHLO mail-ua0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752278AbeBFL2p (ORCPT ); Tue, 6 Feb 2018 06:28:45 -0500 Received: by mail-ua0-f194.google.com with SMTP id i5so934767uai.10 for ; Tue, 06 Feb 2018 03:28:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=owltronix-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=KZkji49wCAygECLgvGpRu0x9NAE+Dc8gDtunoSTxrLA=; b=ADbLjvVidnGAHCZz3AOZRv0WBvu4hxqwzh0gTiCmvFfcd510g5UdGLXzsJJAh6jHsi RgF3k2MfpOimxUPP1JWa+s2i1CSvWtzwhWX62S6cguSlDOyLJoaiSady+ZaWeOze6VwI QcxItkAl0p7owyJL8wVq2Tw9rTF6XoLQpDqIhdPxHJeKsdwGWe/hgBZfqqa/Jc9bODoZ 62BdsrlsKYjBHyakzgy5Y+VDyxoMj1Owm7PGcNKSnTeCpaZgyKo3ht7NdW4fyiOiPl2b 9HrC9UyUsi8gYm8qaydOBPIS8gkp81MdCxtP8CtXXnfDlLGuAUtLqQ76VWrvxoIBwHzr TpLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=KZkji49wCAygECLgvGpRu0x9NAE+Dc8gDtunoSTxrLA=; b=Mp25m3plxF0rKXHrzg0gGVUZxyGcMn9vmf5iqc/dRHlwrcJN35DEtuKCnhLJfKteCM b18jMFN+eIBAYJYcInnMCzgjDdy7piMSReKwcdlPhdZP9ZWrU8qtgFBrBMVnjQ1cHuzn HXXCFXuI8j8tpo5BsK1KmIHC38+TdeXjNxWCbvCWs6701JjGkcx/K9AlouoA7yyHZeXq QLAaF7YV+p1w8QxddBr6PhxdqaBjhClmciTbILv7t8KTV4AoGXusbmkSfBUYZUrNWoa6 qQRoo+pvrgsJJf30AKSeP3zBOPjpG5rObh2uMyOMD06ZzAmrRIlrid9oqozFBzp0P8bi m8+w== X-Gm-Message-State: APf1xPBAVWQeP5nBrxYlCTfQfSk0j//wQn1X8ZKyB+2itFiHLe4c3Igt S5YMYwdz3cC8vMXyjaq6Np34qcuvHOc11ASZ2i/MTmhSq44= X-Received: by 10.159.58.106 with SMTP id r42mr2058459uag.4.1517916524767; Tue, 06 Feb 2018 03:28:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.130.7 with HTTP; Tue, 6 Feb 2018 03:28:44 -0800 (PST) In-Reply-To: <3aaf482e-0f4f-9001-8df1-1ccd2004ce02@lightnvm.io> References: <1517364411-22386-1-git-send-email-javier@cnexlabs.com> <1517364411-22386-4-git-send-email-javier@cnexlabs.com> <3aaf482e-0f4f-9001-8df1-1ccd2004ce02@lightnvm.io> From: Hans Holmberg Date: Tue, 6 Feb 2018 12:28:44 +0100 Message-ID: Subject: Re: [PATCH 4/5] lightnvm: pblk: add padding distribution sysfs attribute To: =?UTF-8?Q?Matias_Bj=C3=B8rling?= Cc: =?UTF-8?Q?Javier_Gonz=C3=A1lez?= , linux-block@vger.kernel.org, Linux Kernel Mailing List , Hans Holmberg , =?UTF-8?Q?Javier_Gonz=C3=A1lez?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 6, 2018 at 11:50 AM, Matias Bj=C3=B8rling wrot= e: > On 02/06/2018 10:27 AM, Hans Holmberg wrote: >> >> Hi Matias, >> >> On Wed, Jan 31, 2018 at 9:44 AM, Matias Bj=C3=B8rling w= rote: >>> >>> On 01/31/2018 03:06 AM, Javier Gonz=C3=A1lez wrote: >>>> >>>> >>>> From: Hans Holmberg >>>> >>>> When pblk receives a sync, all data up to that point in the write buff= er >>>> must be comitted to persistent storage, and as flash memory comes with= a >>>> minimal write size there is a significant cost involved both in terms >>>> of time for completing the sync and in terms of write amplification >>>> padded sectors for filling up to the minimal write size. >>>> >>>> In order to get a better understanding of the costs involved for syncs= , >>>> Add a sysfs attribute to pblk: padded_dist, showing a normalized >>>> distribution of sectors padded. In order to facilitate measurements of >>>> specific workloads during the lifetime of the pblk instance, the >>>> distribution can be reset by writing 0 to the attribute. >>>> >>>> Do this by introducing counters for each possible padding: >>>> {0..(minimal write size - 1)} and calculate the normalized distributio= n >>>> when showing the attribute. >>>> >>>> Signed-off-by: Hans Holmberg >>>> Signed-off-by: Javier Gonz=C3=A1lez >>>> --- >>>> drivers/lightnvm/pblk-init.c | 16 ++++++++-- >>>> drivers/lightnvm/pblk-rb.c | 15 +++++----- >>>> drivers/lightnvm/pblk-sysfs.c | 68 >>>> +++++++++++++++++++++++++++++++++++++++++++ >>>> drivers/lightnvm/pblk.h | 6 +++- >>>> 4 files changed, 95 insertions(+), 10 deletions(-) >>>> >>>> diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init= .c >>>> index 7eedc5daebc8..a5e3510c0f38 100644 >>>> --- a/drivers/lightnvm/pblk-init.c >>>> +++ b/drivers/lightnvm/pblk-init.c >>>> @@ -921,6 +921,7 @@ static void pblk_free(struct pblk *pblk) >>>> { >>>> pblk_luns_free(pblk); >>>> pblk_lines_free(pblk); >>>> + kfree(pblk->pad_dist); >>>> pblk_line_meta_free(pblk); >>>> pblk_core_free(pblk); >>>> pblk_l2p_free(pblk); >>>> @@ -998,11 +999,13 @@ static void *pblk_init(struct nvm_tgt_dev *dev, >>>> struct gendisk *tdisk, >>>> pblk->pad_rst_wa =3D 0; >>>> pblk->gc_rst_wa =3D 0; >>>> + atomic_long_set(&pblk->nr_flush, 0); >>>> + pblk->nr_flush_rst =3D 0; >>>> + >>>> #ifdef CONFIG_NVM_DEBUG >>>> atomic_long_set(&pblk->inflight_writes, 0); >>>> atomic_long_set(&pblk->padded_writes, 0); >>>> atomic_long_set(&pblk->padded_wb, 0); >>>> - atomic_long_set(&pblk->nr_flush, 0); >>>> atomic_long_set(&pblk->req_writes, 0); >>>> atomic_long_set(&pblk->sub_writes, 0); >>>> atomic_long_set(&pblk->sync_writes, 0); >>>> @@ -1034,10 +1037,17 @@ static void *pblk_init(struct nvm_tgt_dev *dev= , >>>> struct gendisk *tdisk, >>>> goto fail_free_luns; >>>> } >>>> + pblk->pad_dist =3D kzalloc((pblk->min_write_pgs - 1) * >>>> sizeof(atomic64_t), >>>> + GFP_KERNEL); >>>> + if (!pblk->pad_dist) { >>>> + ret =3D -ENOMEM; >>>> + goto fail_free_line_meta; >>>> + } >>>> + >>>> ret =3D pblk_core_init(pblk); >>>> if (ret) { >>>> pr_err("pblk: could not initialize core\n"); >>>> - goto fail_free_line_meta; >>>> + goto fail_free_pad_dist; >>>> } >>>> ret =3D pblk_l2p_init(pblk); >>>> @@ -1097,6 +1107,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev, >>>> struct gendisk *tdisk, >>>> pblk_l2p_free(pblk); >>>> fail_free_core: >>>> pblk_core_free(pblk); >>>> +fail_free_pad_dist: >>>> + kfree(pblk->pad_dist); >>>> fail_free_line_meta: >>>> pblk_line_meta_free(pblk); >>>> fail_free_luns: >>>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c >>>> index 7044b5599cc4..264372d7b537 100644 >>>> --- a/drivers/lightnvm/pblk-rb.c >>>> +++ b/drivers/lightnvm/pblk-rb.c >>>> @@ -437,9 +437,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb >>>> *rb, >>>> unsigned int nr_entries, >>>> if (bio->bi_opf & REQ_PREFLUSH) { >>>> struct pblk *pblk =3D container_of(rb, struct pblk, r= wb); >>>> -#ifdef CONFIG_NVM_DEBUG >>>> atomic_long_inc(&pblk->nr_flush); >>>> -#endif >>>> if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem)) >>>> *io_ret =3D NVM_IO_OK; >>>> } >>>> @@ -620,14 +618,17 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb >>>> *rb, >>>> struct nvm_rq *rqd, >>>> pr_err("pblk: could not pad page in write >>>> bio\n"); >>>> return NVM_IO_ERR; >>>> } >>>> + >>>> + if (pad < pblk->min_write_pgs) >>>> + atomic64_inc(&pblk->pad_dist[pad - 1]); >>>> + else >>>> + pr_warn("pblk: padding more than min. >>>> sectors\n"); >>> >>> >>> >>> Curious, when would this happen? Would it be an implementation error or >>> something a user did wrong? >> >> >> This would be an implementation error - and this check is just a >> safeguard to make sure we won't go out of bounds when updating the >> statistics. >> > > Ah, does it make sense to have such defensive programming, when the value= is > never persisted? An 64 bit integer is quite large, and if only used for > workloads for a single instance, it would practically never trigger, and = if > it did, do one care? Ah, sorry for being unclear, it's not for checking if the pad counters overflow - I wanted to make sure that we won't index pad_dist with negative numbers if we mess up the padding calculations in the future. > > > >>> >>> Looks good to me. Let me know if you want to send a new patch, or if I >>> may >>> make the change when I pick it up. >> >> >> Thanks for the review, i saw that you already reworked the patch a bit >> on your core branch. >> However, i found a build issue when building for i386, so i'll fix >> that and submit a V2(that includes your update) > > > Ok. I assume this is the kbuild mail I asked about a couple of days ago. > I'll wait for v2. I did see that particular mail, but it's most likely the same issue that the V2 will fix. Thanks, Hans