Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp308172imm; Tue, 14 Aug 2018 19:58:02 -0700 (PDT) X-Google-Smtp-Source: AA+uWPwYEbgqRtmzUoBqK7FE/zmVM2aTCT4r5zuKs1P4r+UuUDTzFsVNcrCvGkjATh5DTU7NHor2 X-Received: by 2002:a63:c60:: with SMTP id 32-v6mr23547037pgm.155.1534301882720; Tue, 14 Aug 2018 19:58:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534301882; cv=none; d=google.com; s=arc-20160816; b=FtHwoaQSaqCUerX/teWkjeeoz+rFYC2XsCh87UY2IILxntSfKsTEKeohfg1mSl9yup izWzd4LmBPbcYMZXfuXwcPA2Vbajrm71P3w0vmsEgJHGJsntKIa6+AdUvW9r4PQv73t/ 5H+fKJMMqPkX1WcqcAGE/bC+JmzHM346IVR45tWpxUClFD/Q6ia1P+2VHLywd33a8xnU eqSQjl2dvLxnUs8nywdiZim1MjiwMZQpfNan7Hiaeoo5nJ5o+UmrFfbFXY5EZY6lSTIe 3lPA//4K2igUtAQ4F/BkqdXO6nykNkj8c/mBPCtXn9BMj78IYp8rxZ82uqsSDc13knj0 RJPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=AjKX0IVn9J1AVf2nPaqO955yKwFYnGYyCSfutBeP83Y=; b=czdncFJWGwvQK4gzV0zksWov8o012ceIzgoKXcV1XAnlbhUaQwZH6ClfHjSdiVnIcH z75tJ5VFQeFvDPVjqa2QVqnab5M2wr24Q9tLXvdLv46RbeGvyPDjfuoVrCtnVhmo4H+q j2qdxM4OEGe8c3ERfi+nGgBrVPVK3gwzrYmAwPOhjZBTYbGK8SnmX9MSilbYgwePBJu8 4oMRn5Zeg7rQKwxRLNOhvNv7nQZHR9NE21dll7hNQZE6UOmaTbC7rVQV+dZkiREMnNMT qW64GW58CxIaHiPRqIFE9JIlytb2tIE3pvxdtvhMj3oiNCk5EK9xijZHwRwrvtac+A4I EDLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=oS1HlV3u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 79-v6si23036644pfs.40.2018.08.14.19.57.47; Tue, 14 Aug 2018 19:58:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=oS1HlV3u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728584AbeHOFqW (ORCPT + 99 others); Wed, 15 Aug 2018 01:46:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:34434 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726135AbeHOFqW (ORCPT ); Wed, 15 Aug 2018 01:46:22 -0400 Received: from localhost (unknown [104.132.1.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 16D78214C4; Wed, 15 Aug 2018 02:56:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1534301773; bh=CGyiOuVEOWc5F9OdWPkRUFURDYCxGy4H+hh8Mx9sycU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=oS1HlV3uKt6FQov9QBCb66o5Mt+962FyKL18/NFX4lmk9lSPMnLmKyO/CgvOGVlPN Rr1zFzcrkEkDrHZI5jARq8QtDmrQr3iG8MkYaMSWb9GlIlJR4uHzZHszkqk7BzItpK 2ZupfWHu7OLw2qWU2/M8s6/NatEXLVY1KNbQJB4o= Date: Tue, 14 Aug 2018 19:56:12 -0700 From: Jaegeuk Kim To: Chao Yu Cc: linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, chao@kernel.org Subject: Re: [PATCH 2/2] f2fs: tune discard speed with storage usage rate Message-ID: <20180815025612.GC84720@jaegeuk-macbookpro.roam.corp.google.com> References: <20180810100806.9298-1-yuchao0@huawei.com> <20180810100806.9298-2-yuchao0@huawei.com> <20180814041906.GC52730@jaegeuk-macbookpro.roam.corp.google.com> <57d9b6ea-68a5-4736-0b34-74db539d8959@huawei.com> <20180814172313.GC56510@jaegeuk-macbookpro.roam.corp.google.com> <20180815023326.GB84720@jaegeuk-macbookpro.roam.corp.google.com> <17aa1e09-6bfd-eb67-0f28-e13d8bbb17ff@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17aa1e09-6bfd-eb67-0f28-e13d8bbb17ff@huawei.com> User-Agent: Mutt/1.8.2 (2017-04-18) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/15, Chao Yu wrote: > On 2018/8/15 10:33, Jaegeuk Kim wrote: > > On 08/15, Chao Yu wrote: > >> On 2018/8/15 1:23, Jaegeuk Kim wrote: > >>> On 08/14, Chao Yu wrote: > >>>> On 2018/8/14 12:19, Jaegeuk Kim wrote: > >>>>> On 08/10, Chao Yu wrote: > >>>>>> Previously, discard speed was fixed mostly, and in high usage rate > >>>>>> device, we will speed up issuing discard, but it doesn't make sense > >>>>>> that in a non-full filesystem, we still issue discard with slow speed. > >>>>> > >>>>> Could you please elaborate the problem in more detail? The speed depends > >>>>> on how many candidates? > >>>> > >>>> undiscard blocks are all 4k granularity. > >>>> a) utility: filesystem: 20% + undiscard blocks: 20% = flash storage: 40% > >>>> b) utility: filesystem: 40% + undiscard blocks: 25% = flash storage: 65% > >>>> c) utility: filesystem: 60% + undiscard blocks: 30% = flash storage: 100% > >>>> > >>>> > >>>> 1. for case c), we need to speed up issuing discard based on utilization of > >>>> "filesystem + undiscard" instead of just utilization of filesystem. > >>>> > >>>> - if (utilization(sbi) > DEF_DISCARD_URGENT_UTIL) { > >>>> - dpolicy->granularity = 1; > >>>> - dpolicy->max_interval = DEF_MIN_DISCARD_ISSUE_TIME; > >>>> - } > >>>> > >>>> 2. If free space in storage touches therein threshold, performance will be very > >>>> sensitive. In low-end storage, with high usage in space, even free space is > >>>> reduced by 1%, performance will decrease a lot. > >>> > >>> So, we may need to distinguish low-end vs. high-end storage. In high-end case, > >>> it'd be better to avoid IO contention, while low-end device wants to get more > >>> discard commands as much as possible. So, how about adding an option for this > >>> as a tunable point? > >> > >> Agreed, how about adding a sysfs entry discard_tunning: > >> 1: enabled, use 4k granularity, self-adapted speed based on real device free space. > >> 0: disabled, use dcc->discard_granularity, fixed speed. > >> > >> By default: enabled > >> > >> How do you think? > > > > I don't think this is proper with a sysfs entry, since we already know the > > You mean by storage capacity? <= 32GB means low-end? Yes, that's current condition to judge it. If there is any other method, it'd be better to change it. > > Thanks, > > > device type when mounting the partition. We won't require to change the policy > > on the fly. And, I still don't get to change the default. > > > >> > >> Thanks, > >> > >>> > >>>> > >>>> IMO, in above cases, we'd better to issue discard with high speed for c), middle > >>>> speed for b), and low speed for a). > >>>> > >>>> How do you think? > >>>> > >>>> Thanks, > >>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>>> > >>>>>> Anyway, it comes out undiscarded block makes FTL GC be lower efficient > >>>>>> and causing high lifetime overhead. > >>>>>> > >>>>>> Let's tune discard speed as below: > >>>>>> > >>>>>> a. adjust default issue interval: > >>>>>> original after > >>>>>> min_interval: 50ms 100ms > >>>>>> mid_interval: 500ms 1000ms > >>>>>> max_interval: 60000ms 10000ms > >>>>>> > >>>>>> b. if last time we stop issuing discard due to IO interruption of user, > >>>>>> let's reset all {min,mid,max}_interval to default one. > >>>>>> > >>>>>> c. tune {min,mid,max}_interval with below calculation method: > >>>>>> > >>>>>> base_interval = default_interval / 10; > >>>>>> total_interval = default_interval - base_interval; > >>>>>> interval = base_interval + total_interval * (100 - dev_util) / 100; > >>>>>> > >>>>>> For example: > >>>>>> min_interval (:100ms) > >>>>>> dev_util (%) interval (ms) > >>>>>> 0 100 > >>>>>> 10 91 > >>>>>> 20 82 > >>>>>> 30 73 > >>>>>> ... > >>>>>> 80 28 > >>>>>> 90 19 > >>>>>> 100 10 > >>>>>> > >>>>>> Signed-off-by: Chao Yu > >>>>>> --- > >>>>>> fs/f2fs/f2fs.h | 11 ++++---- > >>>>>> fs/f2fs/segment.c | 64 +++++++++++++++++++++++++++++++++++++---------- > >>>>>> fs/f2fs/segment.h | 9 +++++++ > >>>>>> fs/f2fs/super.c | 2 +- > >>>>>> 4 files changed, 67 insertions(+), 19 deletions(-) > >>>>>> > >>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > >>>>>> index 273ffdaf4891..a1dd2e1c3cb9 100644 > >>>>>> --- a/fs/f2fs/f2fs.h > >>>>>> +++ b/fs/f2fs/f2fs.h > >>>>>> @@ -185,10 +185,9 @@ enum { > >>>>>> > >>>>>> #define MAX_DISCARD_BLOCKS(sbi) BLKS_PER_SEC(sbi) > >>>>>> #define DEF_MAX_DISCARD_REQUEST 8 /* issue 8 discards per round */ > >>>>>> -#define DEF_MIN_DISCARD_ISSUE_TIME 50 /* 50 ms, if exists */ > >>>>>> -#define DEF_MID_DISCARD_ISSUE_TIME 500 /* 500 ms, if device busy */ > >>>>>> -#define DEF_MAX_DISCARD_ISSUE_TIME 60000 /* 60 s, if no candidates */ > >>>>>> -#define DEF_DISCARD_URGENT_UTIL 80 /* do more discard over 80% */ > >>>>>> +#define DEF_MIN_DISCARD_ISSUE_TIME 100 /* 100 ms, if exists */ > >>>>>> +#define DEF_MID_DISCARD_ISSUE_TIME 1000 /* 1000 ms, if device busy */ > >>>>>> +#define DEF_MAX_DISCARD_ISSUE_TIME 10000 /* 10000 ms, if no candidates */ > >>>>>> #define DEF_CP_INTERVAL 60 /* 60 secs */ > >>>>>> #define DEF_IDLE_INTERVAL 5 /* 5 secs */ > >>>>>> > >>>>>> @@ -248,7 +247,8 @@ struct discard_entry { > >>>>>> }; > >>>>>> > >>>>>> /* default discard granularity of inner discard thread, unit: block count */ > >>>>>> -#define DEFAULT_DISCARD_GRANULARITY 1 > >>>>>> +#define MID_DISCARD_GRANULARITY 16 > >>>>>> +#define MIN_DISCARD_GRANULARITY 1 > >>>>>> > >>>>>> /* max discard pend list number */ > >>>>>> #define MAX_PLIST_NUM 512 > >>>>>> @@ -330,6 +330,7 @@ struct discard_cmd_control { > >>>>>> atomic_t discard_cmd_cnt; /* # of cached cmd count */ > >>>>>> struct rb_root root; /* root of discard rb-tree */ > >>>>>> bool rbtree_check; /* config for consistence check */ > >>>>>> + bool io_interrupted; /* last state of io interrupted */ > >>>>>> }; > >>>>>> > >>>>>> /* for the list of fsync inodes, used only during recovery */ > >>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > >>>>>> index 8b52e8dfb12f..9564aaf1f27b 100644 > >>>>>> --- a/fs/f2fs/segment.c > >>>>>> +++ b/fs/f2fs/segment.c > >>>>>> @@ -968,6 +968,44 @@ static void __check_sit_bitmap(struct f2fs_sb_info *sbi, > >>>>>> #endif > >>>>>> } > >>>>>> > >>>>>> +static void __adjust_discard_speed(unsigned int *interval, > >>>>>> + unsigned int def_interval, int dev_util) > >>>>>> +{ > >>>>>> + unsigned int base_interval, total_interval; > >>>>>> + > >>>>>> + base_interval = def_interval / 10; > >>>>>> + total_interval = def_interval - base_interval; > >>>>>> + > >>>>>> + /* > >>>>>> + * if def_interval = 100, adjusted interval should be in range of > >>>>>> + * [10, 100]. > >>>>>> + */ > >>>>>> + *interval = base_interval + total_interval * (100 - dev_util) / 100; > >>>>>> +} > >>>>>> + > >>>>>> +static void __tune_discard_policy(struct f2fs_sb_info *sbi, > >>>>>> + struct discard_policy *dpolicy) > >>>>>> +{ > >>>>>> + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > >>>>>> + int dev_util; > >>>>>> + > >>>>>> + if (dcc->io_interrupted) { > >>>>>> + dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; > >>>>>> + dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; > >>>>>> + dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; > >>>>>> + return; > >>>>>> + } > >>>>>> + > >>>>>> + dev_util = dev_utilization(sbi); > >>>>>> + > >>>>>> + __adjust_discard_speed(&dpolicy->min_interval, > >>>>>> + DEF_MIN_DISCARD_ISSUE_TIME, dev_util); > >>>>>> + __adjust_discard_speed(&dpolicy->mid_interval, > >>>>>> + DEF_MID_DISCARD_ISSUE_TIME, dev_util); > >>>>>> + __adjust_discard_speed(&dpolicy->max_interval, > >>>>>> + DEF_MAX_DISCARD_ISSUE_TIME, dev_util); > >>>>>> +} > >>>>>> + > >>>>>> static void __init_discard_policy(struct f2fs_sb_info *sbi, > >>>>>> struct discard_policy *dpolicy, > >>>>>> int discard_type, unsigned int granularity) > >>>>>> @@ -982,20 +1020,11 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi, > >>>>>> dpolicy->io_aware_gran = MAX_PLIST_NUM; > >>>>>> > >>>>>> if (discard_type == DPOLICY_BG) { > >>>>>> - dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; > >>>>>> - dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; > >>>>>> - dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; > >>>>>> dpolicy->io_aware = true; > >>>>>> dpolicy->sync = false; > >>>>>> dpolicy->ordered = true; > >>>>>> - if (utilization(sbi) > DEF_DISCARD_URGENT_UTIL) { > >>>>>> - dpolicy->granularity = 1; > >>>>>> - dpolicy->max_interval = DEF_MIN_DISCARD_ISSUE_TIME; > >>>>>> - } > >>>>>> + __tune_discard_policy(sbi, dpolicy); > >>>>>> } else if (discard_type == DPOLICY_FORCE) { > >>>>>> - dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; > >>>>>> - dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; > >>>>>> - dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; > >>>>>> dpolicy->io_aware = false; > >>>>>> } else if (discard_type == DPOLICY_FSTRIM) { > >>>>>> dpolicy->io_aware = false; > >>>>>> @@ -1353,6 +1382,8 @@ static unsigned int __issue_discard_cmd_orderly(struct f2fs_sb_info *sbi, > >>>>>> if (!issued && io_interrupted) > >>>>>> issued = -1; > >>>>>> > >>>>>> + dcc->io_interrupted = io_interrupted; > >>>>>> + > >>>>>> return issued; > >>>>>> } > >>>>>> > >>>>>> @@ -1370,7 +1401,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > >>>>>> if (i + 1 < dpolicy->granularity) > >>>>>> break; > >>>>>> > >>>>>> - if (i < DEFAULT_DISCARD_GRANULARITY && dpolicy->ordered) > >>>>>> + if (i < MID_DISCARD_GRANULARITY && dpolicy->ordered) > >>>>>> return __issue_discard_cmd_orderly(sbi, dpolicy); > >>>>>> > >>>>>> pend_list = &dcc->pend_list[i]; > >>>>>> @@ -1407,6 +1438,8 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > >>>>>> if (!issued && io_interrupted) > >>>>>> issued = -1; > >>>>>> > >>>>>> + dcc->io_interrupted = io_interrupted; > >>>>>> + > >>>>>> return issued; > >>>>>> } > >>>>>> > >>>>>> @@ -1576,7 +1609,11 @@ static int issue_discard_thread(void *data) > >>>>>> struct f2fs_sb_info *sbi = data; > >>>>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > >>>>>> wait_queue_head_t *q = &dcc->discard_wait_queue; > >>>>>> - struct discard_policy dpolicy; > >>>>>> + struct discard_policy dpolicy = { > >>>>>> + .min_interval = DEF_MIN_DISCARD_ISSUE_TIME, > >>>>>> + .mid_interval = DEF_MID_DISCARD_ISSUE_TIME, > >>>>>> + .max_interval = DEF_MAX_DISCARD_ISSUE_TIME, > >>>>>> + }; > >>>>>> unsigned int wait_ms = DEF_MIN_DISCARD_ISSUE_TIME; > >>>>>> int issued; > >>>>>> > >>>>>> @@ -1929,7 +1966,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > >>>>>> if (!dcc) > >>>>>> return -ENOMEM; > >>>>>> > >>>>>> - dcc->discard_granularity = DEFAULT_DISCARD_GRANULARITY; > >>>>>> + dcc->discard_granularity = MIN_DISCARD_GRANULARITY; > >>>>>> INIT_LIST_HEAD(&dcc->entry_list); > >>>>>> for (i = 0; i < MAX_PLIST_NUM; i++) > >>>>>> INIT_LIST_HEAD(&dcc->pend_list[i]); > >>>>>> @@ -1945,6 +1982,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > >>>>>> dcc->next_pos = 0; > >>>>>> dcc->root = RB_ROOT; > >>>>>> dcc->rbtree_check = false; > >>>>>> + dcc->io_interrupted = false; > >>>>>> > >>>>>> init_waitqueue_head(&dcc->discard_wait_queue); > >>>>>> SM_I(sbi)->dcc_info = dcc; > >>>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h > >>>>>> index 422b0ceb1eaa..63b4da72cd34 100644 > >>>>>> --- a/fs/f2fs/segment.h > >>>>>> +++ b/fs/f2fs/segment.h > >>>>>> @@ -616,6 +616,15 @@ static inline int utilization(struct f2fs_sb_info *sbi) > >>>>>> sbi->user_block_count); > >>>>>> } > >>>>>> > >>>>>> +static inline int dev_utilization(struct f2fs_sb_info *sbi) > >>>>>> +{ > >>>>>> + unsigned int dev_blks; > >>>>>> + > >>>>>> + dev_blks = valid_user_blocks(sbi) + SM_I(sbi)->dcc_info->undiscard_blks; > >>>>>> + return div_u64((u64)dev_blks * 100, > >>>>>> + MAIN_SEGS(sbi) << sbi->log_blocks_per_seg); > >>>>>> +} > >>>>>> + > >>>>>> /* > >>>>>> * Sometimes f2fs may be better to drop out-of-place update policy. > >>>>>> * And, users can control the policy through sysfs entries. > >>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c > >>>>>> index b055f2ea77c5..55ed76daad23 100644 > >>>>>> --- a/fs/f2fs/super.c > >>>>>> +++ b/fs/f2fs/super.c > >>>>>> @@ -2862,7 +2862,7 @@ static void f2fs_tuning_parameters(struct f2fs_sb_info *sbi) > >>>>>> /* adjust parameters according to the volume size */ > >>>>>> if (sm_i->main_segments <= SMALL_VOLUME_SEGMENTS) { > >>>>>> F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE; > >>>>>> - sm_i->dcc_info->discard_granularity = 1; > >>>>>> + sm_i->dcc_info->discard_granularity = MIN_DISCARD_GRANULARITY; > >>>>>> sm_i->ipu_policy = 1 << F2FS_IPU_FORCE; > >>>>>> } > >>>>>> > >>>>>> -- > >>>>>> 2.18.0.rc1 > >>>>> > >>>>> . > >>>>> > >>> > >>> . > >>> > > > > . > >