Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp311504imm; Tue, 14 Aug 2018 20:02:43 -0700 (PDT) X-Google-Smtp-Source: AA+uWPwKFkGkJsY7Ip3H5zeSZHxsKHb1y0yOzunfoNpV1amiZHVm0ns5LuPzBPHqshQwwIPgGXhD X-Received: by 2002:a62:b917:: with SMTP id z23-v6mr26022812pfe.131.1534302163564; Tue, 14 Aug 2018 20:02:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534302163; cv=none; d=google.com; s=arc-20160816; b=LOLWrjph0vhCz3SH0Uccn4HrAxa4nymPpNbKWCZyli83wLzGl8F6TY6pxKBjxASWtS MXj8zMI/9Ys6WlsuCHloDA+Tg5HEvcy9gkSNbXylcK8e0VtzskwYBF2bs4UzK/qcha3r 9fSXl6NHqUrju+VmU2biKjMEmmz8PhSpM3Jpe3ROTt7JHqFVKxV4W739odGbWxn8xb3/ t9PoE6aHQ3HldFyQUKxVOWjHr9K+Uspgfqzw/hVwjhIzaIFiJKrlRzvRKEByFcsM5/KI mxky1qnrmbrpTAjwRVVKW21PlB+mX3defL1T6pTCn0GSOIh6mCRQWdgAqXc3rNo6JtOm 6pXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=NiommteDP8TTMTwxTYJy66m3xD+/4drcctG2WP9bBm0=; b=TjbESXM+el0Vs7NKooXXUkB7hbYsrAUa3Xs4sD8AZvEeU7JL3IaPsBb6me8+TNUJaG ojzy6sat9Ix+vLZIukV5A40h6X5H/aXfThY5VBkBo9xegG39yYj6JHc33V9ID4iKmslT L2krpcI9hwivD0YcG+9pjaJ1/wEiMwOYcHa2wVeOXq9Fhc0ceE5zzQ/qfUgkKIJJJx5W 97zdqRaC0h1lH9O0MaMoYa7G+5QtGjnFHOA4W9dGoV5wMkJ2lt7qWhtCurUdP3hMtvHP rVBpEN9wjq0zp+zqwan6A4BZPEmz7PgoovLL2r81JRGK2YffHHc6q2p4yl67bkLVk/Y8 Y/Zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v66-v6si24511536pfb.368.2018.08.14.20.02.27; Tue, 14 Aug 2018 20:02:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728573AbeHOFvm (ORCPT + 99 others); Wed, 15 Aug 2018 01:51:42 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:48881 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728523AbeHOFvl (ORCPT ); Wed, 15 Aug 2018 01:51:41 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id B9D6050EB13A5; Wed, 15 Aug 2018 11:01:27 +0800 (CST) Received: from [127.0.0.1] (10.134.22.195) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.399.0; Wed, 15 Aug 2018 11:01:26 +0800 Subject: Re: [PATCH 2/2] f2fs: tune discard speed with storage usage rate To: Jaegeuk Kim CC: , , References: <20180810100806.9298-1-yuchao0@huawei.com> <20180810100806.9298-2-yuchao0@huawei.com> <20180814041906.GC52730@jaegeuk-macbookpro.roam.corp.google.com> <57d9b6ea-68a5-4736-0b34-74db539d8959@huawei.com> <20180814172313.GC56510@jaegeuk-macbookpro.roam.corp.google.com> <20180815023326.GB84720@jaegeuk-macbookpro.roam.corp.google.com> <17aa1e09-6bfd-eb67-0f28-e13d8bbb17ff@huawei.com> <20180815025612.GC84720@jaegeuk-macbookpro.roam.corp.google.com> From: Chao Yu Message-ID: <03eec946-0cdb-7767-b5d9-9830072a3e71@huawei.com> Date: Wed, 15 Aug 2018 11:01:25 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180815025612.GC84720@jaegeuk-macbookpro.roam.corp.google.com> Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.134.22.195] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/8/15 10:56, Jaegeuk Kim wrote: > On 08/15, Chao Yu wrote: >> On 2018/8/15 10:33, Jaegeuk Kim wrote: >>> On 08/15, Chao Yu wrote: >>>> On 2018/8/15 1:23, Jaegeuk Kim wrote: >>>>> On 08/14, Chao Yu wrote: >>>>>> On 2018/8/14 12:19, Jaegeuk Kim wrote: >>>>>>> On 08/10, Chao Yu wrote: >>>>>>>> Previously, discard speed was fixed mostly, and in high usage rate >>>>>>>> device, we will speed up issuing discard, but it doesn't make sense >>>>>>>> that in a non-full filesystem, we still issue discard with slow speed. >>>>>>> >>>>>>> Could you please elaborate the problem in more detail? The speed depends >>>>>>> on how many candidates? >>>>>> >>>>>> undiscard blocks are all 4k granularity. >>>>>> a) utility: filesystem: 20% + undiscard blocks: 20% = flash storage: 40% >>>>>> b) utility: filesystem: 40% + undiscard blocks: 25% = flash storage: 65% >>>>>> c) utility: filesystem: 60% + undiscard blocks: 30% = flash storage: 100% >>>>>> >>>>>> >>>>>> 1. for case c), we need to speed up issuing discard based on utilization of >>>>>> "filesystem + undiscard" instead of just utilization of filesystem. >>>>>> >>>>>> - if (utilization(sbi) > DEF_DISCARD_URGENT_UTIL) { >>>>>> - dpolicy->granularity = 1; >>>>>> - dpolicy->max_interval = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>> - } >>>>>> >>>>>> 2. If free space in storage touches therein threshold, performance will be very >>>>>> sensitive. In low-end storage, with high usage in space, even free space is >>>>>> reduced by 1%, performance will decrease a lot. >>>>> >>>>> So, we may need to distinguish low-end vs. high-end storage. In high-end case, >>>>> it'd be better to avoid IO contention, while low-end device wants to get more >>>>> discard commands as much as possible. So, how about adding an option for this >>>>> as a tunable point? >>>> >>>> Agreed, how about adding a sysfs entry discard_tunning: >>>> 1: enabled, use 4k granularity, self-adapted speed based on real device free space. >>>> 0: disabled, use dcc->discard_granularity, fixed speed. >>>> >>>> By default: enabled >>>> >>>> How do you think? >>> >>> I don't think this is proper with a sysfs entry, since we already know the >> >> You mean by storage capacity? <= 32GB means low-end? > > Yes, that's current condition to judge it. If there is any other method, it'd be That would be hard code... Still I have not got any other method to do the judgment except capacity. Thanks, > better to change it. > >> >> Thanks, >> >>> device type when mounting the partition. We won't require to change the policy >>> on the fly. And, I still don't get to change the default. >>> >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> IMO, in above cases, we'd better to issue discard with high speed for c), middle >>>>>> speed for b), and low speed for a). >>>>>> >>>>>> How do you think? >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>>> >>>>>>>> Anyway, it comes out undiscarded block makes FTL GC be lower efficient >>>>>>>> and causing high lifetime overhead. >>>>>>>> >>>>>>>> Let's tune discard speed as below: >>>>>>>> >>>>>>>> a. adjust default issue interval: >>>>>>>> original after >>>>>>>> min_interval: 50ms 100ms >>>>>>>> mid_interval: 500ms 1000ms >>>>>>>> max_interval: 60000ms 10000ms >>>>>>>> >>>>>>>> b. if last time we stop issuing discard due to IO interruption of user, >>>>>>>> let's reset all {min,mid,max}_interval to default one. >>>>>>>> >>>>>>>> c. tune {min,mid,max}_interval with below calculation method: >>>>>>>> >>>>>>>> base_interval = default_interval / 10; >>>>>>>> total_interval = default_interval - base_interval; >>>>>>>> interval = base_interval + total_interval * (100 - dev_util) / 100; >>>>>>>> >>>>>>>> For example: >>>>>>>> min_interval (:100ms) >>>>>>>> dev_util (%) interval (ms) >>>>>>>> 0 100 >>>>>>>> 10 91 >>>>>>>> 20 82 >>>>>>>> 30 73 >>>>>>>> ... >>>>>>>> 80 28 >>>>>>>> 90 19 >>>>>>>> 100 10 >>>>>>>> >>>>>>>> Signed-off-by: Chao Yu >>>>>>>> --- >>>>>>>> fs/f2fs/f2fs.h | 11 ++++---- >>>>>>>> fs/f2fs/segment.c | 64 +++++++++++++++++++++++++++++++++++++---------- >>>>>>>> fs/f2fs/segment.h | 9 +++++++ >>>>>>>> fs/f2fs/super.c | 2 +- >>>>>>>> 4 files changed, 67 insertions(+), 19 deletions(-) >>>>>>>> >>>>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h >>>>>>>> index 273ffdaf4891..a1dd2e1c3cb9 100644 >>>>>>>> --- a/fs/f2fs/f2fs.h >>>>>>>> +++ b/fs/f2fs/f2fs.h >>>>>>>> @@ -185,10 +185,9 @@ enum { >>>>>>>> >>>>>>>> #define MAX_DISCARD_BLOCKS(sbi) BLKS_PER_SEC(sbi) >>>>>>>> #define DEF_MAX_DISCARD_REQUEST 8 /* issue 8 discards per round */ >>>>>>>> -#define DEF_MIN_DISCARD_ISSUE_TIME 50 /* 50 ms, if exists */ >>>>>>>> -#define DEF_MID_DISCARD_ISSUE_TIME 500 /* 500 ms, if device busy */ >>>>>>>> -#define DEF_MAX_DISCARD_ISSUE_TIME 60000 /* 60 s, if no candidates */ >>>>>>>> -#define DEF_DISCARD_URGENT_UTIL 80 /* do more discard over 80% */ >>>>>>>> +#define DEF_MIN_DISCARD_ISSUE_TIME 100 /* 100 ms, if exists */ >>>>>>>> +#define DEF_MID_DISCARD_ISSUE_TIME 1000 /* 1000 ms, if device busy */ >>>>>>>> +#define DEF_MAX_DISCARD_ISSUE_TIME 10000 /* 10000 ms, if no candidates */ >>>>>>>> #define DEF_CP_INTERVAL 60 /* 60 secs */ >>>>>>>> #define DEF_IDLE_INTERVAL 5 /* 5 secs */ >>>>>>>> >>>>>>>> @@ -248,7 +247,8 @@ struct discard_entry { >>>>>>>> }; >>>>>>>> >>>>>>>> /* default discard granularity of inner discard thread, unit: block count */ >>>>>>>> -#define DEFAULT_DISCARD_GRANULARITY 1 >>>>>>>> +#define MID_DISCARD_GRANULARITY 16 >>>>>>>> +#define MIN_DISCARD_GRANULARITY 1 >>>>>>>> >>>>>>>> /* max discard pend list number */ >>>>>>>> #define MAX_PLIST_NUM 512 >>>>>>>> @@ -330,6 +330,7 @@ struct discard_cmd_control { >>>>>>>> atomic_t discard_cmd_cnt; /* # of cached cmd count */ >>>>>>>> struct rb_root root; /* root of discard rb-tree */ >>>>>>>> bool rbtree_check; /* config for consistence check */ >>>>>>>> + bool io_interrupted; /* last state of io interrupted */ >>>>>>>> }; >>>>>>>> >>>>>>>> /* for the list of fsync inodes, used only during recovery */ >>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>> index 8b52e8dfb12f..9564aaf1f27b 100644 >>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>> @@ -968,6 +968,44 @@ static void __check_sit_bitmap(struct f2fs_sb_info *sbi, >>>>>>>> #endif >>>>>>>> } >>>>>>>> >>>>>>>> +static void __adjust_discard_speed(unsigned int *interval, >>>>>>>> + unsigned int def_interval, int dev_util) >>>>>>>> +{ >>>>>>>> + unsigned int base_interval, total_interval; >>>>>>>> + >>>>>>>> + base_interval = def_interval / 10; >>>>>>>> + total_interval = def_interval - base_interval; >>>>>>>> + >>>>>>>> + /* >>>>>>>> + * if def_interval = 100, adjusted interval should be in range of >>>>>>>> + * [10, 100]. >>>>>>>> + */ >>>>>>>> + *interval = base_interval + total_interval * (100 - dev_util) / 100; >>>>>>>> +} >>>>>>>> + >>>>>>>> +static void __tune_discard_policy(struct f2fs_sb_info *sbi, >>>>>>>> + struct discard_policy *dpolicy) >>>>>>>> +{ >>>>>>>> + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; >>>>>>>> + int dev_util; >>>>>>>> + >>>>>>>> + if (dcc->io_interrupted) { >>>>>>>> + dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> + dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>> + dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; >>>>>>>> + return; >>>>>>>> + } >>>>>>>> + >>>>>>>> + dev_util = dev_utilization(sbi); >>>>>>>> + >>>>>>>> + __adjust_discard_speed(&dpolicy->min_interval, >>>>>>>> + DEF_MIN_DISCARD_ISSUE_TIME, dev_util); >>>>>>>> + __adjust_discard_speed(&dpolicy->mid_interval, >>>>>>>> + DEF_MID_DISCARD_ISSUE_TIME, dev_util); >>>>>>>> + __adjust_discard_speed(&dpolicy->max_interval, >>>>>>>> + DEF_MAX_DISCARD_ISSUE_TIME, dev_util); >>>>>>>> +} >>>>>>>> + >>>>>>>> static void __init_discard_policy(struct f2fs_sb_info *sbi, >>>>>>>> struct discard_policy *dpolicy, >>>>>>>> int discard_type, unsigned int granularity) >>>>>>>> @@ -982,20 +1020,11 @@ static void __init_discard_policy(struct f2fs_sb_info *sbi, >>>>>>>> dpolicy->io_aware_gran = MAX_PLIST_NUM; >>>>>>>> >>>>>>>> if (discard_type == DPOLICY_BG) { >>>>>>>> - dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> - dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>> - dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; >>>>>>>> dpolicy->io_aware = true; >>>>>>>> dpolicy->sync = false; >>>>>>>> dpolicy->ordered = true; >>>>>>>> - if (utilization(sbi) > DEF_DISCARD_URGENT_UTIL) { >>>>>>>> - dpolicy->granularity = 1; >>>>>>>> - dpolicy->max_interval = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> - } >>>>>>>> + __tune_discard_policy(sbi, dpolicy); >>>>>>>> } else if (discard_type == DPOLICY_FORCE) { >>>>>>>> - dpolicy->min_interval = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> - dpolicy->mid_interval = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>> - dpolicy->max_interval = DEF_MAX_DISCARD_ISSUE_TIME; >>>>>>>> dpolicy->io_aware = false; >>>>>>>> } else if (discard_type == DPOLICY_FSTRIM) { >>>>>>>> dpolicy->io_aware = false; >>>>>>>> @@ -1353,6 +1382,8 @@ static unsigned int __issue_discard_cmd_orderly(struct f2fs_sb_info *sbi, >>>>>>>> if (!issued && io_interrupted) >>>>>>>> issued = -1; >>>>>>>> >>>>>>>> + dcc->io_interrupted = io_interrupted; >>>>>>>> + >>>>>>>> return issued; >>>>>>>> } >>>>>>>> >>>>>>>> @@ -1370,7 +1401,7 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, >>>>>>>> if (i + 1 < dpolicy->granularity) >>>>>>>> break; >>>>>>>> >>>>>>>> - if (i < DEFAULT_DISCARD_GRANULARITY && dpolicy->ordered) >>>>>>>> + if (i < MID_DISCARD_GRANULARITY && dpolicy->ordered) >>>>>>>> return __issue_discard_cmd_orderly(sbi, dpolicy); >>>>>>>> >>>>>>>> pend_list = &dcc->pend_list[i]; >>>>>>>> @@ -1407,6 +1438,8 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, >>>>>>>> if (!issued && io_interrupted) >>>>>>>> issued = -1; >>>>>>>> >>>>>>>> + dcc->io_interrupted = io_interrupted; >>>>>>>> + >>>>>>>> return issued; >>>>>>>> } >>>>>>>> >>>>>>>> @@ -1576,7 +1609,11 @@ static int issue_discard_thread(void *data) >>>>>>>> struct f2fs_sb_info *sbi = data; >>>>>>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; >>>>>>>> wait_queue_head_t *q = &dcc->discard_wait_queue; >>>>>>>> - struct discard_policy dpolicy; >>>>>>>> + struct discard_policy dpolicy = { >>>>>>>> + .min_interval = DEF_MIN_DISCARD_ISSUE_TIME, >>>>>>>> + .mid_interval = DEF_MID_DISCARD_ISSUE_TIME, >>>>>>>> + .max_interval = DEF_MAX_DISCARD_ISSUE_TIME, >>>>>>>> + }; >>>>>>>> unsigned int wait_ms = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> int issued; >>>>>>>> >>>>>>>> @@ -1929,7 +1966,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>> if (!dcc) >>>>>>>> return -ENOMEM; >>>>>>>> >>>>>>>> - dcc->discard_granularity = DEFAULT_DISCARD_GRANULARITY; >>>>>>>> + dcc->discard_granularity = MIN_DISCARD_GRANULARITY; >>>>>>>> INIT_LIST_HEAD(&dcc->entry_list); >>>>>>>> for (i = 0; i < MAX_PLIST_NUM; i++) >>>>>>>> INIT_LIST_HEAD(&dcc->pend_list[i]); >>>>>>>> @@ -1945,6 +1982,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>> dcc->next_pos = 0; >>>>>>>> dcc->root = RB_ROOT; >>>>>>>> dcc->rbtree_check = false; >>>>>>>> + dcc->io_interrupted = false; >>>>>>>> >>>>>>>> init_waitqueue_head(&dcc->discard_wait_queue); >>>>>>>> SM_I(sbi)->dcc_info = dcc; >>>>>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h >>>>>>>> index 422b0ceb1eaa..63b4da72cd34 100644 >>>>>>>> --- a/fs/f2fs/segment.h >>>>>>>> +++ b/fs/f2fs/segment.h >>>>>>>> @@ -616,6 +616,15 @@ static inline int utilization(struct f2fs_sb_info *sbi) >>>>>>>> sbi->user_block_count); >>>>>>>> } >>>>>>>> >>>>>>>> +static inline int dev_utilization(struct f2fs_sb_info *sbi) >>>>>>>> +{ >>>>>>>> + unsigned int dev_blks; >>>>>>>> + >>>>>>>> + dev_blks = valid_user_blocks(sbi) + SM_I(sbi)->dcc_info->undiscard_blks; >>>>>>>> + return div_u64((u64)dev_blks * 100, >>>>>>>> + MAIN_SEGS(sbi) << sbi->log_blocks_per_seg); >>>>>>>> +} >>>>>>>> + >>>>>>>> /* >>>>>>>> * Sometimes f2fs may be better to drop out-of-place update policy. >>>>>>>> * And, users can control the policy through sysfs entries. >>>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c >>>>>>>> index b055f2ea77c5..55ed76daad23 100644 >>>>>>>> --- a/fs/f2fs/super.c >>>>>>>> +++ b/fs/f2fs/super.c >>>>>>>> @@ -2862,7 +2862,7 @@ static void f2fs_tuning_parameters(struct f2fs_sb_info *sbi) >>>>>>>> /* adjust parameters according to the volume size */ >>>>>>>> if (sm_i->main_segments <= SMALL_VOLUME_SEGMENTS) { >>>>>>>> F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_REUSE; >>>>>>>> - sm_i->dcc_info->discard_granularity = 1; >>>>>>>> + sm_i->dcc_info->discard_granularity = MIN_DISCARD_GRANULARITY; >>>>>>>> sm_i->ipu_policy = 1 << F2FS_IPU_FORCE; >>>>>>>> } >>>>>>>> >>>>>>>> -- >>>>>>>> 2.18.0.rc1 >>>>>>> >>>>>>> . >>>>>>> >>>>> >>>>> . >>>>> >>> >>> . >>> > > . >