Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1507960ybb; Thu, 26 Mar 2020 02:02:45 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvxEqn+modLVsqUXcNGviz8CJ3NKKUn99kOY7wTtOLwq8c1kTw1yUlzzcW49UZvEaCn6jy8 X-Received: by 2002:a54:4197:: with SMTP id 23mr551792oiy.87.1585213364877; Thu, 26 Mar 2020 02:02:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585213364; cv=none; d=google.com; s=arc-20160816; b=i8dNzR8Wsw/JWH3Y/DXKT6xGR0SutdiOPODz4DRgxMcur6hp/eW06kpco+XTLUfuIE IXk0WtkrMy8M+IOBvBgk8tH0wH+TWAOlGT+MEOdJllzLamHMKhlkfxD1p0/vGI+gsVUj 60k4yzUlbpEUnDk+nhWQqRAp/kR/xIfYa4nKowbKbQCbmzGO7uGN+Dn6N6oaicvWRL7V uLkvCtBAifppJGYTvIwOiZk8RHhfgf34OizTt8d+az6uBOF833+IloYky5Kom0o9tGBW m2LJzwtkmonZOG14P+mnT3ngZh3qn5yK20nFPJoUIjlggcl9gGLLM7VlEo4L/1/J0NWd Xntg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=rW++Q13cCwPehxgXk4yGkTZd2Ls4mIQqH1FDjBI0sNw=; b=T+abk+/qqvi7JpSS4t7p5P3BkPpj0HqA/NnLKNqvovMtGF5dpa8m0KOEC8fMSKMxwa sBkP9sOcOBMeQWOM46StUaA1SDFAv8REfYuf6j4BynQK3nM7mWEiwFMhsqDOIsfilfbc 9xI3CvULHec68uvYIrfZphoi7SamtcTwFCDni2bO0Kk5/kTF5J/ZdFoVFOzoveC7UXGr JhYreShxy3Ldae4FX5LEOTxy1N2cLvfgPV5QRE2q/Nc0CHHPhDAnBo7H2bwTC8Ds/ni7 yLfD9q8AFxFYTg17CyPrijtCOhZxjKjDrjhp99Tm2OzrtiAUirqL72jmt8j6zXQAtQpj dS6A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v17si752145otq.250.2020.03.26.02.01.38; Thu, 26 Mar 2020 02:02:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727690AbgCZJAe (ORCPT + 99 others); Thu, 26 Mar 2020 05:00:34 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:36592 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726210AbgCZJAd (ORCPT ); Thu, 26 Mar 2020 05:00:33 -0400 Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 901C1AF1DC8E9FC703B3; Thu, 26 Mar 2020 17:00:21 +0800 (CST) Received: from [10.134.22.195] (10.134.22.195) by smtp.huawei.com (10.3.19.206) with Microsoft SMTP Server (TLS) id 14.3.487.0; Thu, 26 Mar 2020 17:00:18 +0800 Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount To: Sahitya Tummala , Jaegeuk Kim , CC: References: <1584506689-5041-1-git-send-email-stummala@codeaurora.org> From: Chao Yu Message-ID: <29d4adc4-482d-3d92-1470-3405989ea231@huawei.com> Date: Thu, 26 Mar 2020 17:00:18 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <1584506689-5041-1-git-send-email-stummala@codeaurora.org> Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.134.22.195] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Sahitya, On 2020/3/18 12:44, Sahitya Tummala wrote: > F2FS already has a default timeout of 5 secs for discards that > can be issued during umount, but it can take more than the 5 sec > timeout if the underlying UFS device queue is already full and there > are no more available free tags to be used. In that case, submit_bio() > will wait for the already queued discard requests to complete to get > a free tag, which can potentially take way more than 5 sec. > > Fix this by submitting the discard requests with REQ_NOWAIT > flags during umount. This will return -EAGAIN for UFS queue/tag full > scenario without waiting in the context of submit_bio(). The FS can > then handle these requests by retrying again within the stipulated > discard timeout period to avoid long latencies. > > Signed-off-by: Sahitya Tummala > --- > v2: > - Handle the case where a dc can have multiple bios associated with it > > fs/f2fs/f2fs.h | 1 + > fs/f2fs/segment.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++------- > 2 files changed, 74 insertions(+), 10 deletions(-) > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index 12a197e..67b8dcc 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -340,6 +340,7 @@ struct discard_cmd_control { > struct list_head pend_list[MAX_PLIST_NUM];/* store pending entries */ > struct list_head wait_list; /* store on-flushing entries */ > struct list_head fstrim_list; /* in-flight discard from fstrim */ > + struct list_head retry_list; /* list of cmds to retry */ > wait_queue_head_t discard_wait_queue; /* waiting queue for wake-up */ > unsigned int discard_wake; /* to wake up discard thread */ > struct mutex cmd_lock; > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index fb3e531..4162c76 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -1029,13 +1029,16 @@ static void f2fs_submit_discard_endio(struct bio *bio) > struct discard_cmd *dc = (struct discard_cmd *)bio->bi_private; > unsigned long flags; > > - dc->error = blk_status_to_errno(bio->bi_status); > - > spin_lock_irqsave(&dc->lock, flags); > + if (!dc->error) > + dc->error = blk_status_to_errno(bio->bi_status); > + > dc->bio_ref--; > - if (!dc->bio_ref && dc->state == D_SUBMIT) { > - dc->state = D_DONE; > - complete_all(&dc->wait); > + if (!dc->bio_ref) { > + if (dc->error || dc->state == D_SUBMIT) { > + dc->state = D_DONE; > + complete_all(&dc->wait); > + } > } > spin_unlock_irqrestore(&dc->lock, flags); > bio_put(bio); > @@ -1124,10 +1127,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, > struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ? > &(dcc->fstrim_list) : &(dcc->wait_list); > - int flag = dpolicy->sync ? REQ_SYNC : 0; > - block_t lstart, start, len, total_len; > + int flag; > + block_t lstart, start, len, total_len, orig_len; > int err = 0; > > + flag = dpolicy->sync ? REQ_SYNC : 0; > + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0; > + > if (dc->state != D_PREP) > return 0; > > @@ -1139,7 +1145,7 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, > lstart = dc->lstart; > start = dc->start; > len = dc->len; > - total_len = len; > + orig_len = total_len = len; > > dc->len = 0; > > @@ -1203,6 +1209,14 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, > bio->bi_end_io = f2fs_submit_discard_endio; > bio->bi_opf |= flag; > submit_bio(bio); > + if (flag & REQ_NOWAIT) { > + if (dc->error == -EAGAIN) { > + dc->len = orig_len; > + list_move(&dc->list, &dcc->retry_list); > + err = dc->error; I encounter lots of dmesg, which should be printed by __remove_discard_cmd() F2FS-fs (dm-0): Issue discard(23552, 23552, 2) failed, ret: -11 This should happen only if we didn't handle all discard in 5 seconds during umount. So I doubt we failed to move dc to retry_list, because after submit_bio(), end_io() is not called synchronously as the bio was just pluged? Thanks, > + break; > + } > + } > > atomic_inc(&dcc->issued_discard); > > @@ -1463,6 +1477,40 @@ static unsigned int __issue_discard_cmd_orderly(struct f2fs_sb_info *sbi, > return issued; > } > > +static bool __should_discard_retry(struct f2fs_sb_info *sbi, > + struct discard_policy *dpolicy) > +{ > + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > + struct discard_cmd *dc, *tmp; > + bool retry = false; > + unsigned long flags; > + > + if (dpolicy->type != DPOLICY_UMOUNT) > + f2fs_bug_on(sbi, 1); > + > + mutex_lock(&dcc->cmd_lock); > + list_for_each_entry_safe(dc, tmp, &(dcc->retry_list), list) { > + if (dpolicy->timeout != 0 && > + f2fs_time_over(sbi, dpolicy->timeout)) { > + retry = false; > + break; > + } > + > + spin_lock_irqsave(&dc->lock, flags); > + if (!dc->bio_ref) { > + dc->state = D_PREP; > + dc->error = 0; > + reinit_completion(&dc->wait); > + __relocate_discard_cmd(dcc, dc); > + retry = true; > + } > + spin_unlock_irqrestore(&dc->lock, flags); > + } > + mutex_unlock(&dcc->cmd_lock); > + > + return retry; > +} > + > static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > struct discard_policy *dpolicy) > { > @@ -1470,12 +1518,13 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > struct list_head *pend_list; > struct discard_cmd *dc, *tmp; > struct blk_plug plug; > - int i, issued = 0; > + int i, err, issued = 0; > bool io_interrupted = false; > > if (dpolicy->timeout != 0) > f2fs_update_time(sbi, dpolicy->timeout); > > +retry: > for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { > if (dpolicy->timeout != 0 && > f2fs_time_over(sbi, dpolicy->timeout)) > @@ -1509,7 +1558,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > break; > } > > - __submit_discard_cmd(sbi, dpolicy, dc, &issued); > + err = __submit_discard_cmd(sbi, dpolicy, dc, &issued); > + if (err == -EAGAIN) > + congestion_wait(BLK_RW_ASYNC, > + DEFAULT_IO_TIMEOUT); > > if (issued >= dpolicy->max_requests) > break; > @@ -1522,6 +1574,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > break; > } > > + if (!list_empty(&dcc->retry_list) && > + __should_discard_retry(sbi, dpolicy)) > + goto retry; > + > if (!issued && io_interrupted) > issued = -1; > > @@ -1613,6 +1669,12 @@ static unsigned int __wait_discard_cmd_range(struct f2fs_sb_info *sbi, > goto next; > } > > + if (dpolicy->type == DPOLICY_UMOUNT && > + !list_empty(&dcc->retry_list)) { > + wait_list = &dcc->retry_list; > + goto next; > + } > + > return trimmed; > } > > @@ -2051,6 +2113,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > for (i = 0; i < MAX_PLIST_NUM; i++) > INIT_LIST_HEAD(&dcc->pend_list[i]); > INIT_LIST_HEAD(&dcc->wait_list); > + INIT_LIST_HEAD(&dcc->retry_list); > INIT_LIST_HEAD(&dcc->fstrim_list); > mutex_init(&dcc->cmd_lock); > atomic_set(&dcc->issued_discard, 0); >