Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp2168226ybh; Sun, 15 Mar 2020 20:53:22 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuc/UbGyMNMME4liT6JlfgZ8U+QDMV0uxzfwsOvgjeAcUTj4mapZNesI7e2W0/sGCoIxJK+ X-Received: by 2002:a9d:64b:: with SMTP id 69mr20251518otn.237.1584330802597; Sun, 15 Mar 2020 20:53:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584330802; cv=none; d=google.com; s=arc-20160816; b=QHylKPEAOgBUcAGjSpr4ed2FDnGua1C8BEdGc+T5hoK0lfKOnfaVU9NLlO5Q2cuc+s dAnUf9o3q0mUwgzxG0SJUmBGovQon/K9/9+pmSFYONBm3v0N2fX/gmyvX6lW8HbTNiQw ZwxwxH1FC7StDwiPN25y10nqpdAAvDYXfJ1PPteOob4DTqRGsCBU6+LJW8JrOQXnFoFd 2lLspkICQ9PmHjEvfuHkOBcGIujAZwEobB+3Y73O9OmnaqIGR/4r3xN04pc14o4dwp80 AM/x2hjhqz7FjQ9u3RGcyRx0MpD5LHZLA/DZzdVkCNlewAyFn/uVxPn16LNKAZYeQv3k jVxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:dkim-signature; bh=dX/HmR6sozhxPO42JTLUS5ksGpbvHKOZ+PUUche2qGQ=; b=xWiO+eFN+xjmZy52m6xdsKS5T2EDuV5eKhp28gW7VFM0Aa64fQBv3lTtv0/4JFdcou ndq8HBqNuvp8JUeg8q6kLX/pMAWQek778kvxjHGYZVogtQlkBFuvosnoPnEuHNt78YdK GbPAdenIDcLghp4OTWIno8tMPwR7qm4WK1Qm8oCHOSL9uj0QcJZ1jEm3k+nIV2HCIwgl z0jbnpVcWS1wsHdKKsd97FBiSTqfonC7iZZNkcoMf2WPLRkW/Y6M9j/8//H4osHQ/sJi wXne4gZgKDmyeougkQzJIqm03R2Qa+Rn3tJYM+cQVi+IaN7Gke5M7zkK4TAHNZljO6/e hOwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=lv8HOtjS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c23si3200688otn.287.2020.03.15.20.53.10; Sun, 15 Mar 2020 20:53:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=lv8HOtjS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729472AbgCPDwx (ORCPT + 99 others); Sun, 15 Mar 2020 23:52:53 -0400 Received: from mail26.static.mailgun.info ([104.130.122.26]:51847 "EHLO mail26.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729387AbgCPDwx (ORCPT ); Sun, 15 Mar 2020 23:52:53 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1584330772; h=In-Reply-To: Content-Type: MIME-Version: References: Message-ID: Subject: Cc: To: From: Date: Sender; bh=dX/HmR6sozhxPO42JTLUS5ksGpbvHKOZ+PUUche2qGQ=; b=lv8HOtjS6aOR/12R/DTlhbhL7uWQv/Fs0e+YPqMBwytrYiLDKUuX5u4yI2ZSk14IpBlfJPZg TuvOjodbH1Cy87yGrEA82RTL899NGf7rsE57Grays7AlnoYNK1ZravjVxjGwBY9Gpv8VYVuy m0E1lKjc163JE5EuIJcXGRRgq28= X-Mailgun-Sending-Ip: 104.130.122.26 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by mxa.mailgun.org with ESMTP id 5e6ef809.7fa71dff6730-smtp-out-n04; Mon, 16 Mar 2020 03:52:41 -0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 67FC5C44791; Mon, 16 Mar 2020 03:52:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=2.0 tests=ALL_TRUSTED,SPF_NONE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from codeaurora.org (blr-c-bdr-fw-01_GlobalNAT_AllZones-Outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: stummala) by smtp.codeaurora.org (Postfix) with ESMTPSA id AD61DC4478C; Mon, 16 Mar 2020 03:52:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org AD61DC4478C Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=stummala@codeaurora.org Date: Mon, 16 Mar 2020 09:22:33 +0530 From: Sahitya Tummala To: Chao Yu Cc: Jaegeuk Kim , linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, stummala@codeaurora.org Subject: Re: [PATCH] f2fs: fix long latency due to discard during umount Message-ID: <20200316035233.GM20234@codeaurora.org> References: <1584011671-20939-1-git-send-email-stummala@codeaurora.org> <20200313033912.GJ20234@codeaurora.org> <20200313110846.GL20234@codeaurora.org> <20d3b7ef-b216-6e46-58fd-7f1c96d4a8d3@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20d3b7ef-b216-6e46-58fd-7f1c96d4a8d3@huawei.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Chao, On Mon, Mar 16, 2020 at 08:52:25AM +0800, Chao Yu wrote: > On 2020/3/13 19:08, Sahitya Tummala wrote: > > On Fri, Mar 13, 2020 at 02:30:55PM +0800, Chao Yu wrote: > >> On 2020/3/13 11:39, Sahitya Tummala wrote: > >>> On Fri, Mar 13, 2020 at 10:20:04AM +0800, Chao Yu wrote: > >>>> On 2020/3/12 19:14, Sahitya Tummala wrote: > >>>>> F2FS already has a default timeout of 5 secs for discards that > >>>>> can be issued during umount, but it can take more than the 5 sec > >>>>> timeout if the underlying UFS device queue is already full and there > >>>>> are no more available free tags to be used. In that case, submit_bio() > >>>>> will wait for the already queued discard requests to complete to get > >>>>> a free tag, which can potentially take way more than 5 sec. > >>>>> > >>>>> Fix this by submitting the discard requests with REQ_NOWAIT > >>>>> flags during umount. This will return -EAGAIN for UFS queue/tag full > >>>>> scenario without waiting in the context of submit_bio(). The FS can > >>>>> then handle these requests by retrying again within the stipulated > >>>>> discard timeout period to avoid long latencies. > >>>>> > >>>>> Signed-off-by: Sahitya Tummala > >>>>> --- > >>>>> fs/f2fs/segment.c | 14 +++++++++++++- > >>>>> 1 file changed, 13 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > >>>>> index fb3e531..a06bbac 100644 > >>>>> --- a/fs/f2fs/segment.c > >>>>> +++ b/fs/f2fs/segment.c > >>>>> @@ -1124,10 +1124,13 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, > >>>>> struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > >>>>> struct list_head *wait_list = (dpolicy->type == DPOLICY_FSTRIM) ? > >>>>> &(dcc->fstrim_list) : &(dcc->wait_list); > >>>>> - int flag = dpolicy->sync ? REQ_SYNC : 0; > >>>>> + int flag; > >>>>> block_t lstart, start, len, total_len; > >>>>> int err = 0; > >>>>> > >>>>> + flag = dpolicy->sync ? REQ_SYNC : 0; > >>>>> + flag |= dpolicy->type == DPOLICY_UMOUNT ? REQ_NOWAIT : 0; > >>>>> + > >>>>> if (dc->state != D_PREP) > >>>>> return 0; > >>>>> > >>>>> @@ -1203,6 +1206,11 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi, > >>>>> bio->bi_end_io = f2fs_submit_discard_endio; > >>>>> bio->bi_opf |= flag; > >>>>> submit_bio(bio); > >>>>> + if ((flag & REQ_NOWAIT) && (dc->error == -EAGAIN)) { > >>>> > >>>> If we want to update dc->state, we need to cover it with dc->lock. > >>> > >>> Sure, will update it. > >>> > >>>> > >>>>> + dc->state = D_PREP; > >>>> > >>>> BTW, one dc can be referenced by multiple bios, so dc->state could be updated to > >>>> D_DONE later by f2fs_submit_discard_endio(), however we just relocate it to > >>>> pending list... which is inconsistent status. > >>> > >>> In that case dc->bio_ref will reflect it and until it becomes 0, the dc->state > >>> will not be updated to D_DONE in f2fs_submit_discard_endio()? > >> > >> __submit_discard_cmd() > >> lock() > >> dc->state = D_SUBMIT; > >> dc->bio_ref++; > >> unlock() > >> ... > >> submit_bio() > >> f2fs_submit_discard_endio() > >> dc->error = -EAGAIN; > >> lock() > >> dc->bio_ref--; > >> > >> dc->state = D_PREP; > >> > >> dc->state = D_DONE; > >> unlock() > >> > >> So finally, dc's state is D_DONE, and it's in wait list, then will be relocated > >> to pending list. > > > > In case of queue full, f2fs_submit_discard_endio() will not be called > > I guess the case is there are multiple bios related to one dc and partially callback > of bio is called asynchronously and the other is called synchronously, so the race > condition could happen. You are right. Let me review that case and try to fix it. Thanks, > > Thanks, > > > asynchronously. It will be called in the context of submit_bio() itself. > > So by the time, submit_bio returns dc->error will be -EAGAIN and dc->state > > will be D_DONE. > > > > submit_bio() > > ->blk_mq_make_request > > ->blk_mq_get_request() > > ->bio_wouldblock_error() (called due to queue full) > > ->bio_endio() > > > > Thanks, > >> > >>> > >>> Thanks, > >>> > >>>> > >>>> Thanks, > >>>> > >>>>> + err = dc->error; > >>>>> + break; > >>>>> + } > >>>>> > >>>>> atomic_inc(&dcc->issued_discard); > >>>>> > >>>>> @@ -1510,6 +1518,10 @@ static int __issue_discard_cmd(struct f2fs_sb_info *sbi, > >>>>> } > >>>>> > >>>>> __submit_discard_cmd(sbi, dpolicy, dc, &issued); > >>>>> + if (dc->error == -EAGAIN) { > >>>>> + congestion_wait(BLK_RW_ASYNC, HZ/50); > >>>>> + __relocate_discard_cmd(dcc, dc); > >>>>> + } > >>>>> > >>>>> if (issued >= dpolicy->max_requests) > >>>>> break; > >>>>> > >>> > > -- -- Sent by a consultant of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.