Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp409629pxm; Wed, 2 Mar 2022 18:40:08 -0800 (PST) X-Google-Smtp-Source: ABdhPJx1lyndHIijwviquM3ij4cjxi9TO4K9kGRBvY/D2by/mT4mKYV2PJdMT16PSaUay/52KuJW X-Received: by 2002:a17:902:7109:b0:151:8311:d3b5 with SMTP id a9-20020a170902710900b001518311d3b5mr11435772pll.13.1646275208543; Wed, 02 Mar 2022 18:40:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646275208; cv=none; d=google.com; s=arc-20160816; b=tW8J0uzexUzjaPIjcIbuygi0nOP6Hv725gQYFTrdXXZpLa3ApQnL0wlVddoE5czfGC ryUpJwybAdQNl1V5G80Qg9fAfy7NNrLG+GNLI+/75JF/BMg7k84hcAhBpNd1LgDB5J+Y JxKweJk24+ZZgSyzHqRMp+wvb8SqphvT6Bx3a5hDsd3dnLiemc3Enc3X3sLhi2kQPg9p R0rskJ0iTG7e6tOf3m0JKrXQU9Fb9UboVUKMQluBqogMTfjXEjUWWJhoxARQ+AqTk72i fNk3NUhVmRiAqzYKUoxBVloXUGv3B0BStPmtH3BinEJ/8bvBE3zkeO9mE5lO59rWBmjQ ZVCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=/a69l5I2Ls9eBeHj6uFFad5+AKmafLffA9tQn0Pb6Hw=; b=oe5DkvEqArjU3Xpv0P7e+W/d5f54Jf3S2R5c4Irh8IbNKwk0QPaQGYEfFBxgw0PnT0 uEZpFQ0+zxt9JDqgjDNHesmuPLwgaRmLZSDvktL/xYxAUra8EHBcT5DfbTp2olg7tDA0 X7TqLl5CC1wGwPpiQIuLpuG+tXUQB+9WFXN+VvK7mXfq3JZJl9K7y9VA2xf0LJX+kCcZ 8ItatCho5ya7FJIOEdCQ1iKr1ih0m8webrDw5HL0UYbNqStuj7bAZO4jWchO/YpAEL8B YNpfauhcdgHVxsBBC/NkvyAVbjY3i3j1CG200/anu68ywuFlGyQYlQsV24eUzZGV3Zwh 6d2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=f7Bvzjny; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id k4-20020aa788c4000000b004dff0e45b5fsi913582pff.138.2022.03.02.18.40.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 18:40:08 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=f7Bvzjny; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 287C748893; Wed, 2 Mar 2022 18:32:52 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231812AbiCCCdb (ORCPT + 99 others); Wed, 2 Mar 2022 21:33:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231802AbiCCCd3 (ORCPT ); Wed, 2 Mar 2022 21:33:29 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C2D8DFF4 for ; Wed, 2 Mar 2022 18:32:40 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6C08A61471 for ; Thu, 3 Mar 2022 02:32:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B118C004E1; Thu, 3 Mar 2022 02:32:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646274759; bh=5XSp38aU3KA0DzW361Z0nmko3KMEF0z6VS+NuXyW3jk=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=f7BvzjnyLcHFwCoIEdS7oU2KSZNJCS794vBPgHCRMwYx01ONB5bfZAbrlMr4AnYMP RmZrkn77KGiknktRBMqdvGQHE6WZVtRLaNtQZZnQbxoK8A3wWLPc2ke2fmVcIin6yI jYdhIbybqVG68eSfixoJGG8k0+Lvz0VVpSsoDBQCHF0Bt/C/qJSo4JdS9lKQnEZzWc 8pZVqjIIvHMFer1m1mSwKZEnsQ6ra6yzozFV3FDe+Ct8yokni32VRU/jaF1ubO2NIY 9zS3L4MTB+r7nTr+2zEWMaHJG0iue1DKTM8n6lyZSkj3YUwQ1wDAeY7yMitb1zgjwy fWeswTDxnOs/g== Message-ID: Date: Thu, 3 Mar 2022 10:32:33 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [f2fs-dev] [PATCH] f2fs: fix to avoid potential deadlock Content-Language: en-US To: Jaegeuk Kim Cc: Jing Xia , linux-f2fs-devel@lists.sourceforge.net, Zhiguo Niu , linux-kernel@vger.kernel.org References: <51be77f1-6e85-d46d-d0d3-c06d2055a190@kernel.org> <86a175d3-c438-505b-1dbc-4ef6e8b5adcb@kernel.org> <5b5e20d1-877f-b321-b341-c0f233ee976c@kernel.org> <51826b5f-e480-994a-4a72-39ff4572bb3f@kernel.org> From: Chao Yu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/3/3 3:45, Jaegeuk Kim wrote: > On 03/02, Chao Yu wrote: >> On 2022/3/2 13:26, Jaegeuk Kim wrote: >>> On 03/02, Chao Yu wrote: >>>> ping, >>>> >>>> On 2022/2/25 11:02, Chao Yu wrote: >>>>> On 2022/2/3 22:57, Chao Yu wrote: >>>>>> On 2022/2/3 9:51, Jaegeuk Kim wrote: >>>>>>> On 01/29, Chao Yu wrote: >>>>>>>> On 2022/1/29 8:37, Jaegeuk Kim wrote: >>>>>>>>> On 01/28, Chao Yu wrote: >>>>>>>>>> On 2022/1/28 5:59, Jaegeuk Kim wrote: >>>>>>>>>>> On 01/27, Chao Yu wrote: >>>>>>>>>>>> Quoted from Jing Xia's report, there is a potential deadlock may happen >>>>>>>>>>>> between kworker and checkpoint as below: >>>>>>>>>>>> >>>>>>>>>>>> [T:writeback]                [T:checkpoint] >>>>>>>>>>>> - wb_writeback >>>>>>>>>>>>     - blk_start_plug >>>>>>>>>>>> bio contains NodeA was plugged in writeback threads >>>>>>>>>>> >>>>>>>>>>> I'm still trying to understand more precisely. So, how is it possible to >>>>>>>>>>> have bio having node write in this current context? >>>>>>>>>> >>>>>>>>>> IMO, after above blk_start_plug(), it may plug some inode's node page in kworker >>>>>>>>>> during writebacking node_inode's data page (which should be node page)? >>>>>>>>> >>>>>>>>> Wasn't that added into a different task->plug? >>>>>>>> >>>>>>>> I'm not sure I've got your concern correctly... >>>>>>>> >>>>>>>> Do you mean NodeA and other IOs from do_writepages() were plugged in >>>>>>>> different local plug variables? >>>>>>> >>>>>>> I think so. >>>>>> >>>>>> I guess block plug helper says it doesn't allow to use nested plug, so there >>>>>> is only one plug in kworker thread? >>> >>> Is there only one kworker thread that flushes node and inode pages? >> >> IIRC, =one kworker per block device? > > If there's one kworker only, f2fs_write_node_pages() should have flushed its > plug? No, f2fs_write_node_pages() failed to attach local plug into current->plug due to current has attached plug from wb_writeback(), and also, f2fs_write_node_pages() will fail to flush current->plug due to its local plug doesn't match current->plug. void blk_start_plug_nr_ios() { if (tsk->plug) return; ... } void blk_finish_plug(struct blk_plug *plug) { if (plug == current->plug) { __blk_flush_plug(plug, false); current->plug = NULL; } } Thanks, > >> >> Thanks, >> >>> >>>>>> >>>>>> void blk_start_plug_nr_ios(struct blk_plug *plug, unsigned short nr_ios) >>>>>> { >>>>>>      struct task_struct *tsk = current; >>>>>> >>>>>>      /* >>>>>>       * If this is a nested plug, don't actually assign it. >>>>>>       */ >>>>>>      if (tsk->plug) >>>>>>          return; >>>>>> ... >>>>>> } >>>>> >>>>> Any further comments? >>>>> >>>>> Thanks, >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>                     - do_writepages  -- sync write inodeB, inc wb_sync_req[DATA] >>>>>>>>>>>>                      - f2fs_write_data_pages >>>>>>>>>>>>                       - f2fs_write_single_data_page -- write last dirty page >>>>>>>>>>>>                        - f2fs_do_write_data_page >>>>>>>>>>>>                         - set_page_writeback  -- clear page dirty flag and >>>>>>>>>>>>                         PAGECACHE_TAG_DIRTY tag in radix tree >>>>>>>>>>>>                         - f2fs_outplace_write_data >>>>>>>>>>>>                          - f2fs_update_data_blkaddr >>>>>>>>>>>>                           - f2fs_wait_on_page_writeback -- wait NodeA to writeback here >>>>>>>>>>>>                        - inode_dec_dirty_pages >>>>>>>>>>>>     - writeback_sb_inodes >>>>>>>>>>>>      - writeback_single_inode >>>>>>>>>>>>       - do_writepages >>>>>>>>>>>>        - f2fs_write_data_pages -- skip writepages due to wb_sync_req[DATA] >>>>>>>>>>>>         - wbc->pages_skipped += get_dirty_pages() -- PAGECACHE_TAG_DIRTY is not set but get_dirty_pages() returns one >>>>>>>>>>>>      - requeue_inode -- requeue inode to wb->b_dirty queue due to non-zero.pages_skipped >>>>>>>>>>>>     - blk_finish_plug >>>>>>>>>>>> >>>>>>>>>>>> Let's try to avoid deadlock condition by forcing unplugging previous bio via >>>>>>>>>>>> blk_finish_plug(current->plug) once we'v skipped writeback in writepages() >>>>>>>>>>>> due to valid sbi->wb_sync_req[DATA/NODE]. >>>>>>>>>>>> >>>>>>>>>>>> Fixes: 687de7f1010c ("f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE") >>>>>>>>>>>> Signed-off-by: Zhiguo Niu >>>>>>>>>>>> Signed-off-by: Jing Xia >>>>>>>>>>>> Signed-off-by: Chao Yu >>>>>>>>>>>> --- >>>>>>>>>>>>     fs/f2fs/data.c | 6 +++++- >>>>>>>>>>>>     fs/f2fs/node.c | 6 +++++- >>>>>>>>>>>>     2 files changed, 10 insertions(+), 2 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >>>>>>>>>>>> index 76d6fe7b0c8f..932a4c81acaf 100644 >>>>>>>>>>>> --- a/fs/f2fs/data.c >>>>>>>>>>>> +++ b/fs/f2fs/data.c >>>>>>>>>>>> @@ -3174,8 +3174,12 @@ static int __f2fs_write_data_pages(struct address_space *mapping, >>>>>>>>>>>>         /* to avoid spliting IOs due to mixed WB_SYNC_ALL and WB_SYNC_NONE */ >>>>>>>>>>>>         if (wbc->sync_mode == WB_SYNC_ALL) >>>>>>>>>>>>             atomic_inc(&sbi->wb_sync_req[DATA]); >>>>>>>>>>>> -    else if (atomic_read(&sbi->wb_sync_req[DATA])) >>>>>>>>>>>> +    else if (atomic_read(&sbi->wb_sync_req[DATA])) { >>>>>>>>>>>> +        /* to avoid potential deadlock */ >>>>>>>>>>>> +        if (current->plug) >>>>>>>>>>>> +            blk_finish_plug(current->plug); >>>>>>>>>>>>             goto skip_write; >>>>>>>>>>>> +    } >>>>>>>>>>>>         if (__should_serialize_io(inode, wbc)) { >>>>>>>>>>>>             mutex_lock(&sbi->writepages); >>>>>>>>>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c >>>>>>>>>>>> index 556fcd8457f3..69c6bcaf5aae 100644 >>>>>>>>>>>> --- a/fs/f2fs/node.c >>>>>>>>>>>> +++ b/fs/f2fs/node.c >>>>>>>>>>>> @@ -2106,8 +2106,12 @@ static int f2fs_write_node_pages(struct address_space *mapping, >>>>>>>>>>>>         if (wbc->sync_mode == WB_SYNC_ALL) >>>>>>>>>>>>             atomic_inc(&sbi->wb_sync_req[NODE]); >>>>>>>>>>>> -    else if (atomic_read(&sbi->wb_sync_req[NODE])) >>>>>>>>>>>> +    else if (atomic_read(&sbi->wb_sync_req[NODE])) { >>>>>>>>>>>> +        /* to avoid potential deadlock */ >>>>>>>>>>>> +        if (current->plug) >>>>>>>>>>>> +            blk_finish_plug(current->plug); >>>>>>>>>>>>             goto skip_write; >>>>>>>>>>>> +    } >>>>>>>>>>>>         trace_f2fs_writepages(mapping->host, wbc, NODE); >>>>>>>>>>>> -- >>>>>>>>>>>> 2.32.0 >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Linux-f2fs-devel mailing list >>>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>> >>>>> >>>>> _______________________________________________ >>>>> Linux-f2fs-devel mailing list >>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel