Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp302236pxm; Wed, 2 Mar 2022 15:52:36 -0800 (PST) X-Google-Smtp-Source: ABdhPJzZbHSQe5J4+4xIkEOgaoUPwpWnINwPQrvI8UMv+7ZTJKfZ749D1DR9E1E+sDd9/xTUk9E0 X-Received: by 2002:a17:902:7246:b0:151:49e7:d4fc with SMTP id c6-20020a170902724600b0015149e7d4fcmr24882795pll.88.1646265156093; Wed, 02 Mar 2022 15:52:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646265156; cv=none; d=google.com; s=arc-20160816; b=TULTm1zsWJ6bB7YQKM54ZkliQmVJ0Zsp7XMR4hYR13fpZatxyP49iBZFZBAyTjrIxz zMHvcXYitZQ3SmSCfFez9fn54xLB2leQEZfiQI749oLFe//IHSQLV9wH9twkHiHJeW9m xy7ZRKPy/ExGxr3tSQyQ1BnbzdLfIW/h0R8f0WHJBYnNA/PQhbZp7q0Qrr/WEwssBI3j ukD0DnTuncCWMC7gKrUxXoZZuiAZBS651v5sdm3tASpUfyVYbt+ZAjdAzf8nEa9/mJF2 wOmoHUn6slUsNaVtC3ikWwVNQ7gQMzZNhrt3WYH1119YqEzv05O0VGuAJib2qOtpWcSY wgug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=rCCkMgZ8N/Fb14MYFylHYvHegavBpKPuHaZwR9Vn3EY=; b=rOhfhMk+VSO8v3dF14pAK9VM0eqSgkS/0s96YYO6zyYHa1CsriRtem84oC/Gks0ITB zz4hkjxdoT0F1b4EcY6sOTgtZKmAvLkkoKriHW1SWQhYj+cZBHb1wTViJgM9DQdwcZ7s 3m36F8lwks299rbcLjFc06VMxKjfOMePYIo2HVOdjt0OTe+GGxwwogxTl0BQ91OpTTwu gzXoqb6NzR7yeHJ4gvpL+AD35QTVYHKbO0DMhQ2LBF/MBwfTvPo1smyHtcBaVmFAfzOp 6UsZUpV+RIGKVsXWaDhHb1JinKJOm/Z7RcDlbKQ9JULDklmSjV7szRe26dTcNfA5igtU ks5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="o5YNs/bG"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id d5-20020a623605000000b004c0e0bcea60si458368pfa.297.2022.03.02.15.52.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 15:52:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="o5YNs/bG"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8268917DBB3; Wed, 2 Mar 2022 15:14:00 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239959AbiCBIPa (ORCPT + 99 others); Wed, 2 Mar 2022 03:15:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbiCBIP3 (ORCPT ); Wed, 2 Mar 2022 03:15:29 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 407B3B2E12 for ; Wed, 2 Mar 2022 00:14:46 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E58A4B81E47 for ; Wed, 2 Mar 2022 08:14:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 515E5C004E1; Wed, 2 Mar 2022 08:14:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646208883; bh=MqRNdHomIgm21tXs9Zdh3C6Qu9NH1LhEwJvQeL86Vco=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=o5YNs/bG5nMi1TpiX4+v8rrAUlmOI8JS5jrzCJXVne1onSvcucIjI852ynu+0EBVJ 6GFET+vjkTMvd1PH7mbmkvBdh1bTcEBpB+m7iuHV8FYJWhj/zPCUO+UHQYO6vZaH0R gwBax88QNv9ti5UPllTWMI9tVvRSCsIUHCHdL4eP/25FltL8VgP8BvwjHW7IUelePV VwjJwsd29tS2xXlqm6A7ThQKg5v4w/XNj0he6ilCh8cOKFDksdYmhdBfvkPXsV01dg XKK1o+aLviFOw3m3v8PMBIfDSrJxIq7iDPmfLgaWUF48ka2Mo13kZhsvHYqjfUxrXq S5Slwd6LJRxsw== Message-ID: Date: Wed, 2 Mar 2022 16:14:36 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [f2fs-dev] [PATCH] f2fs: fix to avoid potential deadlock Content-Language: en-US To: Jaegeuk Kim Cc: Jing Xia , linux-f2fs-devel@lists.sourceforge.net, Zhiguo Niu , linux-kernel@vger.kernel.org References: <20220127054449.24711-1-chao@kernel.org> <51be77f1-6e85-d46d-d0d3-c06d2055a190@kernel.org> <86a175d3-c438-505b-1dbc-4ef6e8b5adcb@kernel.org> <5b5e20d1-877f-b321-b341-c0f233ee976c@kernel.org> <51826b5f-e480-994a-4a72-39ff4572bb3f@kernel.org> From: Chao Yu In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/3/2 13:26, Jaegeuk Kim wrote: > On 03/02, Chao Yu wrote: >> ping, >> >> On 2022/2/25 11:02, Chao Yu wrote: >>> On 2022/2/3 22:57, Chao Yu wrote: >>>> On 2022/2/3 9:51, Jaegeuk Kim wrote: >>>>> On 01/29, Chao Yu wrote: >>>>>> On 2022/1/29 8:37, Jaegeuk Kim wrote: >>>>>>> On 01/28, Chao Yu wrote: >>>>>>>> On 2022/1/28 5:59, Jaegeuk Kim wrote: >>>>>>>>> On 01/27, Chao Yu wrote: >>>>>>>>>> Quoted from Jing Xia's report, there is a potential deadlock may happen >>>>>>>>>> between kworker and checkpoint as below: >>>>>>>>>> >>>>>>>>>> [T:writeback]                [T:checkpoint] >>>>>>>>>> - wb_writeback >>>>>>>>>>     - blk_start_plug >>>>>>>>>> bio contains NodeA was plugged in writeback threads >>>>>>>>> >>>>>>>>> I'm still trying to understand more precisely. So, how is it possible to >>>>>>>>> have bio having node write in this current context? >>>>>>>> >>>>>>>> IMO, after above blk_start_plug(), it may plug some inode's node page in kworker >>>>>>>> during writebacking node_inode's data page (which should be node page)? >>>>>>> >>>>>>> Wasn't that added into a different task->plug? >>>>>> >>>>>> I'm not sure I've got your concern correctly... >>>>>> >>>>>> Do you mean NodeA and other IOs from do_writepages() were plugged in >>>>>> different local plug variables? >>>>> >>>>> I think so. >>>> >>>> I guess block plug helper says it doesn't allow to use nested plug, so there >>>> is only one plug in kworker thread? > > Is there only one kworker thread that flushes node and inode pages? IIRC, =one kworker per block device? Thanks, > >>>> >>>> void blk_start_plug_nr_ios(struct blk_plug *plug, unsigned short nr_ios) >>>> { >>>>      struct task_struct *tsk = current; >>>> >>>>      /* >>>>       * If this is a nested plug, don't actually assign it. >>>>       */ >>>>      if (tsk->plug) >>>>          return; >>>> ... >>>> } >>> >>> Any further comments? >>> >>> Thanks, >>> >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>>> >>>>>>>>>>                     - do_writepages  -- sync write inodeB, inc wb_sync_req[DATA] >>>>>>>>>>                      - f2fs_write_data_pages >>>>>>>>>>                       - f2fs_write_single_data_page -- write last dirty page >>>>>>>>>>                        - f2fs_do_write_data_page >>>>>>>>>>                         - set_page_writeback  -- clear page dirty flag and >>>>>>>>>>                         PAGECACHE_TAG_DIRTY tag in radix tree >>>>>>>>>>                         - f2fs_outplace_write_data >>>>>>>>>>                          - f2fs_update_data_blkaddr >>>>>>>>>>                           - f2fs_wait_on_page_writeback -- wait NodeA to writeback here >>>>>>>>>>                        - inode_dec_dirty_pages >>>>>>>>>>     - writeback_sb_inodes >>>>>>>>>>      - writeback_single_inode >>>>>>>>>>       - do_writepages >>>>>>>>>>        - f2fs_write_data_pages -- skip writepages due to wb_sync_req[DATA] >>>>>>>>>>         - wbc->pages_skipped += get_dirty_pages() -- PAGECACHE_TAG_DIRTY is not set but get_dirty_pages() returns one >>>>>>>>>>      - requeue_inode -- requeue inode to wb->b_dirty queue due to non-zero.pages_skipped >>>>>>>>>>     - blk_finish_plug >>>>>>>>>> >>>>>>>>>> Let's try to avoid deadlock condition by forcing unplugging previous bio via >>>>>>>>>> blk_finish_plug(current->plug) once we'v skipped writeback in writepages() >>>>>>>>>> due to valid sbi->wb_sync_req[DATA/NODE]. >>>>>>>>>> >>>>>>>>>> Fixes: 687de7f1010c ("f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONE") >>>>>>>>>> Signed-off-by: Zhiguo Niu >>>>>>>>>> Signed-off-by: Jing Xia >>>>>>>>>> Signed-off-by: Chao Yu >>>>>>>>>> --- >>>>>>>>>>     fs/f2fs/data.c | 6 +++++- >>>>>>>>>>     fs/f2fs/node.c | 6 +++++- >>>>>>>>>>     2 files changed, 10 insertions(+), 2 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >>>>>>>>>> index 76d6fe7b0c8f..932a4c81acaf 100644 >>>>>>>>>> --- a/fs/f2fs/data.c >>>>>>>>>> +++ b/fs/f2fs/data.c >>>>>>>>>> @@ -3174,8 +3174,12 @@ static int __f2fs_write_data_pages(struct address_space *mapping, >>>>>>>>>>         /* to avoid spliting IOs due to mixed WB_SYNC_ALL and WB_SYNC_NONE */ >>>>>>>>>>         if (wbc->sync_mode == WB_SYNC_ALL) >>>>>>>>>>             atomic_inc(&sbi->wb_sync_req[DATA]); >>>>>>>>>> -    else if (atomic_read(&sbi->wb_sync_req[DATA])) >>>>>>>>>> +    else if (atomic_read(&sbi->wb_sync_req[DATA])) { >>>>>>>>>> +        /* to avoid potential deadlock */ >>>>>>>>>> +        if (current->plug) >>>>>>>>>> +            blk_finish_plug(current->plug); >>>>>>>>>>             goto skip_write; >>>>>>>>>> +    } >>>>>>>>>>         if (__should_serialize_io(inode, wbc)) { >>>>>>>>>>             mutex_lock(&sbi->writepages); >>>>>>>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c >>>>>>>>>> index 556fcd8457f3..69c6bcaf5aae 100644 >>>>>>>>>> --- a/fs/f2fs/node.c >>>>>>>>>> +++ b/fs/f2fs/node.c >>>>>>>>>> @@ -2106,8 +2106,12 @@ static int f2fs_write_node_pages(struct address_space *mapping, >>>>>>>>>>         if (wbc->sync_mode == WB_SYNC_ALL) >>>>>>>>>>             atomic_inc(&sbi->wb_sync_req[NODE]); >>>>>>>>>> -    else if (atomic_read(&sbi->wb_sync_req[NODE])) >>>>>>>>>> +    else if (atomic_read(&sbi->wb_sync_req[NODE])) { >>>>>>>>>> +        /* to avoid potential deadlock */ >>>>>>>>>> +        if (current->plug) >>>>>>>>>> +            blk_finish_plug(current->plug); >>>>>>>>>>             goto skip_write; >>>>>>>>>> +    } >>>>>>>>>>         trace_f2fs_writepages(mapping->host, wbc, NODE); >>>>>>>>>> -- >>>>>>>>>> 2.32.0 >>>> >>>> >>>> _______________________________________________ >>>> Linux-f2fs-devel mailing list >>>> Linux-f2fs-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>> >>> >>> _______________________________________________ >>> Linux-f2fs-devel mailing list >>> Linux-f2fs-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel