Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp275241pxu; Wed, 2 Dec 2020 22:58:42 -0800 (PST) X-Google-Smtp-Source: ABdhPJxgmT2690oJLojAzQhbvkh2qIQsHHq7G4v89UWZ07/AFssvfQXcxOj7lZv/ZWLhpQlJxuxm X-Received: by 2002:aa7:d54a:: with SMTP id u10mr1569197edr.168.1606978722309; Wed, 02 Dec 2020 22:58:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606978722; cv=none; d=google.com; s=arc-20160816; b=yv+b6f7MCDm2zHSInpo8VGx38EtQZ5sLyqnYYMC8akZ/y3Q0wDzgmpVIVtsAR9dOZT vwsvyLpUsAsULZ/YGXVRwlbaC7b1Sz/UT/wfwUFl65IAYbHhgSpeeGqAvKwVYJyACEen N/LA+wHCXQcnBl1FPjscVoUfNN/rAKG1Ua9SITjuWFYUuepSdwZa7BgkurBrBxwUN0G1 GUHcAf307YczqRAokdcYw4WGENEC0YYDW0TgKXTBXW7oHhiS2/KcMabRUVDK4Cz1glLl u5cekjztv6qWOvffPnAb1UUgjceIqBX9TaFXqqi3EYA1XdF0pkFSIA8pU4hL8y/A/vYQ iV6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=Y9+Sx+Gwsuvxgygzo7chEmP+3h07+K9vwR88EPhqRq4=; b=b3P0p78lP+xlAFH5MkvCaR1M0nREcWDrvC/AMWc0T0xl3B92/FSlMA277mNh6pwG/7 +Ypg3QNHzVSDPSQjFZZnybypLDA81RJHKqk124rjxEu8UqgPLLr67ZoWB6/QWGwQr36a wkNSoylISgcjxoaG9JLQHIqfnTITi78OvITdqXcd6ZsGA91AkoiX1IBvKj6zxcy6G0mL W8M50Er3C+mwR3f0dUWBf4Kolur/n8o/50p+kLcNnbSc4kGGEA+mdjnxhjeFF8hgT479 1A8iFvZQTj93PxaUP5yyFNMoFgAwdh7urrVEjcTu848m1XAnaddJgC6S6ytbjTSYmiQS qfSQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gw26si535324ejb.537.2020.12.02.22.58.19; Wed, 02 Dec 2020 22:58:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728504AbgLCG4l (ORCPT + 99 others); Thu, 3 Dec 2020 01:56:41 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:8184 "EHLO szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725912AbgLCG4l (ORCPT ); Thu, 3 Dec 2020 01:56:41 -0500 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4CmmnF0crGz15LKB; Thu, 3 Dec 2020 14:55:29 +0800 (CST) Received: from [10.136.114.67] (10.136.114.67) by smtp.huawei.com (10.3.19.214) with Microsoft SMTP Server (TLS) id 14.3.487.0; Thu, 3 Dec 2020 14:55:50 +0800 Subject: Re: [f2fs-dev] [PATCH v3] f2fs: avoid race condition for shinker count To: Jaegeuk Kim , , , CC: Light Hsieh References: <20201109170012.2129411-1-jaegeuk@kernel.org> <20201112053414.GB3826485@google.com> <20201112054051.GA4092972@google.com> From: Chao Yu Message-ID: Date: Thu, 3 Dec 2020 14:55:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.136.114.67] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/12/3 14:07, Jaegeuk Kim wrote: > On 11/11, Jaegeuk Kim wrote: >> Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in >> wrong do_shinker work. Let's avoid to get stale data by using nat_tree_lock. >> >> Reported-by: Light Hsieh >> Signed-off-by: Jaegeuk Kim >> --- >> v3: >> - fix to use NM_I(sbi) >> >> fs/f2fs/shrinker.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c >> index d66de5999a26..555712ba06d8 100644 >> --- a/fs/f2fs/shrinker.c >> +++ b/fs/f2fs/shrinker.c >> @@ -18,7 +18,11 @@ static unsigned int shrinker_run_no; >> >> static unsigned long __count_nat_entries(struct f2fs_sb_info *sbi) >> { >> - long count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt; >> + long count; >> + >> + down_read(&NM_I(sbi)->nat_tree_lock); >> + count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt; >> + up_read(&NM_I(sbi)->nat_tree_lock); > > I just fosund this can give kernel hang due to the following backtrace. > f2fs_shrink_count > shrink_slab > shrink_node > do_try_to_free_pages > try_to_free_pages > __alloc_pages_nodemask > alloc_pages_current > allocate_slab > __slab_alloc > __slab_alloc > kmem_cache_alloc > add_free_nid > f2fs_flush_nat_entries > f2fs_write_checkpoint Oh, I missed that case. > > Let me just check like this. > >>From 971163330224449d90aac90957ea38f77d494f0f Mon Sep 17 00:00:00 2001 > From: Jaegeuk Kim > Date: Fri, 6 Nov 2020 13:22:05 -0800 > Subject: [PATCH] f2fs: avoid race condition for shrinker count > > Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in > wrong do_shinker work. Let's avoid to return insane overflowed value. > > Reported-by: Light Hsieh > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/shrinker.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c > index d66de5999a26..75b5b4aaed99 100644 > --- a/fs/f2fs/shrinker.c > +++ b/fs/f2fs/shrinker.c > @@ -18,9 +18,9 @@ static unsigned int shrinker_run_no; > > static unsigned long __count_nat_entries(struct f2fs_sb_info *sbi) > { > - long count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt; > - > - return count > 0 ? count : 0; > + if (NM_I(sbi)->nat_cnt > NM_I(sbi)->dirty_nat_cnt) > + return NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt; Hmm.. in the case that we are not in checkpoint progress, nat_cnt and dirty_nat_cnt is mutable, how can we make sure the calculation is non-negative after the check condition? :( Thanks > + return 0; > } > > static unsigned long __count_free_nids(struct f2fs_sb_info *sbi) >