Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1240547pxk; Thu, 10 Sep 2020 10:21:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwVOjBv4Ya9dxpfNx+rmNPFMVXMQyQR/F649pzJGyKIci0aRREuMewrLOvrtbcCtKle/G4F X-Received: by 2002:a50:9f22:: with SMTP id b31mr10538059edf.345.1599758471908; Thu, 10 Sep 2020 10:21:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599758471; cv=none; d=google.com; s=arc-20160816; b=vESAh9J/13q7rVdbVCpmK7wFOzSl64SdIGDNw7B+ToTNJSq2fFiF/6bZQzLYh2VRzF teY7oilNDjvWs+YDak/ud7dGfPz1LvX7binAyQWnV5PjjVSYmDBlDTks+kaocKJBil2d /NsAfDR/7LjSlBgpLGO9yGPjugQNnexjG7mOhDD3o88Afw5CtSgc238HMU6riuXjHTO7 6SG1KR969YF/V3NXZesNAxuW0spzk/99J1ywdPLhYqVmVGO5/beClxsdtzjocz4imso+ q047xfhooNPIu66JH+yRzz/l4aruod4Can9uX3yVhAFDrjOmbMwlBmr4BhBBJfiG/Efs 547w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=vKCXoO1Y5o1NmvSCDa5quj7fnOryPGIsvZJ6HBmWb1I=; b=T8Qu7/K/8t3dh3qKkn8c7v2JqXKsCD+XVmiY2ZGiND+3+VjKbQ6h2oiIBGi0ddJnho mwj9/OiTVvoUA1Hk5O6LxhbmysGU53Cu504N4sn961CK2nSLYFAAh2Dx8dgg6NywuwAA SXGeAZuC+gtSHeIDztOTngX7a2Nsd248LSLVqix1bEO60lTdHivnqodmxCoc2hYndWIR OuPkOa5oIh0s7z+rTPBsyHXLQfkqR/O9YR/9qQMZ58vGQCSetKejIIT238XRMg19efLG ALdEOi79qe0//AKXOS8FHqlvaoavAmhuZ0zYeyNERzO56bLx8dN4tPHcpu1aZkCsyXMf rsbQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dr13si4245226ejc.638.2020.09.10.10.20.42; Thu, 10 Sep 2020 10:21:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726879AbgIJRTn (ORCPT + 99 others); Thu, 10 Sep 2020 13:19:43 -0400 Received: from mx2.suse.de ([195.135.220.15]:56282 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726805AbgIJQSB (ORCPT ); Thu, 10 Sep 2020 12:18:01 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1A51CB229; Thu, 10 Sep 2020 16:18:12 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 7007B1E12EB; Thu, 10 Sep 2020 18:17:53 +0200 (CEST) Date: Thu, 10 Sep 2020 18:17:53 +0200 From: Jan Kara To: Ye Bin Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.com, linux-ext4@vger.kernel.org, Ritesh Harjani Subject: Re: [PATCH v3] ext4: Fix dead loop in ext4_mb_new_blocks Message-ID: <20200910161753.GB17362@quack2.suse.cz> References: <20200910091252.525346-1-yebin10@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200910091252.525346-1-yebin10@huawei.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu 10-09-20 17:12:52, Ye Bin wrote: > As we test disk offline/online with running fsstress, we find fsstress > process is keeping running state. > kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114 > .... > kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114 > > ext4_mb_new_blocks > repeat: > ext4_mb_discard_preallocations_should_retry(sb, ac, &seq) > freed = ext4_mb_discard_preallocations > ext4_mb_discard_group_preallocations > this_cpu_inc(discard_pa_seq); > ---> freed == 0 > seq_retry = ext4_get_discard_pa_seq_sum > for_each_possible_cpu(__cpu) > __seq += per_cpu(discard_pa_seq, __cpu); > if (seq_retry != *seq) { > *seq = seq_retry; > ret = true; > } > > As we see seq_retry is sum of discard_pa_seq every cpu, if > ext4_mb_discard_group_preallocations return zero discard_pa_seq in this > cpu maybe increase one, so condition "seq_retry != *seq" have always > been met. > To Fix this problem, in ext4_mb_discard_group_preallocations function increase > discard_pa_seq only when it found preallocation to discard. > > Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling") > Signed-off-by: Ye Bin Thanks for the patch. One comment below. > --- > fs/ext4/mballoc.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index f386fe62727d..fd55264dc3fe 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -4191,7 +4191,6 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, > INIT_LIST_HEAD(&list); > repeat: > ext4_lock_group(sb, group); > - this_cpu_inc(discard_pa_seq); > list_for_each_entry_safe(pa, tmp, > &grp->bb_prealloc_list, pa_group_list) { > spin_lock(&pa->pa_lock); > @@ -4233,6 +4232,9 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, > goto out; > } > > + /* only increase when find reallocation to discard */ > + this_cpu_inc(discard_pa_seq); > + This is a good place to increment the counter but I think you also need to handle the case: if (free < needed && busy) { busy = 0; ext4_unlock_group(sb, group); cond_resched(); goto repeat; } We can unlock the group here after removing some preallocations and thus other processes checking discard_pa_seq could miss we did this. In fact I think the code is somewhat buggy here and we should also discard extents accumulated on "list" so far before unlocking the group. Ritesh? Honza -- Jan Kara SUSE Labs, CR