Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2044897pxv; Fri, 2 Jul 2021 20:37:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJycXVCUmXXvMSR0kDGVYWIUsMUv5tRQCaE0qjgRZMgbknuOnZPyzsFQwg7GQP51+tOYlVVJ X-Received: by 2002:a17:906:1318:: with SMTP id w24mr2981246ejb.222.1625283476524; Fri, 02 Jul 2021 20:37:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625283476; cv=none; d=google.com; s=arc-20160816; b=wgeUo3cJr/FNfwfJDvrZf4BRP6PrZugIXbcSvTnyWCw0OQR4OC8OmEOUYmhG1qlI+v a5w98jN45apqFeqGiBifo+IWFX4FMXMHt8S21yEX7jkmwzBBijWQWNhKwa8c4tUgW1l0 duD+Z1eqTG7w8vSXb/z+EokdJHZbweQqH83F6qlts8Jt/PpPa5cmN9YBPS1jJsoplDSE hrznxOtl89NWixmJbiTcV1jnooZlGg5f0EQcAqyd3P6sgNqLGbLn7DUFejKJf+5dH2K3 g9T6S8T3fGGqKJmmlgrzX9QNZ+WIzd6npBY442B2SfNzMh7GBpYGxeaT4tPE/+HpfRfx IDHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=j9lfS3t3Q6vNDzpgOhBplyemhqYJLCES2sNjmPflqCo=; b=bEjOufnUXIpr2icmMAHp115/BsUNTlbt2e5yc5BzUJHDmFTFJlggfSS38VtvNWT4BT To+svpJY/4dF/RplMe7GLqIBLrtRJD5T6hZiRBPaS6nH0AjUQbpzmvmmgPiuEaDI6CPu eN4VqT+eFGrtDW6UXK6DkCNi+Y1WeOOG506ES3Rx4lhsGtimpB3XpKFIKcYMLzeZNPMq kgQ9u8YH1OqZadALbAlX4uziWnCF1F5KEwxgi01BN7TKL4gpihFfzR5GJiQS+9L8dmN6 qEwAS4ZWhYOt2s+kU/5h7BWpVPEsh65KBmUZnCSY1B14uKqNDp9U+6z+YNzSBs3m6ejX iKqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id jy1si5340431ejc.592.2021.07.02.20.37.24; Fri, 02 Jul 2021 20:37:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230174AbhGCDjn (ORCPT + 99 others); Fri, 2 Jul 2021 23:39:43 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:9449 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230051AbhGCDjn (ORCPT ); Fri, 2 Jul 2021 23:39:43 -0400 Received: from dggeme752-chm.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4GGyGv3vK7zZp4d; Sat, 3 Jul 2021 11:33:59 +0800 (CST) Received: from [10.174.178.134] (10.174.178.134) by dggeme752-chm.china.huawei.com (10.3.19.98) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Sat, 3 Jul 2021 11:37:07 +0800 Subject: Re: [powerpc][5.13.0-next-20210701] Kernel crash while running ltp(chdir01) tests To: Theodore Ts'o CC: Jan Kara , , Guoqing Jiang , Sachin Sant , Ext4 Developers List , "linux-fsdevel@vger.kernel.org" References: <26ACA75D-E13D-405B-9BFC-691B5FB64243@linux.vnet.ibm.com> <4cc87ab3-aaa6-ed87-b690-5e5b99de8380@huawei.com> <03f734bd-f36e-f55b-0448-485b8a0d5b75@huawei.com> From: Zhang Yi Message-ID: <9b81eb4e-9adb-991f-31be-f5ef0092c4b3@huawei.com> Date: Sat, 3 Jul 2021 11:37:07 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.178.134] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggeme752-chm.china.huawei.com (10.3.19.98) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 2021/7/3 6:11, Theodore Ts'o wrote: > On Fri, Jul 02, 2021 at 12:11:54PM -0400, Theodore Ts'o wrote: >> So it probably makes more sense to keep jbd2_journal_unregister_shrinker() >> in jbd2_destroy_journal(), since arguably the fact that we are using a >> shrinker is an internal implementation detail, and the users of jbd2 >> ideally shouldn't need to be expected to know they have unregister >> jbd2's shirnkers. >> >> Similarly, perhaps we should be moving jbd2_journal_register_shirnker() >> into jbd2_journal_init_common(). We can un-export the register and >> unshrink register functions, and declare them as static functions internal >> to fs/jbd2/journal.c. >> >> What do you think? > > Like this... > > commit 8f9e16badb8fda3391e03146a47c93e76680efaf > Author: Theodore Ts'o > Date: Fri Jul 2 18:05:03 2021 -0400 > > ext4: fix doubled call to jbd2_journal_unregister_shrinker() > > On Power and ARM platforms this was causing kernel crash when > unmounting the file system, due to a percpu_counter getting destroyed > twice. > > Fix this by cleaning how the jbd2 shrinker is initialized and > uninitiazed by making it solely the responsibility of > fs/jbd2/journal.c. > > Fixes: 4ba3fcdde7e3 ("jbd2,ext4: add a shrinker to release checkpointed buffers") > Reported-by: Sachin Sant > Signed-off-by: Theodore Ts'o > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index b8ff0399e171..dfa09a277b56 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1184,7 +1184,6 @@ static void ext4_put_super(struct super_block *sb) > ext4_unregister_sysfs(sb); > > if (sbi->s_journal) { > - jbd2_journal_unregister_shrinker(sbi->s_journal); > aborted = is_journal_aborted(sbi->s_journal); > err = jbd2_journal_destroy(sbi->s_journal); > sbi->s_journal = NULL; > @@ -5176,7 +5175,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > sbi->s_ea_block_cache = NULL; > > if (sbi->s_journal) { > - jbd2_journal_unregister_shrinker(sbi->s_journal); > jbd2_journal_destroy(sbi->s_journal); > sbi->s_journal = NULL; > } > @@ -5502,12 +5500,6 @@ static int ext4_load_journal(struct super_block *sb, > ext4_commit_super(sb); > } > > - err = jbd2_journal_register_shrinker(journal); > - if (err) { > - EXT4_SB(sb)->s_journal = NULL; > - goto err_out; > - } > - > return 0; > > err_out: > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 152880c298ca..2595703aca51 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -99,6 +99,8 @@ EXPORT_SYMBOL(jbd2_journal_begin_ordered_truncate); > EXPORT_SYMBOL(jbd2_inode_cache); > > static int jbd2_journal_create_slab(size_t slab_size); > +static int jbd2_journal_register_shrinker(journal_t *journal); > +static void jbd2_journal_unregister_shrinker(journal_t *journal); > > #ifdef CONFIG_JBD2_DEBUG > void __jbd2_debug(int level, const char *file, const char *func, > @@ -2043,7 +2045,8 @@ int jbd2_journal_load(journal_t *journal) > goto recovery_error; > > journal->j_flags |= JBD2_LOADED; > - return 0; > + > + return jbd2_journal_register_shrinker(journal); > I check the ocfs2 code, if we register shrinker here, __ocfs2_recovery_thread()-> ocfs2_recover_node() seems will register and unregister a lot of unnecessary shrinkers. It depends on the lifetime of the shrinker and the journal, because of the jbd2_journal_destroy() destroy everything, it not a simple undo of jbd2_load_journal(), so it's not easy to add shrinker properly. But it doesn't seems like a real problem, just curious. Thanks, Yi. > recovery_error: > printk(KERN_WARNING "JBD2: recovery failed\n"); > @@ -2099,7 +2102,7 @@ static unsigned long jbd2_journal_shrink_count(struct shrinker *shrink, > * Init a percpu counter to record the checkpointed buffers on the checkpoint > * list and register a shrinker to release their journal_head. > */ > -int jbd2_journal_register_shrinker(journal_t *journal) > +static int jbd2_journal_register_shrinker(journal_t *journal) > { > int err; > > @@ -2122,7 +2125,6 @@ int jbd2_journal_register_shrinker(journal_t *journal) > > return 0; > } > -EXPORT_SYMBOL(jbd2_journal_register_shrinker); > > /** > * jbd2_journal_unregister_shrinker() > @@ -2130,12 +2132,13 @@ EXPORT_SYMBOL(jbd2_journal_register_shrinker); > * > * Unregister the checkpointed buffer shrinker and destroy the percpu counter. > */ > -void jbd2_journal_unregister_shrinker(journal_t *journal) > +static void jbd2_journal_unregister_shrinker(journal_t *journal) > { > - percpu_counter_destroy(&journal->j_jh_shrink_count); > - unregister_shrinker(&journal->j_shrinker); > + if (journal->j_shrinker.flags & SHRINKER_REGISTERED) { > + percpu_counter_destroy(&journal->j_jh_shrink_count); > + unregister_shrinker(&journal->j_shrinker); > + } > } > -EXPORT_SYMBOL(jbd2_journal_unregister_shrinker); > > /** > * jbd2_journal_destroy() - Release a journal_t structure. > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h > index 6cc035321562..632afbe4b18f 100644 > --- a/include/linux/jbd2.h > +++ b/include/linux/jbd2.h > @@ -1556,8 +1556,6 @@ extern int jbd2_journal_set_features > (journal_t *, unsigned long, unsigned long, unsigned long); > extern void jbd2_journal_clear_features > (journal_t *, unsigned long, unsigned long, unsigned long); > -extern int jbd2_journal_register_shrinker(journal_t *journal); > -extern void jbd2_journal_unregister_shrinker(journal_t *journal); > extern int jbd2_journal_load (journal_t *journal); > extern int jbd2_journal_destroy (journal_t *); > extern int jbd2_journal_recover (journal_t *journal); > . >