Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp6510314rwd; Mon, 5 Jun 2023 20:20:17 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4wNzs/5cZ3uClIxonuuaCAuV7kS8bHYbjQnYQ1+YbwRu44bPSOZlTg+gTy2c2CmhjfEDOs X-Received: by 2002:aca:2303:0:b0:39a:abe8:afc3 with SMTP id e3-20020aca2303000000b0039aabe8afc3mr745545oie.38.1686021616872; Mon, 05 Jun 2023 20:20:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686021616; cv=none; d=google.com; s=arc-20160816; b=On6F3PcE7b5BPjzdVhncisb7xAdFiNmr/klziNDL2RpUVVKB3FSIUrXg9aPgk9Z5vZ LyoLAX4FQAOwXdJXcUhuX1uYUDCted1VAvjBjimDgLNlM5rTUpoqKki1duT3T/alnF7e ePrc8M1+52ahshJMO/CkCUJp/WWr/cDaVUdj+PhU0+LB1l7zkVisVLwnbPW5b5YeLvq6 FYZ8F0kyuFB8S9AzSkxoe9pMvStK1ndGWCIhUiSqa07Sts+vSpNyDiqllYctC9d8UbEW vsNotxDihIrzxC5U4IoLNS0MGVlA2GkiNTS6sjPg/6e+FQLzNnWpAt/VXq5AMuknoxCQ IYfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=hHraN3hVCqrmJmmPSWWTuJqPN0R81IdA9hVV/jS2Z0A=; b=kceuUz2C/kGkr84tDvA9zK4JTJ5nPTKVkxXSm1wlsSJ+loB/BBhVkdpazHIA6TozVM ekCS18ABdxv0NUu7txQyT0UuWLtt8xmTpelgnB/bgaCpqEgj4hF2WOokw0Fkwfg7Cj8j fLfgwj9Nvgg862qVK6y1XGASAna5LEgJlY57q2dhP2+UMPl+bTtLoHyIFbtkDqpOqFoY Nde8TViPLH8CpuuJRNiAXANmks1zgfC63LTIjtYJ5SZlDF6TxfuvxJqEuCZ6FkWNw/nJ 7aEvUmX+fdldLcs8EEFGG40Uq/M9A5LXwBYMH7epTB8Bvd1sdg9PyXwYzkvhjHSX0XQy cViw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=trn4mw3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t23-20020a63b257000000b00528b78ddbdesi6666582pgo.63.2023.06.05.20.19.44; Mon, 05 Jun 2023 20:20:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=trn4mw3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234348AbjFFC5M (ORCPT + 99 others); Mon, 5 Jun 2023 22:57:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229660AbjFFC5L (ORCPT ); Mon, 5 Jun 2023 22:57:11 -0400 Received: from out-1.mta0.migadu.com (out-1.mta0.migadu.com [IPv6:2001:41d0:1004:224b::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E993102 for ; Mon, 5 Jun 2023 19:57:08 -0700 (PDT) Date: Mon, 5 Jun 2023 19:56:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1686020227; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=hHraN3hVCqrmJmmPSWWTuJqPN0R81IdA9hVV/jS2Z0A=; b=trn4mw3JuTr7EBl5wGrwvR/Ct5O4ZvVjVNTwDhjDBfcDV9kO1vxVvmXsBt951zKGGmIelG Q0QHBL+CPaEpXVjcfoB93wadslMSoongpSClmoZYKSEUPe+luWsNhzu6+gAKVLEQtg7zSP R/jXD+cKeHia3kebX1GigHBP9lI9Sik= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Dave Chinner Cc: Kirill Tkhai , akpm@linux-foundation.org, vbabka@suse.cz, viro@zeniv.linux.org.uk, brauner@kernel.org, djwong@kernel.org, hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, zhengqi.arch@bytedance.com Subject: Re: [PATCH v2 3/3] fs: Use delayed shrinker unregistration Message-ID: References: <168599103578.70911.9402374667983518835.stgit@pro.pro> <168599180526.70911.14606767590861123431.stgit@pro.pro> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 06, 2023 at 11:24:32AM +1000, Dave Chinner wrote: > On Mon, Jun 05, 2023 at 05:38:27PM -0700, Roman Gushchin wrote: > > On Mon, Jun 05, 2023 at 10:03:25PM +0300, Kirill Tkhai wrote: > > > Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec > > > test case caused by commit: f95bdb700bc6 ("mm: vmscan: make global slab > > > shrink lockless"). Qi Zheng investigated that the reason is in long SRCU's > > > synchronize_srcu() occuring in unregister_shrinker(). > > > > > > This patch fixes the problem by using new unregistration interfaces, > > > which split unregister_shrinker() in two parts. First part actually only > > > notifies shrinker subsystem about the fact of unregistration and it prevents > > > future shrinker methods calls. The second part completes the unregistration > > > and it insures, that struct shrinker is not used during shrinker chain > > > iteration anymore, so shrinker memory may be freed. Since the long second > > > part is called from delayed work asynchronously, it hides synchronize_srcu() > > > delay from a user. > > > > > > Signed-off-by: Kirill Tkhai > > > --- > > > fs/super.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/super.c b/fs/super.c > > > index 8d8d68799b34..f3e4f205ec79 100644 > > > --- a/fs/super.c > > > +++ b/fs/super.c > > > @@ -159,6 +159,7 @@ static void destroy_super_work(struct work_struct *work) > > > destroy_work); > > > int i; > > > > > > + unregister_shrinker_delayed_finalize(&s->s_shrink); > > > for (i = 0; i < SB_FREEZE_LEVELS; i++) > > > percpu_free_rwsem(&s->s_writers.rw_sem[i]); > > > kfree(s); > > > @@ -327,7 +328,7 @@ void deactivate_locked_super(struct super_block *s) > > > { > > > struct file_system_type *fs = s->s_type; > > > if (atomic_dec_and_test(&s->s_active)) { > > > - unregister_shrinker(&s->s_shrink); > > > + unregister_shrinker_delayed_initiate(&s->s_shrink); > > > > Hm, it makes the API more complex and easier to mess with. Like what will happen > > if the second part is never called? Or it's called without the first part being > > called first? > > Bad things. Agree. > Also, it doesn't fix the three other unregister_shrinker() calls in > the XFS unmount path, nor the three in the ext4/mbcache/jbd2 unmount > path. > > Those are just some of the unregister_shrinker() calls that have > dynamic contexts that would also need this same fix; I haven't > audited the 3 dozen other unregister_shrinker() calls around the > kernel to determine if any of them need similar treatment, too. > > IOWs, this patchset is purely a band-aid to fix the reported > regression, not an actual fix for the underlying problems caused by > moving the shrinker infrastructure to SRCU protection. This is why > I really want the SRCU changeover reverted. > > Not only are the significant changes the API being necessary, it's > put the entire shrinker paths under a SRCU critical section. AIUI, > this means while the shrinkers are running the RCU grace period > cannot expire and no RCU freed memory will actually get freed until > the srcu read lock is dropped by the shrinker. > > Given the superblock shrinkers are freeing dentry and inode objects > by RCU freeing, this is also a fairly significant change of > behaviour. i.e. cond_resched() in the shrinker processing loops no > longer allows RCU grace periods to expire and have memory freed with > the shrinkers are running. > > Are there problems this will cause? I don't know, but I'm pretty > sure they haven't even been considered until now.... > > > Isn't it possible to hide it from a user and call the second part from a work > > context automatically? > > Nope, because it has to be done before the struct shrinker is freed. > Those are embedded into other structures rather than being > dynamically allocated objects. This part we might consider to revisit, if it helps to solve other problems. Having an extra memory allocation (or two) per mount-point doesn't look that expensive. Again, iff it helps with more important problems. Thanks!