Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp6433230rwd; Mon, 5 Jun 2023 18:47:17 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6Ul6UPinMw5JZtzpVliBk7xi+J+59jha4UA0DZ3sviG9HUNftbFYqQJ5uRHVRI3wW954q0 X-Received: by 2002:a05:6214:d67:b0:62b:3d45:7b7f with SMTP id 7-20020a0562140d6700b0062b3d457b7fmr891529qvs.44.1686016036920; Mon, 05 Jun 2023 18:47:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686016036; cv=none; d=google.com; s=arc-20160816; b=IRZGzE7Ba8AQ/4LnsES3kcDYwCkl4NeJR4ilrdReF38+nyEo0VpcxG11y5x7VGv9cE hu4m95N4kPQcBtzGBHdRc8OHvscnThG+Xbd6LRVW0bS4c8eiletsmhAm7OsYE13foGV8 6SCYEyyxUTgOZiwN3StDLP5XpwC+INfNlt4Bs8acN66hia1tECLYvYTqXxCCYj/gR3W8 4Z28tkQC6X6KZ8hI4Yw+ZJyh9vOlApQPju/lF/RezJVMjYDkC0tPbDHHT5tQc024OKKf /tysSqtxMh+mDOV+S//4sxm1Ns42w6Mka4nRu6CGvQz2mvGIlCR7wzLK2yrYY8nOn+e9 Y24A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=FEO737aUec1lwK9pGviD1SIdvncYRmvXW6gxg/j13BM=; b=xQy16XLKdTKiWl8xim038NQDleAm/2Kw0Gd2Ro20ylaIoVEzUeiOoDDHc6sB03EcIg gHOLDtvORAGpvkITFibsnYRB7/Dxk02zLT1Sa1NlhIhDvWDyYDZm5zxspNfp5Q0mWURm QwHDlW/T/6uL9V1bSf4fRWggyTh7PCNR4cKv1EvUK+Yfj5aC2f+XW5iJc8t50JnKJYlF 6d+qpJrpPiGI1Hm9CjTeZt2FFc3/OounyizzLloZxosRG/H8ylUDNhubjr3AQxG0IXkI PoWFlv+Q9iZOHS5TzAB0vqdRjz2ZMdlGkqVwXDKUtK09yvgr++6ld2CHQJsv5kyRLtvR CRwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=O9ZFIh1Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g4-20020a0562140ac400b00625a2fc5e61si5826012qvi.418.2023.06.05.18.47.02; Mon, 05 Jun 2023 18:47:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b=O9ZFIh1Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231811AbjFFBYn (ORCPT + 99 others); Mon, 5 Jun 2023 21:24:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231439AbjFFBYm (ORCPT ); Mon, 5 Jun 2023 21:24:42 -0400 Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74A788F for ; Mon, 5 Jun 2023 18:24:37 -0700 (PDT) Received: by mail-oi1-x235.google.com with SMTP id 5614622812f47-390723f815fso3070885b6e.3 for ; Mon, 05 Jun 2023 18:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1686014677; x=1688606677; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FEO737aUec1lwK9pGviD1SIdvncYRmvXW6gxg/j13BM=; b=O9ZFIh1Q61+tKTknsBCMtIRxeAkqIbO4BxvPoMGA4kqRsNG8NIGLudQ4gAEWlLWLnt 3k4xyS4YC+EcSEl5kwL9vLcvm14OLojC2SxqMP5bH4YQ0zR9MFmSVZT8z3/Nel8mifoc gdNhBEuLUUSOUiam1VEsCdw15h7YEhmZFII3T7r0p9Q3F4Yqhb1rcirey7PpdTxIMxK5 RmQH+kCUOxejayky8B/YD2blSch3v+ncvWbFeZ0W0mtwVzve/fb6paZB4Gl9kTDjKzho 0+s8srRDttqYni1onJzPjOLoOAh36NjpLtTz+hVHIX4LP4L1yn2ox2kKIL2yxV3kEjM4 KaXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686014677; x=1688606677; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FEO737aUec1lwK9pGviD1SIdvncYRmvXW6gxg/j13BM=; b=Jj85Sg9IbF5R0yLUhK4AajaWRWokGFRN++NISpTPDlIVpnAygrV6HHLdQIqIg3CkQ1 Yv7MFq0pqmMn0l1uakph8kM3GgCk6VA6jdZtw50zNNKxIq3cpA2T32D6aA2JW5j0Hvvu 7zFFZI8JYJuqlu00pqdhHAaRAxuWtJkdkxzMscOWsCbkm5Sasezf5JzZMzY1FARvsjMG Q4YhFnpfxGqGIa7zJGQ3XmA0cn2zYglFIoyHBOZ4XLS6KuIIivwtFyvl3VLSZECPWNBM vB4rY19HJ7Z5JjTH1XLuCMm5lNkluXmEivsRqLbavqyggRpLJ2xSIuXjgGf/rimoKQCW pp3g== X-Gm-Message-State: AC+VfDwbcT1Ep+N98VrwNNnEmPSeKpaOPmJ+lixIjp9X5T8FVZRerigf 5eN9swatDssbOUJsWpr29VCVuw== X-Received: by 2002:a05:6359:3af:b0:129:c53e:eab with SMTP id eg47-20020a05635903af00b00129c53e0eabmr655510rwb.12.1686014676615; Mon, 05 Jun 2023 18:24:36 -0700 (PDT) Received: from dread.disaster.area (pa49-179-79-151.pa.nsw.optusnet.com.au. [49.179.79.151]) by smtp.gmail.com with ESMTPSA id t8-20020a17090a3b4800b00246cc751c6bsm8817189pjf.46.2023.06.05.18.24.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 18:24:35 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1q6LR2-008IwX-2D; Tue, 06 Jun 2023 11:24:32 +1000 Date: Tue, 6 Jun 2023 11:24:32 +1000 From: Dave Chinner To: Roman Gushchin Cc: Kirill Tkhai , akpm@linux-foundation.org, vbabka@suse.cz, viro@zeniv.linux.org.uk, brauner@kernel.org, djwong@kernel.org, hughd@google.com, paulmck@kernel.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, zhengqi.arch@bytedance.com Subject: Re: [PATCH v2 3/3] fs: Use delayed shrinker unregistration Message-ID: References: <168599103578.70911.9402374667983518835.stgit@pro.pro> <168599180526.70911.14606767590861123431.stgit@pro.pro> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 05, 2023 at 05:38:27PM -0700, Roman Gushchin wrote: > On Mon, Jun 05, 2023 at 10:03:25PM +0300, Kirill Tkhai wrote: > > Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec > > test case caused by commit: f95bdb700bc6 ("mm: vmscan: make global slab > > shrink lockless"). Qi Zheng investigated that the reason is in long SRCU's > > synchronize_srcu() occuring in unregister_shrinker(). > > > > This patch fixes the problem by using new unregistration interfaces, > > which split unregister_shrinker() in two parts. First part actually only > > notifies shrinker subsystem about the fact of unregistration and it prevents > > future shrinker methods calls. The second part completes the unregistration > > and it insures, that struct shrinker is not used during shrinker chain > > iteration anymore, so shrinker memory may be freed. Since the long second > > part is called from delayed work asynchronously, it hides synchronize_srcu() > > delay from a user. > > > > Signed-off-by: Kirill Tkhai > > --- > > fs/super.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/fs/super.c b/fs/super.c > > index 8d8d68799b34..f3e4f205ec79 100644 > > --- a/fs/super.c > > +++ b/fs/super.c > > @@ -159,6 +159,7 @@ static void destroy_super_work(struct work_struct *work) > > destroy_work); > > int i; > > > > + unregister_shrinker_delayed_finalize(&s->s_shrink); > > for (i = 0; i < SB_FREEZE_LEVELS; i++) > > percpu_free_rwsem(&s->s_writers.rw_sem[i]); > > kfree(s); > > @@ -327,7 +328,7 @@ void deactivate_locked_super(struct super_block *s) > > { > > struct file_system_type *fs = s->s_type; > > if (atomic_dec_and_test(&s->s_active)) { > > - unregister_shrinker(&s->s_shrink); > > + unregister_shrinker_delayed_initiate(&s->s_shrink); > > Hm, it makes the API more complex and easier to mess with. Like what will happen > if the second part is never called? Or it's called without the first part being > called first? Bad things. Also, it doesn't fix the three other unregister_shrinker() calls in the XFS unmount path, nor the three in the ext4/mbcache/jbd2 unmount path. Those are just some of the unregister_shrinker() calls that have dynamic contexts that would also need this same fix; I haven't audited the 3 dozen other unregister_shrinker() calls around the kernel to determine if any of them need similar treatment, too. IOWs, this patchset is purely a band-aid to fix the reported regression, not an actual fix for the underlying problems caused by moving the shrinker infrastructure to SRCU protection. This is why I really want the SRCU changeover reverted. Not only are the significant changes the API being necessary, it's put the entire shrinker paths under a SRCU critical section. AIUI, this means while the shrinkers are running the RCU grace period cannot expire and no RCU freed memory will actually get freed until the srcu read lock is dropped by the shrinker. Given the superblock shrinkers are freeing dentry and inode objects by RCU freeing, this is also a fairly significant change of behaviour. i.e. cond_resched() in the shrinker processing loops no longer allows RCU grace periods to expire and have memory freed with the shrinkers are running. Are there problems this will cause? I don't know, but I'm pretty sure they haven't even been considered until now.... > Isn't it possible to hide it from a user and call the second part from a work > context automatically? Nope, because it has to be done before the struct shrinker is freed. Those are embedded into other structures rather than being dynamically allocated objects. Hence the synchronise_srcu() has to complete before the structure the shrinker is embedded in is freed. Now, this can be dealt with by having register_shrinker() return an allocated struct shrinker and the callers only keep a pointer, but that's an even bigger API change. But, IMO, it is an API change that should have been done before SRCU was introduced precisely because it allows decoupling of shrinker execution and completion from the owning structure. Then we can stop shrinker execution, wait for it to complete and prevent future execution in unregister_shrinker(), then punt the expensive shrinker list removal to background work where processing delays just don't matter for dead shrinker instances. It doesn't need SRCU at all... -Dave. -- Dave Chinner david@fromorbit.com