Received: by 2002:a05:6359:6284:b0:131:369:b2a3 with SMTP id se4csp3726819rwb; Mon, 7 Aug 2023 19:51:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGzEA/QTnorPGRI7hWcosmIvk195j1hDw8s7ZZ7D9ato4JSWHsXDChcFGpqBYwpw3xyZAVD X-Received: by 2002:a05:620a:29d3:b0:765:6542:d9fa with SMTP id s19-20020a05620a29d300b007656542d9famr14050328qkp.11.1691463089216; Mon, 07 Aug 2023 19:51:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691463089; cv=none; d=google.com; s=arc-20160816; b=mCppyhhpXgjctaGLN9SW7v4Xenx6TQxLW1+/e/yzLV9ewoWkwJSfQmvU2PAIIqVRFt g6skdEPWjjMCVGQ0U2aZvnWrGykiU+pGB4AA8RU3H0csuBCy4OidSH+EWIL6xD7IrhiU D4UediNKbtnQC0gNwrrcKBoz99+OYo3BgLYimLdxP1yp4XOJs6pesCpWjGk3D12thy8o zC3W1sxrdL621EamLodsMaacwg3Ijuke+reSGsbeLcs9GrKLQ7zzngGQYTL3NcOg1Hx1 YDh7UYPyGzltY8/TLHtx0KVDJ8pKqICF7Zjc59Uh6qcJDw/FzGzUGsf7qMaofNG16Jpy dvFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=HUGk8Gm4qWKcvYyYW1oMdtOdnxgdWZtLrf0WmvG8aOg=; fh=RD3VsbgmAdXadzJ63mI5KVyorqDWGD5h24qHvXeZha4=; b=KzQ1BxEh7+51Kpx9+ojfmv/APSZ4JFsQ/dKtaTXItbYwDHF5SUoKCtISVARMjyUs/x GIpI+4zNUnMainr6ontD2cF1eZrU3X2V2di7KQghfMQtLE76A7rRZ5vf8WxgjsYK0W6v ROI8I2lT4k0Ff3NvpVUzDsPODq2Dwo5yHUrTwPyG0BWRuou65+2/rzCRdoSGYmyX9QqE iLYxiPqXj+MnS0t4J+fWwERJy4qOXFo4wxq1k8wOmeem6cmMkm6Lh1S7dEb1qYtIuwEU TMkAVe8qZC6XtGnRQo2Hl0EcXq09buqn+LYD/eYy5e6ljqd6TiNExeMNc4Fn3pdqbb/N BbvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b="E8R/sNcx"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w22-20020a63c116000000b0054fdf58c23esi6299181pgf.680.2023.08.07.19.51.14; Mon, 07 Aug 2023 19:51:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@fromorbit-com.20221208.gappssmtp.com header.s=20221208 header.b="E8R/sNcx"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229809AbjHHCYX (ORCPT + 99 others); Mon, 7 Aug 2023 22:24:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229933AbjHHCYW (ORCPT ); Mon, 7 Aug 2023 22:24:22 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7012B1BC3 for ; Mon, 7 Aug 2023 19:24:14 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1b8ad8383faso46445165ad.0 for ; Mon, 07 Aug 2023 19:24:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20221208.gappssmtp.com; s=20221208; t=1691461454; x=1692066254; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=HUGk8Gm4qWKcvYyYW1oMdtOdnxgdWZtLrf0WmvG8aOg=; b=E8R/sNcxrOK1IUnWWdPxuHAURfzEKsQXFvPF/8WoYIocyBzrOF6C/15wk0bBTyq8fp w+9skh2POsCjChG96SdxCE1ukUJoo7U6iUCJoOHux3qHabiPQt2q8GYhzVnnqh4Ia8gf otMrpF/u+4+6GZvnjfH9iZQVLX7oYyO80eB3aSg/65gqSdDxQ6TBQ0eTbg/7hck8CRIR N2poqJrxPUv+xHM8+kwldwyyrpY8ClA2gjcNqxK3I6Rjaxj3f+CIGTw2aTNSrwyrBhAC fS1NWvmq7XKVFcVROA15iTu7BqzTlMOOgRsfusR61dD9uCRFfNfvH5SixHmdTB0ZZ2at NUZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691461454; x=1692066254; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=HUGk8Gm4qWKcvYyYW1oMdtOdnxgdWZtLrf0WmvG8aOg=; b=kEnkx9IiKu0FNzenGheWH12kWg/3BkW7/AYD3kYi5eWeB7zpM9XFeeX4n9U85nYDux N/1L8alEDuv+NWD0K4r+s1A7knq6JNtvHSVXJr5uCEoFEL7vRpCF8/miJ9YfpmHve5B5 vTXFIMZjApy2JdpoSNtZl1Z8P1yv9I/BaJNl6SI4dE1qv82hdhllZMni0TQrwCas/PCZ 9qkGr63Qx7NkCT1l5zUJM2/QUgQBARd+HMA0X/0g5EB19xooc1yojUcgfI7csutdSd10 mOUXfu4/ykXsLOXHGv1bmHyzDl1lL91ZVF6Uax4lNks8qMV2tZNVngllduQ2dvQmtg8j xDnw== X-Gm-Message-State: AOJu0YxUMuTSnC1rC4SnzjB7PocL27eJuiDi66kbg2HARYXOi3Zcbwrp bg/0U4rZi1sEzoKGjYt5bUYkFg== X-Received: by 2002:a17:902:8692:b0:1b9:d307:c1df with SMTP id g18-20020a170902869200b001b9d307c1dfmr10722096plo.17.1691461453690; Mon, 07 Aug 2023 19:24:13 -0700 (PDT) Received: from dread.disaster.area (pa49-180-166-213.pa.nsw.optusnet.com.au. [49.180.166.213]) by smtp.gmail.com with ESMTPSA id h17-20020a170902f55100b001b864add154sm7583543plf.154.2023.08.07.19.24.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Aug 2023 19:24:13 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qTCOH-002Wjl-2u; Tue, 08 Aug 2023 12:24:09 +1000 Date: Tue, 8 Aug 2023 12:24:09 +1000 From: Dave Chinner To: Qi Zheng Cc: akpm@linux-foundation.org, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, simon.horman@corigine.com, dlemoal@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org, linux-erofs@lists.ozlabs.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-nfs@vger.kernel.org, linux-mtd@lists.infradead.org, rcu@vger.kernel.org, netdev@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org Subject: Re: [PATCH v4 45/48] mm: shrinker: make global slab shrink lockless Message-ID: References: <20230807110936.21819-1-zhengqi.arch@bytedance.com> <20230807110936.21819-46-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230807110936.21819-46-zhengqi.arch@bytedance.com> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Aug 07, 2023 at 07:09:33PM +0800, Qi Zheng wrote: > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index eb342994675a..f06225f18531 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -4,6 +4,8 @@ > > #include > #include > +#include > +#include > > #define SHRINKER_UNIT_BITS BITS_PER_LONG > > @@ -87,6 +89,10 @@ struct shrinker { > int seeks; /* seeks to recreate an obj */ > unsigned flags; > > + refcount_t refcount; > + struct completion done; > + struct rcu_head rcu; Documentation, please. What does the refcount protect, what does the completion provide, etc. > + > void *private_data; > > /* These are for internal use */ > @@ -120,6 +126,17 @@ struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...); > void shrinker_register(struct shrinker *shrinker); > void shrinker_free(struct shrinker *shrinker); > > +static inline bool shrinker_try_get(struct shrinker *shrinker) > +{ > + return refcount_inc_not_zero(&shrinker->refcount); > +} > + > +static inline void shrinker_put(struct shrinker *shrinker) > +{ > + if (refcount_dec_and_test(&shrinker->refcount)) > + complete(&shrinker->done); > +} > + > #ifdef CONFIG_SHRINKER_DEBUG > extern int __printf(2, 3) shrinker_debugfs_rename(struct shrinker *shrinker, > const char *fmt, ...); > diff --git a/mm/shrinker.c b/mm/shrinker.c > index 1911c06b8af5..d318f5621862 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -2,6 +2,7 @@ > #include > #include > #include > +#include > #include > > #include "internal.h" > @@ -577,33 +578,42 @@ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, > if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg)) > return shrink_slab_memcg(gfp_mask, nid, memcg, priority); > > - if (!down_read_trylock(&shrinker_rwsem)) > - goto out; > - > - list_for_each_entry(shrinker, &shrinker_list, list) { > + rcu_read_lock(); > + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { > struct shrink_control sc = { > .gfp_mask = gfp_mask, > .nid = nid, > .memcg = memcg, > }; > > + if (!shrinker_try_get(shrinker)) > + continue; > + > + /* > + * We can safely unlock the RCU lock here since we already > + * hold the refcount of the shrinker. > + */ > + rcu_read_unlock(); > + > ret = do_shrink_slab(&sc, shrinker, priority); > if (ret == SHRINK_EMPTY) > ret = 0; > freed += ret; > + > /* > - * Bail out if someone want to register a new shrinker to > - * prevent the registration from being stalled for long periods > - * by parallel ongoing shrinking. > + * This shrinker may be deleted from shrinker_list and freed > + * after the shrinker_put() below, but this shrinker is still > + * used for the next traversal. So it is necessary to hold the > + * RCU lock first to prevent this shrinker from being freed, > + * which also ensures that the next shrinker that is traversed > + * will not be freed (even if it is deleted from shrinker_list > + * at the same time). > */ This needs to be moved to the head of the function, and document the whole list walk, get, put and completion parts of the algorithm that make it safe. There's more to this than "we hold a reference count", especially the tricky "we might see the shrinker before it is fully initialised" case.... ..... > void shrinker_free(struct shrinker *shrinker) > { > struct dentry *debugfs_entry = NULL; > @@ -686,9 +712,18 @@ void shrinker_free(struct shrinker *shrinker) > if (!shrinker) > return; > > + if (shrinker->flags & SHRINKER_REGISTERED) { > + shrinker_put(shrinker); > + wait_for_completion(&shrinker->done); > + } Needs a comment explaining why we need to wait here... > + > down_write(&shrinker_rwsem); > if (shrinker->flags & SHRINKER_REGISTERED) { > - list_del(&shrinker->list); > + /* > + * Lookups on the shrinker are over and will fail in the future, > + * so we can now remove it from the lists and free it. > + */ .... rather than here after the wait has been done and provided the guarantee that no shrinker is running or will run again... -Dave. -- Dave Chinner david@fromorbit.com