Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp810738rdb; Tue, 5 Dec 2023 23:56:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IF4hLQdgbd4ItS8p9ogYnfjMXUk+eOwkCuRM3P4PBaznAbVjYTkOnIlBzog4LTgYcY/uYaJ X-Received: by 2002:a17:902:d88f:b0:1cf:66a3:16c with SMTP id b15-20020a170902d88f00b001cf66a3016cmr305223plz.21.1701849384812; Tue, 05 Dec 2023 23:56:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701849384; cv=none; d=google.com; s=arc-20160816; b=DamN3pEMMkiSq9j6qtep1LghcvG9alIVgEsJT0UeRLrMh43VjNUgcx0zbZURAdcM4v GOlMZg3ewL35HN/o6S8sMWFNScL653yIy8ku6YzGE4hDFuE5uBpLHNpSjiSE/mo+6FM1 fvqzZdNjZBciYW0my6qdsd17ru3mbv64GX9pTcEtMVUSqzTdmCbHImQ7FlxoU/FvoMkD o3uZMLcKBeT2hgAok3ENSgJEWc9sFiQVmR9WGwiRXa0RuERQTy14a2WC77Ad8OmewjFp cBCm6tQR33vUBlAlcW6vhLYZXC9WaZrTgKNOFVIE2T+04Df5Rmx0+7/GrkWssz9fMMMq vEjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; fh=CauS0LrPrHfCd5Y4EbVatHB+GIpUsGdqQucEBQ254+g=; b=ZiUHa463hJpzrnMlfquGUw8fhc3p+sQtDNEjk/t/GiE/M2KJaUeq39/fVIjhxOFb5D g6L6G0M5d0eT/LkyVallSBWk1S6xeQKdOH1bQz4hFD8L9OV8TgIeL1aYe4FVSHuV53ng 3wy1dhuX8DFlQ/KeR9qn8CmRW3CKBYWcs7kFZSLDdsBVox1AthvLnHgBtH9onh8P/WvQ 6XhoFWmG5yUHE5InCHwNLPEcirUVKIZgT/emPltSB82t8bZXE4kb7r02t14Va/iqy1SU btvjXtKl009x6SBeLXmJmQhc1dOmht6FPOVEimymJUSvQt53RKuw/U1Av7y7NpIikGDi zL4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=agnOoORL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id o9-20020a1709026b0900b001d0969ccef4si5072532plk.229.2023.12.05.23.56.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 23:56:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=agnOoORL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 3ED1580787F6; Tue, 5 Dec 2023 23:55:50 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376772AbjLFHzd (ORCPT + 99 others); Wed, 6 Dec 2023 02:55:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229459AbjLFHzb (ORCPT ); Wed, 6 Dec 2023 02:55:31 -0500 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DD74137 for ; Tue, 5 Dec 2023 23:55:37 -0800 (PST) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1d03f03cda9so12488695ad.0 for ; Tue, 05 Dec 2023 23:55:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701849337; x=1702454137; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; b=agnOoORL9S/rsRVOticNDCtc1yQkbzMho1Fs3vKSvQ+sSqXAh8nm+v6bW61VTsBRTK gip3padSAqTxAaJ9wfVq9hxY4+OtOsQFv7h/L2W1FI4w5Ips6zBP7AVxRoasxXY15VpG vGZQ6IfHYbfUs/jRJMZweey28YzDiDH1vS/xeaqaSLcrnF16KNUBmAJ9W9PEcea66vAS Bg6KNhi7SExObHxLC2Iz29SdBbhp0Q+FTVMa9cZTJUa160d1qesS6fgTm7MqykjvuR0Z hfRen1h5EiYRbilNXu9nlIQUzB+dcGkQJX82wT73CJpLgHU1kftwpwWAwnUStR1n5tnn ME9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701849337; x=1702454137; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KS8bxqNBURkYnzM0KcIii/6TxRexjVdODNdz5oVxNd4=; b=Rn8vEc2H9Uh1JV1SM+QzNO1wMnI4cZprHoohrYAjBVc0hmafoc57aVF0k5uVSYnVyl MVqTfzHC2U+7CTTqUpHjIBG4O7z+ZDFKShNzXPnEqgIYHi51jMuIqtNJHVqA73ERM41S RCJ8p1wPtCAMLVqH9aYwGFLq1cpuIytBcAyROVn8DQd80pVFtXAHm2lThxy2iuvO3M9G ffwYCPhfrLRY40ATRbm6J2eRUYt2c9QR5/fHZY0ICMOe/KoaF/wJ7wY/JXNnxS5wgxeK iRcDBwztaqHV0qoIAvXkNmuuH3VBLk1oD+lIoj+PkgXIpzlm3Q5Swsf1/Sqk+gU6G2AP +6EQ== X-Gm-Message-State: AOJu0YzrWjP5XPxFh5m987GIViY7EYwULU3+drQosJy+WFDtAlOoptAa ADEcZAavl6IUoL2CHVclVkLR4A== X-Received: by 2002:a17:902:cec1:b0:1cf:b192:fab8 with SMTP id d1-20020a170902cec100b001cfb192fab8mr945048plg.1.1701849336937; Tue, 05 Dec 2023 23:55:36 -0800 (PST) Received: from [10.84.152.29] ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id j3-20020a17090276c300b001b7f40a8959sm11411596plt.76.2023.12.05.23.55.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Dec 2023 23:55:36 -0800 (PST) Message-ID: <93c36097-5266-4fc5-84a8-d770ab344361@bytedance.com> Date: Wed, 6 Dec 2023 15:55:28 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v6 42/45] mm: shrinker: make global slab shrink lockless To: Lai Jiangshan Cc: akpm@linux-foundation.org, paulmck@kernel.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, tytso@mit.edu, steven.price@arm.com, cel@kernel.org, senozhatsky@chromium.org, yujie.liu@intel.com, gregkh@linuxfoundation.org, muchun.song@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org References: <20230911094444.68966-1-zhengqi.arch@bytedance.com> <20230911094444.68966-43-zhengqi.arch@bytedance.com> Content-Language: en-US From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 05 Dec 2023 23:55:50 -0800 (PST) Hi, On 2023/12/6 15:47, Lai Jiangshan wrote: > On Tue, Sep 12, 2023 at 9:57 PM Qi Zheng wrote: > >> - if (!down_read_trylock(&shrinker_rwsem)) >> - goto out; >> - >> - list_for_each_entry(shrinker, &shrinker_list, list) { >> + /* >> + * lockless algorithm of global shrink. >> + * >> + * In the unregistration setp, the shrinker will be freed asynchronously >> + * via RCU after its refcount reaches 0. So both rcu_read_lock() and >> + * shrinker_try_get() can be used to ensure the existence of the shrinker. >> + * >> + * So in the global shrink: >> + * step 1: use rcu_read_lock() to guarantee existence of the shrinker >> + * and the validity of the shrinker_list walk. >> + * step 2: use shrinker_try_get() to try get the refcount, if successful, >> + * then the existence of the shrinker can also be guaranteed, >> + * so we can release the RCU lock to do do_shrink_slab() that >> + * may sleep. >> + * step 3: *MUST* to reacquire the RCU lock before calling shrinker_put(), >> + * which ensures that neither this shrinker nor the next shrinker >> + * will be freed in the next traversal operation. > > Hello, Qi, Andrew, Paul, > > I wonder know how RCU can ensure the lifespan of the next shrinker. > it seems it is diverged from the common pattern usage of RCU+reference. > > cpu1: > rcu_read_lock(); > shrinker_try_get(this_shrinker); > rcu_read_unlock(); > cpu2: shrinker_free(this_shrinker); > cpu2: shrinker_free(next_shrinker); and free the memory of next_shrinker > cpu2: when shrinker_free(next_shrinker), no one updates this_shrinker's next > cpu2: since this_shrinker has been removed first. No, this_shrinker will not be removed from the shrinker_list until the last refcount is released. See below: > rcu_read_lock(); > shrinker_put(this_shrinker); CPU 1 CPU 2 --> if (refcount_dec_and_test(&shrinker->refcount)) complete(&shrinker->done); wait_for_completion(&shrinker->done); list_del_rcu(&shrinker->list); > travel to the freed next_shrinker. > > a quick simple fix: > > // called with other references other than RCU (i.e. refcount) > static inline rcu_list_deleted(struct list_head *entry) > { > // something like this: > return entry->prev == LIST_POISON2; > } > > // in the loop > if (rcu_list_deleted(&shrinker->list)) { > shrinker_put(shrinker); > goto restart; > } > rcu_read_lock(); > shrinker_put(shrinker); > > Thanks > Lai > >> + * step 4: do shrinker_put() paired with step 2 to put the refcount, >> + * if the refcount reaches 0, then wake up the waiter in >> + * shrinker_free() by calling complete(). >> + */ >> + rcu_read_lock(); >> + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { >> struct shrink_control sc = { >> .gfp_mask = gfp_mask, >> .nid = nid, >> .memcg = memcg, >> }; >> >> + if (!shrinker_try_get(shrinker)) >> + continue; >> + >> + rcu_read_unlock(); >> + >> ret = do_shrink_slab(&sc, shrinker, priority); >> if (ret == SHRINK_EMPTY) >> ret = 0; >> freed += ret; >> - /* >> - * Bail out if someone want to register a new shrinker to >> - * prevent the registration from being stalled for long periods >> - * by parallel ongoing shrinking. >> - */ >> - if (rwsem_is_contended(&shrinker_rwsem)) { >> - freed = freed ? : 1; >> - break; >> - } >> + >> + rcu_read_lock(); >> + shrinker_put(shrinker); >> } >>