Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp36195527rwd; Mon, 10 Jul 2023 20:00:41 -0700 (PDT) X-Google-Smtp-Source: APBJJlGqqWaHhnFYM43L13QRhjx1wK4302lc2IH7ckL9T4ZM4ISu9SjsLCZNnifks2eeIhASWEFt X-Received: by 2002:a17:906:7a08:b0:993:dcca:9607 with SMTP id d8-20020a1709067a0800b00993dcca9607mr10171040ejo.2.1689044441227; Mon, 10 Jul 2023 20:00:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689044441; cv=none; d=google.com; s=arc-20160816; b=G08XL/SIidcvNV05qJVdUKBO26bd5eO+sSfmkP9x/fzmKLIPs0R9qC9RJ0dRdk+uej ND8O4F50mXQSKXX/gSu2i85WNQPNnDd3I/Y/VLUr5/Jy+zY+rXiqKKIKNTWUF9HxKBxx p1pl5OYFT/zzI9/BWD8EaAyZeNcmBzrYJRg5ah98tzVWs3A0bFofeieFnfyWD6GrPzr3 ZqWbVG2I04iKrf+vqzWLn1FntRa50lZWZnZ2j0WWarwV/ZDTgB/IQbJCxqs1TTz2j60b +TFesGNlQCeCnMGeP7eEeEDEXH94L/N5eVpt0euVcA5INtpK8AB8DkQh3Q+cr7Fja539 yRZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=rcFAg3capN+kZ6z5h602ZhH3N0vX5hUx6mIn9KbYzgI=; fh=+uHKokmlPFbps/REPznyMmEsRJHIupin2KEIDEWQv+I=; b=E3/2+gS8y8M+zSFSu3Ega2PxcY2NFa9hU3Qa1rM864K5eY614lXCZWH0pUWBKmpznh VVIwJGbvp9SlNiJq1qZooGyi2sBgowrzrl+3T6esXhrKjXjgz1ZMyFgMpBNcByAHYNag Gf1Zr8EczC0nQTVfuH08knb1IF0sOmFCOa54egUY9cFGa0u2SQkEUjP61P6NJyzN2XXM xRpgPXd/EAJWz/i22USosDHyhHvN3ctg9dsjCTjOny58m9TIRNaVp3yZ16BpNsBDrqwn 1zko1BE81C02Jd4fuYZxvZRulXGA1JKLTpCBc/XneNeRbVpDPotHxYnKV/f1I7abtDJ4 u2Mg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o1-20020a1709061b0100b0098ce5bd2ed0si1159110ejg.257.2023.07.10.20.00.15; Mon, 10 Jul 2023 20:00:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230018AbjGKCwI (ORCPT + 99 others); Mon, 10 Jul 2023 22:52:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229505AbjGKCwH (ORCPT ); Mon, 10 Jul 2023 22:52:07 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5633910D; Mon, 10 Jul 2023 19:52:05 -0700 (PDT) Received: from canpemm500002.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4R0QL63WXZzMqbJ; Tue, 11 Jul 2023 10:48:46 +0800 (CST) Received: from [10.174.151.185] (10.174.151.185) by canpemm500002.china.huawei.com (7.192.104.244) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 11 Jul 2023 10:52:02 +0800 Subject: Re: [PATCH] cgroup/cpuset: update parent subparts cpumask while holding css refcnt To: Waiman Long , =?UTF-8?Q?Michal_Koutn=c3=bd?= CC: , , , , References: <20230701065049.1758266-1-linmiaohe@huawei.com> <74f1906e-fe58-c745-a851-b160374f7acf@redhat.com> From: Miaohe Lin Message-ID: <30b1f809-a11b-efe8-289c-04a801f20207@huawei.com> Date: Tue, 11 Jul 2023 10:52:02 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <74f1906e-fe58-c745-a851-b160374f7acf@redhat.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.151.185] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To canpemm500002.china.huawei.com (7.192.104.244) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2023/7/10 23:40, Waiman Long wrote: > On 7/10/23 11:11, Michal Koutný wrote: >> Hello. >> >> On Sat, Jul 01, 2023 at 02:50:49PM +0800, Miaohe Lin wrote: >>> --- a/kernel/cgroup/cpuset.c >>> +++ b/kernel/cgroup/cpuset.c >>> @@ -1806,9 +1806,12 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, >>>           cpuset_for_each_child(cp, css, parent) >>>               if (is_partition_valid(cp) && >>>                   cpumask_intersects(trialcs->cpus_allowed, cp->cpus_allowed)) { >>> +                if (!css_tryget_online(&cp->css)) >>> +                    continue; >>>                   rcu_read_unlock(); >>>                   update_parent_subparts_cpumask(cp, partcmd_invalidate, NULL, &tmp); >>>                   rcu_read_lock(); >>> +                css_put(&cp->css); >> Apologies for a possibly noob question -- why is RCU read lock >> temporarily dropped within the loop? >> (Is it only because of callback_lock or cgroup_file_kn_lock (via >> notify_partition_change()) on PREEMPT_RT?) >> >> >> >> [ >> OT question: >>     cpuset_for_each_child(cp, css, parent)                (1) >>         if (is_partition_valid(cp) && >>             cpumask_intersects(trialcs->cpus_allowed, cp->cpus_allowed)) { >>             if (!css_tryget_online(&cp->css)) >>                 continue; >>             rcu_read_unlock(); >>             update_parent_subparts_cpumask(cp, partcmd_invalidate, NULL, &tmp); >>               ... >>               update_tasks_cpumask(cp->parent) >>                 ... >>                 css_task_iter_start(&cp->parent->css, 0, &it);    (2) >>                   ... >>             rcu_read_lock(); >>             css_put(&cp->css); >>         } >> >> May this touch each task same number of times as its depth within >> herarchy? > > I believe the primary reason is because update_parent_subparts_cpumask() can potential run for quite a while. So we don't want to hold the rcu_read_lock for too long. There may also be a potential that schedule() may be called. IMHO, the reason should be as same as the below commit: commit 2bdfd2825c9662463371e6691b1a794e97fa36b4 Author: Waiman Long Date: Wed Feb 2 22:31:03 2022 -0500 cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning It was found that a "suspicious RCU usage" lockdep warning was issued with the rcu_read_lock() call in update_sibling_cpumasks(). It is because the update_cpumasks_hier() function may sleep. So we have to release the RCU lock, call update_cpumasks_hier() and reacquire it afterward. Also add a percpu_rwsem_assert_held() in update_sibling_cpumasks() instead of stating that in the comment. Thanks both.