Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp2144039rwl; Thu, 30 Mar 2023 06:39:09 -0700 (PDT) X-Google-Smtp-Source: AKy350aqruc+g00TEicZdTwhhezU8k3XssgqrezxFtKEHAGvUJCtQVG30RyE1O1ikIwEeDBwGXCW X-Received: by 2002:a17:906:28e:b0:947:404b:eb2 with SMTP id 14-20020a170906028e00b00947404b0eb2mr2783747ejf.0.1680183549255; Thu, 30 Mar 2023 06:39:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680183549; cv=none; d=google.com; s=arc-20160816; b=eQX3we8NY+FQq3h2zlkTpOS2vTzRDeTGHyYR4W2Xu3CU5LrzG5XV9a+siL9u5SykGP Ck1PXQXCj6v1v7/nM7c3gw6AncxqVJPFb2oRF571AL7Hw5fUi7TKwGeXzWspknqaK4OD ftK92j8hYyzLQnJ0hdr9LPyShnb/xV2xeENaP9w0X3QGjMj29fd4xxCQ3QzcbpIB0ZOZ 61boRrgVu0dKxc9M5ZJNHxz5/rqlzm7aEGo3q2xcbQMphrWTlPDXf70R1lUr9Dici+0Q kgOHVQJ+srEM2eiDoBJJC5AssRdbu67B53EGkRnl5SIXC+OwHjRHsb9bznbyKhM+GCTW 9tkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=+mEu3VZ3rF7YtqdV3xragbR8pCRXuZFrBnk9tl6J5kg=; b=WymoTeHg6F3gmY2DNebi0V5plWHbeqdVH2yWHqYjNL8nLfh0OVOMQmbZ0S4YHxRXnF vWnUtLk7CTnsaqvH3wDXl4HcNDZJLJJMZS24P1OJYD6ZxoTZd3j2hPVkKqrN+l0+Ed2W y/j360MiT31zHfXhLfKTnjeldybOS6kJoKw+RTLfXhpUh2MoQWAEr5quykIfMDiBzSSz maZBLT6KSwNJSaWtlaLVshS7w45SwMjnln8vNDeDFQf9ZCW9L0Cj6b2/jCApIqNx46Uq DVxvdD07eaJhzKBowEq1NRDL93UeerHR673ACNNM2Y7jpthK5NT0zSYA02Tadu6DXv6P cMRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t3-20020a17090616c300b00939d4e7b5a0si22630370ejd.294.2023.03.30.06.38.43; Thu, 30 Mar 2023 06:39:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231322AbjC3Nec (ORCPT + 99 others); Thu, 30 Mar 2023 09:34:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232024AbjC3Ne3 (ORCPT ); Thu, 30 Mar 2023 09:34:29 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0AFA2AF02; Thu, 30 Mar 2023 06:34:11 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D589B2F4; Thu, 30 Mar 2023 06:34:55 -0700 (PDT) Received: from [192.168.178.6] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EA3FC3F663; Thu, 30 Mar 2023 06:34:07 -0700 (PDT) Message-ID: <67eeb47c-ae23-1389-bb52-f9cfb3206741@arm.com> Date: Thu, 30 Mar 2023 15:34:02 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH 6/7] cgroup/cpuset: Protect DL BW data against parallel cpuset_attach() Content-Language: en-US To: Waiman Long , Juri Lelli , Peter Zijlstra , Ingo Molnar , Qais Yousef , Tejun Heo , Zefan Li , Johannes Weiner , Hao Luo Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Steven Rostedt , luca.abeni@santannapisa.it, claudio@evidence.eu.com, tommaso.cucinotta@santannapisa.it, bristot@redhat.com, mathieu.poirier@linaro.org, Vincent Guittot , Wei Wang , Rick Yiu , Quentin Perret , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Sudeep Holla References: <20230329125558.255239-1-juri.lelli@redhat.com> <20230329160240.2093277-1-longman@redhat.com> From: Dietmar Eggemann In-Reply-To: <20230329160240.2093277-1-longman@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.3 required=5.0 tests=NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/03/2023 18:02, Waiman Long wrote: > It is possible to have parallel attach operations to the same cpuset in > progress. To avoid possible corruption of single set of DL BW data in > the cpuset structure, we have to disallow parallel attach operations if > DL tasks are present. Attach operations can still proceed in parallel > as long as no DL tasks are involved. > > This patch also stores the CPU where DL BW is allocated and free that BW > back to the same CPU in case cpuset_can_attach() is called. > > Signed-off-by: Waiman Long > --- > kernel/cgroup/cpuset.c | 19 ++++++++++++++++--- > 1 file changed, 16 insertions(+), 3 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 05c0a1255218..555a6b1a2b76 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -199,6 +199,7 @@ struct cpuset { > */ > int nr_deadline_tasks; > int nr_migrate_dl_tasks; > + int dl_bw_cpu; Like I mentioned in https://lkml.kernel.org/r/cdede77a-5dc5-8933-a444-a2046b074b12@arm.com IMHO any CPU of the cpuset is fine since exclusive cpuset and related root_domain (as the container for DL BW accounting data) are congruent in terms of cpumask. > u64 sum_migrate_dl_bw; > > /* Invalid partition error code, not lock protected */ > @@ -2502,6 +2503,16 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > if (cpumask_empty(cs->effective_cpus)) > goto out_unlock; > > + /* > + * If there is another parallel attach operations in progress for > + * the same cpuset, the single set of DL data there may get > + * incorrectly overwritten. So parallel operations are not allowed > + * if DL tasks are present. > + */ > + ret = -EBUSY; > + if (cs->nr_migrate_dl_tasks) > + goto out_unlock; (1) > cgroup_taskset_for_each(task, css, tset) { > ret = task_can_attach(task); > if (ret) > @@ -2511,6 +2522,9 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > goto out_unlock; > > if (dl_task(task)) { > + if (cs->attach_in_progress) > + goto out_unlock; (2) Just to check if I get this right, 2 bail-out conditions are necessary because: (1) is to prevent any new cs attach if there is already a DL cs attach and (2) is to prevent a new DL cs attach if there is already a non-DL cs attach. > cs->nr_migrate_dl_tasks++; > cs->sum_migrate_dl_bw += task->dl.dl_bw; > } [...]