Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp7420922rwr; Tue, 2 May 2023 14:39:06 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7whn6x4UZ7wAw4ZDG6Mub9VI/aBKnVvrvti45kVc6Ho8fYiIi7qHlV7V5BSHTmhAm8m2dL X-Received: by 2002:a05:6a20:4289:b0:ef:f558:b7d with SMTP id o9-20020a056a20428900b000eff5580b7dmr24835294pzj.59.1683063546471; Tue, 02 May 2023 14:39:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683063546; cv=none; d=google.com; s=arc-20160816; b=iaUsuNAxOqOdVXpI34iKtyN1f4ThtbdEWek4Tr6hwxYk2LEwJMfNk9/uBHzrW1m0ou SIIl8f0JAlu6a4CoW5cY3eDgFd70ZObE00fGwLbjD3/5rN2g3Wmzcc+0yWsL5OrqpMA7 1y+5aL8XSnYrOlkl+NTkT7122jmHB1KZROeItavMs/kR9rxRg/V3M7wDfAvj/lRp29K+ TCnCEGzg5DM8j4ZUw4CZjqxmv/lVAeQNETaeVpzi7B9RTwkoy+xFLievszywGJXLDtm2 mo3gV645HOqF6IHzLg9lgkn9YFKObVybkQhsG7GZ1gLcCTBfqlhWldI+wYcyPOsqJnRi bZfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=1zc7E43mDugyj5NEnd93trnYdgLQsKcvQy7NWFtl2VU=; b=ibksP0fcSGxYW1Ck9AbUyF+Ks5n7Ji0JrLJ0IzEzF/ttDY925kzJq4/XW3IQlmxxHk wt5xzyoDN/IRPugCz5ueXFIFDO4rA5Yie1AyLllsTBo2g/JBTRyS64Oqd7IB7GVPKkxD 0NdS6RGhaEBGo7dWnfFjjq/gl+Ah3CW8NWuX9QLdzHlPKdrl8Vbsh2QhrtExhWT5ehui Mox+UMKe+TudxrjUVms4sOtPjoOZwGFijgnJJPQ5bH63vL3J2sixJPFoDtwoR4dWiCRb tu0wKuKPMoQfG0MHo9bdJV8qzj/eM+u9kfOTVip0tze93lAuQXC7rpfZaqSWUiL6aego kgGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KG2lhN2A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r132-20020a632b8a000000b005285dfe5ba3si20148538pgr.29.2023.05.02.14.38.50; Tue, 02 May 2023 14:39:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KG2lhN2A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229915AbjEBV1L (ORCPT + 99 others); Tue, 2 May 2023 17:27:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229502AbjEBV1K (ORCPT ); Tue, 2 May 2023 17:27:10 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 915041704 for ; Tue, 2 May 2023 14:26:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683062782; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1zc7E43mDugyj5NEnd93trnYdgLQsKcvQy7NWFtl2VU=; b=KG2lhN2AgtHVgEv322yBE61gVbiyM/zDFhmt+Frp7/B8/jAXkUT6MVBgv8RUxZJk0mEVds Fpb2GS58C68EEO44yLHx3LYaEqHpsU0SrSTLEcAtTLp/7xT3bxEFIKIuXhaZffxusC/XT2 Dlo+4bKflZourDkEFrmMGqnPCzeBeu8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-403-h9qpYmFjMQu-UNgu6fOiTA-1; Tue, 02 May 2023 17:26:18 -0400 X-MC-Unique: h9qpYmFjMQu-UNgu6fOiTA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 33D3C811E7E; Tue, 2 May 2023 21:26:18 +0000 (UTC) Received: from [10.22.10.239] (unknown [10.22.10.239]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A78C112132E; Tue, 2 May 2023 21:26:17 +0000 (UTC) Message-ID: Date: Tue, 2 May 2023 17:26:17 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition Content-Language: en-US To: =?UTF-8?Q?Michal_Koutn=c3=bd?= Cc: Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Juri Lelli , Valentin Schneider , Frederic Weisbecker References: <1b8d9128-d076-7d37-767d-11d6af314662@redhat.com> <9862da55-5f41-24c3-f3bb-4045ccf24b2e@redhat.com> <226cb2da-e800-6531-4e57-cbf991022477@redhat.com> <60ec12dc-943c-b8f0-8b6f-97c5d332144c@redhat.com> <46d26abf-a725-b924-47fa-4419b20bbc02@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/2/23 14:01, Michal Koutný wrote: > Hello. > > The previous thread arrived incomplete to me, so I respond to the last > message only. Point me to a message URL if it was covered. > > On Fri, Apr 14, 2023 at 03:06:27PM -0400, Waiman Long wrote: >> Below is a draft of the new cpuset.cpus.reserve cgroupfs file: >> >>   cpuset.cpus.reserve >>         A read-write multiple values file which exists on all >>         cpuset-enabled cgroups. >> >>         It lists the reserved CPUs to be used for the creation of >>         child partitions.  See the section on "cpuset.cpus.partition" >>         below for more information on cpuset partition.  These reserved >>         CPUs should be a subset of "cpuset.cpus" and will be mutually >>         exclusive of "cpuset.cpus.effective" when used since these >>         reserved CPUs cannot be used by tasks in the current cgroup. >> >>         There are two modes for partition CPUs reservation - >>         auto or manual.  The system starts up in auto mode where >>         "cpuset.cpus.reserve" will be set automatically when valid >>         child partitions are created and users don't need to touch the >>         file at all.  This mode has the limitation that the parent of a >>         partition must be a partition root itself.  So child partition >>         has to be created one-by-one from the cgroup root down. >> >>         To enable the creation of a partition down in the hierarchy >>         without the intermediate cgroups to be partition roots, > Why would be this needed? Owning a CPU (a resource) must logically be > passed all the way from root to the target cgroup, i.e. this is > expressed by valid partitioning down to given level. > >> one >>         has to turn on the manual reservation mode by writing directly >>         to "cpuset.cpus.reserve" with a value different from its >>         current value.  By distributing the reserve CPUs down the cgroup >>         hierarchy to the parent of the target cgroup, this target cgroup >>         can be switched to become a partition root if its "cpuset.cpus" >>         is a subset of the set of valid reserve CPUs in its parent. > level n > `- level n+1 > cpuset.cpus // these are actually configured by "owner" of level n > cpuset.cpus.partition // similrly here, level n decides if child is a partition > > I.e. what would be level n/cpuset.cpus.reserve good for when it can > directly control level n+1/cpuset.cpus? In the new scheme, the available cpus are still directly passed down to a descendant cgroup. However, isolated CPUs (or more generally CPUs dedicated to a partition) have to be exclusive. So what the cpuset.cpus.reserve does is to identify those exclusive CPUs that can be excluded from the effective_cpus of the parent cgroups before they are claimed by a child partition. Currently this is done automatically when a child partition is created off a parent partition root. The new scheme will break it into 2 separate steps without the requirement that the parent of a partition has to be a partition root itself. Cheers, Longman claimed by a partition and will be excluded from the effective_cpus of the parent