Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1146143rwl; Wed, 12 Apr 2023 08:51:31 -0700 (PDT) X-Google-Smtp-Source: AKy350anIWY7hMhsnhN37hBA9FOKejuHgfqeFIldctvwW3LY9Inl+LcH3s6fmiwT7Ri/ojKGG208 X-Received: by 2002:a05:6402:5153:b0:504:b01c:cc53 with SMTP id n19-20020a056402515300b00504b01ccc53mr2737789edd.1.1681314690892; Wed, 12 Apr 2023 08:51:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681314690; cv=none; d=google.com; s=arc-20160816; b=TyZQxLxMEFIBJ8gRfnI1ZkzajhOI8v5jCJ9EoIGcrmEGtJw639BY3Pp/SWGyzOfREi lkgj70WqMOkqsD1musRM41xPybVozI5qig8iTA9nTzEdQcbWbViNBp6HauUQo9bwnBGt JOlAecMYVMI5mNLTM0+4JkUi5kck8gJeknD8NDuGVWoJ6tac1VC3r07D6iiKm2e26sky 70NSp1Yzl+NWrQx6agCENlAaiFJK/5MO5SIdf7sXGxg5grVticqP0ooOnGbJ3SKhTUx7 f4IXlRmRD3A1DjhSrWSew3TFSGLhzVZ2VQ6IKrzk7OHdYHgIUbgVFjiub0Rt/Sgxp2ba GzFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=tcf4K424VGDcdSaKtEm81lLzwyCEljjvVM6IbOXUCaM=; b=pQBJrvRnZ/zRf9huXxcVgOAGdskLaMfGe12P+8PuaO9PudsElq6s0k2EolNjHO/hQ/ IMF4wL/yw5yFaVwbbienqjZPrWru3k+5V7uT3ci6TuizL1WoxqpEVWsOTyA/ABpoABws Y6PuQck7O0S+XUShSEV7rktcvxO4rvKHxHae1mFI8zuf87huCFf8cCl/cZ0paTfqbv69 QPLXY7ScxXn9gsPW77IohVfD+TgX2FOxCs/n0HVmaxb+A5wJk4WpLU0NYxF3ZWjOOXSf uMpM5b8iT+ssdw6dZ9S7LAMsAFlV5VJ+sclTCysDrmmxnI97vgjGKYCl6SnVVmXEFBB6 SXBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UmStSnvd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w21-20020a17090649d500b0093c96e480d1si1848865ejv.437.2023.04.12.08.51.05; Wed, 12 Apr 2023 08:51:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UmStSnvd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231700AbjDLPji (ORCPT + 99 others); Wed, 12 Apr 2023 11:39:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231546AbjDLPj1 (ORCPT ); Wed, 12 Apr 2023 11:39:27 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CDA65B9E for ; Wed, 12 Apr 2023 08:38:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681313912; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=tcf4K424VGDcdSaKtEm81lLzwyCEljjvVM6IbOXUCaM=; b=UmStSnvd2lNBnmwBgQdL70HdgC9IuoEnvL6guVS4UQqB/78QCcrH+oZ9jFgmtq4ThcBRNK fSWU345VvISK8v192bPKAmLRQ+jRHP4O1ashgSOwnBTQw9vpMUvmXY0pmBH6KQTmrHjfx5 D0cVVBEpFF0QeJK9m5rz3w+2LjgX8iY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-617-FVXzpNDZO22DpHreHszLow-1; Wed, 12 Apr 2023 11:38:28 -0400 X-MC-Unique: FVXzpNDZO22DpHreHszLow-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D30DC185A7A4; Wed, 12 Apr 2023 15:38:27 +0000 (UTC) Received: from llong.com (unknown [10.22.32.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2A7CB40C6E70; Wed, 12 Apr 2023 15:38:27 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Juri Lelli , Valentin Schneider , Frederic Weisbecker , Waiman Long Subject: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition Date: Wed, 12 Apr 2023 11:37:53 -0400 Message-Id: <20230412153758.3088111-1-longman@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch series introduces a new "isolcpus" partition type to the existing list of {member, root, isolated} types. The primary reason of adding this new "isolcpus" partition is to facilitate the distribution of isolated CPUs down the cgroup v2 hierarchy. The other non-member partition types have the limitation that their parents have to be valid partitions too. It will be hard to create a partition a few layers down the hierarchy. It is relatively rare to have applications that require creation of a separate scheduling domain (root). However, it is more common to have applications that require the use of isolated CPUs (isolated), e.g. DPDK. One can use the "isolcpus" or "nohz_full" boot command options to get that statically. Of course, the "isolated" partition is another way to achieve that dynamically. Modern container orchestration tools like Kubernetes use the cgroup hierarchy to manage different containers. If a container needs to use isolated CPUs, it is hard to get those with existing set of cpuset partition types. With this patch series, a new "isolcpus" partition can be created to hold a set of isolated CPUs that can be pull into other "isolated" partitions. The "isolcpus" partition is special that there can have at most one instance of this in a system. It serves as a pool for isolated CPUs and cannot hold tasks or sub-cpusets underneath it. It is also not cpu-exclusive so that the isolated CPUs can be distributed down the sibling hierarchies, though those isolated CPUs will not be useable until the partition type becomes "isolated". Once isolated CPUs are needed in a cgroup, the administrator can write a list of isolated CPUs into its "cpuset.cpus" and change its partition type to "isolated" to pull in those isolated CPUs from the "isolcpus" partition and use them in that cgroup. That will make the distribution of isolated CPUs to cgroups that need them much easier. In the future, we may be able to extend this special "isolcpus" partition type to support other isolation attributes like those that can be specified with the "isolcpus" boot command line and related options. Waiman Long (5): cgroup/cpuset: Extract out CS_CPU_EXCLUSIVE & CS_SCHED_LOAD_BALANCE handling cgroup/cpuset: Add a new "isolcpus" paritition root state cgroup/cpuset: Make isolated partition pull CPUs from isolcpus partition cgroup/cpuset: Documentation update for the new "isolcpus" partition cgroup/cpuset: Extend test_cpuset_prs.sh to test isolcpus partition Documentation/admin-guide/cgroup-v2.rst | 89 ++- kernel/cgroup/cpuset.c | 548 +++++++++++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 376 ++++++++---- 3 files changed, 789 insertions(+), 224 deletions(-) -- 2.31.1