Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp7604511rwd; Tue, 6 Jun 2023 13:22:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4p6GeTtVpT0Q/2Fy+jpBDmzZ9iRUMAWQ01CGjOM2FM2ZBvOT4p6XIcH7DcL5iV7z4+jyWS X-Received: by 2002:a05:622a:18a9:b0:3f8:4612:5aa8 with SMTP id v41-20020a05622a18a900b003f846125aa8mr1101615qtc.18.1686082928908; Tue, 06 Jun 2023 13:22:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686082928; cv=none; d=google.com; s=arc-20160816; b=n6H01FtQp1M62JKp280JnmxRyPQ6KOSDUeyLMGPhXFhmDj5qYipk6X2M0of8T4m2xf Oe3eojEDKFbOvY67sFoiFeKZOjMG3AYKSqHD9F/hF81mK10YnOFKwocBQiRgJG+H0OfO lKuOmkPGnEWcgPWq5irqTKa9bSm5LMwNwpc/mJXxPa8RdEVfxIHSoBSK0KRLB0Uu5I1y RS9HtO7BZOrSZZgdldTx1C7SWg8IOMFi0qJ7mQRsyriEMkR5ChjnDZMykVyygLxsga0l pL8dHjQzHgMdp0rF2zbycFNbWr8V3H6GONEklDpodyN1v5oAjsf+eH/MUz5ZoRzj5Pw/ xdbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=lzgqGsD7+CrrqOciLpvZvEV1lKYjsInQHGa9T++ytWk=; b=tSa6UUS6ueqqjIhjs8HZ3wCB190HqmfVS/6UyYJMiUM3bPK7WKVD2IW4kO+k2/DHKh 3zQyO+hmcH57z23GGz62TKa5rcZ0xNYTXbbbLdmk7HdGsYulFYuT9G9+CLaBRdbnvKVw lLOWMQDyXRQ7ugv4ORUNzzthI3wetdUB1ep3XSBcHWoKtrxLTAp8ZimLm4tF9c6ny00K kY+pfawrNodNn/I766Coc+STJlIm+p/lwHqtuTJa9dnErV8pZ1LQPKJNphiNrwW5TgVu 40s0dftDemzTHLPJxs2SEs89wXLaXvZQ2FQQG2jsBELjcv0+mvMcuehJ9S4rhhD/wIVS F/XQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VWoyk2Fs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jp11-20020ad45f8b000000b0062b5c0de74esi1763550qvb.592.2023.06.06.13.21.50; Tue, 06 Jun 2023 13:22:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VWoyk2Fs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239678AbjFFULz (ORCPT + 99 others); Tue, 6 Jun 2023 16:11:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238399AbjFFULx (ORCPT ); Tue, 6 Jun 2023 16:11:53 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E35110F2 for ; Tue, 6 Jun 2023 13:11:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686082269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lzgqGsD7+CrrqOciLpvZvEV1lKYjsInQHGa9T++ytWk=; b=VWoyk2Fsjk8Pv0TJw5J10FLBMxCAYqX3CvkoAVOgO6afqwYBSwTKPpWDG9EbHSXkJUQY7X b/3YWr+i/ZLIKw/fLvFMet65m8XxhJEECO8DXJDUFTihCSWPMNiYBZSfB62TAr6UP2usF3 l7yeGh29pCB3Lqw+E9h6R57QEOj0sYo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-648-7QNlY6cENyyi_C61lCve9w-1; Tue, 06 Jun 2023 16:11:04 -0400 X-MC-Unique: 7QNlY6cENyyi_C61lCve9w-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 843AE185A791; Tue, 6 Jun 2023 20:11:03 +0000 (UTC) Received: from [10.22.34.1] (unknown [10.22.34.1]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5284C40CFD46; Tue, 6 Jun 2023 20:11:02 +0000 (UTC) Message-ID: <563fd5e1-650a-e329-8aab-2fa1953a9f49@redhat.com> Date: Tue, 6 Jun 2023 16:11:02 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition Content-Language: en-US To: Tejun Heo Cc: =?UTF-8?Q?Michal_Koutn=c3=bd?= , Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Juri Lelli , Valentin Schneider , Frederic Weisbecker , Mrunal Patel , Ryan Phillips , Brent Rowsell , Peter Hunt , Phil Auld References: <759603dd-7538-54ad-e63d-bb827b618ae3@redhat.com> <405b2805-538c-790b-5bf8-e90d3660f116@redhat.com> <18793f4a-fd39-2e71-0b77-856afb01547b@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/6/23 15:58, Tejun Heo wrote: > Hello, Waiman. > > On Mon, Jun 05, 2023 at 10:47:08PM -0400, Waiman Long wrote: > ... >> I had a different idea on the semantics of the cpuset.cpus.exclusive at the >> beginning. My original thinking is that it was the actual exclusive CPUs >> that are allocated to the cgroup. Now if we treat this as a hint of what >> exclusive CPUs should be used and it becomes valid only if the cgroup can > I wouldn't call it a hint. It's still hard allocation of the CPUs to the > cgroups that own them. Setting up a partition requires exclusive CPUs and > thus would depend on exclusive allocations set up accordingly. > >> become a valid partition. I can see it as a value that can be hierarchically >> set throughout the whole cpuset hierarchy. >> >> So a transition to a valid partition is possible iff >> >> 1) cpuset.cpus.exclusive is a subset of cpuset.cpus and is a subset of >> cpuset.cpus.exclusive of all its ancestors. > Yes. > >> 2) If its parent is not a partition root, none of the CPUs in >> cpuset.cpus.exclusive are currently allocated to other partitions. This the > Not just that, the CPUs aren't available to cgroups which don't have them > set in the .exclusive file. IOW, if a CPU is in cpus.exclusive of some > cgroups, it shouldn't appear in cpus.effective of cgroups which don't have > the CPU in their cpus.exclusive. > > So, .exclusive explicitly establishes exclusive ownership of CPUs and > partitions depend on that with an implicit "turn CPUs exclusive" behavior in > case the parent is a partition root for backward compatibility. The current CPU exclusive behavior is limited to sibling cgroups only. Because of the hierarchical nature of cpu distribution, the set of exclusive CPUs have to appear in all its ancestors. When partition is enabled, we do a sibling exclusivity test at that point to verify that it is exclusive. It looks like you want to do an exclusivity test even when the partition isn't active. I can certainly do that when the file is being updated. However, it will fail the write if the exclusivity test fails just like the v1 cpuset.cpus.exclusive flag if you are OK with that. > >> same remote partition concept in my v2 patch. If its parent is a partition >> root, part of its exclusive CPUs will be distributed to this child partition >> like the current behavior of cpuset partition. > Yes, similar in a sense. Please do away with the "once .reserve is used, the > behavior is switched" part. That behavior has been gone in my v2 patch. > Instead, it can be sth like "if the parent is a > partition root, cpuset implicitly tries to set all CPUs in its cpus file in > its cpus.exclusive file" so that user-visible behavior stays unchanged > depending on past history. If parent is a partition root, auto reservation will be done and cpus.exclusive will be set automatically just like before. So existing applications using partition will not be affected. Cheers, Longman