Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp1560469pxv; Fri, 16 Jul 2021 12:03:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyAKakjV625IDU5l7UhdKuShX3tGNWKO9Iu2kRhoqcf/ILzy8WKqtUxex/jpYujGGaHDP/I X-Received: by 2002:a05:6402:13c3:: with SMTP id a3mr16674116edx.187.1626462201061; Fri, 16 Jul 2021 12:03:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626462201; cv=none; d=google.com; s=arc-20160816; b=XPPtsKU6PIJhrCw9vjIuzBmD8tZ69WuPEMYf42B3YFMB5y9Fsz8aKi8gsuZ0NJdc9S CSJtrk8O3DcuUzniK+W8sfipccheCdXA6edRTrN2wrN6fgJHwIEBJaV2QOg2peVHhJDX k3crfUeya8wR4KRitOeZ2+yr+smwGGGyBVOO4Qs/pTzwrrS+6KowLU2Y1Y6OuIR3E3H9 Pg5dlZC6esT+NVua5hwZZKhe4v9JmAfKVDE2RE2tGtxKBwE9/CgkHPx6MSMVN3/WHGgP l8p6ZsBdKdQpCj3YkgoTFCkg3a8EwTjQknrjx1fwoDc2jS7wmA2L4dxrNxHoJu3S+1h0 fYag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:subject:from:dkim-signature; bh=t3eOYXdWOcxLwyJeyCXZFcz5Qu68cSCzVAnpdB+6mag=; b=c7zcqketbcrm9y5XtgxYGCmNzc5uR6DD7zBIsHRagGVSflu+3jpErTdf44e6inJgbC hBCWLobmKlCAblTVZKcbLVVfZMHR3Y2OO7mGIJ4crTxtywv8rNOQa+rf1uZq9Zi6XjKY mgiWmzBRNyLuCw4AbF+Vf8KSoZQN8wdaagv0VE+et42FhuKNuyjVEH90HqMGO0h+M/KB qeXWQR1LZPlI9MiwOrA2mp+8Pz8AoUo5L7Iglf2iu+YjkHRT1oyiYFrQem46cN3F4qf7 iEOrHvFP6occUb6YGOkXuWoJumpGs2cUka/V130D6YUFSkhfTy6d+wkcGq42og+vWl2P oUEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=f7dhxgU2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i17si13012359edv.516.2021.07.16.12.02.57; Fri, 16 Jul 2021 12:03:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=f7dhxgU2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231624AbhGPTCo (ORCPT + 99 others); Fri, 16 Jul 2021 15:02:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36828 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231335AbhGPTCn (ORCPT ); Fri, 16 Jul 2021 15:02:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1626461988; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t3eOYXdWOcxLwyJeyCXZFcz5Qu68cSCzVAnpdB+6mag=; b=f7dhxgU2+3RRUlmbcK8zgD4ZrOTrxj1+IZyHNsVO8I/wlH2LgvfRzc5ms8+V3CvCQSvU/7 Zy+2vpanOs6TMGXCsCWRn6CpkRPjBSJ1Ur/9YIYj57H2ayqUWJFrpg91XcKBMg0AgFMEQi amPejuYRr15nm3kGl7fKyuojAGdAN94= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-578-o9o9uPxuPVmGn9LIWjQ3LQ-1; Fri, 16 Jul 2021 14:59:46 -0400 X-MC-Unique: o9o9uPxuPVmGn9LIWjQ3LQ-1 Received: by mail-qt1-f200.google.com with SMTP id w3-20020ac80ec30000b029024e8c2383c1so6923638qti.5 for ; Fri, 16 Jul 2021 11:59:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=t3eOYXdWOcxLwyJeyCXZFcz5Qu68cSCzVAnpdB+6mag=; b=XeyFTYOVDtB+KuUrt2QOtfQMDtLpC7oMGrGSKzWk7ukRtH8MtUoxnJvQAoUKItzMZ+ PyFHt/ecR3FfqzyvW1WMPGq45CB5tFqkdiwBXUW7EzxZ3ce6GCGlgp/oaXKPlXupFJYZ YtNAcs+fJcmmF3abPZlzMKzM7Ck/MLKYbcp4tRcKwXrvZCHZhImQt/2CJ8hhZZlgtBbI sUSkSDEWE57aiFEZGME4eyqwajvlQ61f3kt7ixOe9sa8T7hONvtth+Pcy9MoEyMkJV+9 M0bU/0jzaKuIw8WOOsmxZEVaUxhnTRIfYFbXQX6oFOJ0lNCz4WtZ5BVEN3y9Vofpw4+8 9txw== X-Gm-Message-State: AOAM531GEJfvWvutsTmA9bHGnL0i3LYQV+u+6YJzqHENYcczbOQhWSmx HNRGdmZOm+HiNTP2lw+kG+a+Kq6+ag8OTqBmfTV+jomS2ruRp2lCeD6dk7GdJaOKvXR7ofsxRhq Mymvc+MCYUXx0Iy/BDjypm+6F X-Received: by 2002:ae9:f106:: with SMTP id k6mr11116139qkg.274.1626461986576; Fri, 16 Jul 2021 11:59:46 -0700 (PDT) X-Received: by 2002:ae9:f106:: with SMTP id k6mr11116115qkg.274.1626461986393; Fri, 16 Jul 2021 11:59:46 -0700 (PDT) Received: from llong.remote.csb ([2601:191:8500:76c0::cdbc]) by smtp.gmail.com with ESMTPSA id j7sm4290785qkd.21.2021.07.16.11.59.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 16 Jul 2021 11:59:45 -0700 (PDT) From: Waiman Long X-Google-Original-From: Waiman Long Subject: Re: [PATCH v2 2/6] cgroup/cpuset: Clarify the use of invalid partition root To: Waiman Long , Tejun Heo Cc: Zefan Li , Johannes Weiner , Jonathan Corbet , Shuah Khan , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Andrew Morton , Roman Gushchin , Phil Auld , Peter Zijlstra , Juri Lelli References: <20210621184924.27493-1-longman@redhat.com> <20210621184924.27493-3-longman@redhat.com> <6ea1ac38-73e1-3f78-a5d2-a4c23bcd8dd1@redhat.com> Message-ID: <1bb119a1-d94a-6707-beac-e3ae5c03fae5@redhat.com> Date: Fri, 16 Jul 2021 14:59:44 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/16/21 2:44 PM, Waiman Long wrote: > On 7/5/21 1:51 PM, Tejun Heo wrote: >> Hello, Waiman. >> >> On Mon, Jun 28, 2021 at 09:06:50AM -0400, Waiman Long wrote: >>> The main reason for doing this is because normal cpuset control file >>> actions >>> are under the direct control of the cpuset code. So it is up to us >>> to decide >>> whether to grant it or deny it. Hotplug, on the other hand, is not >>> under the >>> control of cpuset code. It can't deny a hotplug operation. This is >>> the main >>> reason why the partition root error state was added in the first place. >> I have a difficult time convincing myself that this difference >> justifies the >> behavior difference and it keeps bothering me that there is a state >> which >> can be reached through one path but rejected by the other. I'll continue >> below. >> >>> Normally, users can set cpuset.cpus to whatever value they want even >>> though >>> they are not actually granted. However, turning on partition root is >>> under >>> more strict control. You can't turn on partition root if the CPUs >>> requested >>> cannot actually be granted. The problem with setting the state to just >>> partition error is that users may not be aware that the partition >>> creation >>> operation fails.  We can't assume all users will do the proper error >>> checking. I would rather let them know the operation fails rather than >>> relying on them doing the proper check afterward. >>> >>> Yes, I agree that it is a different philosophy than the original cpuset >>> code, but I thought one reason of doing cgroup v2 is to simplify the >>> interface and make it a bit more erorr-proof. Since partition root >>> creation >>> is a relatively rare operation, we can afford to make it more strict >>> than >>> the other operations. >> So, IMO, one of the reasons why cgroup1 interface was such a mess was >> because each piece of interaction was designed ad-hoc without regard >> to the >> overall consistency. One person feels a particular way of interacting >> with >> the interface is "correct" and does it that way and another person does >> another part in a different way. In the end, we ended up with a messy >> patchwork. >> >> One problematic aspect of cpuset in cgroup1 was the handling of failure >> modes, which was caused by the same exact approach - we wanted the >> interface >> to reject invalid configurations outright even though we didn't have the >> ability to prevent those configurations from occurring through other >> paths, >> which makes the failure mode more subtle by further obscuring them. >> >> I think a better approach would be having a clear signal and >> mechanism to >> watch the state and explicitly requiring users to verify and monitor the >> state transitions. > > Sorry for the late reply as I was busy with other works. > > I agree with you on principle. However, the reason why there are more > restrictions on enabling partition is because I want to avoid forcing > the users to always read back cpuset.partition.type to see if the > operation succeeds instead of just getting an error from the > operation. The former approach is more error prone. If you don't want > changes in existing behavior, I can relax the checking and allow them > to become an invalid partition if an illegal operation happens. > > Also there is now another cpuset patch to extend cpu isolation to > cgroup v1 [1]. I think it is better suit to the cgroup v2 partition > scheme, but cgroup v1 is still quite heavily out there. > > Please let me know what you want me to do and I will send out a v3 > version. Note that the current cpuset partition implementation have implemented some restrictions on when a partition can be enabled. However, I missed some corner cases in the original implementation that allow certain cpuset operations to make a partition invalid. I tried to plug those holes in this patchset. However, if maintaining backward compatibility is more important, I can leave those holes and update the documentation to make sure that people check cpuset.partition.type to confirm if their operation succeeds. Cheers, Longman