Received: by 2002:a05:6358:bb9e:b0:b9:5105:a5b4 with SMTP id df30csp4166640rwb; Tue, 6 Sep 2022 03:42:40 -0700 (PDT) X-Google-Smtp-Source: AA6agR6Gc5+JnuJatAdl5dG/KNvpK2M+Nz+KXO2KQaBW3A6UD8QaTImwBy9If7ghkxHrStcxVC0p X-Received: by 2002:a17:902:f68f:b0:176:c42d:4fb9 with SMTP id l15-20020a170902f68f00b00176c42d4fb9mr5160693plg.2.1662460959788; Tue, 06 Sep 2022 03:42:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662460959; cv=none; d=google.com; s=arc-20160816; b=M3CsJ2WXpDuSvjqNiZFiwvWhd1NkTg2XpAJYuPQlO3wi4LEOSsDhFyahK3bz7mhXid m5xA6EWhEpq1bQD3EYoqe1x9Jo4yO5skLQHOQgRHKHSrl1xMQxbVN9RKg7nS4E5rQpcr 8I9KLP/GlqVZb32RqeD1hNJMfDDC1u4TC+h71clxc0f8gKAEu/QHIsO/3Ge1IwlPFRKZ NZUAi7ZbAcrOOLxsDZ0LdkMz6RmyNdCfgkYo/xvBAIBjTc4JVWXw/fw6QdtpKji+5KTW 2emBXiHzumuhEhbeXmzAMLbKWZ5+M8A8Nb8ka+BhkP2e3t9dNxIxVbYWfWSijAyyhps7 MSAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=Xz8bdPgR7exYL3R9SL9LcFFqWqgFZVil8ZYBWFtqEUU=; b=iNz0CYf5YRNqSgJ1epGyPZ8o9R8IL8GLtwGQhSFY95SpScJYcSs3xB+veyUj/bXWoz 2arIXVHWlvwbQHWHaL6iCMvm8ODq+LRtcVJwqsZTDO5qzpf8Egauf5QcRLl2VxSnKlUB 3/XoOWQDsdAw4cJKZaX4Ye2oU/apYa2ZGMbMTv1L7JRUwBcWoJo+drPGSnVHkh4LUGXE gwdGqoJ/SspDzgSAQardaXiVodOeKgvAQgeAupeQqGRkfRUSf90QXoL9vfo6D47QyuX+ ZYq1htZVn6ttky7F4BLnZ+94lmC2d7XswhQaQT2eSjxW2n+cg0GSLr4nC77iUPpj5t/d 5svw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=cA7Hl5ls; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p18-20020a170902e75200b0016d9e2902aesi14601579plf.293.2022.09.06.03.42.27; Tue, 06 Sep 2022 03:42:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=cA7Hl5ls; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239160AbiIFKh6 (ORCPT + 99 others); Tue, 6 Sep 2022 06:37:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233801AbiIFKhz (ORCPT ); Tue, 6 Sep 2022 06:37:55 -0400 Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C3C950048 for ; Tue, 6 Sep 2022 03:37:54 -0700 (PDT) Received: by mail-pj1-x1032.google.com with SMTP id q3so10796959pjg.3 for ; Tue, 06 Sep 2022 03:37:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date; bh=Xz8bdPgR7exYL3R9SL9LcFFqWqgFZVil8ZYBWFtqEUU=; b=cA7Hl5ls5U/OscRp4DobHKMQjNA6D5C4jKC+7Av/PRDOkYb+XEGwmPKNz+0wLOS61X fhy0qVVTI5ldKJ0RhCmSDZqq/RQePtu4iAEe6umgDYn0MBNIb9KyPYXriWh+6o/CF8Ev 0SP2kzY3jn3Ut7Uwr3OOVvEv5KLxDMgFehbIVflowPFwUBRRMOLtK6Poj76c5WL10QlO UZZU0VV5LbhyEHUDIUhN0ptJM5SMq1MRRmLd4NfX9PAtqxwkwlH2NCOb6SDzLvD0tb4k HBIsZ8wCBCeGVl7izNloEowlXHHdNdvPd5dTbNnKUop2/NhG/cwTOuW0nFYVuh7pSAbQ i+6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date; bh=Xz8bdPgR7exYL3R9SL9LcFFqWqgFZVil8ZYBWFtqEUU=; b=oB2ReD9I8mu7V2Be5FsG78mSyD87DpKhPtdFlMOMnBGtZlEAtc48qiaZg7gsqsO8KX IJJpF1++GExWW3clOeoqbicIxNHoNEEb9kZVNJ6Ol2FWbTNVMIMfO/fSC1aDxbxApwB8 kD+8CP/PC45KDZA0R45KhXyNtHbbbRPHw/fENYcQt9N6dbWNresL1Jl17ukC0n2owniG 2kEPWU0tPlm0/WDr/EOmt6o1Et7Ft+uo1ux05/zAPzZPUK5jEvES3OUIFzIjr5fo3y8B qjXseQLuEv/E22Iv/O05rwh4StQdvZWHuuN3UJh8AzVTEwbM8+7pMp4mvMEP+AkS9p1F /fIQ== X-Gm-Message-State: ACgBeo1EZ5EeTx6awPO1T04+eheqQtZhOYY4y7QtyniOZvfyuYrQcU+U QNEkYJI51To2+LLg/rC7P8hy1g== X-Received: by 2002:a17:903:228c:b0:16e:df74:34e5 with SMTP id b12-20020a170903228c00b0016edf7434e5mr53043766plh.49.1662460673860; Tue, 06 Sep 2022 03:37:53 -0700 (PDT) Received: from [10.4.233.171] ([139.177.225.241]) by smtp.gmail.com with ESMTPSA id a16-20020a170902ecd000b00176db67ce98sm442737plh.3.2022.09.06.03.37.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Sep 2022 03:37:53 -0700 (PDT) Message-ID: Date: Tue, 6 Sep 2022 18:37:40 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [External] Re: [PATCH] cgroup/cpuset: Add a new isolated mems.policy type. To: Michal Hocko Cc: hannes@cmpxchg.org, roman.gushchin@linux.dev, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, lizefan.x@bytedance.com, wuyun.abel@bytedance.com References: <20220904040241.1708-1-hezhongkun.hzk@bytedance.com> <0e5f380b-9201-0f56-9144-ce8449491fc8@bytedance.com> From: Zhongkun He In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Mon 05-09-22 18:30:55, Zhongkun He wrote: >> Hi Michal, thanks for your reply. >> >> The current 'mempolicy' is hierarchically independent. The default value of >> the child is to inherit from the parent. The modification of the child >> policy will not be restricted by the parent. > > This breaks cgroup fundamental property of hierarchical enforcement of > each property. And as such it is a no go. > >> Of course, there are other options, such as the child's policy mode must be >> the same as the parent's. node can be the subset of parent's, but the >> interleave type will be complicated, that's why hierarchy independence is >> used. It would be better if you have other suggestions? > > Honestly, I am not really sure cgroup cpusets is a great fit for this > usecase. It would be probably better to elaborate some more what are the > existing shortcomings and what you would like to achieve. Just stating > the syscall is a hard to use interface is not quite clear on its own. > > Btw. have you noticed this question? > >>> What is the hierarchical behavior of the policy? Say parent has a >>> stronger requirement (say bind) than a child (prefer)? >>>> How to use the mempolicy interface: >>>> echo prefer:2 > /sys/fs/cgroup/zz/cpuset.mems.policy >>>> echo bind:1-3 > /sys/fs/cgroup/zz/cpuset.mems.policy >>>> echo interleave:0,1,2,3 >/sys/fs/cgroup/zz/cpuset.mems.policy >>> >>> Am I just confused or did you really mean to combine all these >>> together? > Hi Michal, thanks for your reply. >>Say parent has a stronger requirement (say bind) than a child(prefer)? Yes, combine all these together. The parent's task will use 'bind', child's use 'prefer'.This is the current implementation, and we can discuss and modify it together if there are other suggestions. 1:Existing shortcomings In our use case, the application and the control plane are two separate systems. When the application is created, it doesn't know how to use memory, and it doesn't care. The control plane will decide the memory usage policy based on different reasons (the attributes of the application itself, the priority, the remaining resources of the system). Currently, numactl is used to set it at program startup, and the child process will inherit the mempolicy. But we can't dynamically adjust the memory policy, except restart, the memory policy will not change. 2:Our goals For the above reasons, we want to create a mempolicy at the cgroup level. Usually processes under a cgroup have the same priority and attributes, and we can dynamically adjust the memory allocation strategy according to the remaining resources of the system. For example, a low-priority cgroup uses the 'bind:2-3' policy, and a high-priority cgroup uses bind:0-1. When resources are insufficient, it will be changed to bind:3, bind:0-2 by control plane, etc.Further more, more mempolicy can be extended, such as allocating memory according to node weight, etc. Thanks.