Received: by 2002:a05:6a10:83d0:0:0:0:0 with SMTP id o16csp24634pxh; Thu, 7 Apr 2022 21:44:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydX8BHaKITn/OjX7pKGxC92oFouPbgQwWFFJz1Ly01z0lvPsRKWw7cGhqjtmpZwori9RiP X-Received: by 2002:a17:902:f78d:b0:14f:ce61:eaf2 with SMTP id q13-20020a170902f78d00b0014fce61eaf2mr17390859pln.124.1649393087030; Thu, 07 Apr 2022 21:44:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649393087; cv=none; d=google.com; s=arc-20160816; b=i0as6gwM5ZIpgIS41hubA/vykoZRm8+gePTAq3Xul51LIScEf0qXdRIXTiwXUGF6xp a7EvHMNkkXkpNbEH9LWN7nilo2QKdjnVmql2nJHACHAZXlTPd/vdxG56d+curhrDizQV /PYGaYaM71Fs0t4r6lS4twj5wrcn/YIKJzG1/kcxPN0l4SrnrEBgwyAVtGwihUhwjp45 LKkR3+OgDKOa6ygzIsaOmeTIBrXCMz7z07DEleqa9m7MAX9jB+Jf3tR7KXA68sQNdvgE V+fcqHV0LeX7dlHll6/IW7M8TrBTgRvFa3oXAUGi05kIqfpLW2DA8vDk07bBRmRkqZzm GeVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=kLJsh09usQM6KN7slbu+hDyAz10KLaAr5TuAGBF1ET8=; b=TlRLryZuZvwe9YSWhDx4KeEd/uZUJf0RX/q9wZq47vBVz7yRFvxTo4OMfFeiGhn5Ok /+E8chU9w/KBRwondcRkze1Njbcx0STO9ubN5vi8xJ4TdRw1gLjy+vv/EzdRc8xUBRjE LeFyeXK01IMgGEyBHYlXbiP1T0u9sWxZKun8evS/zsAgEQH3/e0VvH2iLXrrM29YtaGZ HpUY19rbVkcC+8eSsr3bxHNxa17Fal9EtWzpOJb9d4GrGPc0E17a7wfR5nn00S/JB+Ei MZSSpiWANmtfLJhdNe/DHKXlvVPWVA1zOPBqi45PzCbKuexDLA5JyXFSJZ8OXZYjs62a aekw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="gBo+/Hbr"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id q16-20020a656850000000b00398eab86189si279046pgt.382.2022.04.07.21.44.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Apr 2022 21:44:47 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="gBo+/Hbr"; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 51697210298; Thu, 7 Apr 2022 21:11:03 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234211AbiDHEMx (ORCPT + 99 others); Fri, 8 Apr 2022 00:12:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234207AbiDHEMs (ORCPT ); Fri, 8 Apr 2022 00:12:48 -0400 Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E0711FF436 for ; Thu, 7 Apr 2022 21:10:45 -0700 (PDT) Received: by mail-io1-xd36.google.com with SMTP id p21so9309202ioj.4 for ; Thu, 07 Apr 2022 21:10:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kLJsh09usQM6KN7slbu+hDyAz10KLaAr5TuAGBF1ET8=; b=gBo+/HbruUwOBKndO6LMcDu/iJv+/tOrAmtZhCfGRubv6KuCygoW0XzqGPxgTnIjUE YsTDHr2rrSBBP4YfBA072r25eyCoMDp623Z9qh4T2KJ/8B9Rst7np2vKyRSlZ6aWzWJ9 ZrVqZ0kipXK2bXR/XkFovvZKgn4UZJfIehQfiXfSUyIkHnZs4TO8t80L0fJfGBfxdQ0e Ks3hUVwCWQARQLa89WDBX16qmbJeFLC3C+MqOnQu4ghjjxCkOQcnGZUWk62YF7TAP13I 8FRV6jbqN0tq1xYnyrUCZVi227qm1C7BkKEQywopeprRzWbJ2f0+P4RyC2+AILgu8SD8 MUKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kLJsh09usQM6KN7slbu+hDyAz10KLaAr5TuAGBF1ET8=; b=jJ/OwBKl0igF8J2GllTGzr9OyDrOutb7mLdh4NJEarxnSOwwtmHjMvgW4vyDQhfIDs yUaCC5HbDcayooS3kZeckCimWfuTfoRDjNmE/q0gsEoQQT7QsssWhVNzrzkl1joYB1EC UxXYkbViRoiDlEbRZLsfSQHHTFe4Mv2xSipYhyjVWadX2EInTC/m6a/87oEqSoFt0fcq z5RhraTWGTYj/CWGS5y864x4+f5fuQlJQ4dYFEh9fVNt0Jfk3Ku4FKgVi3QIxWfczC47 MbueQGTl5M0ITDvBpypF3svoKZOt0CwrK6YElnIPo5Arygld8Xj4U21EmGW0PaEUJJqt Y3Ng== X-Gm-Message-State: AOAM530wPMWA2zA0zAwucvqaPZyqrG02rXPIbAdA0Wt6+zT8mBrjOk9r cAKZUjbgRlVs+K/z1vCOtZg32Zz9MijjKqsdDe7zlQ== X-Received: by 2002:a5e:dc4c:0:b0:64c:ceff:8916 with SMTP id s12-20020a5edc4c000000b0064cceff8916mr7651721iop.117.1649391044303; Thu, 07 Apr 2022 21:10:44 -0700 (PDT) MIME-Version: 1.0 References: <20220331084151.2600229-1-yosryahmed@google.com> <87y20nzyw4.fsf@yhuang6-desk2.ccr.corp.intel.com> <87o81fujdc.fsf@yhuang6-desk2.ccr.corp.intel.com> <87bkxfudrk.fsf@yhuang6-desk2.ccr.corp.intel.com> <215bd7332aee0ed1092bad4d826a42854ebfd04a.camel@linux.intel.com> <87y20gtgpf.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87y20gtgpf.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Wei Xu Date: Thu, 7 Apr 2022 21:10:33 -0700 Message-ID: Subject: Re: [PATCH resend] memcg: introduce per-memcg reclaim interface To: "Huang, Ying" Cc: Tim Chen , Michal Hocko , Yosry Ahmed , Johannes Weiner , Shakeel Butt , Andrew Morton , David Rientjes , Tejun Heo , Zefan Li , Roman Gushchin , Cgroups , "open list:DOCUMENTATION" , Linux Kernel Mailing List , Linux MM , Jonathan Corbet , Yu Zhao , Dave Hansen , Greg Thelen Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 7, 2022 at 8:08 PM Huang, Ying wrote: > > Wei Xu writes: > > > On Thu, Apr 7, 2022 at 4:11 PM Tim Chen wrote: > >> > >> On Thu, 2022-04-07 at 15:12 -0700, Wei Xu wrote: > >> > >> > > >> > (resending in plain-text, sorry). > >> > > >> > memory.demote can work with any level of memory tiers if a nodemask > >> > argument (or a tier argument if there is a more-explicitly defined, > >> > userspace visible tiering representation) is provided. The semantics > >> > can be to demote X bytes from these nodes to their next tier. > >> > > >> > >> We do need some kind of userspace visible tiering representation. > >> Will be nice if I can tell the memory type, nodemask of nodes in tier Y with > >> > >> cat memory.tier_Y > >> > >> > >> > memory_dram/memory_pmem assumes the hardware for a particular memory > >> > tier, which is undesirable. For example, it is entirely possible that > >> > a slow memory tier is implemented by a lower-cost/lower-performance > >> > DDR device connected via CXL.mem, not by PMEM. It is better for this > >> > interface to speak in either the NUMA node abstraction or a new tier > >> > abstraction. > >> > >> Just from the perspective of memory.reclaim and memory.demote, I think > >> they could work with nodemask. For ease of management, > >> some kind of abstraction of tier information like nodemask, memory type > >> and expected performance should be readily accessible by user space. > >> > > > > I agree. The tier information should be provided at the system level. > > One suggestion is to have a new directory "/sys/devices/system/tier/" > > for tiers, e.g.: > > > > /sys/devices/system/tier/tier0/memlist: all memory nodes in tier 0. > > /sys/devices/system/tier/tier1/memlist: all memory nodes in tier 1. > > I think that it may be sufficient to make tier an attribute of "node". > Some thing like, > > /sys/devices/system/node/nodeX/memory_tier > This works. If we want additional information about each tier, we can then add a tier-specific subtree. In addition, it would be good to also expose the demotion target nodes (node_demotion[]) via sysfs, e.g.: /sys/devices/system/node/nodeX/demotion_path which returns node_demotion[X]. > Best Regards, > Huang, Ying > > > We can discuss this tier representation in a new thread. > > > >> Tim > >> > >> > > >> > It is also desirable to make this interface stateless, i.e. not to > >> > require the setting of memory_dram.reclaim_policy. Any policy can be > >> > specified as arguments to the request itself and should only affect > >> > that particular request. > >> > > >> > Wei > >> >