Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp1538727rwe; Thu, 1 Sep 2022 22:21:31 -0700 (PDT) X-Google-Smtp-Source: AA6agR7/Ofq2G1sCKkHB3NOe55mTyORLBuDtc5/6fYKDVl6d7zglj2wfcXfwvivkgXn2RUL0Li5R X-Received: by 2002:a05:6a00:1a14:b0:52d:3e35:5b38 with SMTP id g20-20020a056a001a1400b0052d3e355b38mr34773367pfv.11.1662096090954; Thu, 01 Sep 2022 22:21:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662096090; cv=none; d=google.com; s=arc-20160816; b=Q0IY5LmqymkimO1pN9Vo9l9LkdaSuYSvtztR7wCAGfG8hfnaixfSa4QzFdWKg9hiZW aHhCjqXO+sFVqa6TYMdgokDHZ8muJqCjkNPGjkQIouMx5m5M2LSSFgb6MK/oQJZCpN5Y ly3rDCKOHmC+XcaOeQDH9Hl1EyzmmtoNCDK1Jy0+yNg3Oct/CPP5kQV5RZgNjpRrvgGN Wi9/f+pLBycW0U+SZZhFB5CHLDoVfu5yYyukjnGaYo33E9Uw+fI1iKe3vMboitBcsQlD Rrc1kfFXrW3JQPIjdB7ujvk0t2azB9ehofQu+idmZAqhwE6SSPT+keRwXm66Qo75h0ES jh8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:user-agent:references:in-reply-to:subject:cc:to :from:dkim-signature; bh=VRi5a1sxCa97O+bWotGLB/c+asMr1FexBw4BqNKA6pc=; b=CGpcY7aPwFl0BmIo/50r7x0CIac7VP4YdQw/Qt4lpNe8VUumoDSr/YnHYbDB5pB1zR g1r7BHeIp6Zsr5TX71WSd6VudOsAGPXBxZTjBaAgNSLSmyCCN6/hSx71XWvt8FmyzG2d 91gIkV8PHbffTjYolAX7Fm8PX3pdxnDnpEiE8rJvfdlQxeEG5TJkc2n65FKbvdgoLmSp lMzrjd32m5oHpJWJOicYaJHYuIzYjN+h+kFxlDChhYVQlap+Lr33YKyDvBim7VWqPvqg ljBnx4huzPvmqrKjkwzABvpTsoE1my60boaVPdfN3LKaeA0HPc4wl41JdnhcGwrJBJ18 S7kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NmXNHYds; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e5-20020a17090301c500b0017315e240basi1192635plh.17.2022.09.01.22.21.17; Thu, 01 Sep 2022 22:21:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NmXNHYds; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232699AbiIBFPX (ORCPT + 99 others); Fri, 2 Sep 2022 01:15:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229528AbiIBFPU (ORCPT ); Fri, 2 Sep 2022 01:15:20 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9F371EEE9 for ; Thu, 1 Sep 2022 22:15:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662095717; x=1693631717; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=adnEXD0tkhnpUb2vVIUHy9ABTJXPeGcyJ1wC7h/5gBM=; b=NmXNHYdsbrxnPTY3tmxUQSy/A74G4h1e1/k2POr338dXPnEwsLksMF6E 8AsKxwdXLYH0AfuqqeXsH5bBWcOMnTCm5lrC/+bD+9Tfj1OWXTdNNUcdu AgQssfeNQaYsBXO3rJ8Myuwn6OLzdqNt5vQWcqz+fIQ776aYzXgdU9NQ9 lcHFiap3Tgmy/GNFzanJvTikYBlMLfgx+EfqAKFNjyQsiANvToYudlkUq pVNUA4fzs/d1eKifQwO/eGf6RaUEIRjv5zI2AvWBzYjHPF7HVqRkZZnww 0y+2ksFXd6GRwT3Oen7pVfCykpVYajfXHed4eZBsWAplo0OCVQC8OPPeG A==; X-IronPort-AV: E=McAfee;i="6500,9779,10457"; a="297179083" X-IronPort-AV: E=Sophos;i="5.93,283,1654585200"; d="scan'208";a="297179083" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 22:15:17 -0700 X-IronPort-AV: E=Sophos;i="5.93,283,1654585200"; d="scan'208";a="941152563" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Sep 2022 22:15:13 -0700 From: "Huang, Ying" To: Wei Xu Cc: Aneesh Kumar K V , Johannes Weiner , Linux MM , Andrew Morton , Yang Shi , Davidlohr Bueso , Tim C Chen , Michal Hocko , Linux Kernel Mailing List , Hesham Almatary , Dave Hansen , Jonathan Cameron , Alistair Popple , Dan Williams , jvgediya.oss@gmail.com, Bharata B Rao , Greg Thelen Subject: Re: [PATCH v3 updated] mm/demotion: Expose memory tier details via sysfs In-Reply-To: (Wei Xu's message of "Thu, 1 Sep 2022 22:09:13 -0700") References: <20220830081736.119281-1-aneesh.kumar@linux.ibm.com> <87tu5rzigc.fsf@yhuang6-desk2.ccr.corp.intel.com> <87pmgezkhp.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Date: Fri, 02 Sep 2022 13:15:04 +0800 Message-ID: <87k06mz7af.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Wei Xu writes: > On Thu, Sep 1, 2022 at 5:33 PM Huang, Ying wrote: >> >> Aneesh Kumar K V writes: >> >> > On 9/1/22 12:31 PM, Huang, Ying wrote: >> >> "Aneesh Kumar K.V" writes: >> >> >> >>> This patch adds /sys/devices/virtual/memory_tiering/ where all memor= y tier >> >>> related details can be found. All allocated memory tiers will be lis= ted >> >>> there as /sys/devices/virtual/memory_tiering/memory_tierN/ >> >>> >> >>> The nodes which are part of a specific memory tier can be listed via >> >>> /sys/devices/virtual/memory_tiering/memory_tierN/nodes >> >> >> >> I think "memory_tier" is a better subsystem/bus name than >> >> memory_tiering. Because we have a set of memory_tierN devices inside. >> >> "memory_tier" sounds more natural. I know this is subjective, just my >> >> preference. >> >> >> >>> >> >>> A directory hierarchy looks like >> >>> :/sys/devices/virtual/memory_tiering$ tree memory_tier4/ >> >>> memory_tier4/ >> >>> =E2=94=9C=E2=94=80=E2=94=80 nodes >> >>> =E2=94=9C=E2=94=80=E2=94=80 subsystem -> ../../../../bus/memory_tier= ing >> >>> =E2=94=94=E2=94=80=E2=94=80 uevent >> >>> >> >>> All toptier nodes are listed via >> >>> /sys/devices/virtual/memory_tiering/toptier_nodes >> >>> >> >>> :/sys/devices/virtual/memory_tiering$ cat toptier_nodes >> >>> 0,2 >> >>> :/sys/devices/virtual/memory_tiering$ cat memory_tier4/nodes >> >>> 0,2 >> >> >> >> I don't think that it is a good idea to show toptier information in u= ser >> >> space interface. Because it is just a in kernel implementation >> >> details. Now, we only promote pages from !toptier to toptier. But >> >> there may be multiple memory tiers in toptier and !toptier, we may >> >> change the implementation in the future. For example, we may promote >> >> pages from DRAM to HBM in the future. >> >> >> > >> > >> > In the case you describe above and others, we will always have a list = of >> > NUMA nodes from which memory promotion is not done. >> > /sys/devices/virtual/memory_tiering/toptier_nodes shows that list. >> >> I don't think we will need that interface if we don't restrict promotion >> in the future. For example, he can just check the memory tier with >> smallest number. >> >> TBH, I don't know why do we need that interface. What is it for? We >> don't want to expose unnecessary information to restrict our in kernel >> implementation in the future. >> >> So, please remove that interface at least before we discussing it >> thoroughly. > > I have asked for this interface to allow the userspace to query a list > of top-tier nodes as the targets of userspace-driven promotions. The > idea is that demotion can gradually go down tier by tier, but we > promote hot pages directly to the top-tier and bypass the immediate > tiers. > > Certainly, this can be viewed as a policy choice. Yes. It's possible for us to change this in the future. > Given that now we have a clearly defined memory tier hierarchy in > sysfs and the toptier_nodes content can be constructed from this > memory tier hierarchy and other information from the node sysfs > interfaces, I am fine if we want to remove toptier_nodes and keep the > current memory tier sysfs interfaces to the minimal. Thanks! Best Regards, Huang, Ying >> >> Do we need a way to show the default memory tier in sysfs? That is, = the >> >> memory tier that the DRAM nodes belong to. >> >> >> > >> > I will hold adding that until we have support for modifying memory tie= r details from >> > userspace. That is when userspace would want to know about the default= memory tier. >> > >> > For now, the user interface is a simpler hierarchy of memory tiers, it= 's associated >> > nodes and the list of nodes from which promotion is not done. >> >> OK. >> >> Best Regards, >> Huang, Ying