Received: by 2002:a05:7412:b795:b0:e2:908c:2ebd with SMTP id iv21csp206881rdb; Thu, 2 Nov 2023 00:48:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHiq+7bCpWQE2QwYrvMgb4Iwze6AuFXwbGaOAZ2dsyxhxwWUryryKNKxXyYTPVyp263C0JP X-Received: by 2002:a05:6870:180d:b0:1e9:c18b:b2da with SMTP id t13-20020a056870180d00b001e9c18bb2damr20608213oaf.18.1698911331799; Thu, 02 Nov 2023 00:48:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698911331; cv=none; d=google.com; s=arc-20160816; b=H3qp953cbiczOuDKZmc0gt2Wa5F/1mqRV4KvQ397WJAi0cAo7hC7wVhHV3D4RikX6Q 3stWuOBbY5gQSzxgjjAaSzbITuembHk99zgbJ5Vcp1WAPKhMLDem5c83Zz1rlIuXIx1N /AQkfUEvmWaVxh+J2rL2p+4EXYkzhr4CCVnYssMQJJIpwkSGGoqKmDM2zhtCKD1iuMqT ZJbr7hxzwORhaCy/O+DiWyh4NIt1FsxRRpv+j4pkx0eVQLAh8P19kXQSmYvnSeRmNyZQ fyY0SVcsbqAOY1sRVSIdhDRyOEqwFD13+I78MkkWiDUPy24ltaaAuvM6BhTYqAxqJtnB F+qA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:dkim-signature; bh=13HzKIKsh4VQtZiSyTxXLr6KXrnYnBgJA12Nor8L1Rs=; fh=SEygaBViyyQYZ/HtQkZs8/bFrxdpTriNQP0Qyc1YTKg=; b=0Eft0ADkiQSs/eXgqKuQhC1QOqsM0jqfNmVhTDNaQAaUslleunRYu2zmD+kOtZn1/a CdfTQkbNF5FNU/052sxd1Fp69VTP4tJ1EHdemXxMTzKS/4yULNto+LWCtZR1gt//c3xS lEEd742c+jZKgNJpz9HnuqZ27yN1XM1lB7PzQ2f22p4/5/mRTogzMTZ2SgkqT4Rz/V/3 Z0Fs2GdH4PdWhNa6QDc5r3xHfEsBaYRQqKZ/Nsf7gTM7UZyOTsFMZl4x88QzST83D0yp 5M2YAu0zaYVJkHumLRvgpv72Sd1rjVMAl47JegYdFGCVXmpdg1I/T+KOg7kixX8uZwp9 07Cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HNBsEyfH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id y16-20020a656c10000000b005b92b8e70e9si1668558pgu.301.2023.11.02.00.48.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 00:48:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HNBsEyfH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id A5860802A3E3; Thu, 2 Nov 2023 00:48:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344031AbjKBHsY (ORCPT + 99 others); Thu, 2 Nov 2023 03:48:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229481AbjKBHsX (ORCPT ); Thu, 2 Nov 2023 03:48:23 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7571E7 for ; Thu, 2 Nov 2023 00:48:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698911297; x=1730447297; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=5YvNxTfvSnC1y/nzvgzf1E81qwSgP1aoPEYjM5UVLpo=; b=HNBsEyfHxDWiqAl9M3/5bDmHx8Avrxx+zYcGnKh3+qMy++saNWBfIHUd NhRAxNIjf/TCDWl6i1oSqrue16V1QG5T9yBnpJOPXs76R4Tmi9K+tmGDX XX96nOuGCcBepaqbPRK5wteSSzq3DKWa6JVGLXB4yMfxcdWWWAGZjekPK 6kU4WrhdeM+cikxetmtujIbgWGZkD4nkFXIf0CL9PQ5dSR51lFmIl5/sD RrUUKPNb5mpIHStLGMI8VUomqDDIfI+QtIuLhS4xRpNYLzcPAGg1EUU0t b0FAe3FpvXwXtHSYx4FT/EvGwh++/JDKrEu5142Rf31dKltbV/lJRcdZD g==; X-IronPort-AV: E=McAfee;i="6600,9927,10881"; a="419768059" X-IronPort-AV: E=Sophos;i="6.03,270,1694761200"; d="scan'208";a="419768059" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 00:48:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10881"; a="754699383" X-IronPort-AV: E=Sophos;i="6.03,270,1694761200"; d="scan'208";a="754699383" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 00:48:15 -0700 From: "Huang, Ying" To: "Yasunori Gotou (Fujitsu)" Cc: Andrew Morton , Greg Kroah-Hartman , "rafael@kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "Zhijian Li (Fujitsu)" Subject: Re: [PATCH RFC 3/4] mm/vmstat: rename pgdemote_* to pgdemote_dst_* and add pgdemote_src_* In-Reply-To: (Yasunori Gotou's message of "Thu, 2 Nov 2023 07:38:19 +0000") References: <20231102025648.1285477-1-lizhijian@fujitsu.com> <20231102025648.1285477-4-lizhijian@fujitsu.com> <87r0l81zfd.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Thu, 02 Nov 2023 15:46:13 +0800 Message-ID: <871qd81ttm.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 02 Nov 2023 00:48:45 -0700 (PDT) "Yasunori Gotou (Fujitsu)" writes: > Hello, > >> On 02/11/2023 13:45, Huang, Ying wrote: >> > Li Zhijian writes: >> > >> >> pgdemote_src_*: pages demoted from this node. >> >> pgdemote_dst_*: pages demoted to this node. >> >> >> >> So that we are able to know their demotion per-node stats by checking this. >> >> >> >> In the environment, node0 and node1 are DRAM, node3 is PMEM. >> >> >> >> Global stats: >> >> $ grep -E 'demote' /proc/vmstat >> >> pgdemote_src_kswapd 130155 >> >> pgdemote_src_direct 113497 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 130155 >> >> pgdemote_dst_direct 113497 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> Per-node stats: >> >> $ grep demote /sys/devices/system/node/node0/vmstat >> >> pgdemote_src_kswapd 68454 >> >> pgdemote_src_direct 83431 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 0 >> >> pgdemote_dst_direct 0 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> $ grep demote /sys/devices/system/node/node1/vmstat >> >> pgdemote_src_kswapd 185834 >> >> pgdemote_src_direct 30066 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 0 >> >> pgdemote_dst_direct 0 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> $ grep demote /sys/devices/system/node/node3/vmstat >> >> pgdemote_src_kswapd 0 >> >> pgdemote_src_direct 0 >> >> pgdemote_src_khugepaged 0 >> >> pgdemote_dst_kswapd 254288 >> >> pgdemote_dst_direct 113497 >> >> pgdemote_dst_khugepaged 0 >> >> >> >> From above stats, we know node3 is the demotion destination which one >> >> the node0 and node1 will demote to. >> > >> > Why do we need these information? Do you have some use case? >> >> I recall our customers have mentioned that they want to know how much the >> memory is demoted >> to the CXL memory device in a specific period. > > I'll mention about it more. > > I had a conversation with one of our customers. He expressed a desire for more detailed > profile information to analyze the behavior of demotion (and promotion) when > his workloads are executed. > If the results are not satisfactory for his workloads, he wants to tune his servers for his workloads > with these profiles. > Additionally, depending on the results, he may want to change his server configuration. > For example, he may want to buy more expensive DDR memories rather than cheaper CXL memory. > > In my impression, our customers seems to think that CXL memory is NOT as reliable as DDR memory yet. > Therefore, they want to prepare for the new world that CXL will bring, and want to have a method > for the preparation by profiling information as much as possible. > > it this enough for your question? I want some more detailed information about how these stats are used? Why isn't per-node pgdemote_xxx counter enough? -- Best Regards, Huang, Ying > Thanks, > >> >> >> >>> mod_node_page_state(NODE_DATA(target_nid), >> >>> - PGDEMOTE_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >>> + PGDEMOTE_DST_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> But if the *target_nid* is only indicate the preferred node, this accounting >> maybe not accurate. >> >> >> Thanks >> Zhijian >> >> > >> > -- >> > Best Regards, >> > Huang, Ying >> > >> >> Signed-off-by: Li Zhijian >> >> --- >> >> RFC: their names are open to discussion, maybe pgdemote_from/to_* >> >> Another defect of this patch is that, SUM(pgdemote_src_*) is always same >> >> as SUM(pgdemote_dst_*) in the global stats, shall we hide one of them. >> >> --- >> >> include/linux/mmzone.h | 9 ++++++--- >> >> mm/vmscan.c | 13 ++++++++++--- >> >> mm/vmstat.c | 9 ++++++--- >> >> 3 files changed, 22 insertions(+), 9 deletions(-) >> >> >> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> >> index ad0309eea850..a6140d894bec 100644 >> >> --- a/include/linux/mmzone.h >> >> +++ b/include/linux/mmzone.h >> >> @@ -207,9 +207,12 @@ enum node_stat_item { >> >> PGPROMOTE_SUCCESS, /* promote successfully */ >> >> PGPROMOTE_CANDIDATE, /* candidate pages to promote */ >> >> /* PGDEMOTE_*: pages demoted */ >> >> - PGDEMOTE_KSWAPD, >> >> - PGDEMOTE_DIRECT, >> >> - PGDEMOTE_KHUGEPAGED, >> >> + PGDEMOTE_SRC_KSWAPD, >> >> + PGDEMOTE_SRC_DIRECT, >> >> + PGDEMOTE_SRC_KHUGEPAGED, >> >> + PGDEMOTE_DST_KSWAPD, >> >> + PGDEMOTE_DST_DIRECT, >> >> + PGDEMOTE_DST_KHUGEPAGED, >> >> #endif >> >> NR_VM_NODE_STAT_ITEMS >> >> }; >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> >> index 2f1fb4ec3235..55d2287d7150 100644 >> >> --- a/mm/vmscan.c >> >> +++ b/mm/vmscan.c >> >> @@ -1111,13 +1111,18 @@ void drop_slab(void) >> >> static int reclaimer_offset(void) >> >> { >> >> BUILD_BUG_ON(PGSTEAL_DIRECT - PGSTEAL_KSWAPD != >> >> - PGDEMOTE_DIRECT - PGDEMOTE_KSWAPD); >> >> + PGDEMOTE_SRC_DIRECT - >> PGDEMOTE_SRC_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_DIRECT - PGSTEAL_KSWAPD != >> >> PGSCAN_DIRECT - PGSCAN_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_KHUGEPAGED - PGSTEAL_KSWAPD != >> >> - PGDEMOTE_KHUGEPAGED - >> PGDEMOTE_KSWAPD); >> >> + PGDEMOTE_SRC_KHUGEPAGED - >> PGDEMOTE_SRC_KSWAPD); >> >> BUILD_BUG_ON(PGSTEAL_KHUGEPAGED - PGSTEAL_KSWAPD != >> >> PGSCAN_KHUGEPAGED - PGSCAN_KSWAPD); >> >> + BUILD_BUG_ON(PGDEMOTE_SRC_DIRECT - >> PGDEMOTE_SRC_KSWAPD != >> >> + PGDEMOTE_DST_DIRECT - >> PGDEMOTE_DST_KSWAPD); >> >> + BUILD_BUG_ON(PGDEMOTE_SRC_KHUGEPAGED - >> PGDEMOTE_SRC_KSWAPD != >> >> + PGDEMOTE_DST_KHUGEPAGED - >> PGDEMOTE_DST_KSWAPD); >> >> + >> >> >> >> if (current_is_kswapd()) >> >> return 0; >> >> @@ -1678,8 +1683,10 @@ static unsigned int demote_folio_list(struct >> list_head *demote_folios, >> >> (unsigned long)&mtc, MIGRATE_ASYNC, >> MR_DEMOTION, >> >> &nr_succeeded); >> >> >> >> + mod_node_page_state(pgdat, >> >> + PGDEMOTE_SRC_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> mod_node_page_state(NODE_DATA(target_nid), >> >> - PGDEMOTE_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> + PGDEMOTE_DST_KSWAPD + reclaimer_offset(), >> nr_succeeded); >> >> >> >> return nr_succeeded; >> >> } >> >> diff --git a/mm/vmstat.c b/mm/vmstat.c >> >> index f141c48c39e4..63f106a5e008 100644 >> >> --- a/mm/vmstat.c >> >> +++ b/mm/vmstat.c >> >> @@ -1244,9 +1244,12 @@ const char * const vmstat_text[] = { >> >> #ifdef CONFIG_NUMA_BALANCING >> >> "pgpromote_success", >> >> "pgpromote_candidate", >> >> - "pgdemote_kswapd", >> >> - "pgdemote_direct", >> >> - "pgdemote_khugepaged", >> >> + "pgdemote_src_kswapd", >> >> + "pgdemote_src_direct", >> >> + "pgdemote_src_khugepaged", >> >> + "pgdemote_dst_kswapd", >> >> + "pgdemote_dst_direct", >> >> + "pgdemote_dst_khugepaged", >> >> #endif >> >> >> >> /* enum writeback_stat_item counters */