Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp273766rdf; Thu, 2 Nov 2023 23:17:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFdOn2ara5aKMbxKccbFfglUMyCtmdJIxKT0FhfIr+E4JbajuqzQ+gMkS/u39erT+hr/lUq X-Received: by 2002:a17:902:f34d:b0:1c9:d0a0:ee88 with SMTP id q13-20020a170902f34d00b001c9d0a0ee88mr14637652ple.62.1698992237521; Thu, 02 Nov 2023 23:17:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698992237; cv=none; d=google.com; s=arc-20160816; b=SZaDhyg5SqE1jSOETPhVTwO8I9aLC96FCoVPtWyd60kw1SmXgdJWG2G0NyQdiJuTmv IpDaPig3hEU01cULYcbgJj9IkVWHCmLiqPwwQ3SjBwB9NEEU16NUTdxS5m0lE+9j8nxL V+qdfiJDpD2x1DZjNG3lQoAtxpH7nL92nwXK8Sy/nCI0MhPIK9B5/voZyKP5x+wNQuTB 4jE3PxvDVe/5niH3LA2JTb8y+oAf5v/Hl6qQowO2PLjH3AvhrGggUp+TFHA8IdY/onOT CIdeuVxn2LH71t632NYQy0Bl7LYJQ07LnAiYYvGdW+KGbxaYWhFXnKxDTiUupKGKmbDA HOew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:dkim-signature; bh=8X/3qrC41oYVHYM5kowu5FLM6u+073L3SeXuFK59GBc=; fh=SEygaBViyyQYZ/HtQkZs8/bFrxdpTriNQP0Qyc1YTKg=; b=pp8fXeuZfjWlR5QWmxQTlrHNUY0m48SHYGhBtHRGT1LKJV+5Sq2/OpK5dBSjgJHHVT Vt6qFUrlHLi95+vmEC3/LM2ViT6lm1jqw9c0iZ4fkAi8sNIGtGZTe96OEItZUl2ZLIff GJG59ABeAoSEQLsmtvubu0OCgO0qPekKgfgCG6V2Z/YgLfgnOULvQO6wRGl9BtpbO2Uv tqdZR2dSrmwtoin4BWINBry9QRPeYz56egAinQurzwaAPYYDCzH9wo5qaD0fGBZcxMG3 GJEaRuxAsJ4D+t6lUnktz9Y5mLuIyP2+JxITtA3TFEQAXZSROzgU4S8B7x3QzQxRhyFZ MLrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=K+uOzlpI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id e21-20020a170902f11500b001c1e1fe16c7si942292plb.236.2023.11.02.23.17.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 23:17:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=K+uOzlpI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id CF8958027005; Thu, 2 Nov 2023 23:17:06 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231528AbjKCGQd (ORCPT + 99 others); Fri, 3 Nov 2023 02:16:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229727AbjKCGQc (ORCPT ); Fri, 3 Nov 2023 02:16:32 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 083ADCA for ; Thu, 2 Nov 2023 23:16:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698992185; x=1730528185; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version; bh=9jHkBKj9vFoE1PXdMlM0Bvfm8ZcZurSAFkjdr/YE9pY=; b=K+uOzlpIU55z7DcXhQkwS4saSRrK3JxbXZJMrTkDsDq5FqTP/Vl5/2dw 0889vpOOCF1UwsWbtqzG3X0VwSj1yaoTO6yZw7t3V6I4YyG+j8Lpd8Oyw sE9a+sqPfftwpAN+86pMmaSL36AMV5DC4pldBV1Tu5FjFSqAsIk8gTQh9 vWrQfMlqprYFIzd0c86q8KZtvdRgBRsIITw0iZ/j95lUPU1EM/qJnXxof fdgRkf354t3WwDQUsvjN0K8IrN+qei7Pu6kjLZR2Puaep1nX0eQRjL24d Q4j+u0u+XTwmls7jqxIyzSN5j3DsafU/Rgfq+Cs9PjjGiNzI0eGycrUdg g==; X-IronPort-AV: E=McAfee;i="6600,9927,10882"; a="419995026" X-IronPort-AV: E=Sophos;i="6.03,273,1694761200"; d="scan'208";a="419995026" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 23:16:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10882"; a="935012661" X-IronPort-AV: E=Sophos;i="6.03,273,1694761200"; d="scan'208";a="935012661" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2023 23:16:23 -0700 From: "Huang, Ying" To: "Yasunori Gotou (Fujitsu)" Cc: Andrew Morton , Greg Kroah-Hartman , "rafael@kernel.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "Zhijian Li (Fujitsu)" Subject: Re: [PATCH RFC 3/4] mm/vmstat: rename pgdemote_* to pgdemote_dst_* and add pgdemote_src_* In-Reply-To: (Yasunori Gotou's message of "Thu, 2 Nov 2023 09:45:38 +0000") References: <20231102025648.1285477-1-lizhijian@fujitsu.com> <20231102025648.1285477-4-lizhijian@fujitsu.com> <87r0l81zfd.fsf@yhuang6-desk2.ccr.corp.intel.com> <871qd81ttm.fsf@yhuang6-desk2.ccr.corp.intel.com> Date: Fri, 03 Nov 2023 14:14:21 +0800 Message-ID: <87sf5nz7lu.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-1.3 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 02 Nov 2023 23:17:07 -0700 (PDT) "Yasunori Gotou (Fujitsu)" writes: >> > Hello, >> > >> >> On 02/11/2023 13:45, Huang, Ying wrote: >> >> > Li Zhijian writes: >> >> > >> >> >> pgdemote_src_*: pages demoted from this node. >> >> >> pgdemote_dst_*: pages demoted to this node. >> >> >> >> >> >> So that we are able to know their demotion per-node stats by checking >> this. >> >> >> >> >> >> In the environment, node0 and node1 are DRAM, node3 is PMEM. >> >> >> >> >> >> Global stats: >> >> >> $ grep -E 'demote' /proc/vmstat >> >> >> pgdemote_src_kswapd 130155 >> >> >> pgdemote_src_direct 113497 >> >> >> pgdemote_src_khugepaged 0 >> >> >> pgdemote_dst_kswapd 130155 >> >> >> pgdemote_dst_direct 113497 >> >> >> pgdemote_dst_khugepaged 0 >> >> >> >> >> >> Per-node stats: >> >> >> $ grep demote /sys/devices/system/node/node0/vmstat >> >> >> pgdemote_src_kswapd 68454 >> >> >> pgdemote_src_direct 83431 >> >> >> pgdemote_src_khugepaged 0 >> >> >> pgdemote_dst_kswapd 0 >> >> >> pgdemote_dst_direct 0 >> >> >> pgdemote_dst_khugepaged 0 >> >> >> >> >> >> $ grep demote /sys/devices/system/node/node1/vmstat >> >> >> pgdemote_src_kswapd 185834 >> >> >> pgdemote_src_direct 30066 >> >> >> pgdemote_src_khugepaged 0 >> >> >> pgdemote_dst_kswapd 0 >> >> >> pgdemote_dst_direct 0 >> >> >> pgdemote_dst_khugepaged 0 >> >> >> >> >> >> $ grep demote /sys/devices/system/node/node3/vmstat >> >> >> pgdemote_src_kswapd 0 >> >> >> pgdemote_src_direct 0 >> >> >> pgdemote_src_khugepaged 0 >> >> >> pgdemote_dst_kswapd 254288 >> >> >> pgdemote_dst_direct 113497 >> >> >> pgdemote_dst_khugepaged 0 >> >> >> >> >> >> From above stats, we know node3 is the demotion destination which >> >> >> one the node0 and node1 will demote to. >> >> > >> >> > Why do we need these information? Do you have some use case? >> >> >> >> I recall our customers have mentioned that they want to know how much >> >> the memory is demoted to the CXL memory device in a specific period. >> > >> > I'll mention about it more. >> > >> > I had a conversation with one of our customers. He expressed a desire >> > for more detailed profile information to analyze the behavior of >> > demotion (and promotion) when his workloads are executed. >> > If the results are not satisfactory for his workloads, he wants to >> > tune his servers for his workloads with these profiles. >> > Additionally, depending on the results, he may want to change his server >> configuration. >> > For example, he may want to buy more expensive DDR memories rather than >> cheaper CXL memory. >> > >> > In my impression, our customers seems to think that CXL memory is NOT as >> reliable as DDR memory yet. >> > Therefore, they want to prepare for the new world that CXL will bring, >> > and want to have a method for the preparation by profiling information as >> much as possible. >> > >> > it this enough for your question? >> >> I want some more detailed information about how these stats are used? >> Why isn't per-node pgdemote_xxx counter enough? > > I rechecked the customer's original request. > > - If a memory area is demoted to a CXL memory node, he wanted to analyze how it affects performance > of their workload, such as latency. He wanted to use CXL Node memory usage as basic > information for the analysis. > > - If he notices that demotion occurs well on a server and CXL memories are used 85% constantly, he > may want to add DDR DRAM or select some other ways to avoid demotion. > (His image is likely Swap free/used.) > IIRC, demotion target is not spread to all of the CXL memory node, right? > Then, he needs to know how CXL memory is occupied by demoted memory. > > If I misunderstand something, or you have any better idea, > please let us know. I'll talk with him again. (It will be next week...) To check CXL memory usage, /proc/PID/numa_maps, /sys/fs/cgroup/CGROUP/memory.numa_stat, and /sys/devices/system/node/nodeN/meminfo can be used for process, cgroup, and NUMA node respectively. Is this enough? -- Best Regards, Huang, Ying >> > >> >> >> >> >> >> >>> mod_node_page_state(NODE_DATA(target_nid), >> >> >>> - PGDEMOTE_KSWAPD + reclaimer_offset(), >> >> nr_succeeded); >> >> >>> + PGDEMOTE_DST_KSWAPD + reclaimer_offset(), >> >> nr_succeeded); >> >> >> >> But if the *target_nid* is only indicate the preferred node, this >> >> accounting maybe not accurate. >> >> [snip]