Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp301067rdg; Thu, 12 Oct 2023 06:14:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHh+lUc+drSPw2YMFeVEkt1ECoVDwyir+PdpyFQMJNv5adF4EHBjRXVydQHwP73BkX4FdNE X-Received: by 2002:a05:6358:988d:b0:14c:704b:d19f with SMTP id q13-20020a056358988d00b0014c704bd19fmr19484014rwa.3.1697116470256; Thu, 12 Oct 2023 06:14:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697116470; cv=none; d=google.com; s=arc-20160816; b=ikGQPMO8wfjNE2+/APYt3CFRqNPsqkzPsiyVNM9stK6Zwg//qtWOp9H6VXxxKFK2ez N3ZTNZOoldWqY+2NmQi70WijKvEfC9VSvFFV3fCc01AsF9VlXaB13jGxhvQ7qp7+4tF3 8wBzExNAwm2Lhl3jTeH0A4PBo8Q0+Fb12LOM6lEbkAz9F+rZrQwOsHruOV/ErlK09Tqb TZ045vwyV+TaWbaclSc8qDR5ZdZinAtKdk3g7v1H/+A5AR8mfRPptqpnCkC6ULQl2xya mVNcAxtiY3H8FGRhd+ngB2Lx4NlGq5TFCJAD63wkNbx9xXQ9x5XweVIbfGSJLxRcekF3 qWaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:message-id:in-reply-to :date:references:subject:cc:to:from:dkim-signature; bh=A/fqw6km8EW8Cqt3RwYdJ3PW5zAF9RhzVk9ZgAGHpaY=; fh=oLaCincPtxcWhHjRPwSBejOHzXwM+e/1MTyqsbWMSLI=; b=ue6mBMPufUFg1Gw46lQdHWE89rPql57Rx07ZHqlIOdJepsAtNDbBCdZKCeU9M/mlnC ALB9HSmtIm6bR5PB8ZHUb1QqJFENtfavMGxJDeP/wDGp0oudt/bsU/9dNVeb6Xrixp3d bHLtNtOY3dSVbbmYgUcyTlQJqU+eAVjRv+wNQ/zgIqPayy7QcG4gmMHpl0Wj4e+u2CuD 6NIwwXvxlpqz91WBgxrOzyEQp2EIZpiZWKhCR0Mgd5rwZC4k1fc7Dvvl03wfZr9zI7sk 9siMkC27YxiZKFb4rV4wzA6T6v77q+QmT4bI3gK3COk0ONDMkrMhKj7W1rxy0oPxyEfP k+Tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VcoY+xz6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id q188-20020a632ac5000000b005a4d0e504b4si747221pgq.348.2023.10.12.06.14.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 06:14:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VcoY+xz6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 1D25D802AB19; Thu, 12 Oct 2023 06:14:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347230AbjJLNOO (ORCPT + 99 others); Thu, 12 Oct 2023 09:14:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347216AbjJLNOM (ORCPT ); Thu, 12 Oct 2023 09:14:12 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F202CC for ; Thu, 12 Oct 2023 06:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697116449; x=1728652449; h=from:to:cc:subject:references:date:in-reply-to: message-id:mime-version; bh=eiWMQcDXi/+eJkMwzq0GtXmInqEDzIHYJMGyPtFb05g=; b=VcoY+xz6Pkww76vGO/KkzgrniybATqpL6KEVH3Wjy5BnhwyCigjCCxro S3fU5rnLF1AmmJ/JgVafWaRqNYuSEXeKZ3IrLHUjKUICSvfBcMJ4nfMv9 0L2/Ca1Thh+wnkaup7sYDhuXP6ZlE9pUrL8fdHuz3aQ1MlyTcN1pn92wG F0vBeO4eB814vJfrULuGo2Ks3qmTus4i+CaiNOcZithtWSSS4OX4KP4V7 q12IM9+lDQ+FzjfM68Yorqlxp+tPB0EUlK41r1O9eqWTa31FB9Mfj+LGz rRu+gHAQvsr3Y7K75pGFlybK4hm0geDD0k9HfiJWHJhRbcMMGNw90Yqe3 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="365187183" X-IronPort-AV: E=Sophos;i="6.03,219,1694761200"; d="scan'208";a="365187183" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2023 06:14:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10861"; a="878098259" X-IronPort-AV: E=Sophos;i="6.03,219,1694761200"; d="scan'208";a="878098259" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Oct 2023 06:14:05 -0700 From: "Huang, Ying" To: Mel Gorman Cc: , , Arjan Van De Ven , Sudeep Holla , Andrew Morton , Vlastimil Babka , "David Hildenbrand" , Johannes Weiner , "Dave Hansen" , Michal Hocko , "Pavel Tatashin" , Matthew Wilcox , Christoph Lameter Subject: Re: [PATCH 02/10] cacheinfo: calculate per-CPU data cache size References: <20230920061856.257597-1-ying.huang@intel.com> <20230920061856.257597-3-ying.huang@intel.com> <20231011122027.pw3uw32sdxxqjsrq@techsingularity.net> <87h6mwf3gf.fsf@yhuang6-desk2.ccr.corp.intel.com> <20231012125253.fpeehd6362c5v2sj@techsingularity.net> Date: Thu, 12 Oct 2023 21:12:00 +0800 In-Reply-To: <20231012125253.fpeehd6362c5v2sj@techsingularity.net> (Mel Gorman's message of "Thu, 12 Oct 2023 13:52:53 +0100") Message-ID: <87v8bcdly7.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 12 Oct 2023 06:14:27 -0700 (PDT) Mel Gorman writes: > On Thu, Oct 12, 2023 at 08:08:32PM +0800, Huang, Ying wrote: >> Mel Gorman writes: >> >> > On Wed, Sep 20, 2023 at 02:18:48PM +0800, Huang Ying wrote: >> >> Per-CPU data cache size is useful information. For example, it can be >> >> used to determine per-CPU cache size. So, in this patch, the data >> >> cache size for each CPU is calculated via data_cache_size / >> >> shared_cpu_weight. >> >> >> >> A brute-force algorithm to iterate all online CPUs is used to avoid >> >> to allocate an extra cpumask, especially in offline callback. >> >> >> >> Signed-off-by: "Huang, Ying" >> > >> > It's not necessarily relevant to the patch, but at least the scheduler >> > also stores some per-cpu topology information such as sd_llc_size -- the >> > number of CPUs sharing the same last-level-cache as this CPU. It may be >> > worth unifying this at some point if it's common that per-cpu >> > information is too fine and per-zone or per-node information is too >> > coarse. This would be particularly true when considering locking >> > granularity, >> > >> >> Cc: Sudeep Holla >> >> Cc: Andrew Morton >> >> Cc: Mel Gorman >> >> Cc: Vlastimil Babka >> >> Cc: David Hildenbrand >> >> Cc: Johannes Weiner >> >> Cc: Dave Hansen >> >> Cc: Michal Hocko >> >> Cc: Pavel Tatashin >> >> Cc: Matthew Wilcox >> >> Cc: Christoph Lameter >> >> --- >> >> drivers/base/cacheinfo.c | 42 ++++++++++++++++++++++++++++++++++++++- >> >> include/linux/cacheinfo.h | 1 + >> >> 2 files changed, 42 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c >> >> index cbae8be1fe52..3e8951a3fbab 100644 >> >> --- a/drivers/base/cacheinfo.c >> >> +++ b/drivers/base/cacheinfo.c >> >> @@ -898,6 +898,41 @@ static int cache_add_dev(unsigned int cpu) >> >> return rc; >> >> } >> >> >> >> +static void update_data_cache_size_cpu(unsigned int cpu) >> >> +{ >> >> + struct cpu_cacheinfo *ci; >> >> + struct cacheinfo *leaf; >> >> + unsigned int i, nr_shared; >> >> + unsigned int size_data = 0; >> >> + >> >> + if (!per_cpu_cacheinfo(cpu)) >> >> + return; >> >> + >> >> + ci = ci_cacheinfo(cpu); >> >> + for (i = 0; i < cache_leaves(cpu); i++) { >> >> + leaf = per_cpu_cacheinfo_idx(cpu, i); >> >> + if (leaf->type != CACHE_TYPE_DATA && >> >> + leaf->type != CACHE_TYPE_UNIFIED) >> >> + continue; >> >> + nr_shared = cpumask_weight(&leaf->shared_cpu_map); >> >> + if (!nr_shared) >> >> + continue; >> >> + size_data += leaf->size / nr_shared; >> >> + } >> >> + ci->size_data = size_data; >> >> +} >> > >> > This needs comments. >> > >> > It would be nice to add a comment on top describing the limitation of >> > CACHE_TYPE_UNIFIED here in the context of >> > update_data_cache_size_cpu(). >> >> Sure. Will do that. >> > > Thanks. > >> > The L2 cache could be unified but much smaller than a L3 or other >> > last-level-cache. It's not clear from the code what level of cache is being >> > used due to a lack of familiarity of the cpu_cacheinfo code but size_data >> > is not the size of a cache, it appears to be the share of a cache a CPU >> > would have under ideal circumstances. >> >> Yes. And it isn't for one specific level of cache. It's sum of per-CPU >> shares of all levels of cache. But the calculation is inaccurate. More >> details are in the below reply. >> >> > However, as it appears to also be >> > iterating hierarchy then this may not be accurate. Caches may or may not >> > allow data to be duplicated between levels so the value may be inaccurate. >> >> Thank you very much for pointing this out! The cache can be inclusive >> or not. So, we cannot calculate the per-CPU slice of all-level caches >> via adding them together blindly. I will change this in a follow-on >> patch. >> > > Please do, I would strongly suggest basing this on LLC only because it's > the only value you can be sure of. This change is the only change that may > warrant a respin of the series as the history will be somewhat confusing > otherwise. I am still checking whether it's possible to get cache inclusive information via cpuid. If there's no reliable way to do that. We can use the max value of per-CPU share of each level of cache. For inclusive cache, that will be the value of LLC. For non-inclusive cache, the value will be more accurate. For example, on Intel Sapphire Rapids, the L2 cache is 2 MB per core, while LLC is 1.875 MB per core according to [1]. [1] https://www.intel.com/content/www/us/en/developer/articles/technical/fourth-generation-xeon-scalable-family-overview.html I will respin the series. Thanks a lot for review! -- Best Regards, Huang, Ying