Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp856822rwr; Wed, 26 Apr 2023 07:14:17 -0700 (PDT) X-Google-Smtp-Source: AKy350ZUAkxlRiQa48H3/2MNzAfKVKPSJA+WV7E7fZh6E4VlTCNV0MN3FqTwbcY52uzOeeVXV6F3 X-Received: by 2002:a17:90a:8a17:b0:246:9bad:2354 with SMTP id w23-20020a17090a8a1700b002469bad2354mr21073958pjn.43.1682518457243; Wed, 26 Apr 2023 07:14:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682518457; cv=none; d=google.com; s=arc-20160816; b=RcgGHd/3s97IHL+ZB3+n+P5CEfdiricxU06Qsx/a6PDCYtO2Jmh6aXI8gxUUFVd5k5 f4bduvI3ud4JOyj0Ttk+fdMHJVj3PuwKlVfUYuQQLAQdIdIW7iDWD5CKeo/wipKpymf3 k3koVTyTprlSkLojeLYSHIEQey2muXofPFXqkjSRf+RjhWZ+k7cTckyN8IfzjFOlGZPe sywZIb35WrlvJE8ykplMkBG15aH7Q0Suy8F3LOSnqUbmkhSoVIxbe/BTx4xWIYO+LQhv KQI2cgmEpO1MHho7S82mWAt3MwypxpJpJabn+ovjy9/9b2XobhbtJTQxKj46iQZu1ruv 1TAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=10kazP3wc+64gesUGgYbpydsVe/PEQJ2K3Snt+Sp17U=; b=HD7HyzTaYAq02v4C/MrqO1W7/ePFZxBAcoadHzQ9VrYnhzB38kCtx2kn1YPgrlmVHw x/SKRcrdiFCq0mTajIbo77e9M8+5adtQV6Tn2obgImuQ7TIvMy4kg1/oBRD15c3b46+S EiAAxJftf5ycKEAlFMVWG7e7PkX/TkmCkcfk0eU46tok/DlDJtDB/3oZn5Vcp+utkhce 5Xyo0JHi+iq01gK5SB+rKhU8Vi3CTv6maApGj2vo4choAY5jKTdkLm4/RjltuoR4nhFC 1yVpw/qXnvb+KKZHGx6xB5TnPLMauiOWPg/KQTr0QGko0bfQB3NGvoLX9Fn3GjELVx3G WJfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iPko3ZLp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x1-20020a17090a46c100b002477fae9d73si15901767pjg.140.2023.04.26.07.14.05; Wed, 26 Apr 2023 07:14:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iPko3ZLp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241234AbjDZN7B (ORCPT + 99 others); Wed, 26 Apr 2023 09:59:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240570AbjDZN67 (ORCPT ); Wed, 26 Apr 2023 09:58:59 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F4D06A52; Wed, 26 Apr 2023 06:58:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1682517537; x=1714053537; h=date:from:to:cc:subject:in-reply-to:message-id: references:mime-version; bh=9SgBTnJblGwgDkEyMnQDPGqR7X3brA77OnxVHEByyUU=; b=iPko3ZLpPgl9bsAX2itFXhtp/fIrsspk/f9WmqGfxvp5YVop1cCdy7Yg a80tnePixVfY+/rwLz5+t5FWeqpwsLHnyUE5VhgpBHRMslAkKuThVfR7c Ks2MiqPny6iNNLZLnKQWRyHQnS+4WC5rE0afT0ksTuEMj3ClAKhpp6cLf alNVTv5zpGF6loZxN5C1sXsIWgYsooHLdWPWwhnMd2h/ZcmPHMkwUfjCM KelKpi1Zhb37xlQlaftM7Vyzr16BG9LlYALtMvufrCXhKWCFBisegTUS7 vQB+0EayZFzA42chqp81HF9H7nLwpn80UE5TTbHLNdUDjYOxuLHLjwUIs A==; X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="410107019" X-IronPort-AV: E=Sophos;i="5.99,228,1677571200"; d="scan'208";a="410107019" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Apr 2023 06:58:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10691"; a="940222385" X-IronPort-AV: E=Sophos;i="5.99,228,1677571200"; d="scan'208";a="940222385" Received: from dancaspi-mobl1.ger.corp.intel.com ([10.252.60.3]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Apr 2023 06:58:49 -0700 Date: Wed, 26 Apr 2023 16:58:42 +0300 (EEST) From: =?ISO-8859-15?Q?Ilpo_J=E4rvinen?= To: Reinette Chatre cc: linux-kselftest@vger.kernel.org, Fenghua Yu , Shuah Khan , LKML , Shaopeng Tan Subject: Re: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test In-Reply-To: Message-ID: References: <20230418114506.46788-1-ilpo.jarvinen@linux.intel.com> <20230418114506.46788-25-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="8323329-289570517-1682517530=:2068" X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --8323329-289570517-1682517530=:2068 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT On Fri, 21 Apr 2023, Reinette Chatre wrote: > Hi Ilpo, > > On 4/18/2023 4:45 AM, Ilpo Järvinen wrote: > > CAT test spawns two processes into two different control groups with > > exclusive schemata. Both the processes alloc a buffer from memory > > matching their allocated LLC block size and flush the entire buffer out > > of caches. Since the processes are reading through the buffer only once > > during the measurement and initially all the buffer was flushed, the > > test isn't testing CAT. > > > > Rewrite the CAT test to allocated a buffer sized to half of LLC. Then > > "allocated a buffer" -> "allocate a buffer" ? > > > perform a sequence of tests with different LLC alloc sizes starting > > from half of the CBM bits down to 1-bit CBM. Flush the buffer before > > each test and read the buffer twice. Observe the LLC misses on the > > second read through the buffer. As the allocated LLC block gets smaller > > and smaller, the LLC misses will become larger and larger giving a > > strong signal on CAT working properly. > > Since the changelog starts by describing the CAT test needing two > processes I think it would help to highlight that this test uses a > single process. I think it would also help to describing how the cache > is used by the rest while this test is running. Sure, good points, I'll add the info. > > Suggested-by: Reinette Chatre > > Signed-off-by: Ilpo Järvinen > > --- > > tools/testing/selftests/resctrl/cache.c | 20 +- > > tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------ > > 2 files changed, 97 insertions(+), 127 deletions(-) > > > > diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c > > index 7970239413da..64f08ba5edc2 100644 > > --- a/tools/testing/selftests/resctrl/cache.c > > +++ b/tools/testing/selftests/resctrl/cache.c > > @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid) > > */ > > int cat_val(struct resctrl_val_param *param) > > { > > - int memflush = 1, operation = 0, ret = 0; > > char *resctrl_val = param->resctrl_val; > > unsigned long llc_perf_miss = 0; > > pid_t bm_pid; > > + int ret; > > > > if (strcmp(param->filename, "") == 0) > > sprintf(param->filename, "stdio"); > > @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param) > > if (ret) > > return ret; > > > > + ret = alloc_buffer(param->span, 1); > > + if (ret) > > + return ret; > > + > > initialize_llc_perf(); > > > > /* Test runs until the callback setup() tells the test to stop. */ > > @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param) > > } > > if (ret < 0) > > break; > > + > > + flush_buffer(param->span); > > + use_buffer(param->span, 0, true); > > + > > ret = reset_enable_llc_perf(bm_pid, param->cpu_no); > > if (ret) > > break; > > > > - if (run_fill_buf(param->span, memflush, operation, true)) { > > - fprintf(stderr, "Error-running fill buffer\n"); > > - ret = -1; > > - break; > > - } > > - > > - sleep(1); > > + use_buffer(param->span, 0, true); > > > > /* Measure cache miss from perf */ > > ret = get_llc_perf(&llc_perf_miss); > > @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param) > > break; > > } > > > > + free_buffer(); > > + > > return ret; > > } > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index 4b505fdb35d7..85053829b9c5 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -11,11 +11,12 @@ > > #include "resctrl.h" > > #include > > > > -#define RESULT_FILE_NAME1 "result_cat1" > > -#define RESULT_FILE_NAME2 "result_cat2" > > -#define NUM_OF_RUNS 5 > > -#define MAX_DIFF_PERCENT 4 > > -#define MAX_DIFF 1000000 > > +#define RESULT_FILE_NAME "result_cat" > > +#define NUM_OF_RUNS 5 > > +#define MIN_DIFF_PERCENT_PER_BIT 2 > > Could you please start a new trend that adds documentation > that explains what this constant means and how it was chosen? I can try although that particular 2 was a bit handwavy that just seems to work with the tests I performed. > > +static unsigned long current_mask; > > +static long prev_avg_llc_val; > > > > /* > > * Change schemata. Write schemata to specified > > @@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p) > > int ret = 0; > > > > /* Run NUM_OF_RUNS times */ > > - if (p->num_of_runs >= NUM_OF_RUNS) > > - return END_OF_TESTS; > > + if (p->num_of_runs >= NUM_OF_RUNS) { > > + /* Remove one bit from the consecutive block */ > > + current_mask &= current_mask >> 1; > > + if (!current_mask) > > + return END_OF_TESTS; > > + > > + p->num_of_runs = 0; > > This seems like a workaround to get the schemata to be written. It is > problematic since now p->num_of_runs no longer accurately reflects the > number of test runs. This is already the case. MBA test works around this very same problem by using a custom static variable (runs_per_allocation) which is reset to 0 every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test would replace runs_per_allocation with use of ->num_of_runs, it would match what the new CAT test does. Nothing currently relies on ->num_of_runs counting across the different "tests" that are run inside CAT and MBA tests. And I don't have anything immediately around the corner that would require ->num_of_runs to count total number of repetitions that were ran. I guess it would be possible to attempt to consolidate that second layer MBA and the rewritten CAT tests need somehow into resctrl_val_param. But IMHO that too is low-prio refactor as nothing is broken as is. > I was expecting this mask manipulation to be > in cat_val() so that it is clear how test works instead of part > of the logic handled here. That seems to be moving into opposite direction from how things are currently handled. Doing it in cat_val() would be relying less on ->setup(). If that's the preferred direction, then the question becomes, should CAT test do anything in ->setup() because also the schemata writing could be done in directly cat_val(). What I would prefer not to do is to have a rule which says: if there's a test-specific function, don't use ->setup() but do any setup directly in the test-specific function but, otherwise use ->setup(). Such an inconsistency would make things hard to track. > > + } > > > > if (p->num_of_runs == 0) { > > - sprintf(schemata, "%lx", p->mask); > > - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, > > - p->resctrl_val); > > + snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask); > > + ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val); > > + if (ret) > > + return ret; > > + snprintf(schemata, sizeof(schemata), "%lx", current_mask); > > + ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val); > > + if (ret) > > + return ret; > > } > > p->num_of_runs++; > > > > ... > > > @@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > > ret = get_mask_no_shareable(cache_type, &long_mask); > > if (ret) > > return ret; > > - count_of_bits = count_consecutive_bits(long_mask, NULL); > > + count_of_bits = count_consecutive_bits(long_mask, &start); > > > > /* Get L3/L2 cache size */ > > ret = get_cache_size(cpu_no, cache_type, &cache_size); > > @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type) > > count_of_bits - 1); > > return -1; > > } > > - > > - /* Get core id from same socket for running another thread */ > > - sibling_cpu_no = get_core_sibling(cpu_no); > > Do any users of get_core_sibling() remain after this? Correct observation, there seems to be no other users after this is removed. -- i. --8323329-289570517-1682517530=:2068--