2023-10-24 09:28:31

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 00/24] selftests/resctrl: CAT test improvements & generalized test framework

Hi all,

Here's a series to improve resctrl selftests. It contains following
improvements:

- Excludes shareable bits from CAT test allocation to avoid interference
- Alters read pattern to defeat HW prefetcher optimizations
- Rewrites CAT test to make the CAT test reliable and truly measure
if CAT is working or not
- Introduces generalized test framework making easier to add new tests
- Adds L2 CAT test
- Lots of other cleanups & refactoring

The patches up to CAT test rewrite have been earlier on the mailing list.
I've tried to address all the comments made against them back then.

This series have been tested across a large number of systems from
different generations.

Ilpo Järvinen (24):
selftests/resctrl: Split fill_buf to allow tests finer-grained control
selftests/resctrl: Refactor fill_buf functions
selftests/resctrl: Refactor get_cbm_mask()
selftests/resctrl: Mark get_cache_size() cache_type const
selftests/resctrl: Create cache_size() helper
selftests/resctrl: Exclude shareable bits from schemata in CAT test
selftests/resctrl: Split measure_cache_vals() function
selftests/resctrl: Split show_cache_info() to test specific and
generic parts
selftests/resctrl: Remove unnecessary __u64 -> unsigned long
conversion
selftests/resctrl: Remove nested calls in perf event handling
selftests/resctrl: Consolidate naming of perf event related things
selftests/resctrl: Improve perf init
selftests/resctrl: Convert perf related globals to locals
selftests/resctrl: Move cat_val() to cat_test.c and rename to
cat_test()
selftests/resctrl: Read in less obvious order to defeat prefetch
optimizations
selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test
selftests/resctrl: Create struct for input parameter
selftests/resctrl: Introduce generalized test framework
selftests/resctrl: Pass write_schemata() resource instead of test name
selftests/resctrl: Add helper to convert L2/3 to integer
selftests/resctrl: Get resource id from cache id
selftests/resctrl: Add test groups and name L3 CAT test L3_CAT
selftests/resctrl: Add L2 CAT test
selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

tools/testing/selftests/resctrl/cache.c | 263 +++---------
tools/testing/selftests/resctrl/cat_test.c | 386 ++++++++++++------
tools/testing/selftests/resctrl/cmt_test.c | 72 +++-
tools/testing/selftests/resctrl/fill_buf.c | 114 +++---
tools/testing/selftests/resctrl/mba_test.c | 24 +-
tools/testing/selftests/resctrl/mbm_test.c | 26 +-
tools/testing/selftests/resctrl/resctrl.h | 102 ++++-
.../testing/selftests/resctrl/resctrl_tests.c | 202 ++++-----
tools/testing/selftests/resctrl/resctrl_val.c | 6 +-
tools/testing/selftests/resctrl/resctrlfs.c | 234 +++++++----
10 files changed, 807 insertions(+), 622 deletions(-)

--
2.30.2


2023-10-24 09:28:33

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 05/24] selftests/resctrl: Create cache_size() helper

CAT and CMT tests calculate the span size from the n-bits cache
allocation on their own.

Add cache_size() helper which calculates size of the cache portion for
the given number of bits and use it to replace the existing span
calculations. This also prepares for the new CAT test that will need to
determine the size of the cache portion also during results processing.

cache_size local variables were renamed out of the way to
cache_total_size.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 10 +++++-----
tools/testing/selftests/resctrl/cmt_test.c | 8 ++++----
tools/testing/selftests/resctrl/resctrl.h | 14 ++++++++++++++
3 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 4852bbda2e71..80861c362a53 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -91,7 +91,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
{
unsigned long l_mask, l_mask_1;
int ret, pipefd[2], sibling_cpu_no;
- unsigned long cache_size = 0;
+ unsigned long cache_total_size = 0;
unsigned long long_mask;
int count_of_bits;
char pipe_message;
@@ -103,10 +103,10 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return ret;

/* Get L3/L2 cache size */
- ret = get_cache_size(cpu_no, cache_type, &cache_size);
+ ret = get_cache_size(cpu_no, cache_type, &cache_total_size);
if (ret)
return ret;
- ksft_print_msg("Cache size :%lu\n", cache_size);
+ ksft_print_msg("Cache size :%lu\n", cache_total_size);

/* Get max number of bits from default-cabm mask */
count_of_bits = count_bits(long_mask);
@@ -138,7 +138,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
/* Set param values for parent thread which will be allocated bitmask
* with (max_bits - n) bits
*/
- span = cache_size * (count_of_bits - n) / count_of_bits;
+ span = cache_size(cache_total_size, l_mask, long_mask);
strcpy(param.ctrlgrp, "c2");
strcpy(param.mongrp, "m2");
strcpy(param.filename, RESULT_FILE_NAME2);
@@ -160,7 +160,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
param.mask = l_mask_1;
strcpy(param.ctrlgrp, "c1");
strcpy(param.mongrp, "m1");
- span = cache_size * n / count_of_bits;
+ span = cache_size(cache_total_size, l_mask_1, long_mask);
strcpy(param.filename, RESULT_FILE_NAME1);
param.num_of_runs = 0;
param.cpu_no = sibling_cpu_no;
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index a6c79edc33cd..e8997ff5bc04 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -72,7 +72,7 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
{
const char * const *cmd = benchmark_cmd;
const char *new_cmd[BENCHMARK_ARGS];
- unsigned long cache_size = 0;
+ unsigned long cache_total_size = 0;
unsigned long long_mask;
char *span_str = NULL;
int count_of_bits;
@@ -83,10 +83,10 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
if (ret)
return ret;

- ret = get_cache_size(cpu_no, "L3", &cache_size);
+ ret = get_cache_size(cpu_no, "L3", &cache_total_size);
if (ret)
return ret;
- ksft_print_msg("Cache size :%lu\n", cache_size);
+ ksft_print_msg("Cache size :%lu\n", cache_total_size);

count_of_bits = count_bits(long_mask);

@@ -107,7 +107,7 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
.setup = cmt_setup,
};

- span = cache_size * n / count_of_bits;
+ span = cache_size(cache_total_size, param.mask, long_mask);

if (strcmp(cmd[0], "fill_buf") == 0) {
/* Duplicate the command to be able to replace span in it */
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 2f3f0ee439d8..da06b2d492f9 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -117,4 +117,18 @@ int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
unsigned long max_diff_percent, unsigned long num_of_runs,
bool platform, bool cmt);

+/*
+ * cache_size - Calculate the size of a cache portion
+ * @cache_size: Cache size in bytes
+ * @mask: Cache portion mask
+ * @cache_mask: Full bitmask for the cache
+ *
+ * Return: The size of the cache portion in bytes.
+ */
+static inline int cache_size(unsigned long cache_size, unsigned long mask,
+ unsigned long cache_mask)
+{
+ return cache_size * count_bits(mask) / count_bits(cache_mask);
+}
+
#endif /* RESCTRL_H */
--
2.30.2

2023-10-24 09:28:56

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 07/24] selftests/resctrl: Split measure_cache_vals() function

The measure_cache_vals() function does a different thing depending on
the test case that called it:
- For CAT, it measures LLC perf misses.
- For CMT, it measures LLC occupancy through resctrl.

Split these two functionalities into own functions the CAT and CMT
tests can call directly.

Co-developed-by: Fenghua Yu <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 37 ++++++++++---------
tools/testing/selftests/resctrl/resctrl.h | 2 +-
tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
3 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index bcbca356d56a..299d9508221f 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -170,35 +170,36 @@ static int print_results_cache(char *filename, int bm_pid,
return 0;
}

-int measure_cache_vals(struct resctrl_val_param *param, int bm_pid)
+static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
{
- unsigned long llc_perf_miss = 0, llc_occu_resc = 0, llc_value = 0;
+ unsigned long llc_perf_miss = 0;
int ret;

/*
* Measure cache miss from perf.
*/
- if (!strncmp(param->resctrl_val, CAT_STR, sizeof(CAT_STR))) {
- ret = get_llc_perf(&llc_perf_miss);
- if (ret < 0)
- return ret;
- llc_value = llc_perf_miss;
- }
+ ret = get_llc_perf(&llc_perf_miss);
+ if (ret < 0)
+ return ret;
+
+ ret = print_results_cache(param->filename, bm_pid, llc_perf_miss);
+ return ret;
+}
+
+int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
+{
+ unsigned long llc_occu_resc = 0;
+ int ret;

/*
* Measure llc occupancy from resctrl.
*/
- if (!strncmp(param->resctrl_val, CMT_STR, sizeof(CMT_STR))) {
- ret = get_llc_occu_resctrl(&llc_occu_resc);
- if (ret < 0)
- return ret;
- llc_value = llc_occu_resc;
- }
- ret = print_results_cache(param->filename, bm_pid, llc_value);
- if (ret)
+ ret = get_llc_occu_resctrl(&llc_occu_resc);
+ if (ret < 0)
return ret;

- return 0;
+ ret = print_results_cache(param->filename, bm_pid, llc_occu_resc);
+ return ret;
}

/*
@@ -253,7 +254,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}

sleep(1);
- ret = measure_cache_vals(param, bm_pid);
+ ret = measure_llc_perf(param, bm_pid);
if (ret)
goto pe_close;
}
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 10fd3161e63a..56afdc190727 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -114,7 +114,7 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd);
unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);
int get_core_sibling(int cpu_no);
-int measure_cache_vals(struct resctrl_val_param *param, int bm_pid);
+int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid);
int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
size_t cache_span, unsigned long max_diff,
unsigned long max_diff_percent, unsigned long num_of_runs,
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 88789678917b..43ca026c6e0f 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -830,7 +830,7 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
break;
} else if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
sleep(1);
- ret = measure_cache_vals(param, bm_pid);
+ ret = measure_llc_resctrl(param, bm_pid);
if (ret)
break;
}
--
2.30.2

2023-10-24 09:28:57

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 06/24] selftests/resctrl: Exclude shareable bits from schemata in CAT test

CAT test doesn't take shareable bits into account, i.e., the test might
be sharing cache with some devices (e.g., graphics).

Introduce get_mask_no_shareable() and use it to provision an
environment for CAT test where the allocated LLC is isolated better.
Excluding shareable_bits may create hole(s) into the cbm_mask, thus add
a new helper count_contiguous_bits() to find the longest contiguous set
of CBM bits.

create_bit_mask() is needed by an upcoming CAT test rewrite so make it
available in resctrl.h right away.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 12 ++-
tools/testing/selftests/resctrl/resctrl.h | 3 +
tools/testing/selftests/resctrl/resctrlfs.c | 84 +++++++++++++++++++++
3 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 80861c362a53..e5861e7cba7e 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -92,13 +92,17 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
unsigned long l_mask, l_mask_1;
int ret, pipefd[2], sibling_cpu_no;
unsigned long cache_total_size = 0;
- unsigned long long_mask;
+ unsigned long full_cache_mask, long_mask;
int count_of_bits;
char pipe_message;
size_t span;

/* Get default cbm mask for L3/L2 cache */
- ret = get_cbm_mask(cache_type, &long_mask);
+ ret = get_cbm_mask(cache_type, &full_cache_mask);
+ if (ret)
+ return ret;
+ /* Get the exclusive portion of the cache */
+ ret = get_mask_no_shareable(cache_type, &long_mask);
if (ret)
return ret;

@@ -138,7 +142,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
/* Set param values for parent thread which will be allocated bitmask
* with (max_bits - n) bits
*/
- span = cache_size(cache_total_size, l_mask, long_mask);
+ span = cache_size(cache_total_size, l_mask, full_cache_mask);
strcpy(param.ctrlgrp, "c2");
strcpy(param.mongrp, "m2");
strcpy(param.filename, RESULT_FILE_NAME2);
@@ -160,7 +164,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
param.mask = l_mask_1;
strcpy(param.ctrlgrp, "c1");
strcpy(param.mongrp, "m1");
- span = cache_size(cache_total_size, l_mask_1, long_mask);
+ span = cache_size(cache_total_size, l_mask_1, full_cache_mask);
strcpy(param.filename, RESULT_FILE_NAME1);
param.num_of_runs = 0;
param.cpu_no = sibling_cpu_no;
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index da06b2d492f9..10fd3161e63a 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -99,7 +99,10 @@ void tests_cleanup(void);
void mbm_test_cleanup(void);
int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd);
void mba_test_cleanup(void);
+unsigned long create_bit_mask(unsigned int start, unsigned int len);
int get_cbm_mask(const char *cache_type, unsigned long *mask);
+int get_shareable_mask(const char *cache_type, unsigned long *shareable_mask);
+int get_mask_no_shareable(const char *cache_type, unsigned long *mask);
int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size);
void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
int signal_handler_register(void);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 46fb0441818d..02b04878121f 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -228,6 +228,44 @@ static int get_bit_mask(const char *filename, unsigned long *mask)
return 0;
}

+/*
+ * create_bit_mask- Create bit mask from start,len pair
+ * @start: LSB of the mask
+ * @len Number of bits in the mask
+ */
+unsigned long create_bit_mask(unsigned int start, unsigned int len)
+{
+ return ((1UL << len) - 1UL) << start;
+}
+
+/*
+ * count_contiguous_bits - Returns the longest train of bits in a bit mask
+ * @val A bit mask
+ * @start The location of the least-significant bit of the longest train
+ *
+ * Return: The length of the contiguous bits in the longest train of bits
+ */
+static unsigned int count_contiguous_bits(unsigned long val, unsigned int *start)
+{
+ unsigned long last_val;
+ int count = 0;
+
+ while (val) {
+ last_val = val;
+ val &= (val >> 1);
+ count++;
+ }
+
+ if (start) {
+ if (count)
+ *start = ffsl(last_val) - 1;
+ else
+ *start = 0;
+ }
+
+ return count;
+}
+
/*
* get_cbm_mask - Get cbm bit mask
* @cache_type: Cache level L2/L3
@@ -253,6 +291,52 @@ int get_cbm_mask(const char *cache_type, unsigned long *mask)
return 0;
}

+/*
+ * get_shareable_mask - Get shareable mask from shareable_bits for given cache
+ * @cache_type: Cache level L2/L3
+ * @shareable_mask: shareable mask returned as unsigned long
+ *
+ * Return: = 0 on success, < 0 on failure.
+ */
+int get_shareable_mask(const char *cache_type, unsigned long *shareable_mask)
+{
+ char mask_path[1024];
+
+ if (!cache_type)
+ return -1;
+
+ snprintf(mask_path, sizeof(mask_path), "%s/%s/shareable_bits",
+ INFO_PATH, cache_type);
+
+ return get_bit_mask(mask_path, shareable_mask);
+}
+
+/*
+ * get_mask_no_shareable - Get CBM mask without shareable_bits for given cache
+ * @cache_type: Cache level L2/L3
+ * @mask: mask returned as unsigned long
+ *
+ * Return: = 0 on success, < 0 on failure.
+ */
+int get_mask_no_shareable(const char *cache_type, unsigned long *mask)
+{
+ unsigned long full_mask, shareable_mask;
+ unsigned int start, len;
+
+ if (get_cbm_mask(cache_type, &full_mask) < 0)
+ return -1;
+ if (get_shareable_mask(cache_type, &shareable_mask) < 0)
+ return -1;
+
+ len = count_contiguous_bits(full_mask & ~shareable_mask, &start);
+ if (!len)
+ return -1;
+
+ *mask = create_bit_mask(start, len);
+
+ return 0;
+}
+
/*
* get_core_sibling - Get sibling core id from the same socket for given CPU
* @cpu_no: CPU number
--
2.30.2

2023-10-24 09:29:36

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 11/24] selftests/resctrl: Consolidate naming of perf event related things

Naming for perf event related functions, types, and variables is
currently inconsistent.

Make struct read_format and all functions related to perf events start
with perf_event. Adjust variable names towards the same direction
but use shorter names for variables where appropriate (pe prefix).

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 46 ++++++++++++-------------
1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 208af1ecae28..a70ace82e76e 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -3,7 +3,7 @@
#include <stdint.h>
#include "resctrl.h"

-struct read_format {
+struct perf_event_read {
__u64 nr; /* The number of events */
struct {
__u64 value; /* The value of the event */
@@ -11,11 +11,11 @@ struct read_format {
};

static struct perf_event_attr pea_llc_miss;
-static struct read_format rf_cqm;
-static int fd_lm;
+static struct perf_event_read pe_read;
+static int pe_fd;
char llc_occup_path[1024];

-static void initialize_perf_event_attr(void)
+static void perf_event_attr_initialize(void)
{
pea_llc_miss.type = PERF_TYPE_HARDWARE;
pea_llc_miss.size = sizeof(struct perf_event_attr);
@@ -29,31 +29,31 @@ static void initialize_perf_event_attr(void)
pea_llc_miss.disabled = 1;
}

-static void initialize_llc_perf(void)
+static void perf_event_initialize(void)
{
memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
- memset(&rf_cqm, 0, sizeof(struct read_format));
+ memset(&pe_read, 0, sizeof(struct perf_event_read));

/* Initialize perf_event_attr structures for HW_CACHE_MISSES */
- initialize_perf_event_attr();
+ perf_event_attr_initialize();

pea_llc_miss.config = PERF_COUNT_HW_CACHE_MISSES;

- rf_cqm.nr = 1;
+ pe_read.nr = 1;
}

-static int reset_enable_llc_perf(pid_t pid, int cpu_no)
+static int perf_event_reset_enable(pid_t pid, int cpu_no)
{
- fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
- if (fd_lm == -1) {
+ pe_fd = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
+ if (pe_fd == -1) {
perror("Error opening leader");
ctrlc_handler(0, NULL, NULL);
return -1;
}

/* Start counters to log values */
- ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
- ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
+ ioctl(pe_fd, PERF_EVENT_IOC_RESET, 0);
+ ioctl(pe_fd, PERF_EVENT_IOC_ENABLE, 0);

return 0;
}
@@ -122,28 +122,28 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
}

/*
- * measure_llc_perf: measure perf events
+ * perf_event_measure: measure perf events
* @bm_pid: child pid that runs benchmark
*
* Measure things like cache misses from perf events.
*
* Return: =0 on success. <0 on failure.
*/
-static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
+static int perf_event_measure(struct resctrl_val_param *param, int bm_pid)
{
int ret;

/* Stop counters after one span to get miss rate */
- ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
+ ioctl(pe_fd, PERF_EVENT_IOC_DISABLE, 0);

- ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
- close(fd_lm);
+ ret = read(pe_fd, &pe_read, sizeof(struct perf_event_read));
+ close(pe_fd);
if (ret == -1) {
perror("Could not get perf value");
return -1;
}

- return print_results_cache(param->filename, bm_pid, rf_cqm.values[0].value);
+ return print_results_cache(param->filename, bm_pid, pe_read.values[0].value);
}

int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
@@ -192,7 +192,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
if (ret)
return ret;

- initialize_llc_perf();
+ perf_event_initialize();

/* Test runs until the callback setup() tells the test to stop. */
while (1) {
@@ -203,7 +203,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}
if (ret < 0)
break;
- ret = reset_enable_llc_perf(bm_pid, param->cpu_no);
+ ret = perf_event_reset_enable(bm_pid, param->cpu_no);
if (ret)
break;

@@ -214,7 +214,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}

sleep(1);
- ret = measure_llc_perf(param, bm_pid);
+ ret = perf_event_measure(param, bm_pid);
if (ret)
goto pe_close;
}
@@ -222,7 +222,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
return ret;

pe_close:
- close(fd_lm);
+ close(pe_fd);
return ret;
}

--
2.30.2

2023-10-24 09:29:38

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 10/24] selftests/resctrl: Remove nested calls in perf event handling

Perf event handling has functions that are the sole caller of another
perf event handling related function:
- reset_enable_llc_perf() calls perf_event_open_llc_miss()
- reset_enable_llc_perf() calls ioctl_perf_event_ioc_reset_enable()
- measure_llc_perf() calls get_llc_perf()

Remove the extra layer of calls to make the code easier to follow by
moving the code into the calling function.

In addition, converts print_results_cache() unsigned long parameter to
__u64 that matches the type coming from perf.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 86 +++++++------------------
1 file changed, 25 insertions(+), 61 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index d39ef4eebc37..208af1ecae28 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -29,25 +29,6 @@ static void initialize_perf_event_attr(void)
pea_llc_miss.disabled = 1;
}

-static void ioctl_perf_event_ioc_reset_enable(void)
-{
- ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
- ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
-}
-
-static int perf_event_open_llc_miss(pid_t pid, int cpu_no)
-{
- fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1,
- PERF_FLAG_FD_CLOEXEC);
- if (fd_lm == -1) {
- perror("Error opening leader");
- ctrlc_handler(0, NULL, NULL);
- return -1;
- }
-
- return 0;
-}
-
static void initialize_llc_perf(void)
{
memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
@@ -63,42 +44,16 @@ static void initialize_llc_perf(void)

static int reset_enable_llc_perf(pid_t pid, int cpu_no)
{
- int ret = 0;
-
- ret = perf_event_open_llc_miss(pid, cpu_no);
- if (ret < 0)
- return ret;
-
- /* Start counters to log values */
- ioctl_perf_event_ioc_reset_enable();
-
- return 0;
-}
-
-/*
- * get_llc_perf: llc cache miss through perf events
- * @llc_perf_miss: LLC miss counter that is filled on success
- *
- * Perf events like HW_CACHE_MISSES could be used to validate number of
- * cache lines allocated.
- *
- * Return: =0 on success. <0 on failure.
- */
-static int get_llc_perf(__u64 *llc_perf_miss)
-{
- int ret;
-
- /* Stop counters after one span to get miss rate */
-
- ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
-
- ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
- if (ret == -1) {
- perror("Could not get llc misses through perf");
+ fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
+ if (fd_lm == -1) {
+ perror("Error opening leader");
+ ctrlc_handler(0, NULL, NULL);
return -1;
}

- *llc_perf_miss = rf_cqm.values[0].value;
+ /* Start counters to log values */
+ ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
+ ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);

return 0;
}
@@ -166,20 +121,29 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
return 0;
}

+/*
+ * measure_llc_perf: measure perf events
+ * @bm_pid: child pid that runs benchmark
+ *
+ * Measure things like cache misses from perf events.
+ *
+ * Return: =0 on success. <0 on failure.
+ */
static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
{
- __u64 llc_perf_miss = 0;
int ret;

- /*
- * Measure cache miss from perf.
- */
- ret = get_llc_perf(&llc_perf_miss);
- if (ret < 0)
- return ret;
+ /* Stop counters after one span to get miss rate */
+ ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);

- ret = print_results_cache(param->filename, bm_pid, llc_perf_miss);
- return ret;
+ ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
+ close(fd_lm);
+ if (ret == -1) {
+ perror("Could not get perf value");
+ return -1;
+ }
+
+ return print_results_cache(param->filename, bm_pid, rf_cqm.values[0].value);
}

int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
--
2.30.2

2023-10-24 09:29:39

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 04/24] selftests/resctrl: Mark get_cache_size() cache_type const

get_cache_size() does not modify cache_type so it could be const.

Mark cache_type const so that const char * can be passed to it. This
prevents warnings once many of the test parameters are marked const.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/resctrl.h | 2 +-
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index e95121a113f3..2f3f0ee439d8 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -100,7 +100,7 @@ void mbm_test_cleanup(void);
int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd);
void mba_test_cleanup(void);
int get_cbm_mask(const char *cache_type, unsigned long *mask);
-int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size);
+int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size);
void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
int signal_handler_register(void);
void signal_handler_unregister(void);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 220dc83748ca..46fb0441818d 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -138,7 +138,7 @@ int get_resource_id(int cpu_no, int *resource_id)
*
* Return: = 0 on success, < 0 on failure.
*/
-int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size)
+int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size)
{
char cache_path[1024], cache_str[64];
int length, i, cache_num;
--
2.30.2

2023-10-24 09:29:42

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 12/24] selftests/resctrl: Improve perf init

struct perf_event_attr initialization is spread into
perf_event_initialize() and perf_event_attr_initialize() and setting
->config is hardcoded by the deepest level.

perf_event_attr init belongs to perf_event_attr_initialize() so move it
entirely there. Rename the other function
perf_event_initialized_read_format().

Call each init function directly from the test as they will take
different parameters (especially tue after the perf related global
variables are moved to local variables).

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index a70ace82e76e..a84679a4ac0c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -15,8 +15,9 @@ static struct perf_event_read pe_read;
static int pe_fd;
char llc_occup_path[1024];

-static void perf_event_attr_initialize(void)
+static void perf_event_attr_initialize(__u64 config)
{
+ memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
pea_llc_miss.type = PERF_TYPE_HARDWARE;
pea_llc_miss.size = sizeof(struct perf_event_attr);
pea_llc_miss.read_format = PERF_FORMAT_GROUP;
@@ -27,18 +28,12 @@ static void perf_event_attr_initialize(void)
pea_llc_miss.inherit = 1;
pea_llc_miss.exclude_guest = 1;
pea_llc_miss.disabled = 1;
+ pea_llc_miss.config = config;
}

-static void perf_event_initialize(void)
+static void perf_event_initialize_read_format(void)
{
- memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
memset(&pe_read, 0, sizeof(struct perf_event_read));
-
- /* Initialize perf_event_attr structures for HW_CACHE_MISSES */
- perf_event_attr_initialize();
-
- pea_llc_miss.config = PERF_COUNT_HW_CACHE_MISSES;
-
pe_read.nr = 1;
}

@@ -192,7 +187,8 @@ int cat_val(struct resctrl_val_param *param, size_t span)
if (ret)
return ret;

- perf_event_initialize();
+ perf_event_attr_initialize(PERF_COUNT_HW_CACHE_MISSES);
+ perf_event_initialize_read_format();

/* Test runs until the callback setup() tells the test to stop. */
while (1) {
--
2.30.2

2023-10-24 09:29:43

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 08/24] selftests/resctrl: Split show_cache_info() to test specific and generic parts

show_cache_info() calculates results and provides generic cache
information. This makes it hard to alter pass/fail conditions.

Separate the test specific checks into CAT and CMT test files and
leave only the generic information part into show_cache_info().

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 40 ++++------------------
tools/testing/selftests/resctrl/cat_test.c | 30 ++++++++++++++--
tools/testing/selftests/resctrl/cmt_test.c | 32 +++++++++++++++--
tools/testing/selftests/resctrl/resctrl.h | 6 ++--
4 files changed, 65 insertions(+), 43 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 299d9508221f..95489d4b42b7 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -267,43 +267,17 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}

/*
- * show_cache_info: show cache test result information
- * @sum_llc_val: sum of LLC cache result data
+ * show_cache_info: show generic cache test information
* @no_of_bits: number of bits
- * @cache_span: cache span in bytes for CMT or in lines for CAT
- * @max_diff: max difference
- * @max_diff_percent: max difference percentage
- * @num_of_runs: number of runs
- * @platform: show test information on this platform
- * @cmt: CMT test or CAT test
- *
- * Return: 0 on success. non-zero on failure.
+ * @avg_llc_val: avg of LLC cache result data
+ * @cache_span: cache span
+ * @lines: cache span in lines or bytes
*/
-int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
- size_t cache_span, unsigned long max_diff,
- unsigned long max_diff_percent, unsigned long num_of_runs,
- bool platform, bool cmt)
+void show_cache_info(int no_of_bits, unsigned long avg_llc_val,
+ size_t cache_span, bool lines)
{
- unsigned long avg_llc_val = 0;
- float diff_percent;
- long avg_diff = 0;
- int ret;
-
- avg_llc_val = sum_llc_val / num_of_runs;
- avg_diff = (long)abs(cache_span - avg_llc_val);
- diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
-
- ret = platform && abs((int)diff_percent) > max_diff_percent &&
- (cmt ? (abs(avg_diff) > max_diff) : true);
-
- ksft_print_msg("%s Check cache miss rate within %lu%%\n",
- ret ? "Fail:" : "Pass:", max_diff_percent);
-
- ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
ksft_print_msg("Number of bits: %d\n", no_of_bits);
ksft_print_msg("Average LLC val: %lu\n", avg_llc_val);
- ksft_print_msg("Cache span (%s): %zu\n", cmt ? "bytes" : "lines",
+ ksft_print_msg("Cache span (%s): %zu\n", !lines ? "bytes" : "lines",
cache_span);
-
- return ret;
}
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index e5861e7cba7e..32f6d612a3e7 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -41,6 +41,30 @@ static int cat_setup(struct resctrl_val_param *p)
return ret;
}

+static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
+ unsigned long cache_span, unsigned long max_diff,
+ unsigned long max_diff_percent, unsigned long num_of_runs,
+ bool platform)
+{
+ unsigned long avg_llc_val = 0;
+ float diff_percent;
+ int ret;
+
+ avg_llc_val = sum_llc_val / num_of_runs;
+ diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
+
+ ret = platform && abs((int)diff_percent) > max_diff_percent;
+
+ ksft_print_msg("%s Check cache miss rate within %lu%%\n",
+ ret ? "Fail:" : "Pass:", max_diff_percent);
+
+ ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+
+ show_cache_info(no_of_bits, avg_llc_val, cache_span, true);
+
+ return ret;
+}
+
static int check_results(struct resctrl_val_param *param, size_t span)
{
char *token_array[8], temp[512];
@@ -76,9 +100,9 @@ static int check_results(struct resctrl_val_param *param, size_t span)
fclose(fp);
no_of_bits = count_bits(param->mask);

- return show_cache_info(sum_llc_perf_miss, no_of_bits, span / 64,
- MAX_DIFF, MAX_DIFF_PERCENT, runs - 1,
- get_vendor() == ARCH_INTEL, false);
+ return show_results_info(sum_llc_perf_miss, no_of_bits, span / 64,
+ MAX_DIFF, MAX_DIFF_PERCENT, runs - 1,
+ get_vendor() == ARCH_INTEL);
}

void cat_test_cleanup(void)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index e8997ff5bc04..702ea87cd473 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -27,6 +27,33 @@ static int cmt_setup(struct resctrl_val_param *p)
return 0;
}

+static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
+ unsigned long cache_span, unsigned long max_diff,
+ unsigned long max_diff_percent, unsigned long num_of_runs,
+ bool platform)
+{
+ unsigned long avg_llc_val = 0;
+ float diff_percent;
+ long avg_diff = 0;
+ int ret;
+
+ avg_llc_val = sum_llc_val / num_of_runs;
+ avg_diff = (long)abs(cache_span - avg_llc_val);
+ diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
+
+ ret = platform && abs((int)diff_percent) > max_diff_percent &&
+ abs(avg_diff) > max_diff;
+
+ ksft_print_msg("%s Check cache miss rate within %lu%%\n",
+ ret ? "Fail:" : "Pass:", max_diff_percent);
+
+ ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+
+ show_cache_info(no_of_bits, avg_llc_val, cache_span, false);
+
+ return ret;
+}
+
static int check_results(struct resctrl_val_param *param, size_t span, int no_of_bits)
{
char *token_array[8], temp[512];
@@ -58,9 +85,8 @@ static int check_results(struct resctrl_val_param *param, size_t span, int no_of
}
fclose(fp);

- return show_cache_info(sum_llc_occu_resc, no_of_bits, span,
- MAX_DIFF, MAX_DIFF_PERCENT, runs - 1,
- true, true);
+ return show_results_info(sum_llc_occu_resc, no_of_bits, span,
+ MAX_DIFF, MAX_DIFF_PERCENT, runs - 1, true);
}

void cmt_test_cleanup(void)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 56afdc190727..c7655714b23f 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -115,10 +115,8 @@ unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);
int get_core_sibling(int cpu_no);
int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid);
-int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
- size_t cache_span, unsigned long max_diff,
- unsigned long max_diff_percent, unsigned long num_of_runs,
- bool platform, bool cmt);
+void show_cache_info(int no_of_bits, unsigned long avg_llc_val,
+ size_t cache_span, bool lines);

/*
* cache_size - Calculate the size of a cache portion
--
2.30.2

2023-10-24 09:29:48

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 13/24] selftests/resctrl: Convert perf related globals to locals

Perf related variables pea_llc_miss, pe_read, and pe_fd are globals in
cache.c.

Convert them to locals for better scoping and make pea_llc_miss simpler
by renaming it to pea. Make close(pe_fd) handling easier to understand
by doing it inside cat_val().

Make also sizeof()s use safer way determine the right struct.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 69 ++++++++++++++-----------
1 file changed, 38 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index a84679a4ac0c..a65e2e35c33c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -10,36 +10,35 @@ struct perf_event_read {
} values[2];
};

-static struct perf_event_attr pea_llc_miss;
-static struct perf_event_read pe_read;
-static int pe_fd;
char llc_occup_path[1024];

-static void perf_event_attr_initialize(__u64 config)
+static void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
{
- memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
- pea_llc_miss.type = PERF_TYPE_HARDWARE;
- pea_llc_miss.size = sizeof(struct perf_event_attr);
- pea_llc_miss.read_format = PERF_FORMAT_GROUP;
- pea_llc_miss.exclude_kernel = 1;
- pea_llc_miss.exclude_hv = 1;
- pea_llc_miss.exclude_idle = 1;
- pea_llc_miss.exclude_callchain_kernel = 1;
- pea_llc_miss.inherit = 1;
- pea_llc_miss.exclude_guest = 1;
- pea_llc_miss.disabled = 1;
- pea_llc_miss.config = config;
+ memset(pea, 0, sizeof(*pea));
+ pea->type = PERF_TYPE_HARDWARE;
+ pea->size = sizeof(struct perf_event_attr);
+ pea->read_format = PERF_FORMAT_GROUP;
+ pea->exclude_kernel = 1;
+ pea->exclude_hv = 1;
+ pea->exclude_idle = 1;
+ pea->exclude_callchain_kernel = 1;
+ pea->inherit = 1;
+ pea->exclude_guest = 1;
+ pea->disabled = 1;
+ pea->config = config;
}

-static void perf_event_initialize_read_format(void)
+static void perf_event_initialize_read_format(struct perf_event_read *pe_read)
{
- memset(&pe_read, 0, sizeof(struct perf_event_read));
- pe_read.nr = 1;
+ memset(pe_read, 0, sizeof(*pe_read));
+ pe_read->nr = 1;
}

-static int perf_event_reset_enable(pid_t pid, int cpu_no)
+static int perf_event_reset_enable(struct perf_event_attr *pea, pid_t pid, int cpu_no)
{
- pe_fd = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
+ int pe_fd;
+
+ pe_fd = perf_event_open(pea, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
if (pe_fd == -1) {
perror("Error opening leader");
ctrlc_handler(0, NULL, NULL);
@@ -50,7 +49,7 @@ static int perf_event_reset_enable(pid_t pid, int cpu_no)
ioctl(pe_fd, PERF_EVENT_IOC_RESET, 0);
ioctl(pe_fd, PERF_EVENT_IOC_ENABLE, 0);

- return 0;
+ return pe_fd;
}

/*
@@ -124,21 +123,21 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
*
* Return: =0 on success. <0 on failure.
*/
-static int perf_event_measure(struct resctrl_val_param *param, int bm_pid)
+static int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
+ struct resctrl_val_param *param, int bm_pid)
{
int ret;

/* Stop counters after one span to get miss rate */
ioctl(pe_fd, PERF_EVENT_IOC_DISABLE, 0);

- ret = read(pe_fd, &pe_read, sizeof(struct perf_event_read));
- close(pe_fd);
+ ret = read(pe_fd, pe_read, sizeof(*pe_read));
if (ret == -1) {
perror("Could not get perf value");
return -1;
}

- return print_results_cache(param->filename, bm_pid, pe_read.values[0].value);
+ return print_results_cache(param->filename, bm_pid, pe_read->values[0].value);
}

int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
@@ -169,7 +168,10 @@ int cat_val(struct resctrl_val_param *param, size_t span)
{
int memflush = 1, operation = 0, ret = 0;
char *resctrl_val = param->resctrl_val;
+ static struct perf_event_read pe_read;
+ struct perf_event_attr pea;
pid_t bm_pid;
+ int pe_fd;

if (strcmp(param->filename, "") == 0)
sprintf(param->filename, "stdio");
@@ -187,8 +189,8 @@ int cat_val(struct resctrl_val_param *param, size_t span)
if (ret)
return ret;

- perf_event_attr_initialize(PERF_COUNT_HW_CACHE_MISSES);
- perf_event_initialize_read_format();
+ perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
+ perf_event_initialize_read_format(&pe_read);

/* Test runs until the callback setup() tells the test to stop. */
while (1) {
@@ -199,9 +201,12 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}
if (ret < 0)
break;
- ret = perf_event_reset_enable(bm_pid, param->cpu_no);
- if (ret)
+
+ pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
+ if (pe_fd < 0) {
+ ret = -1;
break;
+ }

if (run_fill_buf(span, memflush, operation, true)) {
fprintf(stderr, "Error-running fill buffer\n");
@@ -210,9 +215,11 @@ int cat_val(struct resctrl_val_param *param, size_t span)
}

sleep(1);
- ret = perf_event_measure(param, bm_pid);
+ ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
if (ret)
goto pe_close;
+
+ close(pe_fd);
}

return ret;
--
2.30.2

2023-10-24 09:29:58

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 14/24] selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test()

The main CAT test function is called cat_val() and resides in cache.c
which is illogical.

Rename the function to cat_test() and move it into cat_test.c.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 90 ++--------------------
tools/testing/selftests/resctrl/cat_test.c | 73 +++++++++++++++++-
tools/testing/selftests/resctrl/resctrl.h | 14 +++-
3 files changed, 90 insertions(+), 87 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index a65e2e35c33c..c4cb3cb8c2c9 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -3,16 +3,9 @@
#include <stdint.h>
#include "resctrl.h"

-struct perf_event_read {
- __u64 nr; /* The number of events */
- struct {
- __u64 value; /* The value of the event */
- } values[2];
-};
-
char llc_occup_path[1024];

-static void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
+void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
{
memset(pea, 0, sizeof(*pea));
pea->type = PERF_TYPE_HARDWARE;
@@ -28,13 +21,13 @@ static void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config
pea->config = config;
}

-static void perf_event_initialize_read_format(struct perf_event_read *pe_read)
+void perf_event_initialize_read_format(struct perf_event_read *pe_read)
{
memset(pe_read, 0, sizeof(*pe_read));
pe_read->nr = 1;
}

-static int perf_event_reset_enable(struct perf_event_attr *pea, pid_t pid, int cpu_no)
+int perf_event_reset_enable(struct perf_event_attr *pea, pid_t pid, int cpu_no)
{
int pe_fd;

@@ -123,8 +116,8 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
*
* Return: =0 on success. <0 on failure.
*/
-static int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
- struct resctrl_val_param *param, int bm_pid)
+int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
+ struct resctrl_val_param *param, int bm_pid)
{
int ret;

@@ -156,79 +149,6 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
return ret;
}

-/*
- * cache_val: execute benchmark and measure LLC occupancy resctrl
- * and perf cache miss for the benchmark
- * @param: parameters passed to cache_val()
- * @span: buffer size for the benchmark
- *
- * Return: 0 on success. non-zero on failure.
- */
-int cat_val(struct resctrl_val_param *param, size_t span)
-{
- int memflush = 1, operation = 0, ret = 0;
- char *resctrl_val = param->resctrl_val;
- static struct perf_event_read pe_read;
- struct perf_event_attr pea;
- pid_t bm_pid;
- int pe_fd;
-
- if (strcmp(param->filename, "") == 0)
- sprintf(param->filename, "stdio");
-
- bm_pid = getpid();
-
- /* Taskset benchmark to specified cpu */
- ret = taskset_benchmark(bm_pid, param->cpu_no);
- if (ret)
- return ret;
-
- /* Write benchmark to specified con_mon grp, mon_grp in resctrl FS*/
- ret = write_bm_pid_to_resctrl(bm_pid, param->ctrlgrp, param->mongrp,
- resctrl_val);
- if (ret)
- return ret;
-
- perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
- perf_event_initialize_read_format(&pe_read);
-
- /* Test runs until the callback setup() tells the test to stop. */
- while (1) {
- ret = param->setup(param);
- if (ret == END_OF_TESTS) {
- ret = 0;
- break;
- }
- if (ret < 0)
- break;
-
- pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
- if (pe_fd < 0) {
- ret = -1;
- break;
- }
-
- if (run_fill_buf(span, memflush, operation, true)) {
- fprintf(stderr, "Error-running fill buffer\n");
- ret = -1;
- goto pe_close;
- }
-
- sleep(1);
- ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
- if (ret)
- goto pe_close;
-
- close(pe_fd);
- }
-
- return ret;
-
-pe_close:
- close(pe_fd);
- return ret;
-}
-
/*
* show_cache_info: show generic cache test information
* @no_of_bits: number of bits
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 2106cc3601d9..e71690a9bbb3 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -111,6 +111,77 @@ void cat_test_cleanup(void)
remove(RESULT_FILE_NAME2);
}

+/*
+ * cat_test: execute CAT benchmark and measure LLC cache misses
+ * @param: parameters passed to cat_test()
+ * @span: buffer size for the benchmark
+ *
+ * Return: 0 on success. non-zero on failure.
+ */
+static int cat_test(struct resctrl_val_param *param, size_t span)
+{
+ int memflush = 1, operation = 0, ret = 0;
+ char *resctrl_val = param->resctrl_val;
+ static struct perf_event_read pe_read;
+ struct perf_event_attr pea;
+ pid_t bm_pid;
+ int pe_fd;
+
+ if (strcmp(param->filename, "") == 0)
+ sprintf(param->filename, "stdio");
+
+ bm_pid = getpid();
+
+ /* Taskset benchmark to specified cpu */
+ ret = taskset_benchmark(bm_pid, param->cpu_no);
+ if (ret)
+ return ret;
+
+ /* Write benchmark to specified con_mon grp, mon_grp in resctrl FS*/
+ ret = write_bm_pid_to_resctrl(bm_pid, param->ctrlgrp, param->mongrp,
+ resctrl_val);
+ if (ret)
+ return ret;
+
+ perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
+ perf_event_initialize_read_format(&pe_read);
+
+ /* Test runs until the callback setup() tells the test to stop. */
+ while (1) {
+ ret = param->setup(param);
+ if (ret == END_OF_TESTS) {
+ ret = 0;
+ break;
+ }
+ if (ret < 0)
+ break;
+ pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
+ if (pe_fd < 0) {
+ ret = -1;
+ break;
+ }
+
+ if (run_fill_buf(span, memflush, operation, true)) {
+ fprintf(stderr, "Error-running fill buffer\n");
+ ret = -1;
+ goto pe_close;
+ }
+
+ sleep(1);
+ ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
+ if (ret)
+ goto pe_close;
+
+ close(pe_fd);
+ }
+
+ return ret;
+
+pe_close:
+ close(pe_fd);
+ return ret;
+}
+
int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
{
unsigned long l_mask, l_mask_1;
@@ -196,7 +267,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)

remove(param.filename);

- ret = cat_val(&param, span);
+ ret = cat_test(&param, span);
if (ret == 0)
ret = check_results(&param, span);

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 033c49784581..ee3cee74a69c 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -67,6 +67,13 @@ struct resctrl_val_param {
int (*setup)(struct resctrl_val_param *param);
};

+struct perf_event_read {
+ __u64 nr; /* The number of events */
+ struct {
+ __u64 value; /* The value of the event */
+ } values[2];
+};
+
#define MBM_STR "mbm"
#define MBA_STR "mba"
#define CMT_STR "cmt"
@@ -107,13 +114,18 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
int signal_handler_register(void);
void signal_handler_unregister(void);
-int cat_val(struct resctrl_val_param *param, size_t span);
void cat_test_cleanup(void);
int cat_perf_miss_val(int cpu_no, int no_of_bits, char *cache_type);
int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd);
unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);
int get_core_sibling(int cpu_no);
+
+void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config);
+void perf_event_initialize_read_format(struct perf_event_read *pe_read);
+int perf_event_reset_enable(struct perf_event_attr *pea, pid_t pid, int cpu_no);
+int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
+ struct resctrl_val_param *param, int bm_pid);
int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid);
void show_cache_info(int no_of_bits, __u64 avg_llc_val, size_t cache_span, bool lines);

--
2.30.2

2023-10-24 09:31:19

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 15/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations

When reading memory in order, HW prefetching optimizations will
interfere with measuring how caches and memory are being accessed. This
adds noise into the results.

Change the fill_buf reading loop to not use an obvious in-order access
using multiply by a prime and modulo.

Using a prime multiplier with modulo ensures the entire buffer is
eventually read. 23 is small enough that the reads are spread out but
wrapping does not occur very frequently (wrapping too often can trigger
L2 hits more frequently which causes noise to the test because getting
the data from LLC is not required).

It was discovered that not all primes work equally well and some can
cause wildly unstable results (e.g., in an earlier version of this
patch, the reads were done in reversed order and 59 was used as the
prime resulting in unacceptably high and unstable results in MBA and
MBM test on some architectures).

Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/
Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/fill_buf.c | 38 +++++++++++++++++-----
1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 9d0b0bf4b85a..326d530425d0 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -51,16 +51,38 @@ static void mem_flush(unsigned char *buf, size_t buf_size)
sb();
}

+/*
+ * Buffer index step advance to workaround HW prefetching interfering with
+ * the measurements.
+ *
+ * Must be a prime to step through all indexes of the buffer.
+ *
+ * Some primes work better than others on some architectures (from MBA/MBM
+ * result stability point of view).
+ */
+#define FILL_IDX_MULT 23
+
static int fill_one_span_read(unsigned char *buf, size_t buf_size)
{
- unsigned char *end_ptr = buf + buf_size;
- unsigned char sum, *p;
-
- sum = 0;
- p = buf;
- while (p < end_ptr) {
- sum += *p;
- p += (CL_SIZE / 2);
+ unsigned int size = buf_size / (CL_SIZE / 2);
+ unsigned int i, idx = 0;
+ unsigned char sum = 0;
+
+ /*
+ * Read the buffer in an order that is unexpected by HW prefetching
+ * optimizations to prevent them interfering with the caching pattern.
+ *
+ * The read order is (in terms of halves of cachelines):
+ * i * FILL_IDX_MULT % size
+ * The formula is open-coded below to avoiding modulo inside the loop
+ * as it improves MBA/MBM result stability on some architectures.
+ */
+ for (i = 0; i < size; i++) {
+ sum += buf[idx * (CL_SIZE / 2)];
+
+ idx += FILL_IDX_MULT;
+ while (idx >= size)
+ idx -= size;
}

return sum;
--
2.30.2

2023-10-24 09:31:23

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

CAT test spawns two processes into two different control groups with
exclusive schemata. Both the processes alloc a buffer from memory
matching their allocated LLC block size and flush the entire buffer out
of caches. Since the processes are reading through the buffer only once
during the measurement and initially all the buffer was flushed, the
test isn't testing CAT.

Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
perform a sequence of tests with different LLC alloc sizes starting
from half of the CBM bits down to 1-bit CBM. Flush the buffer before
each test and read the buffer twice. Observe the LLC misses on the
second read through the buffer. As the allocated LLC block gets smaller
and smaller, the LLC misses will become larger and larger giving a
strong signal on CAT working properly.

The new CAT test is using only a single process because it relies on
measured effect against another run of itself rather than another
process adding noise. The rest of the system is allocated the CBM bits
not used by the CAT test to keep the test isolated.

Replace count_bits() with count_contiguous_bits() to get the first bit
position in order to be able to calculate masks based on it.

This change has been tested with a number of systems from different
generations.

Suggested-by: Reinette Chatre <[email protected]>
Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 286 +++++++++-----------
tools/testing/selftests/resctrl/fill_buf.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 5 +-
tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
4 files changed, 137 insertions(+), 204 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index e71690a9bbb3..7518c520c5cc 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -11,65 +11,68 @@
#include "resctrl.h"
#include <unistd.h>

-#define RESULT_FILE_NAME1 "result_cat1"
-#define RESULT_FILE_NAME2 "result_cat2"
+#define RESULT_FILE_NAME "result_cat"
#define NUM_OF_RUNS 5
-#define MAX_DIFF_PERCENT 4
-#define MAX_DIFF 1000000

/*
- * Change schemata. Write schemata to specified
- * con_mon grp, mon_grp in resctrl FS.
- * Run 5 times in order to get average values.
+ * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
+ * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
+ * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
+ *
+ * The relationship between number of used CBM bits and difference in LLC
+ * misses is not expected to be linear. With a small number of bits, the
+ * margin is smaller than with larger number of bits. For selftest purposes,
+ * however, linear approach is enough because ultimately only pass/fail
+ * decision has to be made and distinction between strong and stronger
+ * signal is irrelevant.
*/
-static int cat_setup(struct resctrl_val_param *p)
-{
- char schemata[64];
- int ret = 0;
-
- /* Run NUM_OF_RUNS times */
- if (p->num_of_runs >= NUM_OF_RUNS)
- return END_OF_TESTS;
-
- if (p->num_of_runs == 0) {
- sprintf(schemata, "%lx", p->mask);
- ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
- p->resctrl_val);
- }
- p->num_of_runs++;
-
- return ret;
-}
+#define MIN_DIFF_PERCENT_PER_BIT 1

static int show_results_info(__u64 sum_llc_val, int no_of_bits,
- unsigned long cache_span, unsigned long max_diff,
- unsigned long max_diff_percent, unsigned long num_of_runs,
- bool platform)
+ unsigned long cache_span, long min_diff_percent,
+ unsigned long num_of_runs, bool platform,
+ __s64 *prev_avg_llc_val)
{
__u64 avg_llc_val = 0;
- float diff_percent;
- int ret;
+ float avg_diff;
+ int ret = 0;

avg_llc_val = sum_llc_val / num_of_runs;
- diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
+ if (*prev_avg_llc_val) {
+ float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);

- ret = platform && abs((int)diff_percent) > max_diff_percent;
+ avg_diff = delta / *prev_avg_llc_val;
+ ret = platform && (avg_diff * 100) < (float)min_diff_percent;

- ksft_print_msg("%s Check cache miss rate within %lu%%\n",
- ret ? "Fail:" : "Pass:", max_diff_percent);
+ ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
+ ret ? "Fail:" : "Pass:", (float)min_diff_percent);

- ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
+ ksft_print_msg("Percent diff=%.1f\n", avg_diff * 100);
+ }
+ *prev_avg_llc_val = avg_llc_val;

show_cache_info(no_of_bits, avg_llc_val, cache_span, true);

return ret;
}

-static int check_results(struct resctrl_val_param *param, size_t span)
+/* Remove one bit from the consecutive cbm mask */
+static unsigned long next_mask(unsigned long current_mask)
+{
+ return current_mask & (current_mask >> 1);
+}
+
+static int check_results(struct resctrl_val_param *param, const char *cache_type,
+ unsigned long cache_total_size, unsigned long full_cache_mask,
+ unsigned long current_mask)
{
char *token_array[8], temp[512];
__u64 sum_llc_perf_miss = 0;
- int runs = 0, no_of_bits = 0;
+ unsigned long alloc_size;
+ __s64 prev_avg_llc_val = 0;
+ int runs = 0;
+ int fail = 0;
+ int ret;
FILE *fp;

ksft_print_msg("Checking for pass/fail\n");
@@ -83,49 +86,71 @@ static int check_results(struct resctrl_val_param *param, size_t span)
while (fgets(temp, sizeof(temp), fp)) {
char *token = strtok(temp, ":\t");
int fields = 0;
+ int bits;

while (token) {
token_array[fields++] = token;
token = strtok(NULL, ":\t");
}
- /*
- * Discard the first value which is inaccurate due to monitoring
- * setup transition phase.
- */
- if (runs > 0)
- sum_llc_perf_miss += strtoull(token_array[3], NULL, 0);
+
+ sum_llc_perf_miss += strtoull(token_array[3], NULL, 0);
runs++;
+
+ if (runs < NUM_OF_RUNS)
+ continue;
+
+ if (!current_mask) {
+ ksft_print_msg("Unexpected empty cache mask\n");
+ break;
+ }
+
+ alloc_size = cache_size(cache_total_size, current_mask, full_cache_mask);
+
+ bits = count_bits(current_mask);
+
+ ret = show_results_info(sum_llc_perf_miss, bits,
+ alloc_size / 64,
+ MIN_DIFF_PERCENT_PER_BIT * (bits - 1), runs,
+ get_vendor() == ARCH_INTEL,
+ &prev_avg_llc_val);
+ if (ret)
+ fail = 1;
+
+ runs = 0;
+ sum_llc_perf_miss = 0;
+ current_mask = next_mask(current_mask);
}

fclose(fp);
- no_of_bits = count_bits(param->mask);

- return show_results_info(sum_llc_perf_miss, no_of_bits, span / 64,
- MAX_DIFF, MAX_DIFF_PERCENT, runs - 1,
- get_vendor() == ARCH_INTEL);
+ return fail;
}

void cat_test_cleanup(void)
{
- remove(RESULT_FILE_NAME1);
- remove(RESULT_FILE_NAME2);
+ remove(RESULT_FILE_NAME);
}

/*
* cat_test: execute CAT benchmark and measure LLC cache misses
* @param: parameters passed to cat_test()
* @span: buffer size for the benchmark
+ * @current_mask start mask for the first iteration
+ *
+ * Run CAT test, bits are removed one-by-one from the current_mask for each
+ * subsequent test.
*
- * Return: 0 on success. non-zero on failure.
+ * Return: 0 on success. non-zero on failure.
*/
-static int cat_test(struct resctrl_val_param *param, size_t span)
+static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
{
- int memflush = 1, operation = 0, ret = 0;
char *resctrl_val = param->resctrl_val;
static struct perf_event_read pe_read;
struct perf_event_attr pea;
+ unsigned char *buf;
+ char schemata[64];
+ int ret, i, pe_fd;
pid_t bm_pid;
- int pe_fd;

if (strcmp(param->filename, "") == 0)
sprintf(param->filename, "stdio");
@@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
if (ret)
return ret;

+ buf = alloc_buffer(span, 1);
+ if (buf == NULL)
+ return -1;
+
perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
perf_event_initialize_read_format(&pe_read);

- /* Test runs until the callback setup() tells the test to stop. */
- while (1) {
- ret = param->setup(param);
- if (ret == END_OF_TESTS) {
- ret = 0;
- break;
- }
- if (ret < 0)
- break;
- pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
- if (pe_fd < 0) {
- ret = -1;
- break;
- }
+ while (current_mask) {
+ snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
+ ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
+ if (ret)
+ goto free_buf;
+ snprintf(schemata, sizeof(schemata), "%lx", current_mask);
+ ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
+ if (ret)
+ goto free_buf;
+
+ for (i = 0; i < NUM_OF_RUNS; i++) {
+ mem_flush(buf, span);
+ ret = fill_cache_read(buf, span, true);
+ if (ret)
+ goto free_buf;
+
+ pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
+ if (pe_fd < 0) {
+ ret = -1;
+ goto free_buf;
+ }

- if (run_fill_buf(span, memflush, operation, true)) {
- fprintf(stderr, "Error-running fill buffer\n");
- ret = -1;
- goto pe_close;
- }
+ fill_cache_read(buf, span, true);

- sleep(1);
- ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
- if (ret)
- goto pe_close;
+ ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
+ if (ret)
+ goto pe_close;

- close(pe_fd);
+ close(pe_fd);
+ }
+ current_mask = next_mask(current_mask);
}

+free_buf:
+ free(buf);
+
return ret;

pe_close:
close(pe_fd);
- return ret;
+ goto free_buf;
}

int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
{
- unsigned long l_mask, l_mask_1;
- int ret, pipefd[2], sibling_cpu_no;
+ unsigned long long_mask, start_mask, full_cache_mask;
unsigned long cache_total_size = 0;
- unsigned long full_cache_mask, long_mask;
+ unsigned int start;
int count_of_bits;
- char pipe_message;
size_t span;
+ int ret;

/* Get default cbm mask for L3/L2 cache */
ret = get_cbm_mask(cache_type, &full_cache_mask);
@@ -207,8 +242,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return ret;
ksft_print_msg("Cache size :%lu\n", cache_total_size);

- /* Get max number of bits from default-cabm mask */
- count_of_bits = count_bits(long_mask);
+ count_of_bits = count_contiguous_bits(long_mask, &start);

if (!n)
n = count_of_bits / 2;
@@ -219,88 +253,26 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
count_of_bits - 1);
return -1;
}
-
- /* Get core id from same socket for running another thread */
- sibling_cpu_no = get_core_sibling(cpu_no);
- if (sibling_cpu_no < 0)
- return -1;
+ start_mask = create_bit_mask(start, n);

struct resctrl_val_param param = {
.resctrl_val = CAT_STR,
.cpu_no = cpu_no,
- .setup = cat_setup,
+ .ctrlgrp = "c1",
+ .filename = RESULT_FILE_NAME,
+ .num_of_runs = 0,
};
-
- l_mask = long_mask >> n;
- l_mask_1 = ~l_mask & long_mask;
-
- /* Set param values for parent thread which will be allocated bitmask
- * with (max_bits - n) bits
- */
- span = cache_size(cache_total_size, l_mask, full_cache_mask);
- strcpy(param.ctrlgrp, "c2");
- strcpy(param.mongrp, "m2");
- strcpy(param.filename, RESULT_FILE_NAME2);
- param.mask = l_mask;
- param.num_of_runs = 0;
-
- if (pipe(pipefd)) {
- perror("# Unable to create pipe");
- return errno;
- }
-
- fflush(stdout);
- bm_pid = fork();
-
- /* Set param values for child thread which will be allocated bitmask
- * with n bits
- */
- if (bm_pid == 0) {
- param.mask = l_mask_1;
- strcpy(param.ctrlgrp, "c1");
- strcpy(param.mongrp, "m1");
- span = cache_size(cache_total_size, l_mask_1, full_cache_mask);
- strcpy(param.filename, RESULT_FILE_NAME1);
- param.num_of_runs = 0;
- param.cpu_no = sibling_cpu_no;
- }
+ param.mask = long_mask;
+ span = cache_size(cache_total_size, start_mask, full_cache_mask);

remove(param.filename);

- ret = cat_test(&param, span);
- if (ret == 0)
- ret = check_results(&param, span);
-
- if (bm_pid == 0) {
- /* Tell parent that child is ready */
- close(pipefd[0]);
- pipe_message = 1;
- if (write(pipefd[1], &pipe_message, sizeof(pipe_message)) <
- sizeof(pipe_message))
- /*
- * Just print the error message.
- * Let while(1) run and wait for itself to be killed.
- */
- perror("# failed signaling parent process");
-
- close(pipefd[1]);
- while (1)
- ;
- } else {
- /* Parent waits for child to be ready. */
- close(pipefd[1]);
- pipe_message = 0;
- while (pipe_message != 1) {
- if (read(pipefd[0], &pipe_message,
- sizeof(pipe_message)) < sizeof(pipe_message)) {
- perror("# failed reading from child process");
- break;
- }
- }
- close(pipefd[0]);
- kill(bm_pid, SIGKILL);
- }
+ ret = cat_test(&param, span, start_mask);
+ if (ret)
+ goto out;

+ ret = check_results(&param, cache_type, cache_total_size, full_cache_mask, start_mask);
+out:
cat_test_cleanup();

return ret;
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 326d530425d0..3dbb71371715 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -38,7 +38,7 @@ static void cl_flush(void *p)
#endif
}

-static void mem_flush(unsigned char *buf, size_t buf_size)
+void mem_flush(unsigned char *buf, size_t buf_size)
{
unsigned char *cp = buf;
size_t i = 0;
@@ -100,7 +100,7 @@ static void fill_one_span_write(unsigned char *buf, size_t buf_size)
}
}

-static int fill_cache_read(unsigned char *buf, size_t buf_size, bool once)
+int fill_cache_read(unsigned char *buf, size_t buf_size, bool once)
{
int ret = 0;
FILE *fp;
@@ -134,7 +134,7 @@ static int fill_cache_write(unsigned char *buf, size_t buf_size, bool once)
return 0;
}

-static unsigned char *alloc_buffer(size_t buf_size, int memflush)
+unsigned char *alloc_buffer(size_t buf_size, int memflush)
{
void *p = NULL;
uint64_t *p64;
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index ee3cee74a69c..927f696e0ab7 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -99,6 +99,9 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp,
char *resctrl_val);
int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu,
int group_fd, unsigned long flags);
+unsigned char *alloc_buffer(size_t buf_size, int memflush);
+void mem_flush(unsigned char *buf, size_t buf_size);
+int fill_cache_read(unsigned char *buf, size_t buf_size, bool once);
int run_fill_buf(size_t buf_size, int memflush, int op, bool once);
int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *param);
int mbm_bw_change(int cpu_no, const char * const *benchmark_cmd);
@@ -107,6 +110,7 @@ void mbm_test_cleanup(void);
int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd);
void mba_test_cleanup(void);
unsigned long create_bit_mask(unsigned int start, unsigned int len);
+unsigned int count_contiguous_bits(unsigned long val, unsigned int *start);
int get_cbm_mask(const char *cache_type, unsigned long *mask);
int get_shareable_mask(const char *cache_type, unsigned long *shareable_mask);
int get_mask_no_shareable(const char *cache_type, unsigned long *mask);
@@ -119,7 +123,6 @@ int cat_perf_miss_val(int cpu_no, int no_of_bits, char *cache_type);
int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd);
unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);
-int get_core_sibling(int cpu_no);

void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config);
void perf_event_initialize_read_format(struct perf_event_read *pe_read);
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 02b04878121f..c8fbbd96311d 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -245,7 +245,7 @@ unsigned long create_bit_mask(unsigned int start, unsigned int len)
*
* Return: The length of the contiguous bits in the longest train of bits
*/
-static unsigned int count_contiguous_bits(unsigned long val, unsigned int *start)
+unsigned int count_contiguous_bits(unsigned long val, unsigned int *start)
{
unsigned long last_val;
int count = 0;
@@ -337,48 +337,6 @@ int get_mask_no_shareable(const char *cache_type, unsigned long *mask)
return 0;
}

-/*
- * get_core_sibling - Get sibling core id from the same socket for given CPU
- * @cpu_no: CPU number
- *
- * Return: > 0 on success, < 0 on failure.
- */
-int get_core_sibling(int cpu_no)
-{
- char core_siblings_path[1024], cpu_list_str[64];
- int sibling_cpu_no = -1;
- FILE *fp;
-
- sprintf(core_siblings_path, "%s%d/topology/core_siblings_list",
- CORE_SIBLINGS_PATH, cpu_no);
-
- fp = fopen(core_siblings_path, "r");
- if (!fp) {
- perror("Failed to open core siblings path");
-
- return -1;
- }
- if (fscanf(fp, "%s", cpu_list_str) <= 0) {
- perror("Could not get core_siblings list");
- fclose(fp);
-
- return -1;
- }
- fclose(fp);
-
- char *token = strtok(cpu_list_str, "-,");
-
- while (token) {
- sibling_cpu_no = atoi(token);
- /* Skipping core 0 as we don't want to run test on core 0 */
- if (sibling_cpu_no != 0 && sibling_cpu_no != cpu_no)
- break;
- token = strtok(NULL, "-,");
- }
-
- return sibling_cpu_no;
-}
-
/*
* taskset_benchmark - Taskset PID (i.e. benchmark) to a specified cpu
* @bm_pid: PID that should be binded
--
2.30.2

2023-10-24 09:31:25

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

resctrl_tests reads a set of parameters and passes them individually
for each tests. The way the parameters passed varies between tests.

Add struct input_params to hold are input parameters. It can be easily
passed to every test without varying the call signature.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 7 ++-
tools/testing/selftests/resctrl/cmt_test.c | 13 ++--
tools/testing/selftests/resctrl/mba_test.c | 6 +-
tools/testing/selftests/resctrl/mbm_test.c | 6 +-
tools/testing/selftests/resctrl/resctrl.h | 20 ++++--
.../testing/selftests/resctrl/resctrl_tests.c | 61 +++++++++++--------
6 files changed, 67 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 7518c520c5cc..1a70c69e5f7c 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -218,10 +218,11 @@ static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long
goto free_buf;
}

-int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
+int cat_perf_miss_val(const struct user_params *uparams, char *cache_type)
{
unsigned long long_mask, start_mask, full_cache_mask;
unsigned long cache_total_size = 0;
+ int n = uparams->bits;
unsigned int start;
int count_of_bits;
size_t span;
@@ -237,7 +238,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
return ret;

/* Get L3/L2 cache size */
- ret = get_cache_size(cpu_no, cache_type, &cache_total_size);
+ ret = get_cache_size(uparams->cpu, cache_type, &cache_total_size);
if (ret)
return ret;
ksft_print_msg("Cache size :%lu\n", cache_total_size);
@@ -257,7 +258,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)

struct resctrl_val_param param = {
.resctrl_val = CAT_STR,
- .cpu_no = cpu_no,
+ .cpu_no = uparams->cpu,
.ctrlgrp = "c1",
.filename = RESULT_FILE_NAME,
.num_of_runs = 0,
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index 702ea87cd473..f5561b79629f 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -94,11 +94,12 @@ void cmt_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
+int cmt_resctrl_val(const struct user_params *uparams)
{
- const char * const *cmd = benchmark_cmd;
+ const char * const *cmd = uparams->benchmark_cmd;
const char *new_cmd[BENCHMARK_ARGS];
unsigned long cache_total_size = 0;
+ int n = uparams->bits ? : 5;
unsigned long long_mask;
char *span_str = NULL;
int count_of_bits;
@@ -109,7 +110,7 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
if (ret)
return ret;

- ret = get_cache_size(cpu_no, "L3", &cache_total_size);
+ ret = get_cache_size(uparams->cpu, "L3", &cache_total_size);
if (ret)
return ret;
ksft_print_msg("Cache size :%lu\n", cache_total_size);
@@ -126,7 +127,7 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
.resctrl_val = CMT_STR,
.ctrlgrp = "c1",
.mongrp = "m1",
- .cpu_no = cpu_no,
+ .cpu_no = uparams->cpu,
.filename = RESULT_FILE_NAME,
.mask = ~(long_mask << n) & long_mask,
.num_of_runs = 0,
@@ -137,8 +138,8 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)

if (strcmp(cmd[0], "fill_buf") == 0) {
/* Duplicate the command to be able to replace span in it */
- for (i = 0; benchmark_cmd[i]; i++)
- new_cmd[i] = benchmark_cmd[i];
+ for (i = 0; uparams->benchmark_cmd[i]; i++)
+ new_cmd[i] = uparams->benchmark_cmd[i];
new_cmd[i] = NULL;

ret = asprintf(&span_str, "%zu", span);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index d3bf4368341e..5157a3f74fee 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -141,13 +141,13 @@ void mba_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
+int mba_schemata_change(const struct user_params *uparams)
{
struct resctrl_val_param param = {
.resctrl_val = MBA_STR,
.ctrlgrp = "c1",
.mongrp = "m1",
- .cpu_no = cpu_no,
+ .cpu_no = uparams->cpu,
.filename = RESULT_FILE_NAME,
.bw_report = "reads",
.setup = mba_setup
@@ -156,7 +156,7 @@ int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)

remove(RESULT_FILE_NAME);

- ret = resctrl_val(benchmark_cmd, &param);
+ ret = resctrl_val(uparams->benchmark_cmd, &param);
if (ret)
goto out;

diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 741533f2b075..98df9d151941 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -109,13 +109,13 @@ void mbm_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int mbm_bw_change(int cpu_no, const char * const *benchmark_cmd)
+int mbm_bw_change(const struct user_params *uparams)
{
struct resctrl_val_param param = {
.resctrl_val = MBM_STR,
.ctrlgrp = "c1",
.mongrp = "m1",
- .cpu_no = cpu_no,
+ .cpu_no = uparams->cpu,
.filename = RESULT_FILE_NAME,
.bw_report = "reads",
.setup = mbm_setup
@@ -124,7 +124,7 @@ int mbm_bw_change(int cpu_no, const char * const *benchmark_cmd)

remove(RESULT_FILE_NAME);

- ret = resctrl_val(benchmark_cmd, &param);
+ ret = resctrl_val(uparams->benchmark_cmd, &param);
if (ret)
goto out;

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 927f696e0ab7..ec6efd36f60a 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -45,6 +45,18 @@
exit(EXIT_FAILURE); \
} while (0)

+/*
+ * user_params: User supplied parameters
+ * @cpu: CPU number to which the benchmark will be bound to
+ * @bits: Number of bits used for cache allocation size
+ * @benchmark_cmd: Benchmark command to run during (some of the) tests
+ */
+struct user_params {
+ int cpu;
+ int bits;
+ const char *benchmark_cmd[BENCHMARK_ARGS];
+};
+
/*
* resctrl_val_param: resctrl test parameters
* @resctrl_val: Resctrl feature (Eg: mbm, mba.. etc)
@@ -104,10 +116,10 @@ void mem_flush(unsigned char *buf, size_t buf_size);
int fill_cache_read(unsigned char *buf, size_t buf_size, bool once);
int run_fill_buf(size_t buf_size, int memflush, int op, bool once);
int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *param);
-int mbm_bw_change(int cpu_no, const char * const *benchmark_cmd);
+int mbm_bw_change(const struct user_params *uparams);
void tests_cleanup(void);
void mbm_test_cleanup(void);
-int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd);
+int mba_schemata_change(const struct user_params *uparams);
void mba_test_cleanup(void);
unsigned long create_bit_mask(unsigned int start, unsigned int len);
unsigned int count_contiguous_bits(unsigned long val, unsigned int *start);
@@ -119,8 +131,8 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
int signal_handler_register(void);
void signal_handler_unregister(void);
void cat_test_cleanup(void);
-int cat_perf_miss_val(int cpu_no, int no_of_bits, char *cache_type);
-int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd);
+int cat_perf_miss_val(const struct user_params *uparams, char *cache_type);
+int cmt_resctrl_val(const struct user_params *uparams);
unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);

diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 2bbe3045a018..8e00ccc2b2f6 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -92,7 +92,7 @@ static void test_cleanup(void)
signal_handler_unregister();
}

-static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
+static void run_mbm_test(const struct user_params *uparams)
{
int res;

@@ -110,7 +110,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
goto cleanup;
}

- res = mbm_bw_change(cpu_no, benchmark_cmd);
+ res = mbm_bw_change(uparams);
ksft_test_result(!res, "MBM: bw change\n");
if ((get_vendor() == ARCH_INTEL) && res)
ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
@@ -119,7 +119,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
test_cleanup();
}

-static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
+static void run_mba_test(const struct user_params *uparams)
{
int res;

@@ -137,14 +137,14 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
goto cleanup;
}

- res = mba_schemata_change(cpu_no, benchmark_cmd);
+ res = mba_schemata_change(uparams);
ksft_test_result(!res, "MBA: schemata change\n");

cleanup:
test_cleanup();
}

-static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
+static void run_cmt_test(const struct user_params *uparams)
{
int res;

@@ -161,7 +161,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
goto cleanup;
}

- res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
+ res = cmt_resctrl_val(uparams);
ksft_test_result(!res, "CMT: test\n");
if ((get_vendor() == ARCH_INTEL) && res)
ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
@@ -170,7 +170,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
test_cleanup();
}

-static void run_cat_test(int cpu_no, int no_of_bits)
+static void run_cat_test(const struct user_params *uparams)
{
int res;

@@ -186,22 +186,29 @@ static void run_cat_test(int cpu_no, int no_of_bits)
goto cleanup;
}

- res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
+ res = cat_perf_miss_val(uparams, "L3");
ksft_test_result(!res, "CAT: test\n");

cleanup:
test_cleanup();
}

+static void init_user_params(struct user_params *uparams)
+{
+ uparams->cpu = 1;
+ uparams->bits = 0;
+}
+
int main(int argc, char **argv)
{
bool mbm_test = true, mba_test = true, cmt_test = true;
- const char *benchmark_cmd[BENCHMARK_ARGS] = {};
- int c, cpu_no = 1, i, no_of_bits = 0;
+ struct user_params uparams = {};
char *span_str = NULL;
bool cat_test = true;
int tests = 0;
- int ret;
+ int ret, c, i;
+
+ init_user_params(&uparams);

while ((c = getopt(argc, argv, "ht:b:n:p:")) != -1) {
char *token;
@@ -219,8 +226,8 @@ int main(int argc, char **argv)

/* Extract benchmark command from command line. */
for (i = 0; i < argc - optind; i++)
- benchmark_cmd[i] = argv[i + optind];
- benchmark_cmd[i] = NULL;
+ uparams.benchmark_cmd[i] = argv[i + optind];
+ uparams.benchmark_cmd[i] = NULL;

goto last_arg;
case 't':
@@ -252,11 +259,11 @@ int main(int argc, char **argv)
}
break;
case 'p':
- cpu_no = atoi(optarg);
+ uparams.cpu = atoi(optarg);
break;
case 'n':
- no_of_bits = atoi(optarg);
- if (no_of_bits <= 0) {
+ uparams.bits = atoi(optarg);
+ if (uparams.bits <= 0) {
printf("Bail out! invalid argument for no_of_bits\n");
return -1;
}
@@ -291,32 +298,32 @@ int main(int argc, char **argv)

filter_dmesg();

- if (!benchmark_cmd[0]) {
+ if (!uparams.benchmark_cmd[0]) {
/* If no benchmark is given by "-b" argument, use fill_buf. */
- benchmark_cmd[0] = "fill_buf";
+ uparams.benchmark_cmd[0] = "fill_buf";
ret = asprintf(&span_str, "%u", DEFAULT_SPAN);
if (ret < 0)
ksft_exit_fail_msg("Out of memory!\n");
- benchmark_cmd[1] = span_str;
- benchmark_cmd[2] = "1";
- benchmark_cmd[3] = "0";
- benchmark_cmd[4] = "false";
- benchmark_cmd[5] = NULL;
+ uparams.benchmark_cmd[1] = span_str;
+ uparams.benchmark_cmd[2] = "1";
+ uparams.benchmark_cmd[3] = "0";
+ uparams.benchmark_cmd[4] = "false";
+ uparams.benchmark_cmd[5] = NULL;
}

ksft_set_plan(tests ? : 4);

if (mbm_test)
- run_mbm_test(benchmark_cmd, cpu_no);
+ run_mbm_test(&uparams);

if (mba_test)
- run_mba_test(benchmark_cmd, cpu_no);
+ run_mba_test(&uparams);

if (cmt_test)
- run_cmt_test(benchmark_cmd, cpu_no);
+ run_cmt_test(&uparams);

if (cat_test)
- run_cat_test(cpu_no, no_of_bits);
+ run_cat_test(&uparams);

free(span_str);
ksft_finished();
--
2.30.2

2023-10-24 09:31:26

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 18/24] selftests/resctrl: Introduce generalized test framework

Each test currently has a "run test" function in per test file and
another resctrl_tests.c. The functions in resctrl_tests.c are almost
identical.

Generalize the one in resctrl_tests.c such that it can be shared
between all of the tests. It makes adding new tests easier and removes
the per test if () forests.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 18 +-
tools/testing/selftests/resctrl/cmt_test.c | 17 +-
tools/testing/selftests/resctrl/mba_test.c | 16 +-
tools/testing/selftests/resctrl/mbm_test.c | 18 +-
tools/testing/selftests/resctrl/resctrl.h | 31 +++-
.../testing/selftests/resctrl/resctrl_tests.c | 160 ++++++------------
tools/testing/selftests/resctrl/resctrlfs.c | 5 +
7 files changed, 144 insertions(+), 121 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 1a70c69e5f7c..aa16fb36d0d4 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -218,7 +218,7 @@ static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long
goto free_buf;
}

-int cat_perf_miss_val(const struct user_params *uparams, char *cache_type)
+static int cat_run_test(const struct resctrl_test *test, const struct user_params *uparams)
{
unsigned long long_mask, start_mask, full_cache_mask;
unsigned long cache_total_size = 0;
@@ -229,16 +229,16 @@ int cat_perf_miss_val(const struct user_params *uparams, char *cache_type)
int ret;

/* Get default cbm mask for L3/L2 cache */
- ret = get_cbm_mask(cache_type, &full_cache_mask);
+ ret = get_cbm_mask(test->resource, &full_cache_mask);
if (ret)
return ret;
/* Get the exclusive portion of the cache */
- ret = get_mask_no_shareable(cache_type, &long_mask);
+ ret = get_mask_no_shareable(test->resource, &long_mask);
if (ret)
return ret;

/* Get L3/L2 cache size */
- ret = get_cache_size(uparams->cpu, cache_type, &cache_total_size);
+ ret = get_cache_size(uparams->cpu, test->resource, &cache_total_size);
if (ret)
return ret;
ksft_print_msg("Cache size :%lu\n", cache_total_size);
@@ -272,9 +272,17 @@ int cat_perf_miss_val(const struct user_params *uparams, char *cache_type)
if (ret)
goto out;

- ret = check_results(&param, cache_type, cache_total_size, full_cache_mask, start_mask);
+ ret = check_results(&param, test->resource,
+ cache_total_size, full_cache_mask, start_mask);
out:
cat_test_cleanup();

return ret;
}
+
+struct resctrl_test l3_cat_test = {
+ .name = "CAT",
+ .resource = "L3",
+ .feature_check = test_resource_feature_check,
+ .run_test = cat_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index f5561b79629f..353c4bae2cfe 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -94,7 +94,7 @@ void cmt_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int cmt_resctrl_val(const struct user_params *uparams)
+static int cmt_run_test(const struct resctrl_test *test, const struct user_params *uparams)
{
const char * const *cmd = uparams->benchmark_cmd;
const char *new_cmd[BENCHMARK_ARGS];
@@ -156,6 +156,8 @@ int cmt_resctrl_val(const struct user_params *uparams)
goto out;

ret = check_results(&param, span, n);
+ if (ret && (get_vendor() == ARCH_INTEL))
+ ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");

out:
cmt_test_cleanup();
@@ -163,3 +165,16 @@ int cmt_resctrl_val(const struct user_params *uparams)

return ret;
}
+
+static bool cmt_feature_check(const struct resctrl_test *test)
+{
+ return validate_resctrl_feature_request("L3_MON", "llc_occupancy") &&
+ validate_resctrl_feature_request("L3", NULL);
+}
+
+struct resctrl_test cmt_test = {
+ .name = "CMT",
+ .resource = "L3",
+ .feature_check = cmt_feature_check,
+ .run_test = cmt_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 5157a3f74fee..722f94013cb9 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -141,7 +141,7 @@ void mba_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int mba_schemata_change(const struct user_params *uparams)
+static int mba_run_test(const struct resctrl_test *test, const struct user_params *uparams)
{
struct resctrl_val_param param = {
.resctrl_val = MBA_STR,
@@ -167,3 +167,17 @@ int mba_schemata_change(const struct user_params *uparams)

return ret;
}
+
+static bool mba_feature_check(const struct resctrl_test *test)
+{
+ return test_resource_feature_check(test) &&
+ validate_resctrl_feature_request("L3_MON", "mbm_local_bytes");
+}
+
+struct resctrl_test mba_test = {
+ .name = "MBA",
+ .resource = "MB",
+ .vendor_specific = ARCH_INTEL,
+ .feature_check = mba_feature_check,
+ .run_test = mba_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 98df9d151941..943f4f14a499 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -109,7 +109,7 @@ void mbm_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

-int mbm_bw_change(const struct user_params *uparams)
+static int mbm_run_test(const struct resctrl_test *test, const struct user_params *uparams)
{
struct resctrl_val_param param = {
.resctrl_val = MBM_STR,
@@ -129,9 +129,25 @@ int mbm_bw_change(const struct user_params *uparams)
goto out;

ret = check_results(DEFAULT_SPAN);
+ if (ret && (get_vendor() == ARCH_INTEL))
+ ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");

out:
mbm_test_cleanup();

return ret;
}
+
+static bool mbm_feature_check(const struct resctrl_test *test)
+{
+ return validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") &&
+ validate_resctrl_feature_request("L3_MON", "mbm_local_bytes");
+}
+
+struct resctrl_test mbm_test = {
+ .name = "MBM",
+ .resource = "MB",
+ .vendor_specific = ARCH_INTEL,
+ .feature_check = mbm_feature_check,
+ .run_test = mbm_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index ec6efd36f60a..e017adf1390d 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -37,6 +37,8 @@

#define DEFAULT_SPAN (250 * MB)

+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
#define PARENT_EXIT(err_msg) \
do { \
perror(err_msg); \
@@ -57,6 +59,25 @@ struct user_params {
const char *benchmark_cmd[BENCHMARK_ARGS];
};

+/*
+ * resctrl_test: resctrl test definition
+ * @name: Test name
+ * @resource: Resource to test (e.g., MB, L3, L2, etc.)
+ * @vendor_specific: Bitmask for vendor-specific tests (can be 0 for universal tests)
+ * @disabled: Test is disabled
+ * @feature_check: Callback to check required resctrl features
+ * @run_test: Callback to run the test
+ */
+struct resctrl_test {
+ const char *name;
+ const char *resource;
+ unsigned int vendor_specific;
+ bool disabled;
+ bool (*feature_check)(const struct resctrl_test *test);
+ int (*run_test)(const struct resctrl_test *test,
+ const struct user_params *uparams);
+};
+
/*
* resctrl_val_param: resctrl test parameters
* @resctrl_val: Resctrl feature (Eg: mbm, mba.. etc)
@@ -103,6 +124,7 @@ int mount_resctrlfs(void);
int umount_resctrlfs(void);
int validate_bw_report_request(char *bw_report);
bool validate_resctrl_feature_request(const char *resource, const char *feature);
+bool test_resource_feature_check(const struct resctrl_test *test);
char *fgrep(FILE *inf, const char *str);
int taskset_benchmark(pid_t bm_pid, int cpu_no);
int write_schemata(char *ctrlgrp, char *schemata, int cpu_no,
@@ -116,10 +138,8 @@ void mem_flush(unsigned char *buf, size_t buf_size);
int fill_cache_read(unsigned char *buf, size_t buf_size, bool once);
int run_fill_buf(size_t buf_size, int memflush, int op, bool once);
int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *param);
-int mbm_bw_change(const struct user_params *uparams);
void tests_cleanup(void);
void mbm_test_cleanup(void);
-int mba_schemata_change(const struct user_params *uparams);
void mba_test_cleanup(void);
unsigned long create_bit_mask(unsigned int start, unsigned int len);
unsigned int count_contiguous_bits(unsigned long val, unsigned int *start);
@@ -131,8 +151,6 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
int signal_handler_register(void);
void signal_handler_unregister(void);
void cat_test_cleanup(void);
-int cat_perf_miss_val(const struct user_params *uparams, char *cache_type);
-int cmt_resctrl_val(const struct user_params *uparams);
unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);

@@ -158,4 +176,9 @@ static inline int cache_size(unsigned long cache_size, unsigned long mask,
return cache_size * count_bits(mask) / count_bits(cache_mask);
}

+extern struct resctrl_test mbm_test;
+extern struct resctrl_test mba_test;
+extern struct resctrl_test cmt_test;
+extern struct resctrl_test l3_cat_test;
+
#endif /* RESCTRL_H */
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 8e00ccc2b2f6..7846260e3f68 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -10,6 +10,13 @@
*/
#include "resctrl.h"

+static struct resctrl_test *resctrl_tests[] = {
+ &mbm_test,
+ &mba_test,
+ &cmt_test,
+ &l3_cat_test,
+};
+
static int detect_vendor(void)
{
FILE *inf = fopen("/proc/cpuinfo", "r");
@@ -49,11 +56,16 @@ int get_vendor(void)

static void cmd_help(void)
{
+ int i;
+
printf("usage: resctrl_tests [-h] [-t test list] [-n no_of_bits] [-b benchmark_cmd [option]...]\n");
printf("\t-b benchmark_cmd [option]...: run specified benchmark for MBM, MBA and CMT\n");
printf("\t default benchmark is builtin fill_buf\n");
printf("\t-t test list: run tests specified in the test list, ");
printf("e.g. -t mbm,mba,cmt,cat\n");
+ printf("\t\tSupported tests:\n");
+ for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
+ printf("\t\t\t%s\n", resctrl_tests[i]->name);
printf("\t-n no_of_bits: run cache tests using specified no of bits in cache bit mask\n");
printf("\t-p cpu_no: specify CPU number to run the test. 1 is default\n");
printf("\t-h: help\n");
@@ -92,102 +104,41 @@ static void test_cleanup(void)
signal_handler_unregister();
}

-static void run_mbm_test(const struct user_params *uparams)
+static bool test_vendor_specific_check(const struct resctrl_test *test)
{
- int res;
-
- ksft_print_msg("Starting MBM BW change ...\n");
-
- if (test_prepare()) {
- ksft_exit_fail_msg("Abnormal failure when preparing for the test\n");
- return;
- }
-
- if (!validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") ||
- !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
- (get_vendor() != ARCH_INTEL)) {
- ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
- goto cleanup;
- }
-
- res = mbm_bw_change(uparams);
- ksft_test_result(!res, "MBM: bw change\n");
- if ((get_vendor() == ARCH_INTEL) && res)
- ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
+ if (!test->vendor_specific)
+ return true;

-cleanup:
- test_cleanup();
+ return get_vendor() & test->vendor_specific;
}

-static void run_mba_test(const struct user_params *uparams)
+static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams)
{
- int res;
-
- ksft_print_msg("Starting MBA Schemata change ...\n");
+ int ret;

- if (test_prepare()) {
- ksft_exit_fail_msg("Abnormal failure when preparing for the test\n");
+ if (test->disabled)
return;
- }

- if (!validate_resctrl_feature_request("MB", NULL) ||
- !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
- (get_vendor() != ARCH_INTEL)) {
- ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
- goto cleanup;
- }
-
- res = mba_schemata_change(uparams);
- ksft_test_result(!res, "MBA: schemata change\n");
-
-cleanup:
- test_cleanup();
-}
-
-static void run_cmt_test(const struct user_params *uparams)
-{
- int res;
-
- ksft_print_msg("Starting CMT test ...\n");
-
- if (test_prepare()) {
- ksft_exit_fail_msg("Abnormal failure when preparing for the test\n");
+ if (!test_vendor_specific_check(test)) {
+ ksft_test_result_skip("Hardware does not support %s\n", test->name);
return;
}

- if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy") ||
- !validate_resctrl_feature_request("L3", NULL)) {
- ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
- goto cleanup;
- }
-
- res = cmt_resctrl_val(uparams);
- ksft_test_result(!res, "CMT: test\n");
- if ((get_vendor() == ARCH_INTEL) && res)
- ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
-
-cleanup:
- test_cleanup();
-}
-
-static void run_cat_test(const struct user_params *uparams)
-{
- int res;
-
- ksft_print_msg("Starting CAT test ...\n");
+ ksft_print_msg("Starting %s test ...\n", test->name);

if (test_prepare()) {
ksft_exit_fail_msg("Abnormal failure when preparing for the test\n");
return;
}

- if (!validate_resctrl_feature_request("L3", NULL)) {
- ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
+ if (!test->feature_check(test)) {
+ ksft_test_result_skip("Hardware does not support %s or %s is disabled\n",
+ test->name, test->name);
goto cleanup;
}

- res = cat_perf_miss_val(uparams, "L3");
- ksft_test_result(!res, "CAT: test\n");
+ ret = test->run_test(test, uparams);
+ ksft_test_result(!ret, "%s: test\n", test->name);

cleanup:
test_cleanup();
@@ -201,11 +152,10 @@ static void init_user_params(struct user_params *uparams)

int main(int argc, char **argv)
{
- bool mbm_test = true, mba_test = true, cmt_test = true;
+ int tests = ARRAY_SIZE(resctrl_tests);
struct user_params uparams = {};
+ bool test_param_seen = false;
char *span_str = NULL;
- bool cat_test = true;
- int tests = 0;
int ret, c, i;

init_user_params(&uparams);
@@ -233,25 +183,26 @@ int main(int argc, char **argv)
case 't':
token = strtok(optarg, ",");

- mbm_test = false;
- mba_test = false;
- cmt_test = false;
- cat_test = false;
+ if (!test_param_seen) {
+ for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
+ resctrl_tests[i]->disabled = true;
+ tests = 0;
+ test_param_seen = true;
+ }
while (token) {
- if (!strncmp(token, MBM_STR, sizeof(MBM_STR))) {
- mbm_test = true;
- tests++;
- } else if (!strncmp(token, MBA_STR, sizeof(MBA_STR))) {
- mba_test = true;
- tests++;
- } else if (!strncmp(token, CMT_STR, sizeof(CMT_STR))) {
- cmt_test = true;
- tests++;
- } else if (!strncmp(token, CAT_STR, sizeof(CAT_STR))) {
- cat_test = true;
- tests++;
- } else {
- printf("invalid argument\n");
+ bool found = false;
+
+ for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
+ if (!strcasecmp(token, resctrl_tests[i]->name)) {
+ if (resctrl_tests[i]->disabled)
+ tests++;
+ resctrl_tests[i]->disabled = false;
+ found = true;
+ }
+ }
+
+ if (!found) {
+ printf("invalid test: %s\n", token);

return -1;
}
@@ -311,19 +262,10 @@ int main(int argc, char **argv)
uparams.benchmark_cmd[5] = NULL;
}

- ksft_set_plan(tests ? : 4);
-
- if (mbm_test)
- run_mbm_test(&uparams);
-
- if (mba_test)
- run_mba_test(&uparams);
-
- if (cmt_test)
- run_cmt_test(&uparams);
+ ksft_set_plan(tests);

- if (cat_test)
- run_cat_test(&uparams);
+ for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
+ run_single_test(resctrl_tests[i], &uparams);

free(span_str);
ksft_finished();
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index c8fbbd96311d..2851ffe64b56 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -673,6 +673,11 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature)
return !!res;
}

+bool test_resource_feature_check(const struct resctrl_test *test)
+{
+ return validate_resctrl_feature_request(test->resource, NULL);
+}
+
int filter_dmesg(void)
{
char line[1024];
--
2.30.2

2023-10-24 09:31:28

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 19/24] selftests/resctrl: Pass write_schemata() resource instead of test name

write_schemata() currently takes the test name as an argument and
determines the relevant resource based on the test name. L2 CAT test
needs to set schemata for both L3 and L2 CAT which would get
complicated using the current approach.

Pass a resource instead of test name to write_schemata() to allow more
than one resource be set per test name.

While touching the sprintf(), move the unnecessary %c that is always
'=' directly into the format string.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 9 +++++----
tools/testing/selftests/resctrl/cmt_test.c | 1 +
tools/testing/selftests/resctrl/mba_test.c | 4 ++--
tools/testing/selftests/resctrl/mbm_test.c | 4 ++--
tools/testing/selftests/resctrl/resctrl.h | 5 +++--
tools/testing/selftests/resctrl/resctrlfs.c | 22 +++++----------------
6 files changed, 18 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index aa16fb36d0d4..1ef047cadf4c 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -142,7 +142,8 @@ void cat_test_cleanup(void)
*
* Return: 0 on success. non-zero on failure.
*/
-static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
+static int cat_test(struct resctrl_val_param *param, const char *resource,
+ size_t span, unsigned long current_mask)
{
char *resctrl_val = param->resctrl_val;
static struct perf_event_read pe_read;
@@ -177,11 +178,11 @@ static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long

while (current_mask) {
snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
- ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
+ ret = write_schemata("", schemata, param->cpu_no, resource);
if (ret)
goto free_buf;
snprintf(schemata, sizeof(schemata), "%lx", current_mask);
- ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
+ ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, resource);
if (ret)
goto free_buf;

@@ -268,7 +269,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param

remove(param.filename);

- ret = cat_test(&param, span, start_mask);
+ ret = cat_test(&param, test->resource, span, start_mask);
if (ret)
goto out;

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index 353c4bae2cfe..b0825d654dcf 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -125,6 +125,7 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param

struct resctrl_val_param param = {
.resctrl_val = CMT_STR,
+ .resource = "L3",
.ctrlgrp = "c1",
.mongrp = "m1",
.cpu_no = uparams->cpu,
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index 722f94013cb9..a14b7f4466e5 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -40,8 +40,7 @@ static int mba_setup(struct resctrl_val_param *p)

sprintf(allocation_str, "%d", allocation);

- ret = write_schemata(p->ctrlgrp, allocation_str, p->cpu_no,
- p->resctrl_val);
+ ret = write_schemata(p->ctrlgrp, allocation_str, p->cpu_no, p->resource);
if (ret < 0)
return ret;

@@ -145,6 +144,7 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
{
struct resctrl_val_param param = {
.resctrl_val = MBA_STR,
+ .resource = "MB",
.ctrlgrp = "c1",
.mongrp = "m1",
.cpu_no = uparams->cpu,
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 943f4f14a499..1ad2c1a7fddb 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -96,8 +96,7 @@ static int mbm_setup(struct resctrl_val_param *p)

/* Set up shemata with 100% allocation on the first run. */
if (p->num_of_runs == 0 && validate_resctrl_feature_request("MB", NULL))
- ret = write_schemata(p->ctrlgrp, "100", p->cpu_no,
- p->resctrl_val);
+ ret = write_schemata(p->ctrlgrp, "100", p->cpu_no, p->resource);

p->num_of_runs++;

@@ -113,6 +112,7 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param
{
struct resctrl_val_param param = {
.resctrl_val = MBM_STR,
+ .resource = "MB",
.ctrlgrp = "c1",
.mongrp = "m1",
.cpu_no = uparams->cpu,
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index e017adf1390d..99fbce4794bc 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -81,6 +81,7 @@ struct resctrl_test {
/*
* resctrl_val_param: resctrl test parameters
* @resctrl_val: Resctrl feature (Eg: mbm, mba.. etc)
+ * @resource: Resource to test (e.g., MB, L3, L2, etc.)
* @ctrlgrp: Name of the control monitor group (con_mon grp)
* @mongrp: Name of the monitor group (mon grp)
* @cpu_no: CPU number to which the benchmark would be binded
@@ -90,6 +91,7 @@ struct resctrl_test {
*/
struct resctrl_val_param {
char *resctrl_val;
+ char *resource;
char ctrlgrp[64];
char mongrp[64];
int cpu_no;
@@ -127,8 +129,7 @@ bool validate_resctrl_feature_request(const char *resource, const char *feature)
bool test_resource_feature_check(const struct resctrl_test *test);
char *fgrep(FILE *inf, const char *str);
int taskset_benchmark(pid_t bm_pid, int cpu_no);
-int write_schemata(char *ctrlgrp, char *schemata, int cpu_no,
- char *resctrl_val);
+int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, const char *resource);
int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp,
char *resctrl_val);
int perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu,
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 2851ffe64b56..7d72bea3ed58 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -496,23 +496,17 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, char *mongrp,
* @ctrlgrp: Name of the con_mon grp
* @schemata: Schemata that should be updated to
* @cpu_no: CPU number that the benchmark PID is binded to
- * @resctrl_val: Resctrl feature (Eg: mbm, mba.. etc)
+ * @resource: Resctrl resource (Eg: MB, L3, L2, etc.)
*
* Update schemata of a con_mon grp *only* if requested resctrl feature is
* allocation type
*
* Return: 0 on success, non-zero on failure
*/
-int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, char *resctrl_val)
+int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, const char *resource)
{
char controlgroup[1024], reason[128], schema[1024] = {};
- int resource_id, fd, schema_len = -1, ret = 0;
-
- if (strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR)) &&
- strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) &&
- strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR)) &&
- strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR)))
- return -ENOENT;
+ int resource_id, fd, schema_len, ret = 0;

if (!schemata) {
ksft_print_msg("Skipping empty schemata update\n");
@@ -532,14 +526,8 @@ int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, char *resctrl_val)
else
sprintf(controlgroup, "%s/schemata", RESCTRL_PATH);

- if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR)) ||
- !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR)))
- schema_len = snprintf(schema, sizeof(schema), "%s%d%c%s\n",
- "L3:", resource_id, '=', schemata);
- if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR)) ||
- !strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)))
- schema_len = snprintf(schema, sizeof(schema), "%s%d%c%s\n",
- "MB:", resource_id, '=', schemata);
+ schema_len = snprintf(schema, sizeof(schema), "%s:%d=%s\n",
+ resource, resource_id, schemata);
if (schema_len < 0 || schema_len >= sizeof(schema)) {
snprintf(reason, sizeof(reason),
"snprintf() failed with return value : %d", schema_len);
--
2.30.2

2023-10-24 09:32:13

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 20/24] selftests/resctrl: Add helper to convert L2/3 to integer

"L2"/"L3" conversion to integer in embedded into get_cache_size()
which prevents reuse.

Create a helper for the cache string to integer conversion to make
it reusable.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/resctrlfs.c | 28 +++++++++++++++------
1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 7d72bea3ed58..b98d88f7cbc4 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -94,6 +94,23 @@ int umount_resctrlfs(void)
return 0;
}

+/*
+ * get_cache_level - Convert cache level from string to integer
+ * @cache_type: Cache level as string
+ *
+ * Return: cache level as integer or -1 if @cache_type is invalid.
+ */
+static int get_cache_level(const char *cache_type)
+{
+ if (!strcmp(cache_type, "L3"))
+ return 3;
+ if (!strcmp(cache_type, "L2"))
+ return 2;
+
+ perror("Invalid cache level");
+ return -1;
+}
+
/*
* get_resource_id - Get socket number/l3 id for a specified CPU
* @cpu_no: CPU number
@@ -144,14 +161,9 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
int length, i, cache_num;
FILE *fp;

- if (!strcmp(cache_type, "L3")) {
- cache_num = 3;
- } else if (!strcmp(cache_type, "L2")) {
- cache_num = 2;
- } else {
- perror("Invalid cache level");
- return -1;
- }
+ cache_num = get_cache_level(cache_type);
+ if (cache_num < 0)
+ return cache_num;

sprintf(cache_path, "/sys/bus/cpu/devices/cpu%d/cache/index%d/size",
cpu_no, cache_num);
--
2.30.2

2023-10-24 09:32:40

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 22/24] selftests/resctrl: Add test groups and name L3 CAT test L3_CAT

To select test to run -t parameter can be used. However, -t cat
currently maps to L3 CAT test which is confusing after L2 CAT test is
be the next change.

Allow selecting tests as groups and call L3 CAT test L3_CAT, "CAT"
group will enable to both L3 and L2 CAT tests.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 3 ++-
tools/testing/selftests/resctrl/resctrl.h | 2 ++
tools/testing/selftests/resctrl/resctrl_tests.c | 16 +++++++++++-----
3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 1ef047cadf4c..48a96acd9e31 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -282,7 +282,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
}

struct resctrl_test l3_cat_test = {
- .name = "CAT",
+ .name = "L3_CAT",
+ .group = "CAT",
.resource = "L3",
.feature_check = test_resource_feature_check,
.run_test = cat_run_test,
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index c148998bf6ad..f9a4cfd981f8 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -62,6 +62,7 @@ struct user_params {
/*
* resctrl_test: resctrl test definition
* @name: Test name
+ * @group: Test group (e.g., L2 and L3 CAT test belong to CAT group), can be NULL
* @resource: Resource to test (e.g., MB, L3, L2, etc.)
* @vendor_specific: Bitmask for vendor-specific tests (can be 0 for universal tests)
* @disabled: Test is disabled
@@ -70,6 +71,7 @@ struct user_params {
*/
struct resctrl_test {
const char *name;
+ const char *group;
const char *resource;
unsigned int vendor_specific;
bool disabled;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 7846260e3f68..d89179541d7b 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -61,11 +61,15 @@ static void cmd_help(void)
printf("usage: resctrl_tests [-h] [-t test list] [-n no_of_bits] [-b benchmark_cmd [option]...]\n");
printf("\t-b benchmark_cmd [option]...: run specified benchmark for MBM, MBA and CMT\n");
printf("\t default benchmark is builtin fill_buf\n");
- printf("\t-t test list: run tests specified in the test list, ");
+ printf("\t-t test list: run tests/groups specified by the list, ");
printf("e.g. -t mbm,mba,cmt,cat\n");
- printf("\t\tSupported tests:\n");
- for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
- printf("\t\t\t%s\n", resctrl_tests[i]->name);
+ printf("\t\tSupported tests (group):\n");
+ for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
+ if (resctrl_tests[i]->group)
+ printf("\t\t\t%s (%s)\n", resctrl_tests[i]->name, resctrl_tests[i]->group);
+ else
+ printf("\t\t\t%s\n", resctrl_tests[i]->name);
+ }
printf("\t-n no_of_bits: run cache tests using specified no of bits in cache bit mask\n");
printf("\t-p cpu_no: specify CPU number to run the test. 1 is default\n");
printf("\t-h: help\n");
@@ -193,7 +197,9 @@ int main(int argc, char **argv)
bool found = false;

for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
- if (!strcasecmp(token, resctrl_tests[i]->name)) {
+ if (!strcasecmp(token, resctrl_tests[i]->name) ||
+ (resctrl_tests[i]->group &&
+ !strcasecmp(token, resctrl_tests[i]->group))) {
if (resctrl_tests[i]->disabled)
tests++;
resctrl_tests[i]->disabled = false;
--
2.30.2

2023-10-24 09:32:40

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 09/24] selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversion

Perf counters are __u64 but the code converts them to unsigned long
before printing them out.

Remove unnecessary type conversion and the potential loss of meaningful
bits due to different sizes of types.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cache.c | 21 ++++++++-------------
tools/testing/selftests/resctrl/cat_test.c | 8 ++++----
tools/testing/selftests/resctrl/resctrl.h | 3 +--
3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 95489d4b42b7..d39ef4eebc37 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -84,9 +84,8 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no)
*
* Return: =0 on success. <0 on failure.
*/
-static int get_llc_perf(unsigned long *llc_perf_miss)
+static int get_llc_perf(__u64 *llc_perf_miss)
{
- __u64 total_misses;
int ret;

/* Stop counters after one span to get miss rate */
@@ -99,8 +98,7 @@ static int get_llc_perf(unsigned long *llc_perf_miss)
return -1;
}

- total_misses = rf_cqm.values[0].value;
- *llc_perf_miss = total_misses;
+ *llc_perf_miss = rf_cqm.values[0].value;

return 0;
}
@@ -148,14 +146,12 @@ static int get_llc_occu_resctrl(unsigned long *llc_occupancy)
*
* Return: 0 on success. non-zero on failure.
*/
-static int print_results_cache(char *filename, int bm_pid,
- unsigned long llc_value)
+static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
{
FILE *fp;

if (strcmp(filename, "stdio") == 0 || strcmp(filename, "stderr") == 0) {
- printf("Pid: %d \t LLC_value: %lu\n", bm_pid,
- llc_value);
+ printf("Pid: %d \t LLC_value: %llu\n", bm_pid, llc_value);
} else {
fp = fopen(filename, "a");
if (!fp) {
@@ -163,7 +159,7 @@ static int print_results_cache(char *filename, int bm_pid,

return errno;
}
- fprintf(fp, "Pid: %d \t llc_value: %lu\n", bm_pid, llc_value);
+ fprintf(fp, "Pid: %d \t llc_value: %llu\n", bm_pid, llc_value);
fclose(fp);
}

@@ -172,7 +168,7 @@ static int print_results_cache(char *filename, int bm_pid,

static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
{
- unsigned long llc_perf_miss = 0;
+ __u64 llc_perf_miss = 0;
int ret;

/*
@@ -273,11 +269,10 @@ int cat_val(struct resctrl_val_param *param, size_t span)
* @cache_span: cache span
* @lines: cache span in lines or bytes
*/
-void show_cache_info(int no_of_bits, unsigned long avg_llc_val,
- size_t cache_span, bool lines)
+void show_cache_info(int no_of_bits, __u64 avg_llc_val, size_t cache_span, bool lines)
{
ksft_print_msg("Number of bits: %d\n", no_of_bits);
- ksft_print_msg("Average LLC val: %lu\n", avg_llc_val);
+ ksft_print_msg("Average LLC val: %llu\n", avg_llc_val);
ksft_print_msg("Cache span (%s): %zu\n", !lines ? "bytes" : "lines",
cache_span);
}
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 32f6d612a3e7..2106cc3601d9 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -41,12 +41,12 @@ static int cat_setup(struct resctrl_val_param *p)
return ret;
}

-static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
+static int show_results_info(__u64 sum_llc_val, int no_of_bits,
unsigned long cache_span, unsigned long max_diff,
unsigned long max_diff_percent, unsigned long num_of_runs,
bool platform)
{
- unsigned long avg_llc_val = 0;
+ __u64 avg_llc_val = 0;
float diff_percent;
int ret;

@@ -68,7 +68,7 @@ static int show_results_info(unsigned long sum_llc_val, int no_of_bits,
static int check_results(struct resctrl_val_param *param, size_t span)
{
char *token_array[8], temp[512];
- unsigned long sum_llc_perf_miss = 0;
+ __u64 sum_llc_perf_miss = 0;
int runs = 0, no_of_bits = 0;
FILE *fp;

@@ -93,7 +93,7 @@ static int check_results(struct resctrl_val_param *param, size_t span)
* setup transition phase.
*/
if (runs > 0)
- sum_llc_perf_miss += strtoul(token_array[3], NULL, 0);
+ sum_llc_perf_miss += strtoull(token_array[3], NULL, 0);
runs++;
}

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index c7655714b23f..033c49784581 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -115,8 +115,7 @@ unsigned int count_bits(unsigned long n);
void cmt_test_cleanup(void);
int get_core_sibling(int cpu_no);
int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid);
-void show_cache_info(int no_of_bits, unsigned long avg_llc_val,
- size_t cache_span, bool lines);
+void show_cache_info(int no_of_bits, __u64 avg_llc_val, size_t cache_span, bool lines);

/*
* cache_size - Calculate the size of a cache portion
--
2.30.2

2023-10-24 09:32:41

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

L2 CAT test with low number of bits tends to occasionally fail because
of what seems random variation. The margin is quite small to begin with
for <= 2 bits in CBM. At times, the result can even become negative.
While it would be possible to allow negative values for those cases, it
would be more confusing to user.

Ignore failures from the tests where <= 2 were used to avoid false
negative results.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index a9c72022bb5a..bc88eb891f35 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -28,7 +28,7 @@
*/
#define MIN_DIFF_PERCENT_PER_BIT 1

-static int show_results_info(__u64 sum_llc_val, int no_of_bits,
+static int show_results_info(__u64 sum_llc_val, int no_of_bits, bool ignore_fail,
unsigned long cache_span, long min_diff_percent,
unsigned long num_of_runs, bool platform,
__s64 *prev_avg_llc_val)
@@ -40,12 +40,18 @@ static int show_results_info(__u64 sum_llc_val, int no_of_bits,
avg_llc_val = sum_llc_val / num_of_runs;
if (*prev_avg_llc_val) {
float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);
+ char *res_str;

avg_diff = delta / *prev_avg_llc_val;
ret = platform && (avg_diff * 100) < (float)min_diff_percent;

+ res_str = ret ? "Fail:" : "Pass:";
+ if (ret && ignore_fail) {
+ res_str = "Pass (failure ignored):";
+ ret = 0;
+ }
ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
- ret ? "Fail:" : "Pass:", (float)min_diff_percent);
+ res_str, (float)min_diff_percent);

ksft_print_msg("Percent diff=%.1f\n", avg_diff * 100);
}
@@ -85,6 +91,7 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type

while (fgets(temp, sizeof(temp), fp)) {
char *token = strtok(temp, ":\t");
+ bool ignore_fail = false;
int fields = 0;
int bits;

@@ -108,7 +115,15 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type

bits = count_bits(current_mask);

- ret = show_results_info(sum_llc_perf_miss, bits,
+ /*
+ * L2 CAT test with low number of bits has too small margin to
+ * always remain positive. As negative values would be confusing
+ * for the user, ignore failure instead.
+ */
+ if (bits <= 2 && !strcmp(cache_type, "L2"))
+ ignore_fail = true;
+
+ ret = show_results_info(sum_llc_perf_miss, bits, ignore_fail,
alloc_size / 64,
MIN_DIFF_PERCENT_PER_BIT * (bits - 1), runs,
get_vendor() == ARCH_INTEL,
--
2.30.2

2023-10-24 09:32:52

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 21/24] selftests/resctrl: Get resource id from cache id

Resource id is acquired differently depending on CPU. AMD tests use id
from L3 cache, whereas CPUs from other vendors base the id on topology
package id. In order to support L2 CAT test, this has to be
generalized.

The driver side code seems to get the resource ids from cache ids so
the approach used by the AMD branch seems to match the kernel-side
code. It will also work with L2 resource IDs as long as the cache level
is generalized.

Using the topology id was always fragile due to mismatch with the
kernel-side way to acquire the resource id. It got incorrect resource
id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not
well suited for resctrl in the first place so it has not been a big
issues if test don't work correctly with it).

Taking all the above into account, generalize the resource id
calculation by taking it from the cache id and do not hard-code the
cache level.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/resctrl.h | 2 +-
tools/testing/selftests/resctrl/resctrl_val.c | 4 +--
tools/testing/selftests/resctrl/resctrlfs.c | 31 ++++++++++++-------
3 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 99fbce4794bc..c148998bf6ad 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -121,7 +121,7 @@ extern char llc_occup_path[1024];
int get_vendor(void);
bool check_resctrlfs_support(void);
int filter_dmesg(void);
-int get_resource_id(int cpu_no, int *resource_id);
+int get_resource_id(const char *resource, int cpu_no, int *resource_id);
int mount_resctrlfs(void);
int umount_resctrlfs(void);
int validate_bw_report_request(char *bw_report);
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 43ca026c6e0f..d210daafc5af 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -415,7 +415,7 @@ static void initialize_mem_bw_resctrl(const char *ctrlgrp, const char *mongrp,
{
int resource_id;

- if (get_resource_id(cpu_no, &resource_id) < 0) {
+ if (get_resource_id("MB", cpu_no, &resource_id) < 0) {
perror("Could not get resource_id");
return;
}
@@ -584,7 +584,7 @@ static void initialize_llc_occu_resctrl(const char *ctrlgrp, const char *mongrp,
{
int resource_id;

- if (get_resource_id(cpu_no, &resource_id) < 0) {
+ if (get_resource_id("L3", cpu_no, &resource_id) < 0) {
perror("# Unable to resource_id");
return;
}
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index b98d88f7cbc4..8e79b646d7cb 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -111,24 +111,33 @@ static int get_cache_level(const char *cache_type)
return -1;
}

+static int get_resource_cache_level(const char *resource)
+{
+ /* "MB" use L3 (LLC) resource id */
+ if (!strcmp(resource, "MB"))
+ return 3;
+ return get_cache_level(resource);
+}
+
/*
- * get_resource_id - Get socket number/l3 id for a specified CPU
- * @cpu_no: CPU number
- * @resource_id: Socket number or l3_id
+ * get_resource_id - Get resctrl resource id for a specified CPU
+ * @resource: resource
+ * @cpu_no: CPU number
+ * @resource_id: resource id (cache id; for MB, L3 cache id)
*
* Return: >= 0 on success, < 0 on failure.
*/
-int get_resource_id(int cpu_no, int *resource_id)
+int get_resource_id(const char *resource, int cpu_no, int *resource_id)
{
char phys_pkg_path[1024];
+ int cache_num;
FILE *fp;

- if (get_vendor() == ARCH_AMD)
- sprintf(phys_pkg_path, "%s%d/cache/index3/id",
- PHYS_ID_PATH, cpu_no);
- else
- sprintf(phys_pkg_path, "%s%d/topology/physical_package_id",
- PHYS_ID_PATH, cpu_no);
+ cache_num = get_resource_cache_level(resource);
+ if (cache_num < 0)
+ return cache_num;
+
+ sprintf(phys_pkg_path, "%s%d/cache/index%d/id", PHYS_ID_PATH, cpu_no, cache_num);

fp = fopen(phys_pkg_path, "r");
if (!fp) {
@@ -526,7 +535,7 @@ int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, const char *resour
return -1;
}

- if (get_resource_id(cpu_no, &resource_id) < 0) {
+ if (get_resource_id(resource, cpu_no, &resource_id) < 0) {
sprintf(reason, "Failed to get resource id");
ret = -1;

--
2.30.2

2023-10-24 09:36:28

by Ilpo Järvinen

[permalink] [raw]
Subject: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
support.

Add L2 CAT selftest. As measuring L2 misses is not easily available
with perf, use L3 accesses as a proxy for L2 CAT working or not.

Signed-off-by: Ilpo Järvinen <[email protected]>
---
tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
tools/testing/selftests/resctrl/resctrl.h | 1 +
.../testing/selftests/resctrl/resctrl_tests.c | 1 +
3 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 48a96acd9e31..a9c72022bb5a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -131,8 +131,47 @@ void cat_test_cleanup(void)
remove(RESULT_FILE_NAME);
}

+/*
+ * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
+ * because perf cannot directly provide the number of L2 misses (there are
+ * only platform specific ways to get the number of L2 misses).
+ *
+ * This function sets up L3 CAT to reduce noise from other processes during
+ * L2 CAT test.
+ */
+int l3_proxy_prepare(const struct resctrl_test *test, struct resctrl_val_param *param, int cpu)
+{
+ unsigned long l3_mask, split_mask;
+ unsigned int start;
+ int count_of_bits;
+ char schemata[64];
+ int n, ret;
+
+ if (!validate_resctrl_feature_request("L3", NULL)) {
+ ksft_print_msg("%s test results may contain noise because L3 CAT is not available!\n",
+ test->name);
+ return 0;
+ }
+
+ ret = get_mask_no_shareable("L3", &l3_mask);
+ if (ret)
+ return ret;
+ count_of_bits = count_contiguous_bits(l3_mask, &start);
+ n = count_of_bits / 2;
+ split_mask = create_bit_mask(start, n);
+
+ snprintf(schemata, sizeof(schemata), "%lx", l3_mask & ~split_mask);
+ ret = write_schemata("", schemata, cpu, "L3");
+ if (ret)
+ return ret;
+
+ snprintf(schemata, sizeof(schemata), "%lx", split_mask);
+ return write_schemata(param->ctrlgrp, schemata, cpu, "L3");
+}
+
/*
* cat_test: execute CAT benchmark and measure LLC cache misses
+ * @test: test information structure
* @param: parameters passed to cat_test()
* @span: buffer size for the benchmark
* @current_mask start mask for the first iteration
@@ -142,9 +181,10 @@ void cat_test_cleanup(void)
*
* Return: 0 on success. non-zero on failure.
*/
-static int cat_test(struct resctrl_val_param *param, const char *resource,
+static int cat_test(const struct resctrl_test *test, struct resctrl_val_param *param,
size_t span, unsigned long current_mask)
{
+ __u64 pea_config = PERF_COUNT_HW_CACHE_MISSES;
char *resctrl_val = param->resctrl_val;
static struct perf_event_read pe_read;
struct perf_event_attr pea;
@@ -169,20 +209,26 @@ static int cat_test(struct resctrl_val_param *param, const char *resource,
if (ret)
return ret;

+ if (!strcmp(test->resource, "L2")) {
+ ret = l3_proxy_prepare(test, param, param->cpu_no);
+ if (ret)
+ return ret;
+ pea_config = PERF_COUNT_HW_CACHE_REFERENCES;
+ }
+ perf_event_attr_initialize(&pea, pea_config);
+ perf_event_initialize_read_format(&pe_read);
+
buf = alloc_buffer(span, 1);
if (buf == NULL)
return -1;

- perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
- perf_event_initialize_read_format(&pe_read);
-
while (current_mask) {
snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
- ret = write_schemata("", schemata, param->cpu_no, resource);
+ ret = write_schemata("", schemata, param->cpu_no, test->resource);
if (ret)
goto free_buf;
snprintf(schemata, sizeof(schemata), "%lx", current_mask);
- ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, resource);
+ ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, test->resource);
if (ret)
goto free_buf;

@@ -269,7 +315,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param

remove(param.filename);

- ret = cat_test(&param, test->resource, span, start_mask);
+ ret = cat_test(test, &param, span, start_mask);
if (ret)
goto out;

@@ -288,3 +334,11 @@ struct resctrl_test l3_cat_test = {
.feature_check = test_resource_feature_check,
.run_test = cat_run_test,
};
+
+struct resctrl_test l2_cat_test = {
+ .name = "L2_CAT",
+ .group = "CAT",
+ .resource = "L2",
+ .feature_check = test_resource_feature_check,
+ .run_test = cat_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index f9a4cfd981f8..fffeb442c173 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -183,5 +183,6 @@ extern struct resctrl_test mbm_test;
extern struct resctrl_test mba_test;
extern struct resctrl_test cmt_test;
extern struct resctrl_test l3_cat_test;
+extern struct resctrl_test l2_cat_test;

#endif /* RESCTRL_H */
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index d89179541d7b..9e254bca6c25 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -15,6 +15,7 @@ static struct resctrl_test *resctrl_tests[] = {
&mba_test,
&cmt_test,
&l3_cat_test,
+ &l2_cat_test,
};

static int detect_vendor(void)
--
2.30.2

2023-10-27 11:46:30

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 12/24] selftests/resctrl: Improve perf init

On 2023-10-24 at 12:26:22 +0300, Ilpo J?rvinen wrote:
>struct perf_event_attr initialization is spread into
>perf_event_initialize() and perf_event_attr_initialize() and setting
>->config is hardcoded by the deepest level.
>
>perf_event_attr init belongs to perf_event_attr_initialize() so move it
>entirely there. Rename the other function
>perf_event_initialized_read_format().
>
>Call each init function directly from the test as they will take
>different parameters (especially tue after the perf related global

"tue" -> ""?

>variables are moved to local variables).
>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 11:48:29

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 13/24] selftests/resctrl: Convert perf related globals to locals

On 2023-10-24 at 12:26:23 +0300, Ilpo J?rvinen wrote:
>Perf related variables pea_llc_miss, pe_read, and pe_fd are globals in
>cache.c.
>
>Convert them to locals for better scoping and make pea_llc_miss simpler
>by renaming it to pea. Make close(pe_fd) handling easier to understand
>by doing it inside cat_val().
>
>Make also sizeof()s use safer way determine the right struct.

"use safer way determine" -> "use safer way to determine"?

>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 11:52:03

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 14/24] selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test()

On 2023-10-24 at 12:26:24 +0300, Ilpo J?rvinen wrote:
>diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>index 2106cc3601d9..e71690a9bbb3 100644
>--- a/tools/testing/selftests/resctrl/cat_test.c
>+++ b/tools/testing/selftests/resctrl/cat_test.c
>@@ -111,6 +111,77 @@ void cat_test_cleanup(void)
> remove(RESULT_FILE_NAME2);
> }
>
>+/*
>+ * cat_test: execute CAT benchmark and measure LLC cache misses
>+ * @param: parameters passed to cat_test()
>+ * @span: buffer size for the benchmark
>+ *
>+ * Return: 0 on success. non-zero on failure.
>+ */
>+static int cat_test(struct resctrl_val_param *param, size_t span)
>+{
>+ int memflush = 1, operation = 0, ret = 0;
>+ char *resctrl_val = param->resctrl_val;
>+ static struct perf_event_read pe_read;

Is there a reason why this struct is declared as static?

>+ struct perf_event_attr pea;
>+ pid_t bm_pid;
>+ int pe_fd;
>+
>+ if (strcmp(param->filename, "") == 0)
>+ sprintf(param->filename, "stdio");
>+
>+ bm_pid = getpid();
>+
>+ /* Taskset benchmark to specified cpu */
>+ ret = taskset_benchmark(bm_pid, param->cpu_no);
>+ if (ret)
>+ return ret;
>+
>+ /* Write benchmark to specified con_mon grp, mon_grp in resctrl FS*/
>+ ret = write_bm_pid_to_resctrl(bm_pid, param->ctrlgrp, param->mongrp,
>+ resctrl_val);
>+ if (ret)
>+ return ret;
>+
>+ perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
>+ perf_event_initialize_read_format(&pe_read);
>+
>+ /* Test runs until the callback setup() tells the test to stop. */
>+ while (1) {
>+ ret = param->setup(param);
>+ if (ret == END_OF_TESTS) {
>+ ret = 0;
>+ break;
>+ }
>+ if (ret < 0)
>+ break;
>+ pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
>+ if (pe_fd < 0) {
>+ ret = -1;
>+ break;
>+ }
>+
>+ if (run_fill_buf(span, memflush, operation, true)) {
>+ fprintf(stderr, "Error-running fill buffer\n");
>+ ret = -1;
>+ goto pe_close;
>+ }
>+
>+ sleep(1);
>+ ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
>+ if (ret)
>+ goto pe_close;
>+
>+ close(pe_fd);
>+ }
>+
>+ return ret;
>+
>+pe_close:
>+ close(pe_fd);
>+ return ret;
>+}
>+
> int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> {
> unsigned long l_mask, l_mask_1;

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:05:58

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On 2023-10-24 at 12:26:26 +0300, Ilpo J?rvinen wrote:
>CAT test spawns two processes into two different control groups with
>exclusive schemata. Both the processes alloc a buffer from memory
>matching their allocated LLC block size and flush the entire buffer out
>of caches. Since the processes are reading through the buffer only once
>during the measurement and initially all the buffer was flushed, the
>test isn't testing CAT.
>
>Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
>perform a sequence of tests with different LLC alloc sizes starting
>from half of the CBM bits down to 1-bit CBM. Flush the buffer before
>each test and read the buffer twice. Observe the LLC misses on the
>second read through the buffer. As the allocated LLC block gets smaller
>and smaller, the LLC misses will become larger and larger giving a
>strong signal on CAT working properly.
>
>The new CAT test is using only a single process because it relies on
>measured effect against another run of itself rather than another
>process adding noise. The rest of the system is allocated the CBM bits
>not used by the CAT test to keep the test isolated.
>
>Replace count_bits() with count_contiguous_bits() to get the first bit
>position in order to be able to calculate masks based on it.
>
>This change has been tested with a number of systems from different
>generations.
>
>Suggested-by: Reinette Chatre <[email protected]>
>Signed-off-by: Ilpo J?rvinen <[email protected]>
>---
> tools/testing/selftests/resctrl/cat_test.c | 286 +++++++++-----------
> tools/testing/selftests/resctrl/fill_buf.c | 6 +-
> tools/testing/selftests/resctrl/resctrl.h | 5 +-
> tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
> 4 files changed, 137 insertions(+), 204 deletions(-)
>
>diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>index e71690a9bbb3..7518c520c5cc 100644
>--- a/tools/testing/selftests/resctrl/cat_test.c
>+++ b/tools/testing/selftests/resctrl/cat_test.c
>@@ -11,65 +11,68 @@
> #include "resctrl.h"
> #include <unistd.h>
>
>-#define RESULT_FILE_NAME1 "result_cat1"
>-#define RESULT_FILE_NAME2 "result_cat2"
>+#define RESULT_FILE_NAME "result_cat"
> #define NUM_OF_RUNS 5
>-#define MAX_DIFF_PERCENT 4
>-#define MAX_DIFF 1000000
>
> /*
>- * Change schemata. Write schemata to specified
>- * con_mon grp, mon_grp in resctrl FS.
>- * Run 5 times in order to get average values.
>+ * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
>+ * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
>+ * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
>+ *
>+ * The relationship between number of used CBM bits and difference in LLC
>+ * misses is not expected to be linear. With a small number of bits, the
>+ * margin is smaller than with larger number of bits. For selftest purposes,
>+ * however, linear approach is enough because ultimately only pass/fail
>+ * decision has to be made and distinction between strong and stronger
>+ * signal is irrelevant.
> */
>-static int cat_setup(struct resctrl_val_param *p)
>-{
>- char schemata[64];
>- int ret = 0;
>-
>- /* Run NUM_OF_RUNS times */
>- if (p->num_of_runs >= NUM_OF_RUNS)
>- return END_OF_TESTS;
>-
>- if (p->num_of_runs == 0) {
>- sprintf(schemata, "%lx", p->mask);
>- ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
>- p->resctrl_val);
>- }
>- p->num_of_runs++;
>-
>- return ret;
>-}
>+#define MIN_DIFF_PERCENT_PER_BIT 1
>
> static int show_results_info(__u64 sum_llc_val, int no_of_bits,
>- unsigned long cache_span, unsigned long max_diff,
>- unsigned long max_diff_percent, unsigned long num_of_runs,
>- bool platform)
>+ unsigned long cache_span, long min_diff_percent,
>+ unsigned long num_of_runs, bool platform,
>+ __s64 *prev_avg_llc_val)
> {
> __u64 avg_llc_val = 0;
>- float diff_percent;
>- int ret;
>+ float avg_diff;
>+ int ret = 0;
>
> avg_llc_val = sum_llc_val / num_of_runs;
>- diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
>+ if (*prev_avg_llc_val) {
>+ float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);
>
>- ret = platform && abs((int)diff_percent) > max_diff_percent;
>+ avg_diff = delta / *prev_avg_llc_val;
>+ ret = platform && (avg_diff * 100) < (float)min_diff_percent;
>
>- ksft_print_msg("%s Check cache miss rate within %lu%%\n",
>- ret ? "Fail:" : "Pass:", max_diff_percent);
>+ ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
>+ ret ? "Fail:" : "Pass:", (float)min_diff_percent);

Shouldn't "Fail" and "Pass" be flipped in the ternary operator? Or the condition
sign above "<" should be ">"?

Now it looks like if (avg_diff * 100) is smaller than the min_diff_percent the
test is supposed to fail but the text suggests it's the other way around.

I also ran this selftest and that's the output:

# Pass: Check cache miss rate changed more than 3.0%
# Percent diff=45.8
# Number of bits: 4
# Average LLC val: 322489
# Cache span (lines): 294912
# Pass: Check cache miss rate changed more than 2.0%
# Percent diff=38.0
# Number of bits: 3
# Average LLC val: 445005
# Cache span (lines): 221184
# Pass: Check cache miss rate changed more than 1.0%
# Percent diff=27.2
# Number of bits: 2
# Average LLC val: 566145
# Cache span (lines): 147456
# Pass: Check cache miss rate changed more than 0.0%
# Percent diff=18.3
# Number of bits: 1
# Average LLC val: 669657
# Cache span (lines): 73728
ok 1 CAT: test

The diff percentages are much larger than the thresholds they're supposed to
be within and the test is passed.

>- ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
>+ ksft_print_msg("Percent diff=%.1f\n", avg_diff * 100);
>+ }
>+ *prev_avg_llc_val = avg_llc_val;
>
> show_cache_info(no_of_bits, avg_llc_val, cache_span, true);
>
> return ret;
> }
>
>@@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
> if (ret)
> return ret;
>
>+ buf = alloc_buffer(span, 1);
>+ if (buf == NULL)

Similiar to patch 01/24, wouldn't this:
if (!buf)
be better?

>+ return -1;
>+

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:08:36

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

On 2023-10-24 at 12:26:27 +0300, Ilpo J?rvinen wrote:
>resctrl_tests reads a set of parameters and passes them individually
>for each tests. The way the parameters passed varies between tests.

"the parameters passed" -> "the parameters are passed"?

>
>Add struct input_params to hold are input parameters. It can be easily

"to hold are input parameters" -> "to hold input parameters"?

>passed to every test without varying the call signature.
>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:09:43

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 20/24] selftests/resctrl: Add helper to convert L2/3 to integer

On 2023-10-24 at 12:26:30 +0300, Ilpo J?rvinen wrote:
>"L2"/"L3" conversion to integer in embedded into get_cache_size()

"in embedded" -> "is embedded"?

>which prevents reuse.
>
>Create a helper for the cache string to integer conversion to make
>it reusable.
>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:18:32

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 14/24] selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test()

On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:

> On 2023-10-24 at 12:26:24 +0300, Ilpo J?rvinen wrote:
> >diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >index 2106cc3601d9..e71690a9bbb3 100644
> >--- a/tools/testing/selftests/resctrl/cat_test.c
> >+++ b/tools/testing/selftests/resctrl/cat_test.c
> >@@ -111,6 +111,77 @@ void cat_test_cleanup(void)
> > remove(RESULT_FILE_NAME2);
> > }
> >
> >+/*
> >+ * cat_test: execute CAT benchmark and measure LLC cache misses
> >+ * @param: parameters passed to cat_test()
> >+ * @span: buffer size for the benchmark
> >+ *
> >+ * Return: 0 on success. non-zero on failure.
> >+ */
> >+static int cat_test(struct resctrl_val_param *param, size_t span)
> >+{
> >+ int memflush = 1, operation = 0, ret = 0;
> >+ char *resctrl_val = param->resctrl_val;
> >+ static struct perf_event_read pe_read;
>
> Is there a reason why this struct is declared as static?

Good catch.

I'll change the earlier patch which made the global -> local var move and
failed to remove the static keyword.

--
i.

2023-10-27 12:33:27

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:

> On 2023-10-24 at 12:26:26 +0300, Ilpo J?rvinen wrote:
> >CAT test spawns two processes into two different control groups with
> >exclusive schemata. Both the processes alloc a buffer from memory
> >matching their allocated LLC block size and flush the entire buffer out
> >of caches. Since the processes are reading through the buffer only once
> >during the measurement and initially all the buffer was flushed, the
> >test isn't testing CAT.
> >
> >Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
> >perform a sequence of tests with different LLC alloc sizes starting
> >from half of the CBM bits down to 1-bit CBM. Flush the buffer before
> >each test and read the buffer twice. Observe the LLC misses on the
> >second read through the buffer. As the allocated LLC block gets smaller
> >and smaller, the LLC misses will become larger and larger giving a
> >strong signal on CAT working properly.
> >
> >The new CAT test is using only a single process because it relies on
> >measured effect against another run of itself rather than another
> >process adding noise. The rest of the system is allocated the CBM bits
> >not used by the CAT test to keep the test isolated.
> >
> >Replace count_bits() with count_contiguous_bits() to get the first bit
> >position in order to be able to calculate masks based on it.
> >
> >This change has been tested with a number of systems from different
> >generations.
> >
> >Suggested-by: Reinette Chatre <[email protected]>
> >Signed-off-by: Ilpo J?rvinen <[email protected]>
> >---
> > tools/testing/selftests/resctrl/cat_test.c | 286 +++++++++-----------
> > tools/testing/selftests/resctrl/fill_buf.c | 6 +-
> > tools/testing/selftests/resctrl/resctrl.h | 5 +-
> > tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
> > 4 files changed, 137 insertions(+), 204 deletions(-)
> >
> >diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >index e71690a9bbb3..7518c520c5cc 100644
> >--- a/tools/testing/selftests/resctrl/cat_test.c
> >+++ b/tools/testing/selftests/resctrl/cat_test.c
> >@@ -11,65 +11,68 @@
> > #include "resctrl.h"
> > #include <unistd.h>
> >
> >-#define RESULT_FILE_NAME1 "result_cat1"
> >-#define RESULT_FILE_NAME2 "result_cat2"
> >+#define RESULT_FILE_NAME "result_cat"
> > #define NUM_OF_RUNS 5
> >-#define MAX_DIFF_PERCENT 4
> >-#define MAX_DIFF 1000000
> >
> > /*
> >- * Change schemata. Write schemata to specified
> >- * con_mon grp, mon_grp in resctrl FS.
> >- * Run 5 times in order to get average values.
> >+ * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
> >+ * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
> >+ * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
> >+ *
> >+ * The relationship between number of used CBM bits and difference in LLC
> >+ * misses is not expected to be linear. With a small number of bits, the
> >+ * margin is smaller than with larger number of bits. For selftest purposes,
> >+ * however, linear approach is enough because ultimately only pass/fail
> >+ * decision has to be made and distinction between strong and stronger
> >+ * signal is irrelevant.
> > */
> >-static int cat_setup(struct resctrl_val_param *p)
> >-{
> >- char schemata[64];
> >- int ret = 0;
> >-
> >- /* Run NUM_OF_RUNS times */
> >- if (p->num_of_runs >= NUM_OF_RUNS)
> >- return END_OF_TESTS;
> >-
> >- if (p->num_of_runs == 0) {
> >- sprintf(schemata, "%lx", p->mask);
> >- ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
> >- p->resctrl_val);
> >- }
> >- p->num_of_runs++;
> >-
> >- return ret;
> >-}
> >+#define MIN_DIFF_PERCENT_PER_BIT 1
> >
> > static int show_results_info(__u64 sum_llc_val, int no_of_bits,
> >- unsigned long cache_span, unsigned long max_diff,
> >- unsigned long max_diff_percent, unsigned long num_of_runs,
> >- bool platform)
> >+ unsigned long cache_span, long min_diff_percent,
> >+ unsigned long num_of_runs, bool platform,
> >+ __s64 *prev_avg_llc_val)
> > {
> > __u64 avg_llc_val = 0;
> >- float diff_percent;
> >- int ret;
> >+ float avg_diff;
> >+ int ret = 0;
> >
> > avg_llc_val = sum_llc_val / num_of_runs;
> >- diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
> >+ if (*prev_avg_llc_val) {
> >+ float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);
> >
> >- ret = platform && abs((int)diff_percent) > max_diff_percent;
> >+ avg_diff = delta / *prev_avg_llc_val;
> >+ ret = platform && (avg_diff * 100) < (float)min_diff_percent;
> >
> >- ksft_print_msg("%s Check cache miss rate within %lu%%\n",
> >- ret ? "Fail:" : "Pass:", max_diff_percent);
> >+ ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
> >+ ret ? "Fail:" : "Pass:", (float)min_diff_percent);
>
> Shouldn't "Fail" and "Pass" be flipped in the ternary operator? Or the condition
> sign above "<" should be ">"?

I must not touch ret ? "Fail:" : "Pass:" logic, it's the correct way
around. If I'd touch it, it'd break what the calling code assumes about
the return value.

(More explanation below).

> Now it looks like if (avg_diff * 100) is smaller than the min_diff_percent the
> test is supposed to fail but the text suggests it's the other way around.
>
> I also ran this selftest and that's the output:
>
> # Pass: Check cache miss rate changed more than 3.0%
> # Percent diff=45.8
> # Number of bits: 4
> # Average LLC val: 322489
> # Cache span (lines): 294912
> # Pass: Check cache miss rate changed more than 2.0%
> # Percent diff=38.0
> # Number of bits: 3
> # Average LLC val: 445005
> # Cache span (lines): 221184
> # Pass: Check cache miss rate changed more than 1.0%
> # Percent diff=27.2
> # Number of bits: 2
> # Average LLC val: 566145
> # Cache span (lines): 147456
> # Pass: Check cache miss rate changed more than 0.0%
> # Percent diff=18.3
> # Number of bits: 1
> # Average LLC val: 669657
> # Cache span (lines): 73728
> ok 1 CAT: test
>
> The diff percentages are much larger than the thresholds they're supposed to
> be within and the test is passed.

No, the whole test logic is changed dramatically by this patch and
failure logic is reverse now because of it. Note how I also altered these
things:

- MAX_DIFF_PERCENT -> MIN_DIFF_PERCENT_PER_BIT
- max_diff_percent -> min_diff_percent
- "cache miss rate within" -> "cache miss rate changed more than"

The new CAT test measures the # of cache misses (or in case of L2 CAT
test, LLC accesses which is used as a proxy for L2 misses). Then it takes
one bit away from the allocation mask and repeats the measurement.

If the # of LLC misses changes more than min_diff_precent when the
number of bits in the allocation was changed, it is a strong indicator CAT
is working like it should. Based on your numbers above, I'm extremely
confident CAT works as expected!

I know for a fact that when the selftest is bound to a wrong resource id
(which actually occurs on broadwell's with CoD enabled without one of the
later patches in this series), this test is guaranteed to fail 100%,
there's no noticeable difference measured in LLC misses in that case.

> >@@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
> > if (ret)
> > return ret;
> >
> >+ buf = alloc_buffer(span, 1);
> >+ if (buf == NULL)
>
> Similiar to patch 01/24, wouldn't this:
> if (!buf)
> be better?

I've already changed this based on the comment you made against 1/24 :-).

--
i.

2023-10-27 12:42:55

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 21/24] selftests/resctrl: Get resource id from cache id

On 2023-10-24 at 12:26:31 +0300, Ilpo J?rvinen wrote:
>Resource id is acquired differently depending on CPU. AMD tests use id
>from L3 cache, whereas CPUs from other vendors base the id on topology
>package id. In order to support L2 CAT test, this has to be
>generalized.
>
>The driver side code seems to get the resource ids from cache ids so
>the approach used by the AMD branch seems to match the kernel-side
>code. It will also work with L2 resource IDs as long as the cache level
>is generalized.
>
>Using the topology id was always fragile due to mismatch with the
>kernel-side way to acquire the resource id. It got incorrect resource
>id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not
>well suited for resctrl in the first place so it has not been a big
>issues if test don't work correctly with it).

"not been a big issues" -> "not been a big issue"?

"if test don't work correctly" -> "if test doesn't work correctly" / "if tests
don't work correctly"?

>
>Taking all the above into account, generalize the resource id
>calculation by taking it from the cache id and do not hard-code the
>cache level.

In x86/resctrl files group of CPUs sharing a resctrl resource is called a
domain. Because of that I think "resource id" terms should be substituted with
"domain id" to match core resctrl code.

>
>Signed-off-by: Ilpo J?rvinen <[email protected]>
>---
> tools/testing/selftests/resctrl/resctrl.h | 2 +-
> tools/testing/selftests/resctrl/resctrl_val.c | 4 +--
> tools/testing/selftests/resctrl/resctrlfs.c | 31 ++++++++++++-------
> 3 files changed, 23 insertions(+), 14 deletions(-)
>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:43:20

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 22/24] selftests/resctrl: Add test groups and name L3 CAT test L3_CAT

On 2023-10-24 at 12:26:32 +0300, Ilpo J?rvinen wrote:
>To select test to run -t parameter can be used. However, -t cat
>currently maps to L3 CAT test which is confusing after L2 CAT test is
>be the next change.
>
>Allow selecting tests as groups and call L3 CAT test L3_CAT, "CAT"
>group will enable to both L3 and L2 CAT tests.

"enable to both" -> "enable both"?

>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:43:49

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:

> On 2023-10-24 at 12:26:27 +0300, Ilpo J?rvinen wrote:
> >resctrl_tests reads a set of parameters and passes them individually
> >for each tests. The way the parameters passed varies between tests.
>
> "the parameters passed" -> "the parameters are passed"?

I ended up rewriting this sentence completely to make it more obvious
it's about call signature variations.

> >Add struct input_params to hold are input parameters. It can be easily
>
> "to hold are input parameters" -> "to hold input parameters"?

I intended to write "to hold all input parameters".

--
i.

2023-10-27 12:47:51

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

On 2023-10-24 at 12:26:33 +0300, Ilpo J?rvinen wrote:
>CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
>support.

Is there some some defined line since what CPU model is L2 CAT supported?

In my opinion, from the perspective of someone digging up this commit a couple
years from now it could be handy to have something more specific instead of
"some newer CPUs".

>
>Add L2 CAT selftest. As measuring L2 misses is not easily available
>with perf, use L3 accesses as a proxy for L2 CAT working or not.
>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 12:48:39

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

On 2023-10-24 at 12:26:34 +0300, Ilpo J?rvinen wrote:
>L2 CAT test with low number of bits tends to occasionally fail because
>of what seems random variation. The margin is quite small to begin with
>for <= 2 bits in CBM. At times, the result can even become negative.
>While it would be possible to allow negative values for those cases, it
>would be more confusing to user.

"to user" -> "to the user"?

>
>Ignore failures from the tests where <= 2 were used to avoid false
>negative results.
>
>Signed-off-by: Ilpo J?rvinen <[email protected]>

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-27 13:32:45

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 21/24] selftests/resctrl: Get resource id from cache id

On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:

> On 2023-10-24 at 12:26:31 +0300, Ilpo J?rvinen wrote:
> >Resource id is acquired differently depending on CPU. AMD tests use id
> >from L3 cache, whereas CPUs from other vendors base the id on topology
> >package id. In order to support L2 CAT test, this has to be
> >generalized.
> >
> >The driver side code seems to get the resource ids from cache ids so
> >the approach used by the AMD branch seems to match the kernel-side
> >code. It will also work with L2 resource IDs as long as the cache level
> >is generalized.
> >
> >Using the topology id was always fragile due to mismatch with the
> >kernel-side way to acquire the resource id. It got incorrect resource
> >id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not
> >well suited for resctrl in the first place so it has not been a big
> >issues if test don't work correctly with it).
>
> "not been a big issues" -> "not been a big issue"?
>
> "if test don't work correctly" -> "if test doesn't work correctly" / "if tests
> don't work correctly"?
>
> >
> >Taking all the above into account, generalize the resource id
> >calculation by taking it from the cache id and do not hard-code the
> >cache level.
>
> In x86/resctrl files group of CPUs sharing a resctrl resource is called a
> domain. Because of that I think "resource id" terms should be substituted with
> "domain id" to match core resctrl code.

Okay, I was just using the terminology which pre-existed in the selftest
code.

I've really tried to look how I should call it but things were quite
inconsistent. The selftest uses resource id and reads it from cache id.
Documentation uses "instances of that resource" or "cache id" or "domain
x".


Documentation/arch/x86/resctrl.rst is very misleading btw and should be
IMHO updated:

Memory bandwidth Allocation (default mode)
------------------------------------------

Memory b/w domain is L3 cache.
::

MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...

It claims "domain" is L3 cache (and the value we're talking about is
called "cache_id" here which is what you just called "domain id"). I
suppose it should say "b/w resource is L3 cache"?


--
i.

2023-10-31 07:24:33

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On 2023-10-27 at 15:32:58 +0300, Ilpo J?rvinen wrote:
>On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:
>> On 2023-10-24 at 12:26:26 +0300, Ilpo J?rvinen wrote:
>> >- ksft_print_msg("%s Check cache miss rate within %lu%%\n",
>> >- ret ? "Fail:" : "Pass:", max_diff_percent);
>> >+ ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
>> >+ ret ? "Fail:" : "Pass:", (float)min_diff_percent);
>>
>> Shouldn't "Fail" and "Pass" be flipped in the ternary operator? Or the condition
>> sign above "<" should be ">"?
>
>I must not touch ret ? "Fail:" : "Pass:" logic, it's the correct way
>around. If I'd touch it, it'd break what the calling code assumes about
>the return value.
>
>(More explanation below).
>
>> Now it looks like if (avg_diff * 100) is smaller than the min_diff_percent the
>> test is supposed to fail but the text suggests it's the other way around.
>>
>> I also ran this selftest and that's the output:
>>
>> # Pass: Check cache miss rate changed more than 3.0%
>> # Percent diff=45.8
>> # Number of bits: 4
>> # Average LLC val: 322489
>> # Cache span (lines): 294912
>> # Pass: Check cache miss rate changed more than 2.0%
>> # Percent diff=38.0
>> # Number of bits: 3
>> # Average LLC val: 445005
>> # Cache span (lines): 221184
>> # Pass: Check cache miss rate changed more than 1.0%
>> # Percent diff=27.2
>> # Number of bits: 2
>> # Average LLC val: 566145
>> # Cache span (lines): 147456
>> # Pass: Check cache miss rate changed more than 0.0%
>> # Percent diff=18.3
>> # Number of bits: 1
>> # Average LLC val: 669657
>> # Cache span (lines): 73728
>> ok 1 CAT: test
>>
>> The diff percentages are much larger than the thresholds they're supposed to
>> be within and the test is passed.
>
>No, the whole test logic is changed dramatically by this patch and
>failure logic is reverse now because of it. Note how I also altered these
>things:
>
>- MAX_DIFF_PERCENT -> MIN_DIFF_PERCENT_PER_BIT
>- max_diff_percent -> min_diff_percent
>- "cache miss rate within" -> "cache miss rate changed more than"
>
>The new CAT test measures the # of cache misses (or in case of L2 CAT
>test, LLC accesses which is used as a proxy for L2 misses). Then it takes
>one bit away from the allocation mask and repeats the measurement.
>
>If the # of LLC misses changes more than min_diff_precent when the
>number of bits in the allocation was changed, it is a strong indicator CAT
>is working like it should. Based on your numbers above, I'm extremely
>confident CAT works as expected!
>
>I know for a fact that when the selftest is bound to a wrong resource id
>(which actually occurs on broadwell's with CoD enabled without one of the
>later patches in this series), this test is guaranteed to fail 100%,
>there's no noticeable difference measured in LLC misses in that case.

Thanks for explaining. Looking at it again the patch makes sense and seems very
coherent.

--
Kind regards
Maciej Wiecz?r-Retman

2023-10-31 07:59:24

by Maciej Wieczor-Retman

[permalink] [raw]
Subject: Re: [PATCH 21/24] selftests/resctrl: Get resource id from cache id

On 2023-10-27 at 16:30:38 +0300, Ilpo J?rvinen wrote:
>On Fri, 27 Oct 2023, Maciej Wiecz?r-Retman wrote:
>
>> On 2023-10-24 at 12:26:31 +0300, Ilpo J?rvinen wrote:
>> >Resource id is acquired differently depending on CPU. AMD tests use id
>> >from L3 cache, whereas CPUs from other vendors base the id on topology
>> >package id. In order to support L2 CAT test, this has to be
>> >generalized.
>> >
>> >The driver side code seems to get the resource ids from cache ids so
>> >the approach used by the AMD branch seems to match the kernel-side
>> >code. It will also work with L2 resource IDs as long as the cache level
>> >is generalized.
>> >
>> >Using the topology id was always fragile due to mismatch with the
>> >kernel-side way to acquire the resource id. It got incorrect resource
>> >id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not
>> >well suited for resctrl in the first place so it has not been a big
>> >issues if test don't work correctly with it).
>>
>> "not been a big issues" -> "not been a big issue"?
>>
>> "if test don't work correctly" -> "if test doesn't work correctly" / "if tests
>> don't work correctly"?
>>
>> >
>> >Taking all the above into account, generalize the resource id
>> >calculation by taking it from the cache id and do not hard-code the
>> >cache level.
>>
>> In x86/resctrl files group of CPUs sharing a resctrl resource is called a
>> domain. Because of that I think "resource id" terms should be substituted with
>> "domain id" to match core resctrl code.
>
>Okay, I was just using the terminology which pre-existed in the selftest
>code.
>
>I've really tried to look how I should call it but things were quite
>inconsistent. The selftest uses resource id and reads it from cache id.
>Documentation uses "instances of that resource" or "cache id" or "domain
>x".
>
>
>Documentation/arch/x86/resctrl.rst is very misleading btw and should be
>IMHO updated:
>
>Memory bandwidth Allocation (default mode)
>------------------------------------------
>
>Memory b/w domain is L3 cache.
>::
>
> MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
>
>It claims "domain" is L3 cache (and the value we're talking about is
>called "cache_id" here which is what you just called "domain id"). I
>suppose it should say "b/w resource is L3 cache"?
>

I believe so. After some looking around it looks like the "L3 domain" naming is
also present in rdtgroup.c:mkdir_mondata_all() and "mon_data" description in the
docs:

Docs:
"mon_data":
This contains a set of files organized by L3 domain and by
RDT event. E.g. on a system with two L3 domains there will

rdtgroup.c:
/*
* This creates a directory mon_data which contains the monitored data.
*
* mon_data has one directory for each domain which are named
* in the format mon_<domain_name>_<domain_id>. For ex: A mon_data
* with L3 domain looks as below:
* ./mon_data:
* mon_L3_00
* mon_L3_01
* mon_L3_02
* ...
*
* Each domain directory has one file per event:
* ./mon_L3_00/:
* llc_occupancy
*
*/
static int mkdir_mondata_all(struct kernfs_node *parent_kn,
struct rdtgroup *prgrp,
struct kernfs_node **dest_kn)

From the reading I've done on resctrl my understanding was that L2/L3 are
resources that can be shared between domains.

--
Kind regards
Maciej Wiecz?r-Retman

2023-11-02 17:48:17

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 05/24] selftests/resctrl: Create cache_size() helper

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> CAT and CMT tests calculate the span size from the n-bits cache
> allocation on their own.
>
> Add cache_size() helper which calculates size of the cache portion for
> the given number of bits and use it to replace the existing span
> calculations. This also prepares for the new CAT test that will need to
> determine the size of the cache portion also during results processing.
>
> cache_size local variables were renamed out of the way to
> cache_total_size.

Please do stick to imperative mood ... "Rename cache_size local
variables ..."


...

> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 2f3f0ee439d8..da06b2d492f9 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -117,4 +117,18 @@ int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
> unsigned long max_diff_percent, unsigned long num_of_runs,
> bool platform, bool cmt);
>
> +/*
> + * cache_size - Calculate the size of a cache portion
> + * @cache_size: Cache size in bytes
> + * @mask: Cache portion mask
> + * @cache_mask: Full bitmask for the cache
> + *
> + * Return: The size of the cache portion in bytes.
> + */
> +static inline int cache_size(unsigned long cache_size, unsigned long mask,
> + unsigned long cache_mask)
> +{
> + return cache_size * count_bits(mask) / count_bits(cache_mask);
> +}
> +
> #endif /* RESCTRL_H */


The get_cache_size() and cache_size() naming appears similar enough to me
to cause confusion. Considering the "portion" term above, what do you think
of "cache_portion_size()" or even "cache_portion_bytes()"?

Reinette

2023-11-02 17:48:18

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 06/24] selftests/resctrl: Exclude shareable bits from schemata in CAT test

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> CAT test doesn't take shareable bits into account, i.e., the test might
> be sharing cache with some devices (e.g., graphics).
>
> Introduce get_mask_no_shareable() and use it to provision an
> environment for CAT test where the allocated LLC is isolated better.
> Excluding shareable_bits may create hole(s) into the cbm_mask, thus add
> a new helper count_contiguous_bits() to find the longest contiguous set
> of CBM bits.
>
> create_bit_mask() is needed by an upcoming CAT test rewrite so make it
> available in resctrl.h right away.
>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 12 ++-
> tools/testing/selftests/resctrl/resctrl.h | 3 +
> tools/testing/selftests/resctrl/resctrlfs.c | 84 +++++++++++++++++++++
> 3 files changed, 95 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 80861c362a53..e5861e7cba7e 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -92,13 +92,17 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> unsigned long l_mask, l_mask_1;
> int ret, pipefd[2], sibling_cpu_no;
> unsigned long cache_total_size = 0;
> - unsigned long long_mask;
> + unsigned long full_cache_mask, long_mask;

Please do maintain reverse fir order.

> int count_of_bits;
> char pipe_message;
> size_t span;
>
> /* Get default cbm mask for L3/L2 cache */
> - ret = get_cbm_mask(cache_type, &long_mask);
> + ret = get_cbm_mask(cache_type, &full_cache_mask);
> + if (ret)
> + return ret;
> + /* Get the exclusive portion of the cache */
> + ret = get_mask_no_shareable(cache_type, &long_mask);
> if (ret)
> return ret;
>
> +
> +/*
> + * count_contiguous_bits - Returns the longest train of bits in a bit mask
> + * @val A bit mask
> + * @start The location of the least-significant bit of the longest train
> + *
> + * Return: The length of the contiguous bits in the longest train of bits
> + */
> +static unsigned int count_contiguous_bits(unsigned long val, unsigned int *start)
> +{
> + unsigned long last_val;
> + int count = 0;

could count be unsigned int?

> +
> + while (val) {
> + last_val = val;
> + val &= (val >> 1);
> + count++;
> + }
> +
> + if (start) {
> + if (count)
> + *start = ffsl(last_val) - 1;
> + else
> + *start = 0;
> + }
> +
> + return count;
> +}
> +
> /*
> * get_cbm_mask - Get cbm bit mask
> * @cache_type: Cache level L2/L3
> @@ -253,6 +291,52 @@ int get_cbm_mask(const char *cache_type, unsigned long *mask)
> return 0;
> }
>
> +/*
> + * get_shareable_mask - Get shareable mask from shareable_bits for given cache
> + * @cache_type: Cache level L2/L3
> + * @shareable_mask: shareable mask returned as unsigned long

A nitpick but please be consistent in starting sentences with capital letter.

> + *
> + * Return: = 0 on success, < 0 on failure.
> + */
> +int get_shareable_mask(const char *cache_type, unsigned long *shareable_mask)
> +{
> + char mask_path[1024];
> +
> + if (!cache_type)
> + return -1;

Same question as earlier about error checking.

> +
> + snprintf(mask_path, sizeof(mask_path), "%s/%s/shareable_bits",
> + INFO_PATH, cache_type);
> +
> + return get_bit_mask(mask_path, shareable_mask);
> +}
> +
> +/*
> + * get_mask_no_shareable - Get CBM mask without shareable_bits for given cache
> + * @cache_type: Cache level L2/L3
> + * @mask: mask returned as unsigned long

Apart from comment about capital letter this comment does not seem to
reveal more about what code does.

> + *
> + * Return: = 0 on success, < 0 on failure.
> + */
> +int get_mask_no_shareable(const char *cache_type, unsigned long *mask)
> +{
> + unsigned long full_mask, shareable_mask;
> + unsigned int start, len;
> +
> + if (get_cbm_mask(cache_type, &full_mask) < 0)
> + return -1;
> + if (get_shareable_mask(cache_type, &shareable_mask) < 0)
> + return -1;
> +
> + len = count_contiguous_bits(full_mask & ~shareable_mask, &start);
> + if (!len)
> + return -1;
> +
> + *mask = create_bit_mask(start, len);
> +
> + return 0;
> +}

Nicely done, thanks.

> +
> /*
> * get_core_sibling - Get sibling core id from the same socket for given CPU
> * @cpu_no: CPU number


Reinette

2023-11-02 17:48:41

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 07/24] selftests/resctrl: Split measure_cache_vals() function

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> The measure_cache_vals() function does a different thing depending on

No need to say "function" when using "()". The above can just be:
"measure_cache_vals() does a different ..."

> the test case that called it:
> - For CAT, it measures LLC perf misses.
> - For CMT, it measures LLC occupancy through resctrl.
>
> Split these two functionalities into own functions the CAT and CMT
> tests can call directly.
>
> Co-developed-by: Fenghua Yu <[email protected]>
> Signed-off-by: Fenghua Yu <[email protected]>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cache.c | 37 ++++++++++---------
> tools/testing/selftests/resctrl/resctrl.h | 2 +-
> tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
> 3 files changed, 21 insertions(+), 20 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> index bcbca356d56a..299d9508221f 100644
> --- a/tools/testing/selftests/resctrl/cache.c
> +++ b/tools/testing/selftests/resctrl/cache.c
> @@ -170,35 +170,36 @@ static int print_results_cache(char *filename, int bm_pid,
> return 0;
> }
>
> -int measure_cache_vals(struct resctrl_val_param *param, int bm_pid)
> +static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
> {
> - unsigned long llc_perf_miss = 0, llc_occu_resc = 0, llc_value = 0;
> + unsigned long llc_perf_miss = 0;
> int ret;
>
> /*
> * Measure cache miss from perf.
> */

I'd expect these comments to move as part of a change like this,
but seems like this is merged with other changes in patch 10. Does not
seem like an issue done in this way.

> - if (!strncmp(param->resctrl_val, CAT_STR, sizeof(CAT_STR))) {
> - ret = get_llc_perf(&llc_perf_miss);
> - if (ret < 0)
> - return ret;
> - llc_value = llc_perf_miss;
> - }
> + ret = get_llc_perf(&llc_perf_miss);
> + if (ret < 0)
> + return ret;
> +
> + ret = print_results_cache(param->filename, bm_pid, llc_perf_miss);
> + return ret;
> +}
> +
> +int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
> +{
> + unsigned long llc_occu_resc = 0;
> + int ret;
>
> /*
> * Measure llc occupancy from resctrl.
> */
> - if (!strncmp(param->resctrl_val, CMT_STR, sizeof(CMT_STR))) {
> - ret = get_llc_occu_resctrl(&llc_occu_resc);
> - if (ret < 0)
> - return ret;
> - llc_value = llc_occu_resc;
> - }
> - ret = print_results_cache(param->filename, bm_pid, llc_value);
> - if (ret)
> + ret = get_llc_occu_resctrl(&llc_occu_resc);
> + if (ret < 0)
> return ret;
>
> - return 0;
> + ret = print_results_cache(param->filename, bm_pid, llc_occu_resc);
> + return ret;
> }
>
> /*
> @@ -253,7 +254,7 @@ int cat_val(struct resctrl_val_param *param, size_t span)
> }
>
> sleep(1);
> - ret = measure_cache_vals(param, bm_pid);
> + ret = measure_llc_perf(param, bm_pid);
> if (ret)
> goto pe_close;
> }


Reinette

2023-11-02 17:49:04

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 08/24] selftests/resctrl: Split show_cache_info() to test specific and generic parts

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> show_cache_info() calculates results and provides generic cache
> information. This makes it hard to alter pass/fail conditions.
>
> Separate the test specific checks into CAT and CMT test files and
> leave only the generic information part into show_cache_info().
>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cache.c | 40 ++++------------------
> tools/testing/selftests/resctrl/cat_test.c | 30 ++++++++++++++--
> tools/testing/selftests/resctrl/cmt_test.c | 32 +++++++++++++++--
> tools/testing/selftests/resctrl/resctrl.h | 6 ++--
> 4 files changed, 65 insertions(+), 43 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> index 299d9508221f..95489d4b42b7 100644
> --- a/tools/testing/selftests/resctrl/cache.c
> +++ b/tools/testing/selftests/resctrl/cache.c
> @@ -267,43 +267,17 @@ int cat_val(struct resctrl_val_param *param, size_t span)
> }
>
> /*
> - * show_cache_info: show cache test result information
> - * @sum_llc_val: sum of LLC cache result data
> + * show_cache_info: show generic cache test information
> * @no_of_bits: number of bits
> - * @cache_span: cache span in bytes for CMT or in lines for CAT
> - * @max_diff: max difference
> - * @max_diff_percent: max difference percentage
> - * @num_of_runs: number of runs
> - * @platform: show test information on this platform
> - * @cmt: CMT test or CAT test
> - *
> - * Return: 0 on success. non-zero on failure.
> + * @avg_llc_val: avg of LLC cache result data
> + * @cache_span: cache span
> + * @lines: cache span in lines or bytes

It may be helpful to directly connect it to the other parameter:
@cache_span in lines or bytes

> */
> -int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
> - size_t cache_span, unsigned long max_diff,
> - unsigned long max_diff_percent, unsigned long num_of_runs,
> - bool platform, bool cmt)
> +void show_cache_info(int no_of_bits, unsigned long avg_llc_val,
> + size_t cache_span, bool lines)
> {
> - unsigned long avg_llc_val = 0;
> - float diff_percent;
> - long avg_diff = 0;
> - int ret;
> -
> - avg_llc_val = sum_llc_val / num_of_runs;
> - avg_diff = (long)abs(cache_span - avg_llc_val);
> - diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
> -
> - ret = platform && abs((int)diff_percent) > max_diff_percent &&
> - (cmt ? (abs(avg_diff) > max_diff) : true);
> -
> - ksft_print_msg("%s Check cache miss rate within %lu%%\n",
> - ret ? "Fail:" : "Pass:", max_diff_percent);
> -
> - ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
> ksft_print_msg("Number of bits: %d\n", no_of_bits);
> ksft_print_msg("Average LLC val: %lu\n", avg_llc_val);
> - ksft_print_msg("Cache span (%s): %zu\n", cmt ? "bytes" : "lines",
> + ksft_print_msg("Cache span (%s): %zu\n", !lines ? "bytes" : "lines",
> cache_span);

I think it would be easier to read as: lines ? "lines" : "bytes"

Reinette

2023-11-02 17:49:37

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 09/24] selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversion

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> Perf counters are __u64 but the code converts them to unsigned long
> before printing them out.
>
> Remove unnecessary type conversion and the potential loss of meaningful
> bits due to different sizes of types.

This motivation is not clear to me. Is this work done in
preparation for 32 bit testing? This raises a lot more topics if
this is the case.

Reinette

2023-11-02 17:50:18

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 10/24] selftests/resctrl: Remove nested calls in perf event handling

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> Perf event handling has functions that are the sole caller of another
> perf event handling related function:
> - reset_enable_llc_perf() calls perf_event_open_llc_miss()
> - reset_enable_llc_perf() calls ioctl_perf_event_ioc_reset_enable()
> - measure_llc_perf() calls get_llc_perf()
>
> Remove the extra layer of calls to make the code easier to follow by
> moving the code into the calling function.
>
> In addition, converts print_results_cache() unsigned long parameter to
> __u64 that matches the type coming from perf.

Is this referring to work from previous patch?

>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cache.c | 86 +++++++------------------
> 1 file changed, 25 insertions(+), 61 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> index d39ef4eebc37..208af1ecae28 100644
> --- a/tools/testing/selftests/resctrl/cache.c
> +++ b/tools/testing/selftests/resctrl/cache.c
> @@ -29,25 +29,6 @@ static void initialize_perf_event_attr(void)
> pea_llc_miss.disabled = 1;
> }
>
> -static void ioctl_perf_event_ioc_reset_enable(void)
> -{
> - ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
> - ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
> -}
> -
> -static int perf_event_open_llc_miss(pid_t pid, int cpu_no)
> -{
> - fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1,
> - PERF_FLAG_FD_CLOEXEC);
> - if (fd_lm == -1) {
> - perror("Error opening leader");
> - ctrlc_handler(0, NULL, NULL);
> - return -1;
> - }
> -
> - return 0;
> -}
> -
> static void initialize_llc_perf(void)
> {
> memset(&pea_llc_miss, 0, sizeof(struct perf_event_attr));
> @@ -63,42 +44,16 @@ static void initialize_llc_perf(void)
>
> static int reset_enable_llc_perf(pid_t pid, int cpu_no)
> {
> - int ret = 0;
> -
> - ret = perf_event_open_llc_miss(pid, cpu_no);
> - if (ret < 0)
> - return ret;
> -
> - /* Start counters to log values */
> - ioctl_perf_event_ioc_reset_enable();
> -
> - return 0;
> -}
> -
> -/*
> - * get_llc_perf: llc cache miss through perf events
> - * @llc_perf_miss: LLC miss counter that is filled on success
> - *
> - * Perf events like HW_CACHE_MISSES could be used to validate number of
> - * cache lines allocated.
> - *
> - * Return: =0 on success. <0 on failure.
> - */
> -static int get_llc_perf(__u64 *llc_perf_miss)
> -{
> - int ret;
> -
> - /* Stop counters after one span to get miss rate */
> -
> - ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> -
> - ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> - if (ret == -1) {
> - perror("Could not get llc misses through perf");
> + fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
> + if (fd_lm == -1) {
> + perror("Error opening leader");
> + ctrlc_handler(0, NULL, NULL);

I understand you just copied the code here ... but it is not clear to me
why this particular error handling deserves a ctrlc_handler().

> return -1;
> }
>
> - *llc_perf_miss = rf_cqm.values[0].value;
> + /* Start counters to log values */
> + ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
> + ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
>
> return 0;
> }
> @@ -166,20 +121,29 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
> return 0;
> }
>
> +/*
> + * measure_llc_perf: measure perf events
> + * @bm_pid: child pid that runs benchmark

I expected "bm_pid" to reflect a "benchmark pid" that
is not unique to the child. Are both parent and child
not running the benchmark?

Missing doc of a parameter here.

> + *
> + * Measure things like cache misses from perf events.

"things like cache misses" is vague. The function's name
still contains "llc" which makes me think it is not quite
generic yet.



> + *
> + * Return: =0 on success. <0 on failure.
> + */
> static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
> {
> - __u64 llc_perf_miss = 0;
> int ret;
>
> - /*
> - * Measure cache miss from perf.
> - */
> - ret = get_llc_perf(&llc_perf_miss);
> - if (ret < 0)
> - return ret;
> + /* Stop counters after one span to get miss rate */
> + ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
>
> - ret = print_results_cache(param->filename, bm_pid, llc_perf_miss);
> - return ret;
> + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> + close(fd_lm);

I am not able to tell where this close() moved from.

> + if (ret == -1) {
> + perror("Could not get perf value");
> + return -1;
> + }
> +
> + return print_results_cache(param->filename, bm_pid, rf_cqm.values[0].value);
> }
>
> int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)

Reinette

2023-11-02 17:52:04

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> CAT test spawns two processes into two different control groups with
> exclusive schemata. Both the processes alloc a buffer from memory
> matching their allocated LLC block size and flush the entire buffer out
> of caches. Since the processes are reading through the buffer only once
> during the measurement and initially all the buffer was flushed, the
> test isn't testing CAT.
>
> Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
> perform a sequence of tests with different LLC alloc sizes starting
> from half of the CBM bits down to 1-bit CBM. Flush the buffer before
> each test and read the buffer twice. Observe the LLC misses on the
> second read through the buffer. As the allocated LLC block gets smaller
> and smaller, the LLC misses will become larger and larger giving a
> strong signal on CAT working properly.
>
> The new CAT test is using only a single process because it relies on
> measured effect against another run of itself rather than another
> process adding noise. The rest of the system is allocated the CBM bits
> not used by the CAT test to keep the test isolated.
>
> Replace count_bits() with count_contiguous_bits() to get the first bit
> position in order to be able to calculate masks based on it.
>
> This change has been tested with a number of systems from different
> generations.

Thank you very much for doing this.

>
> Suggested-by: Reinette Chatre <[email protected]>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 286 +++++++++-----------
> tools/testing/selftests/resctrl/fill_buf.c | 6 +-
> tools/testing/selftests/resctrl/resctrl.h | 5 +-
> tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
> 4 files changed, 137 insertions(+), 204 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index e71690a9bbb3..7518c520c5cc 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -11,65 +11,68 @@
> #include "resctrl.h"
> #include <unistd.h>
>
> -#define RESULT_FILE_NAME1 "result_cat1"
> -#define RESULT_FILE_NAME2 "result_cat2"
> +#define RESULT_FILE_NAME "result_cat"
> #define NUM_OF_RUNS 5
> -#define MAX_DIFF_PERCENT 4
> -#define MAX_DIFF 1000000
>
> /*
> - * Change schemata. Write schemata to specified
> - * con_mon grp, mon_grp in resctrl FS.
> - * Run 5 times in order to get average values.
> + * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
> + * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
> + * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.

This formula is not clear to me. In the code the formula is always:
MIN_DIFF_PERCENT_PER_BIT * (bits - 1) ... is the "-1" because it always
decrements the number of bits tested by one? So, for example, if testing
5 then 3 bits it would be MIN_DIFF_PERCENT_PER_BIT * (3 - 2)?
Would above example thus be:
MIN_DIFF_PERCENT_PER_BIT * (4 - (5 - 4)) = 3 ?



> + *
> + * The relationship between number of used CBM bits and difference in LLC
> + * misses is not expected to be linear. With a small number of bits, the
> + * margin is smaller than with larger number of bits. For selftest purposes,
> + * however, linear approach is enough because ultimately only pass/fail
> + * decision has to be made and distinction between strong and stronger
> + * signal is irrelevant.
> */

...

> /*
> * cat_test: execute CAT benchmark and measure LLC cache misses
> * @param: parameters passed to cat_test()
> * @span: buffer size for the benchmark
> + * @current_mask start mask for the first iteration
> + *
> + * Run CAT test, bits are removed one-by-one from the current_mask for each
> + * subsequent test.
> *

Could this please be expanded to provide more details about how test works,
measurements taken, and how pass/fail is determined?

> - * Return: 0 on success. non-zero on failure.
> + * Return: 0 on success. non-zero on failure.

Is non-zero specific enough? Does that mean that <0 and >0 are failure?

> */
> -static int cat_test(struct resctrl_val_param *param, size_t span)
> +static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
> {
> - int memflush = 1, operation = 0, ret = 0;
> char *resctrl_val = param->resctrl_val;
> static struct perf_event_read pe_read;
> struct perf_event_attr pea;
> + unsigned char *buf;
> + char schemata[64];
> + int ret, i, pe_fd;
> pid_t bm_pid;
> - int pe_fd;
>
> if (strcmp(param->filename, "") == 0)
> sprintf(param->filename, "stdio");
> @@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
> if (ret)
> return ret;
>
> + buf = alloc_buffer(span, 1);
> + if (buf == NULL)
> + return -1;
> +
> perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
> perf_event_initialize_read_format(&pe_read);
>
> - /* Test runs until the callback setup() tells the test to stop. */
> - while (1) {
> - ret = param->setup(param);
> - if (ret == END_OF_TESTS) {
> - ret = 0;
> - break;
> - }
> - if (ret < 0)
> - break;
> - pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
> - if (pe_fd < 0) {
> - ret = -1;
> - break;
> - }
> + while (current_mask) {
> + snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
> + ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
> + if (ret)
> + goto free_buf;
> + snprintf(schemata, sizeof(schemata), "%lx", current_mask);
> + ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
> + if (ret)
> + goto free_buf;
> +
> + for (i = 0; i < NUM_OF_RUNS; i++) {
> + mem_flush(buf, span);
> + ret = fill_cache_read(buf, span, true);
> + if (ret)
> + goto free_buf;
> +
> + pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
> + if (pe_fd < 0) {
> + ret = -1;
> + goto free_buf;
> + }

It seems to me that the perf counters are reconfigured at every iteration.
Can it not just be configured once and then the counters just reset and
enabled at each iteration? I'd expect this additional work to impact
values measured.

>
> - if (run_fill_buf(span, memflush, operation, true)) {
> - fprintf(stderr, "Error-running fill buffer\n");
> - ret = -1;
> - goto pe_close;
> - }
> + fill_cache_read(buf, span, true);
>
> - sleep(1);
> - ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
> - if (ret)
> - goto pe_close;
> + ret = perf_event_measure(pe_fd, &pe_read, param, bm_pid);
> + if (ret)
> + goto pe_close;
>
> - close(pe_fd);
> + close(pe_fd);
> + }
> + current_mask = next_mask(current_mask);
> }
>
> +free_buf:
> + free(buf);
> +
> return ret;
>
> pe_close:
> close(pe_fd);
> - return ret;
> + goto free_buf;
> }
>

Reinette

2023-11-02 17:53:21

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:

...

> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index d3bf4368341e..5157a3f74fee 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -141,13 +141,13 @@ void mba_test_cleanup(void)
> remove(RESULT_FILE_NAME);
> }
>
> -int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
> +int mba_schemata_change(const struct user_params *uparams)
> {
> struct resctrl_val_param param = {
> .resctrl_val = MBA_STR,
> .ctrlgrp = "c1",
> .mongrp = "m1",
> - .cpu_no = cpu_no,
> + .cpu_no = uparams->cpu,
> .filename = RESULT_FILE_NAME,
> .bw_report = "reads",
> .setup = mba_setup
> @@ -156,7 +156,7 @@ int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
>
> remove(RESULT_FILE_NAME);
>
> - ret = resctrl_val(benchmark_cmd, &param);
> + ret = resctrl_val(uparams->benchmark_cmd, &param);
> if (ret)
> goto out;
>

How about a new member of struct resctrl_val_param that points to
uparams? That would remove cpu_no from resctrl_val_param
and have everything available when a test needs to run ... not copying
some user parameters into struct resctrl_val_param and passing
others as parameters.

Reinette

2023-11-02 17:53:28

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 18/24] selftests/resctrl: Introduce generalized test framework

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
...
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index ec6efd36f60a..e017adf1390d 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -37,6 +37,8 @@
>
> #define DEFAULT_SPAN (250 * MB)
>
> +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
> +

Shuah worked hard to remove all these copies within kselftest. Please
use the one in kselftest.h instead.

> #define PARENT_EXIT(err_msg) \
> do { \
> perror(err_msg); \


...

> @@ -233,25 +183,26 @@ int main(int argc, char **argv)
> case 't':
> token = strtok(optarg, ",");
>
> - mbm_test = false;
> - mba_test = false;
> - cmt_test = false;
> - cat_test = false;
> + if (!test_param_seen) {
> + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
> + resctrl_tests[i]->disabled = true;
> + tests = 0;
> + test_param_seen = true;
> + }
> while (token) {
> - if (!strncmp(token, MBM_STR, sizeof(MBM_STR))) {
> - mbm_test = true;
> - tests++;
> - } else if (!strncmp(token, MBA_STR, sizeof(MBA_STR))) {
> - mba_test = true;
> - tests++;
> - } else if (!strncmp(token, CMT_STR, sizeof(CMT_STR))) {
> - cmt_test = true;
> - tests++;
> - } else if (!strncmp(token, CAT_STR, sizeof(CAT_STR))) {
> - cat_test = true;
> - tests++;
> - } else {
> - printf("invalid argument\n");
> + bool found = false;
> +
> + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
> + if (!strcasecmp(token, resctrl_tests[i]->name)) {
> + if (resctrl_tests[i]->disabled)
> + tests++;
> + resctrl_tests[i]->disabled = false;
> + found = true;
> + }
> + }

Could providing multiple "-t" result in the test count not
matching the number of tests run?

> +
> + if (!found) {
> + printf("invalid test: %s\n", token);
>
> return -1;
> }

A great improvement, thanks.

Reinette

2023-11-02 17:58:12

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
> support.

No need to use "new" language. L2 CAT has been available for a long time
... since Apollo Lake. Which systems actually support it is a different
topic. This is an architectural feature that has been available for a
long time. Whether a system supports it will be detected and the test
run based on that.

>
> Add L2 CAT selftest. As measuring L2 misses is not easily available
> with perf, use L3 accesses as a proxy for L2 CAT working or not.

I understand the exact measurement is not available but I do notice some
L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
looks promising.

L3 cannot be relied on for those systems, like Apollo lake, that do
not have an L3.

>
> Signed-off-by: Ilpo Järvinen <[email protected]>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
> tools/testing/selftests/resctrl/resctrl.h | 1 +
> .../testing/selftests/resctrl/resctrl_tests.c | 1 +
> 3 files changed, 63 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 48a96acd9e31..a9c72022bb5a 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
> remove(RESULT_FILE_NAME);
> }
>
> +/*
> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> + * because perf cannot directly provide the number of L2 misses (there are
> + * only platform specific ways to get the number of L2 misses).
> + *
> + * This function sets up L3 CAT to reduce noise from other processes during
> + * L2 CAT test.

This motivation is not clear to me. Does the same isolation used during L3 CAT
testing not work? I expected it to follow the same idea with the L2 cache split
in two, the test using one part and the rest of the system using the other.
Is that not enough isolation?

Reinette

2023-11-02 17:59:36

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> L2 CAT test with low number of bits tends to occasionally fail because
> of what seems random variation. The margin is quite small to begin with
> for <= 2 bits in CBM. At times, the result can even become negative.
> While it would be possible to allow negative values for those cases, it
> would be more confusing to user.
>
> Ignore failures from the tests where <= 2 were used to avoid false
> negative results.
>

I think the core message is that 2 or fewer bits should not be used. Instead
of running the test and ignoring the results the test should perhaps just not
be run.

Reinette

2023-11-03 08:54:22

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 05/24] selftests/resctrl: Create cache_size() helper

On Thu, 2 Nov 2023, Reinette Chatre wrote:

> Hi Ilpo,
>
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > CAT and CMT tests calculate the span size from the n-bits cache
> > allocation on their own.
> >
> > Add cache_size() helper which calculates size of the cache portion for
> > the given number of bits and use it to replace the existing span
> > calculations. This also prepares for the new CAT test that will need to
> > determine the size of the cache portion also during results processing.
> >
> > cache_size local variables were renamed out of the way to
> > cache_total_size.
>
> Please do stick to imperative mood ... "Rename cache_size local
> variables ..."
>
>
> ...
>
> > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> > index 2f3f0ee439d8..da06b2d492f9 100644
> > --- a/tools/testing/selftests/resctrl/resctrl.h
> > +++ b/tools/testing/selftests/resctrl/resctrl.h
> > @@ -117,4 +117,18 @@ int show_cache_info(unsigned long sum_llc_val, int no_of_bits,
> > unsigned long max_diff_percent, unsigned long num_of_runs,
> > bool platform, bool cmt);
> >
> > +/*
> > + * cache_size - Calculate the size of a cache portion
> > + * @cache_size: Cache size in bytes
> > + * @mask: Cache portion mask
> > + * @cache_mask: Full bitmask for the cache
> > + *
> > + * Return: The size of the cache portion in bytes.
> > + */
> > +static inline int cache_size(unsigned long cache_size, unsigned long mask,
> > + unsigned long cache_mask)
> > +{
> > + return cache_size * count_bits(mask) / count_bits(cache_mask);
> > +}
> > +
> > #endif /* RESCTRL_H */
>
>
> The get_cache_size() and cache_size() naming appears similar enough to me
> to cause confusion. Considering the "portion" term above, what do you think
> of "cache_portion_size()" or even "cache_portion_bytes()"?

Yes, I'm more than happy to rename them. This naming was what you
suggested earlier. (I used cache_alloc_size() or something like that
initially and you were against using "alloc" in the name.)

--
i.

2023-11-03 09:19:48

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 09/24] selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversion

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > Perf counters are __u64 but the code converts them to unsigned long
> > before printing them out.
> >
> > Remove unnecessary type conversion and the potential loss of meaningful
> > bits due to different sizes of types.
>
> This motivation is not clear to me. Is this work done in
> preparation for 32 bit testing? This raises a lot more topics if
> this is the case.

So you oppose stating that second part after "and"?

My main motivation was keeping the types consistent when I noted that the
code does unnecessary conversion to unsigned long (that's the first part
before "and").

Of course it has the added benefit of being 32-bit compatible but
it was just a thought I added "automatically" without paying much
attention on whether it's realistic to assume resctrl selftest as whole
would work with 32-bit or not (objectively, the after patch code is 32-bit
compatible but it of course gives no guarantees about the rest of the
resctrl selftest code base).

--
i.

2023-11-03 09:46:15

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 10/24] selftests/resctrl: Remove nested calls in perf event handling

On Thu, 2 Nov 2023, Reinette Chatre wrote:

> Hi Ilpo,
>
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > Perf event handling has functions that are the sole caller of another
> > perf event handling related function:
> > - reset_enable_llc_perf() calls perf_event_open_llc_miss()
> > - reset_enable_llc_perf() calls ioctl_perf_event_ioc_reset_enable()
> > - measure_llc_perf() calls get_llc_perf()
> >
> > Remove the extra layer of calls to make the code easier to follow by
> > moving the code into the calling function.
> >
> > In addition, converts print_results_cache() unsigned long parameter to
> > __u64 that matches the type coming from perf.
>
> Is this referring to work from previous patch?

Yes, this got split into own patch and it unfortunately lingered on in the
changelog. I'll remove it.

> > Signed-off-by: Ilpo Järvinen <[email protected]> ---
> >
> > 1 file changed, 25 insertions(+), 61 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> > index d39ef4eebc37..208af1ecae28 100644
> > --- a/tools/testing/selftests/resctrl/cache.c
> > +++ b/tools/testing/selftests/resctrl/cache.c

> > -/*
> > - * get_llc_perf: llc cache miss through perf events
> > - * @llc_perf_miss: LLC miss counter that is filled on success
> > - *
> > - * Perf events like HW_CACHE_MISSES could be used to validate number of
> > - * cache lines allocated.
> > - *
> > - * Return: =0 on success. <0 on failure.
> > - */
> > -static int get_llc_perf(__u64 *llc_perf_miss)
> > -{
> > - int ret;
> > -
> > - /* Stop counters after one span to get miss rate */
> > -
> > - ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> > -
> > - ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> > - if (ret == -1) {
> > - perror("Could not get llc misses through perf");
> > + fd_lm = perf_event_open(&pea_llc_miss, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
> > + if (fd_lm == -1) {
> > + perror("Error opening leader");
> > + ctrlc_handler(0, NULL, NULL);
>
> I understand you just copied the code here ... but it is not clear to me
> why this particular error handling deserves a ctrlc_handler().

Good catch! It's first time I even notice this line.

It certainly looks a bit too big hammer for error handling. I'll try to
create a separate patch which properly removes it (I suspect all error
handling rollbacks are there already so it can just be removed but I'll
have to confirm that).

> > return -1;
> > }
> >
> > - *llc_perf_miss = rf_cqm.values[0].value;
> > + /* Start counters to log values */
> > + ioctl(fd_lm, PERF_EVENT_IOC_RESET, 0);
> > + ioctl(fd_lm, PERF_EVENT_IOC_ENABLE, 0);
> >
> > return 0;
> > }
> > @@ -166,20 +121,29 @@ static int print_results_cache(char *filename, int bm_pid, __u64 llc_value)
> > return 0;
> > }
> >
> > +/*
> > + * measure_llc_perf: measure perf events
> > + * @bm_pid: child pid that runs benchmark
>
> I expected "bm_pid" to reflect a "benchmark pid" that
> is not unique to the child. Are both parent and child
> not running the benchmark?
>
> Missing doc of a parameter here.
>
> > + *
> > + * Measure things like cache misses from perf events.
>
> "things like cache misses" is vague. The function's name
> still contains "llc" which makes me think it is not quite
> generic yet.

I suppose I tried to be intentionally vague because I knew I was going to
use it measure also LLC accesses in the case of L2 CAT test. I'll see if I
can improve this wording somehow.

> > + *
> > + * Return: =0 on success. <0 on failure.
> > + */
> > static int measure_llc_perf(struct resctrl_val_param *param, int bm_pid)
> > {
> > - __u64 llc_perf_miss = 0;
> > int ret;
> >
> > - /*
> > - * Measure cache miss from perf.
> > - */
> > - ret = get_llc_perf(&llc_perf_miss);
> > - if (ret < 0)
> > - return ret;
> > + /* Stop counters after one span to get miss rate */
> > + ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0);
> >
> > - ret = print_results_cache(param->filename, bm_pid, llc_perf_miss);
> > - return ret;
> > + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format));
> > + close(fd_lm);
>
> I am not able to tell where this close() moved from.

Another good catch.

It seems to be a conflict resolution fallout from the time when I changed
to close() logic in that fix patch related to this fd.

--
i.

2023-11-03 09:56:32

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 18/24] selftests/resctrl: Introduce generalized test framework

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> ...
> > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> > index ec6efd36f60a..e017adf1390d 100644
>
> > @@ -233,25 +183,26 @@ int main(int argc, char **argv)
> > case 't':
> > token = strtok(optarg, ",");
> >
> > - mbm_test = false;
> > - mba_test = false;
> > - cmt_test = false;
> > - cat_test = false;
> > + if (!test_param_seen) {
> > + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
> > + resctrl_tests[i]->disabled = true;
> > + tests = 0;
> > + test_param_seen = true;
> > + }
> > while (token) {
> > - if (!strncmp(token, MBM_STR, sizeof(MBM_STR))) {
> > - mbm_test = true;
> > - tests++;
> > - } else if (!strncmp(token, MBA_STR, sizeof(MBA_STR))) {
> > - mba_test = true;
> > - tests++;
> > - } else if (!strncmp(token, CMT_STR, sizeof(CMT_STR))) {
> > - cmt_test = true;
> > - tests++;
> > - } else if (!strncmp(token, CAT_STR, sizeof(CAT_STR))) {
> > - cat_test = true;
> > - tests++;
> > - } else {
> > - printf("invalid argument\n");
> > + bool found = false;
> > +
> > + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
> > + if (!strcasecmp(token, resctrl_tests[i]->name)) {
> > + if (resctrl_tests[i]->disabled)
> > + tests++;
> > + resctrl_tests[i]->disabled = false;
> > + found = true;
> > + }
> > + }
>
> Could providing multiple "-t" result in the test count not
> matching the number of tests run?

bool test_param_seen covers the case with more than one -t parameter.
Because of it, the code above resets tests and ->disabled only when the
first -t is encountered. tests++ is only done when ->disabled is set from
true to false.

I don't see how they could get out of sync but if you had a more specific
case in mind, just let me know.

--
i.

2023-11-03 10:26:01

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > L2 CAT test with low number of bits tends to occasionally fail because
> > of what seems random variation. The margin is quite small to begin with
> > for <= 2 bits in CBM. At times, the result can even become negative.
> > While it would be possible to allow negative values for those cases, it
> > would be more confusing to user.
> >
> > Ignore failures from the tests where <= 2 were used to avoid false
> > negative results.
> >
>
> I think the core message is that 2 or fewer bits should not be used. Instead
> of running the test and ignoring the results the test should perhaps just not
> be run.

I considered that but it often does work so it felt shame to now present
them when they're successful. Then I just had to decide how to deal with
the cases where they failed.

Also, if I make it to not run down to 1 bit, those numbers will never ever
be seen by anyone. It doesn't say 2 and 1 bit results don't contain any
information to a human reader who is able to do more informed decisions
whether something is truly working or not. We could, hypothetically, have
a HW issue one day which makes 1-bit L2 mask to misbehave and if the
number is never seen by anyone, it's extremely unlikely to be caught
easily.

They are just reliable enough for simple automated threshold currently.
Maybe something else than average value would be, it would need to be
explored but I suspect also the memory address of the buffer might affect
the value, with L3 it definitely should because of how the things work but
I don't know if that holds for L2 too. I have earlier tried playing with
the buffer addresses with L3 but as I didn't immediately yield positive
outcome to guard against outliers, I postponed that investigation (e.g.,
my alloc pattern might have been too straightforward and didn't provide
enough entropy into the buffer start address because I just alloc'ed n x
buf_size buffers back-to-back).

But I don't have very strong opinion on this so if you prefer I just stop
at 3 bits, I can change it?

--
i.

2023-11-03 10:41:02

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:

> > Add L2 CAT selftest. As measuring L2 misses is not easily available
> > with perf, use L3 accesses as a proxy for L2 CAT working or not.
>
> I understand the exact measurement is not available but I do notice some
> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
> looks promising.

Okay, I was under impression that L2 misses are not available. Both based
on what you mentioned to me half an year ago and because of what flags I
found from the header. But I'll take another look into it.

> L3 cannot be relied on for those systems, like Apollo lake, that do
> not have an L3.

Do you happen know what perf will report for such CPUs, will it return
L2 as LLC?

> > Signed-off-by: Ilpo Järvinen <[email protected]>
> > ---
> > tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
> > tools/testing/selftests/resctrl/resctrl.h | 1 +
> > .../testing/selftests/resctrl/resctrl_tests.c | 1 +
> > 3 files changed, 63 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index 48a96acd9e31..a9c72022bb5a 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
> > remove(RESULT_FILE_NAME);
> > }
> >
> > +/*
> > + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> > + * because perf cannot directly provide the number of L2 misses (there are
> > + * only platform specific ways to get the number of L2 misses).
> > + *
> > + * This function sets up L3 CAT to reduce noise from other processes during
> > + * L2 CAT test.
>
> This motivation is not clear to me. Does the same isolation used during
> L3 CAT testing not work? I expected it to follow the same idea with the
> L2 cache split in two, the test using one part and the rest of the
> system using the other. Is that not enough isolation?

Isolation for L2 is done very same way as with L3 and I think it itself
works just fine.

However, because L2 CAT selftest as is measures L3 accesses that in ideal
world equals to L2 misses, isolating selftest related L3 accesses from the
rest of the system should reduce noise in the # of L3 accesses. It's not
mandatory though so if L3 CAT is not available the function just prints a
warning about the potential noise and does setup nothing for L3.

But I'll see if I can make it use L2 misses directly so this wouldn't
matter.

--
i.

2023-11-03 10:58:08

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > CAT test spawns two processes into two different control groups with
> > exclusive schemata. Both the processes alloc a buffer from memory
> > matching their allocated LLC block size and flush the entire buffer out
> > of caches. Since the processes are reading through the buffer only once
> > during the measurement and initially all the buffer was flushed, the
> > test isn't testing CAT.
> >
> > Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
> > perform a sequence of tests with different LLC alloc sizes starting
> > from half of the CBM bits down to 1-bit CBM. Flush the buffer before
> > each test and read the buffer twice. Observe the LLC misses on the
> > second read through the buffer. As the allocated LLC block gets smaller
> > and smaller, the LLC misses will become larger and larger giving a
> > strong signal on CAT working properly.
> >
> > The new CAT test is using only a single process because it relies on
> > measured effect against another run of itself rather than another
> > process adding noise. The rest of the system is allocated the CBM bits
> > not used by the CAT test to keep the test isolated.
> >
> > Replace count_bits() with count_contiguous_bits() to get the first bit
> > position in order to be able to calculate masks based on it.
> >
> > This change has been tested with a number of systems from different
> > generations.
>
> Thank you very much for doing this.
>
> >
> > Suggested-by: Reinette Chatre <[email protected]>
> > Signed-off-by: Ilpo Järvinen <[email protected]>
> > ---
> > tools/testing/selftests/resctrl/cat_test.c | 286 +++++++++-----------
> > tools/testing/selftests/resctrl/fill_buf.c | 6 +-
> > tools/testing/selftests/resctrl/resctrl.h | 5 +-
> > tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
> > 4 files changed, 137 insertions(+), 204 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index e71690a9bbb3..7518c520c5cc 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -11,65 +11,68 @@
> > #include "resctrl.h"
> > #include <unistd.h>
> >
> > -#define RESULT_FILE_NAME1 "result_cat1"
> > -#define RESULT_FILE_NAME2 "result_cat2"
> > +#define RESULT_FILE_NAME "result_cat"
> > #define NUM_OF_RUNS 5
> > -#define MAX_DIFF_PERCENT 4
> > -#define MAX_DIFF 1000000
> >
> > /*
> > - * Change schemata. Write schemata to specified
> > - * con_mon grp, mon_grp in resctrl FS.
> > - * Run 5 times in order to get average values.
> > + * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
> > + * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
> > + * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
>
> This formula is not clear to me. In the code the formula is always:
> MIN_DIFF_PERCENT_PER_BIT * (bits - 1) ... is the "-1" because it always
> decrements the number of bits tested by one?

No, -1 is not related to decrementing bits by one, but setting a boundary
which workds for 1 bit masks. In general, the smaller the number of bits
in the allocation mask is, less change there will be between n+1 -> n bits
results. When n is 1, the result with some platforms is close to zero so I
just had to make the min diff to allow it. Thus, n-1 to set the failure
threshold at 0%.

> So, for example, if testing
> 5 then 3 bits it would be MIN_DIFF_PERCENT_PER_BIT * (3 - 2)?
> Would above example thus be:
> MIN_DIFF_PERCENT_PER_BIT * (4 - (5 - 4)) = 3 ?

I suspect you're overthinking it here. The CAT selftest currently doesn't
jump from 5 to 3 bits so I don't know what you're trying to calculate
here.

> > - * Return: 0 on success. non-zero on failure.
> > + * Return: 0 on success. non-zero on failure.
>
> Is non-zero specific enough? Does that mean that <0 and >0 are failure?

I suspect it is, after all the cleanups and fixes that have been done.
The wording is from the original though.

> > */
> > -static int cat_test(struct resctrl_val_param *param, size_t span)
> > +static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
> > {
> > - int memflush = 1, operation = 0, ret = 0;
> > char *resctrl_val = param->resctrl_val;
> > static struct perf_event_read pe_read;
> > struct perf_event_attr pea;
> > + unsigned char *buf;
> > + char schemata[64];
> > + int ret, i, pe_fd;
> > pid_t bm_pid;
> > - int pe_fd;
> >
> > if (strcmp(param->filename, "") == 0)
> > sprintf(param->filename, "stdio");
> > @@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
> > if (ret)
> > return ret;
> >
> > + buf = alloc_buffer(span, 1);
> > + if (buf == NULL)
> > + return -1;
> > +
> > perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
> > perf_event_initialize_read_format(&pe_read);
> >
> > - /* Test runs until the callback setup() tells the test to stop. */
> > - while (1) {
> > - ret = param->setup(param);
> > - if (ret == END_OF_TESTS) {
> > - ret = 0;
> > - break;
> > - }
> > - if (ret < 0)
> > - break;
> > - pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
> > - if (pe_fd < 0) {
> > - ret = -1;
> > - break;
> > - }
> > + while (current_mask) {
> > + snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
> > + ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
> > + if (ret)
> > + goto free_buf;
> > + snprintf(schemata, sizeof(schemata), "%lx", current_mask);
> > + ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
> > + if (ret)
> > + goto free_buf;
> > +
> > + for (i = 0; i < NUM_OF_RUNS; i++) {
> > + mem_flush(buf, span);
> > + ret = fill_cache_read(buf, span, true);
> > + if (ret)
> > + goto free_buf;
> > +
> > + pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
> > + if (pe_fd < 0) {
> > + ret = -1;
> > + goto free_buf;
> > + }
>
> It seems to me that the perf counters are reconfigured at every iteration.
> Can it not just be configured once and then the counters just reset and
> enabled at each iteration? I'd expect this additional work to impact
> values measured.

So you suggest I undo one of the changes made in 10/24 and just call the
function which does only the ioctl() calls? I don't know why it has been
done the way it has been, I can try to change it and see what happens.


--
i.

2023-11-03 11:25:25

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>
> > diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> > index d3bf4368341e..5157a3f74fee 100644
> > --- a/tools/testing/selftests/resctrl/mba_test.c
> > +++ b/tools/testing/selftests/resctrl/mba_test.c
> > @@ -141,13 +141,13 @@ void mba_test_cleanup(void)
> > remove(RESULT_FILE_NAME);
> > }
> >
> > -int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
> > +int mba_schemata_change(const struct user_params *uparams)
> > {
> > struct resctrl_val_param param = {
> > .resctrl_val = MBA_STR,
> > .ctrlgrp = "c1",
> > .mongrp = "m1",
> > - .cpu_no = cpu_no,
> > + .cpu_no = uparams->cpu,
> > .filename = RESULT_FILE_NAME,
> > .bw_report = "reads",
> > .setup = mba_setup
> > @@ -156,7 +156,7 @@ int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
> >
> > remove(RESULT_FILE_NAME);
> >
> > - ret = resctrl_val(benchmark_cmd, &param);
> > + ret = resctrl_val(uparams->benchmark_cmd, &param);
> > if (ret)
> > goto out;
> >
>
> How about a new member of struct resctrl_val_param that points to
> uparams? That would remove cpu_no from resctrl_val_param
> and have everything available when a test needs to run ... not copying
> some user parameters into struct resctrl_val_param and passing
> others as parameters.

I'm a bit allergic to adding more stuff into resctrl_val_param. It seems
a structure where random stuff has been thrown at just because it exists.
In general, your point is very valid though because the members of
resctrl_val_param should be auditted through to see how many of them are
even useful after adding uparams and struct resctrl_test.

I could get rid of copying parameters from uparams to params and just
passing uparams instead of benchmark_cmd into resctrl_val(). Would you be
okay with that?

Oh, and I really should rename resctrl_val() one day to something more
meaningful too. :-) (but it won't be part of this series and will likely
be another conflicty nightmare because resctrl_val_param too needs to
be renamed...).

--
i.

2023-11-03 22:49:35

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 05/24] selftests/resctrl: Create cache_size() helper


Hi Ilpo,

On 11/3/2023 1:53 AM, Ilpo Järvinen wrote:
>
> Yes, I'm more than happy to rename them. This naming was what you
> suggested earlier. (I used cache_alloc_size() or something like that
> initially and you were against using "alloc" in the name.)

My apologies for giving poor guidance. I cannot recall the thread behind
that guidance but looking at these changes together the cache_size() and
get_cache_size() naming does seem close enough to cause confusion.

Reinette


2023-11-03 22:50:29

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

Hi Ilpo,

On 11/3/2023 3:57 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:

...

>>> /*
>>> - * Change schemata. Write schemata to specified
>>> - * con_mon grp, mon_grp in resctrl FS.
>>> - * Run 5 times in order to get average values.
>>> + * Minimum difference in LLC misses between a test with n+1 bits CBM mask to
>>> + * the test with n bits. With e.g. 5 vs 4 bits in the CBM mask, the minimum
>>> + * difference must be at least MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
>>
>> This formula is not clear to me. In the code the formula is always:
>> MIN_DIFF_PERCENT_PER_BIT * (bits - 1) ... is the "-1" because it always
>> decrements the number of bits tested by one?
>
> No, -1 is not related to decrementing bits by one, but setting a boundary
> which workds for 1 bit masks. In general, the smaller the number of bits
> in the allocation mask is, less change there will be between n+1 -> n bits
> results. When n is 1, the result with some platforms is close to zero so I
> just had to make the min diff to allow it. Thus, n-1 to set the failure
> threshold at 0%.
>
>> So, for example, if testing
>> 5 then 3 bits it would be MIN_DIFF_PERCENT_PER_BIT * (3 - 2)?
>> Would above example thus be:
>> MIN_DIFF_PERCENT_PER_BIT * (4 - (5 - 4)) = 3 ?
>
> I suspect you're overthinking it here. The CAT selftest currently doesn't
> jump from 5 to 3 bits so I don't know what you're trying to calculate
> here.

I was trying to understand the formula. The example starts with "e.g 5 vs 4 bits
in the CBM mask ..." and then jumps to "MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3"
It is not obvious to me where the "5" and "4" from the example produces the
resulting formula.

Perhaps it would be simpler (to me) to say something like

Minimum difference in LLC misses between a test with n+1 bits CBM mask to
the test with n bits is MIN_DIFF_PERCENT_PER_BIT * (n - 1). With e.g. 5 vs 4
bits in the CBM mask, the minimum difference must be at least
MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.

>
>>> - * Return: 0 on success. non-zero on failure.
>>> + * Return: 0 on success. non-zero on failure.
>>
>> Is non-zero specific enough? Does that mean that <0 and >0 are failure?
>
> I suspect it is, after all the cleanups and fixes that have been done.
> The wording is from the original though.
>
>>> */
>>> -static int cat_test(struct resctrl_val_param *param, size_t span)
>>> +static int cat_test(struct resctrl_val_param *param, size_t span, unsigned long current_mask)
>>> {
>>> - int memflush = 1, operation = 0, ret = 0;
>>> char *resctrl_val = param->resctrl_val;
>>> static struct perf_event_read pe_read;
>>> struct perf_event_attr pea;
>>> + unsigned char *buf;
>>> + char schemata[64];
>>> + int ret, i, pe_fd;
>>> pid_t bm_pid;
>>> - int pe_fd;
>>>
>>> if (strcmp(param->filename, "") == 0)
>>> sprintf(param->filename, "stdio");
>>> @@ -143,54 +168,64 @@ static int cat_test(struct resctrl_val_param *param, size_t span)
>>> if (ret)
>>> return ret;
>>>
>>> + buf = alloc_buffer(span, 1);
>>> + if (buf == NULL)
>>> + return -1;
>>> +
>>> perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
>>> perf_event_initialize_read_format(&pe_read);
>>>
>>> - /* Test runs until the callback setup() tells the test to stop. */
>>> - while (1) {
>>> - ret = param->setup(param);
>>> - if (ret == END_OF_TESTS) {
>>> - ret = 0;
>>> - break;
>>> - }
>>> - if (ret < 0)
>>> - break;
>>> - pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
>>> - if (pe_fd < 0) {
>>> - ret = -1;
>>> - break;
>>> - }
>>> + while (current_mask) {
>>> + snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
>>> + ret = write_schemata("", schemata, param->cpu_no, param->resctrl_val);
>>> + if (ret)
>>> + goto free_buf;
>>> + snprintf(schemata, sizeof(schemata), "%lx", current_mask);
>>> + ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, param->resctrl_val);
>>> + if (ret)
>>> + goto free_buf;
>>> +
>>> + for (i = 0; i < NUM_OF_RUNS; i++) {
>>> + mem_flush(buf, span);
>>> + ret = fill_cache_read(buf, span, true);
>>> + if (ret)
>>> + goto free_buf;
>>> +
>>> + pe_fd = perf_event_reset_enable(&pea, bm_pid, param->cpu_no);
>>> + if (pe_fd < 0) {
>>> + ret = -1;
>>> + goto free_buf;
>>> + }
>>
>> It seems to me that the perf counters are reconfigured at every iteration.
>> Can it not just be configured once and then the counters just reset and
>> enabled at each iteration? I'd expect this additional work to impact
>> values measured.
>
> So you suggest I undo one of the changes made in 10/24 and just call the
> function which does only the ioctl() calls? I don't know why it has been
> done the way it has been, I can try to change it and see what happens.

The relationships between the functions are not as obvious to me but as
far as execution is concerned I am indeed suggesting that only the ioctl()s
to reset and enable counters are repeated in the loop. From what I understand
the counters need only be configured once.

Reinette

2023-11-03 22:51:25

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 18/24] selftests/resctrl: Introduce generalized test framework

Hi Ilpo,

On 11/3/2023 2:54 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>> ...
>>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
>>> index ec6efd36f60a..e017adf1390d 100644
>>
>>> @@ -233,25 +183,26 @@ int main(int argc, char **argv)
>>> case 't':
>>> token = strtok(optarg, ",");
>>>
>>> - mbm_test = false;
>>> - mba_test = false;
>>> - cmt_test = false;
>>> - cat_test = false;
>>> + if (!test_param_seen) {
>>> + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++)
>>> + resctrl_tests[i]->disabled = true;
>>> + tests = 0;
>>> + test_param_seen = true;
>>> + }
>>> while (token) {
>>> - if (!strncmp(token, MBM_STR, sizeof(MBM_STR))) {
>>> - mbm_test = true;
>>> - tests++;
>>> - } else if (!strncmp(token, MBA_STR, sizeof(MBA_STR))) {
>>> - mba_test = true;
>>> - tests++;
>>> - } else if (!strncmp(token, CMT_STR, sizeof(CMT_STR))) {
>>> - cmt_test = true;
>>> - tests++;
>>> - } else if (!strncmp(token, CAT_STR, sizeof(CAT_STR))) {
>>> - cat_test = true;
>>> - tests++;
>>> - } else {
>>> - printf("invalid argument\n");
>>> + bool found = false;
>>> +
>>> + for (i = 0; i < ARRAY_SIZE(resctrl_tests); i++) {
>>> + if (!strcasecmp(token, resctrl_tests[i]->name)) {
>>> + if (resctrl_tests[i]->disabled)
>>> + tests++;
>>> + resctrl_tests[i]->disabled = false;
>>> + found = true;
>>> + }
>>> + }
>>
>> Could providing multiple "-t" result in the test count not
>> matching the number of tests run?
>
> bool test_param_seen covers the case with more than one -t parameter.
> Because of it, the code above resets tests and ->disabled only when the
> first -t is encountered. tests++ is only done when ->disabled is set from
> true to false.
>
> I don't see how they could get out of sync but if you had a more specific
> case in mind, just let me know.
>

Thank you for your detailed explanation. I can now see how this is safeguarded.

Reinette

2023-11-03 22:51:39

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

Hi Ilpo,

On 11/3/2023 4:24 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>
>>> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
>>> index d3bf4368341e..5157a3f74fee 100644
>>> --- a/tools/testing/selftests/resctrl/mba_test.c
>>> +++ b/tools/testing/selftests/resctrl/mba_test.c
>>> @@ -141,13 +141,13 @@ void mba_test_cleanup(void)
>>> remove(RESULT_FILE_NAME);
>>> }
>>>
>>> -int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
>>> +int mba_schemata_change(const struct user_params *uparams)
>>> {
>>> struct resctrl_val_param param = {
>>> .resctrl_val = MBA_STR,
>>> .ctrlgrp = "c1",
>>> .mongrp = "m1",
>>> - .cpu_no = cpu_no,
>>> + .cpu_no = uparams->cpu,
>>> .filename = RESULT_FILE_NAME,
>>> .bw_report = "reads",
>>> .setup = mba_setup
>>> @@ -156,7 +156,7 @@ int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
>>>
>>> remove(RESULT_FILE_NAME);
>>>
>>> - ret = resctrl_val(benchmark_cmd, &param);
>>> + ret = resctrl_val(uparams->benchmark_cmd, &param);
>>> if (ret)
>>> goto out;
>>>
>>
>> How about a new member of struct resctrl_val_param that points to
>> uparams? That would remove cpu_no from resctrl_val_param
>> and have everything available when a test needs to run ... not copying
>> some user parameters into struct resctrl_val_param and passing
>> others as parameters.
>
> I'm a bit allergic to adding more stuff into resctrl_val_param. It seems
> a structure where random stuff has been thrown at just because it exists.
> In general, your point is very valid though because the members of
> resctrl_val_param should be auditted through to see how many of them are
> even useful after adding uparams and struct resctrl_test.
>
> I could get rid of copying parameters from uparams to params and just
> passing uparams instead of benchmark_cmd into resctrl_val(). Would you be
> okay with that?

I am ok with that. I assume this implies that cpu_no will be removed from
resctrl_val_param?

> Oh, and I really should rename resctrl_val() one day to something more
> meaningful too. :-) (but it won't be part of this series and will likely
> be another conflicty nightmare because resctrl_val_param too needs to
> be renamed...).

"Naming only" changes that are not part of something more substantive are not
very appealing though.

Reinette

2023-11-03 22:53:48

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

Hi Ilpo,

On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>
>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>
>> I understand the exact measurement is not available but I do notice some
>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>> looks promising.
>
> Okay, I was under impression that L2 misses are not available. Both based
> on what you mentioned to me half an year ago and because of what flags I
> found from the header. But I'll take another look into it.

You are correct that when I did L2 testing a long time ago I used
the model specific L2 miss counts. I was hoping that things have improved
so that model specific counters are not needed, as you have tried here.
I found the l2_rqsts symbol while looking for alternatives but I am not
familiar enough with perf to know how these symbolic names are mapped.
I was hoping that they could be a simple drop-in replacement to
experiment with.

>
>> L3 cannot be relied on for those systems, like Apollo lake, that do
>> not have an L3.
>
> Do you happen know what perf will report for such CPUs, will it return
> L2 as LLC?

I don't know.

>
>>> Signed-off-by: Ilpo Järvinen <[email protected]>
>>> ---
>>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
>>> tools/testing/selftests/resctrl/resctrl.h | 1 +
>>> .../testing/selftests/resctrl/resctrl_tests.c | 1 +
>>> 3 files changed, 63 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 48a96acd9e31..a9c72022bb5a 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
>>> remove(RESULT_FILE_NAME);
>>> }
>>>
>>> +/*
>>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
>>> + * because perf cannot directly provide the number of L2 misses (there are
>>> + * only platform specific ways to get the number of L2 misses).
>>> + *
>>> + * This function sets up L3 CAT to reduce noise from other processes during
>>> + * L2 CAT test.
>>
>> This motivation is not clear to me. Does the same isolation used during
>> L3 CAT testing not work? I expected it to follow the same idea with the
>> L2 cache split in two, the test using one part and the rest of the
>> system using the other. Is that not enough isolation?
>
> Isolation for L2 is done very same way as with L3 and I think it itself
> works just fine.
>
> However, because L2 CAT selftest as is measures L3 accesses that in ideal
> world equals to L2 misses, isolating selftest related L3 accesses from the
> rest of the system should reduce noise in the # of L3 accesses. It's not
> mandatory though so if L3 CAT is not available the function just prints a
> warning about the potential noise and does setup nothing for L3.

This is not clear to me. If the read misses L2 and then accesses L3 then
it should not matter which part of L3 cache the work is isolated to. What noise
do you have in mind?

Reinette

2023-11-03 22:54:08

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

Hi Ilpo,

On 11/3/2023 3:24 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>> L2 CAT test with low number of bits tends to occasionally fail because
>>> of what seems random variation. The margin is quite small to begin with
>>> for <= 2 bits in CBM. At times, the result can even become negative.
>>> While it would be possible to allow negative values for those cases, it
>>> would be more confusing to user.
>>>
>>> Ignore failures from the tests where <= 2 were used to avoid false
>>> negative results.
>>>
>>
>> I think the core message is that 2 or fewer bits should not be used. Instead
>> of running the test and ignoring the results the test should perhaps just not
>> be run.
>
> I considered that but it often does work so it felt shame to now present
> them when they're successful. Then I just had to decide how to deal with
> the cases where they failed.
>
> Also, if I make it to not run down to 1 bit, those numbers will never ever
> be seen by anyone. It doesn't say 2 and 1 bit results don't contain any
> information to a human reader who is able to do more informed decisions
> whether something is truly working or not. We could, hypothetically, have
> a HW issue one day which makes 1-bit L2 mask to misbehave and if the
> number is never seen by anyone, it's extremely unlikely to be caught
> easily.
>
> They are just reliable enough for simple automated threshold currently.
> Maybe something else than average value would be, it would need to be
> explored but I suspect also the memory address of the buffer might affect
> the value, with L3 it definitely should because of how the things work but
> I don't know if that holds for L2 too. I have earlier tried playing with
> the buffer addresses with L3 but as I didn't immediately yield positive
> outcome to guard against outliers, I postponed that investigation (e.g.,
> my alloc pattern might have been too straightforward and didn't provide
> enough entropy into the buffer start address because I just alloc'ed n x
> buf_size buffers back-to-back).
>
> But I don't have very strong opinion on this so if you prefer I just stop
> at 3 bits, I can change it?
>

We seem to have different users in mind when thinking about this. I was
considering the users that just run the selftest to get a pass/fail. You
seem to also consider folks using this for validation. I'm ok with keeping
this change to accommodate both.

Reinette

2023-11-06 09:07:10

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 17/24] selftests/resctrl: Create struct for input parameter

On Fri, 3 Nov 2023, Reinette Chatre wrote:

> Hi Ilpo,
>
> On 11/3/2023 4:24 AM, Ilpo Järvinen wrote:
> > On Thu, 2 Nov 2023, Reinette Chatre wrote:
> >> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> >>
> >>> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> >>> index d3bf4368341e..5157a3f74fee 100644
> >>> --- a/tools/testing/selftests/resctrl/mba_test.c
> >>> +++ b/tools/testing/selftests/resctrl/mba_test.c
> >>> @@ -141,13 +141,13 @@ void mba_test_cleanup(void)
> >>> remove(RESULT_FILE_NAME);
> >>> }
> >>>
> >>> -int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
> >>> +int mba_schemata_change(const struct user_params *uparams)
> >>> {
> >>> struct resctrl_val_param param = {
> >>> .resctrl_val = MBA_STR,
> >>> .ctrlgrp = "c1",
> >>> .mongrp = "m1",
> >>> - .cpu_no = cpu_no,
> >>> + .cpu_no = uparams->cpu,
> >>> .filename = RESULT_FILE_NAME,
> >>> .bw_report = "reads",
> >>> .setup = mba_setup
> >>> @@ -156,7 +156,7 @@ int mba_schemata_change(int cpu_no, const char * const *benchmark_cmd)
> >>>
> >>> remove(RESULT_FILE_NAME);
> >>>
> >>> - ret = resctrl_val(benchmark_cmd, &param);
> >>> + ret = resctrl_val(uparams->benchmark_cmd, &param);
> >>> if (ret)
> >>> goto out;
> >>>
> >>
> >> How about a new member of struct resctrl_val_param that points to
> >> uparams? That would remove cpu_no from resctrl_val_param
> >> and have everything available when a test needs to run ... not copying
> >> some user parameters into struct resctrl_val_param and passing
> >> others as parameters.
> >
> > I'm a bit allergic to adding more stuff into resctrl_val_param. It seems
> > a structure where random stuff has been thrown at just because it exists.
> > In general, your point is very valid though because the members of
> > resctrl_val_param should be auditted through to see how many of them are
> > even useful after adding uparams and struct resctrl_test.
> >
> > I could get rid of copying parameters from uparams to params and just
> > passing uparams instead of benchmark_cmd into resctrl_val(). Would you be
> > okay with that?
>
> I am ok with that. I assume this implies that cpu_no will be removed from
> resctrl_val_param?

Yes.

--
i.

> > Oh, and I really should rename resctrl_val() one day to something more
> > meaningful too. :-) (but it won't be part of this series and will likely
> > be another conflicty nightmare because resctrl_val_param too needs to
> > be renamed...).
>
> "Naming only" changes that are not part of something more substantive are not
> very appealing though.
>
> Reinette
>

2023-11-06 09:53:29

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

On Fri, 3 Nov 2023, Reinette Chatre wrote:
> On 11/3/2023 3:39 AM, Ilpo J?rvinen wrote:
> > On Thu, 2 Nov 2023, Reinette Chatre wrote:
> >> On 10/24/2023 2:26 AM, Ilpo J?rvinen wrote:
> >
> >>> Add L2 CAT selftest. As measuring L2 misses is not easily available
> >>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
> >>
> >> I understand the exact measurement is not available but I do notice some
> >> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
> >> looks promising.
> >
> > Okay, I was under impression that L2 misses are not available. Both based
> > on what you mentioned to me half an year ago and because of what flags I
> > found from the header. But I'll take another look into it.
>
> You are correct that when I did L2 testing a long time ago I used
> the model specific L2 miss counts. I was hoping that things have improved
> so that model specific counters are not needed, as you have tried here.
> I found the l2_rqsts symbol while looking for alternatives but I am not
> familiar enough with perf to know how these symbolic names are mapped.
> I was hoping that they could be a simple drop-in replacement to
> experiment with.

According to perf_event_open() manpage, mapping those symbolic names
requires libpfm so this would add a library dependency?

> >>> Signed-off-by: Ilpo J?rvinen <[email protected]>
> >>> ---
> >>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
> >>> tools/testing/selftests/resctrl/resctrl.h | 1 +
> >>> .../testing/selftests/resctrl/resctrl_tests.c | 1 +
> >>> 3 files changed, 63 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >>> index 48a96acd9e31..a9c72022bb5a 100644
> >>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
> >>> remove(RESULT_FILE_NAME);
> >>> }
> >>>
> >>> +/*
> >>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> >>> + * because perf cannot directly provide the number of L2 misses (there are
> >>> + * only platform specific ways to get the number of L2 misses).
> >>> + *
> >>> + * This function sets up L3 CAT to reduce noise from other processes during
> >>> + * L2 CAT test.
> >>
> >> This motivation is not clear to me. Does the same isolation used during
> >> L3 CAT testing not work? I expected it to follow the same idea with the
> >> L2 cache split in two, the test using one part and the rest of the
> >> system using the other. Is that not enough isolation?
> >
> > Isolation for L2 is done very same way as with L3 and I think it itself
> > works just fine.
> >
> > However, because L2 CAT selftest as is measures L3 accesses that in ideal
> > world equals to L2 misses, isolating selftest related L3 accesses from the
> > rest of the system should reduce noise in the # of L3 accesses. It's not
> > mandatory though so if L3 CAT is not available the function just prints a
> > warning about the potential noise and does setup nothing for L3.
>
> This is not clear to me. If the read misses L2 and then accesses L3 then
> it should not matter which part of L3 cache the work is isolated to.
> What noise do you have in mind?

The way it is currently done is to measure L3 accesses. If something else
runs at the same time as the CAT selftest, it can do mem accesses that
cause L3 accesses which is noise in the # of L3 accesses number since
those accesses were unrelated to the L2 CAT selftest.

--
i.

2023-11-06 17:04:45

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

Hi Ilpo,

On 11/6/2023 1:53 AM, Ilpo Järvinen wrote:
> On Fri, 3 Nov 2023, Reinette Chatre wrote:
>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
>>> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>>
>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>>>
>>>> I understand the exact measurement is not available but I do notice some
>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>>>> looks promising.
>>>
>>> Okay, I was under impression that L2 misses are not available. Both based
>>> on what you mentioned to me half an year ago and because of what flags I
>>> found from the header. But I'll take another look into it.
>>
>> You are correct that when I did L2 testing a long time ago I used
>> the model specific L2 miss counts. I was hoping that things have improved
>> so that model specific counters are not needed, as you have tried here.
>> I found the l2_rqsts symbol while looking for alternatives but I am not
>> familiar enough with perf to know how these symbolic names are mapped.
>> I was hoping that they could be a simple drop-in replacement to
>> experiment with.
>
> According to perf_event_open() manpage, mapping those symbolic names
> requires libpfm so this would add a library dependency?

I do not see perf list using this library to determine the event and
umask but I am in unfamiliar territory. I'll have to spend some more
time here to determine options.

>
>>>>> Signed-off-by: Ilpo Järvinen <[email protected]>
>>>>> ---
>>>>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
>>>>> tools/testing/selftests/resctrl/resctrl.h | 1 +
>>>>> .../testing/selftests/resctrl/resctrl_tests.c | 1 +
>>>>> 3 files changed, 63 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>>>> index 48a96acd9e31..a9c72022bb5a 100644
>>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
>>>>> remove(RESULT_FILE_NAME);
>>>>> }
>>>>>
>>>>> +/*
>>>>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
>>>>> + * because perf cannot directly provide the number of L2 misses (there are
>>>>> + * only platform specific ways to get the number of L2 misses).
>>>>> + *
>>>>> + * This function sets up L3 CAT to reduce noise from other processes during
>>>>> + * L2 CAT test.
>>>>
>>>> This motivation is not clear to me. Does the same isolation used during
>>>> L3 CAT testing not work? I expected it to follow the same idea with the
>>>> L2 cache split in two, the test using one part and the rest of the
>>>> system using the other. Is that not enough isolation?
>>>
>>> Isolation for L2 is done very same way as with L3 and I think it itself
>>> works just fine.
>>>
>>> However, because L2 CAT selftest as is measures L3 accesses that in ideal
>>> world equals to L2 misses, isolating selftest related L3 accesses from the
>>> rest of the system should reduce noise in the # of L3 accesses. It's not
>>> mandatory though so if L3 CAT is not available the function just prints a
>>> warning about the potential noise and does setup nothing for L3.
>>
>> This is not clear to me. If the read misses L2 and then accesses L3 then
>> it should not matter which part of L3 cache the work is isolated to.
>> What noise do you have in mind?
>
> The way it is currently done is to measure L3 accesses. If something else
> runs at the same time as the CAT selftest, it can do mem accesses that
> cause L3 accesses which is noise in the # of L3 accesses number since
> those accesses were unrelated to the L2 CAT selftest.
>

Creating a CAT allocation sets aside a portion of cache where a task/cpu
can allocation into cache, it does not prevent one task from accessing
the cache concurrently with another.

Reinette

2023-11-06 21:23:06

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH 23/24] selftests/resctrl: Add L2 CAT test

Hi Ilpo,

On 11/6/2023 9:03 AM, Reinette Chatre wrote:
> On 11/6/2023 1:53 AM, Ilpo Järvinen wrote:
>> On Fri, 3 Nov 2023, Reinette Chatre wrote:
>>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
>>>> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>>>
>>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>>>>
>>>>> I understand the exact measurement is not available but I do notice some
>>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>>>>> looks promising.
>>>>
>>>> Okay, I was under impression that L2 misses are not available. Both based
>>>> on what you mentioned to me half an year ago and because of what flags I
>>>> found from the header. But I'll take another look into it.
>>>
>>> You are correct that when I did L2 testing a long time ago I used
>>> the model specific L2 miss counts. I was hoping that things have improved
>>> so that model specific counters are not needed, as you have tried here.
>>> I found the l2_rqsts symbol while looking for alternatives but I am not
>>> familiar enough with perf to know how these symbolic names are mapped.
>>> I was hoping that they could be a simple drop-in replacement to
>>> experiment with.
>>
>> According to perf_event_open() manpage, mapping those symbolic names
>> requires libpfm so this would add a library dependency?
>
> I do not see perf list using this library to determine the event and
> umask but I am in unfamiliar territory. I'll have to spend some more
> time here to determine options.

tools/perf/pmu-events/README cleared it up for me. The architecture specific
tables are included in the perf binary. Potentially pmu-events.h could be
included or the test could just stick with the architectural events.
A quick look at the various cache.json files created the impression that
the events of interest may actually have the same event code and umask across
platforms.
I am not familiar with libpfm. This can surely be considered if it supports
this testing. Several selftests have library dependencies.

Reinette