2022-05-18 16:19:39

by Michal Koutný

[permalink] [raw]
Subject:

Subject: [PATCH v2 0/5] memcontrol selftests fixups

Hello.

I'm just flushing the patches to make memcontrol selftests check the
events behavior we had consensus about (test_memcg_low fails).

(test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
even before the refactoring.)

The two bigger changes are:
- adjustment of the protected values to make tests succeed with the given
tolerance,
- both test_memcg_low and test_memcg_min check protection of memory in
populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst
memory.min should not apply to empty cgroups, which is not the case
currently. Therefore I unified tests with the populated case in order to to
bring more broken tests).


Thanks,
Michal

Changes from v1 (https://lore.kernel.org/r/[email protected]/)
- fixed mis-rebase in compilation fix patch,
- added review, ack tags from v1,
- applied feedback from v1 (Octave script in git tree),
- added one more patch extracting common parts,
- rebased on mm-stable bbe832b9db2e.

Michal Koutný (5):
selftests: memcg: Fix compilation
selftests: memcg: Expect no low events in unprotected sibling
selftests: memcg: Adjust expected reclaim values of protected cgroups
selftests: memcg: Remove protection from top level memcg
selftests: memcg: Factor out common parts of memory.{low,min} tests

MAINTAINERS | 1 +
.../selftests/cgroup/memcg_protection.m | 89 +++++++
.../selftests/cgroup/test_memcontrol.c | 247 +++++-------------
3 files changed, 152 insertions(+), 185 deletions(-)
create mode 100644 tools/testing/selftests/cgroup/memcg_protection.m

--
2.35.3



2022-05-18 16:19:42

by Michal Koutný

[permalink] [raw]
Subject: [PATCH v2 3/5] selftests: memcg: Adjust expected reclaim values of protected cgroups

The numbers are not easy to derive in a closed form (certainly mere
protections ratios do not apply), therefore use a simulation to obtain
expected numbers.

Signed-off-by: Michal Koutný <[email protected]>
---
MAINTAINERS | 1 +
.../selftests/cgroup/memcg_protection.m | 89 +++++++++++++++++++
.../selftests/cgroup/test_memcontrol.c | 29 +++---
3 files changed, 107 insertions(+), 12 deletions(-)
create mode 100644 tools/testing/selftests/cgroup/memcg_protection.m

diff --git a/MAINTAINERS b/MAINTAINERS
index 78c57046fa93..b28b6aeb8636 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5029,6 +5029,7 @@ L: [email protected]
S: Maintained
F: mm/memcontrol.c
F: mm/swap_cgroup.c
+F: tools/testing/selftests/cgroup/memcg_protection.m
F: tools/testing/selftests/cgroup/test_kmem.c
F: tools/testing/selftests/cgroup/test_memcontrol.c

diff --git a/tools/testing/selftests/cgroup/memcg_protection.m b/tools/testing/selftests/cgroup/memcg_protection.m
new file mode 100644
index 000000000000..051daa3477b6
--- /dev/null
+++ b/tools/testing/selftests/cgroup/memcg_protection.m
@@ -0,0 +1,89 @@
+% SPDX-License-Identifier: GPL-2.0
+%
+% run as: octave-cli memcg_protection.m
+%
+% This script simulates reclaim protection behavior on a single level of memcg
+% hierarchy to illustrate how overcommitted protection spreads among siblings
+% (as it depends also on their current consumption).
+%
+% Simulation assumes siblings consumed the initial amount of memory (w/out
+% reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated
+% same. It simulates only non-low reclaim and assumes all memory.min = 0.
+%
+% Input configurations
+% --------------------
+% E number parent effective protection
+% n vector nominal protection of siblings set at the given level (memory.low)
+% c vector current consumption -,,- (memory.current)
+
+% example from testcase (values in GB)
+E = 50 / 1024;
+n = [75 25 0 500 ] / 1024;
+c = [50 50 50 0] / 1024;
+
+% Reclaim parameters
+% ------------------
+
+% Minimal reclaim amount (GB)
+cluster = 32*4 / 2**20;
+
+% Reclaim coefficient (think as 0.5^sc->priority)
+alpha = .1
+
+% Simulation parameters
+% ---------------------
+epsilon = 1e-7;
+timeout = 1000;
+
+% Simulation loop
+% ---------------
+
+ch = [];
+eh = [];
+rh = [];
+
+for t = 1:timeout
+ % low_usage
+ u = min(c, n);
+ siblings = sum(u);
+
+ % effective_protection()
+ protected = min(n, c); % start with nominal
+ e = protected * min(1, E / siblings); % normalize overcommit
+
+ % recursive protection
+ unclaimed = max(0, E - siblings);
+ parent_overuse = sum(c) - siblings;
+ if (unclaimed > 0 && parent_overuse > 0)
+ overuse = max(0, c - protected);
+ e += unclaimed * (overuse / parent_overuse);
+ endif
+
+ % get_scan_count()
+ r = alpha * c; % assume all memory is in a single LRU list
+
+ % commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection")
+ sz = max(e, c);
+ r .*= (1 - (e+epsilon) ./ (sz+epsilon));
+
+ % uncomment to debug prints
+ % e, c, r
+
+ % nothing to reclaim, reached equilibrium
+ if max(r) < epsilon
+ break;
+ endif
+
+ % SWAP_CLUSTER_MAX roundup
+ r = max(r, (r > epsilon) .* cluster);
+ % XXX here I do parallel reclaim of all siblings
+ % in reality reclaim is serialized and each sibling recalculates own residual
+ c = max(c - r, 0);
+
+ ch = [ch ; c];
+ eh = [eh ; e];
+ rh = [rh ; r];
+endfor
+
+t
+c, e
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 4924425639b0..dc2c7d6e3572 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -248,7 +248,7 @@ static int cg_test_proc_killed(const char *cgroup)
/*
* First, this test creates the following hierarchy:
* A memory.min = 50M, memory.max = 200M
- * A/B memory.min = 50M, memory.current = 50M
+ * A/B memory.min = 50M
* A/B/C memory.min = 75M, memory.current = 50M
* A/B/D memory.min = 25M, memory.current = 50M
* A/B/E memory.min = 0, memory.current = 50M
@@ -259,10 +259,13 @@ static int cg_test_proc_killed(const char *cgroup)
* Then it creates A/G and creates a significant
* memory pressure in it.
*
+ * Then it checks actual memory usages and expects that:
* A/B memory.current ~= 50M
- * A/B/C memory.current ~= 33M
- * A/B/D memory.current ~= 17M
- * A/B/F memory.current ~= 0
+ * A/B/C memory.current ~= 29M
+ * A/B/D memory.current ~= 21M
+ * A/B/E memory.current ~= 0
+ * A/B/F memory.current = 0
+ * (for origin of the numbers, see model in memcg_protection.m.)
*
* After that it tries to allocate more than there is
* unprotected memory in A available, and checks
@@ -365,10 +368,10 @@ static int test_memcg_min(const char *root)
for (i = 0; i < ARRAY_SIZE(children); i++)
c[i] = cg_read_long(children[i], "memory.current");

- if (!values_close(c[0], MB(33), 10))
+ if (!values_close(c[0], MB(29), 10))
goto cleanup;

- if (!values_close(c[1], MB(17), 10))
+ if (!values_close(c[1], MB(21), 10))
goto cleanup;

if (c[3] != 0)
@@ -405,7 +408,7 @@ static int test_memcg_min(const char *root)
/*
* First, this test creates the following hierarchy:
* A memory.low = 50M, memory.max = 200M
- * A/B memory.low = 50M, memory.current = 50M
+ * A/B memory.low = 50M
* A/B/C memory.low = 75M, memory.current = 50M
* A/B/D memory.low = 25M, memory.current = 50M
* A/B/E memory.low = 0, memory.current = 50M
@@ -417,9 +420,11 @@ static int test_memcg_min(const char *root)
*
* Then it checks actual memory usages and expects that:
* A/B memory.current ~= 50M
- * A/B/ memory.current ~= 33M
- * A/B/D memory.current ~= 17M
- * A/B/F memory.current ~= 0
+ * A/B/C memory.current ~= 29M
+ * A/B/D memory.current ~= 21M
+ * A/B/E memory.current ~= 0
+ * A/B/F memory.current = 0
+ * (for origin of the numbers, see model in memcg_protection.m.)
*
* After that it tries to allocate more than there is
* unprotected memory in A available,
@@ -512,10 +517,10 @@ static int test_memcg_low(const char *root)
for (i = 0; i < ARRAY_SIZE(children); i++)
c[i] = cg_read_long(children[i], "memory.current");

- if (!values_close(c[0], MB(33), 10))
+ if (!values_close(c[0], MB(29), 10))
goto cleanup;

- if (!values_close(c[1], MB(17), 10))
+ if (!values_close(c[1], MB(21), 10))
goto cleanup;

if (c[3] != 0)
--
2.35.3


2022-05-18 16:19:49

by Michal Koutný

[permalink] [raw]
Subject: [PATCH v2 5/5] selftests: memcg: Factor out common parts of memory.{low,min} tests

The memory protection test setup and runtime is almost equal for
memory.low and memory.min cases.
It makes modification of the common parts prone to mistakes, since the
protections are similar not only in setup but also in principle, factor
the common part out.

Past exceptions between the tests:
- missing memory.min is fine (kept),
- test_memcg_low protected orphaned pagecache (adapted like
test_memcg_min and we keep the processes of protected memory running).

The evaluation in two tests is different (OOM of allocator vs low events
of protégés), this is kept different.

Signed-off-by: Michal Koutný <[email protected]>
---
.../selftests/cgroup/test_memcontrol.c | 199 ++++--------------
1 file changed, 36 insertions(+), 163 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 63c6a683a8c1..c3d0d5f7b19c 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -190,13 +190,6 @@ static int test_memcg_current(const char *root)
return ret;
}

-static int alloc_pagecache_50M(const char *cgroup, void *arg)
-{
- int fd = (long)arg;
-
- return alloc_pagecache(fd, MB(50));
-}
-
static int alloc_pagecache_50M_noexit(const char *cgroup, void *arg)
{
int fd = (long)arg;
@@ -254,7 +247,9 @@ static int cg_test_proc_killed(const char *cgroup)
* A/B/E memory.min = 0, memory.current = 50M
* A/B/F memory.min = 500M, memory.current = 0
*
- * Usages are pagecache, but the test keeps a running
+ * (or memory.low if we test soft protection)
+ *
+ * Usages are pagecache and the test keeps a running
* process in every leaf cgroup.
* Then it creates A/G and creates a significant
* memory pressure in A.
@@ -268,15 +263,16 @@ static int cg_test_proc_killed(const char *cgroup)
* (for origin of the numbers, see model in memcg_protection.m.)
*
* After that it tries to allocate more than there is
- * unprotected memory in A available, and checks
- * checks that memory.min protects pagecache even
- * in this case.
+ * unprotected memory in A available, and checks that:
+ * a) memory.min protects pagecache even in this case,
+ * b) memory.low allows reclaiming page cache with low events.
*/
-static int test_memcg_min(const char *root)
+static int test_memcg_protection(const char *root, bool min)
{
- int ret = KSFT_FAIL;
+ int ret = KSFT_FAIL, rc;
char *parent[3] = {NULL};
char *children[4] = {NULL};
+ const char *attribute = min ? "memory.min" : "memory.low";
long c[4];
int i, attempts;
int fd;
@@ -300,8 +296,10 @@ static int test_memcg_min(const char *root)
if (cg_create(parent[0]))
goto cleanup;

- if (cg_read_long(parent[0], "memory.min")) {
- ret = KSFT_SKIP;
+ if (cg_read_long(parent[0], attribute)) {
+ /* No memory.min on older kernels is fine */
+ if (min)
+ ret = KSFT_SKIP;
goto cleanup;
}

@@ -338,15 +336,15 @@ static int test_memcg_min(const char *root)
(void *)(long)fd);
}

- if (cg_write(parent[1], "memory.min", "50M"))
+ if (cg_write(parent[1], attribute, "50M"))
goto cleanup;
- if (cg_write(children[0], "memory.min", "75M"))
+ if (cg_write(children[0], attribute, "75M"))
goto cleanup;
- if (cg_write(children[1], "memory.min", "25M"))
+ if (cg_write(children[1], attribute, "25M"))
goto cleanup;
- if (cg_write(children[2], "memory.min", "0"))
+ if (cg_write(children[2], attribute, "0"))
goto cleanup;
- if (cg_write(children[3], "memory.min", "500M"))
+ if (cg_write(children[3], attribute, "500M"))
goto cleanup;

attempts = 0;
@@ -375,161 +373,26 @@ static int test_memcg_min(const char *root)
if (c[3] != 0)
goto cleanup;

- if (!cg_run(parent[2], alloc_anon, (void *)MB(170)))
- goto cleanup;
-
- if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
- goto cleanup;
-
- ret = KSFT_PASS;
-
-cleanup:
- for (i = ARRAY_SIZE(children) - 1; i >= 0; i--) {
- if (!children[i])
- continue;
-
- cg_destroy(children[i]);
- free(children[i]);
- }
-
- for (i = ARRAY_SIZE(parent) - 1; i >= 0; i--) {
- if (!parent[i])
- continue;
-
- cg_destroy(parent[i]);
- free(parent[i]);
- }
- close(fd);
- return ret;
-}
-
-/*
- * First, this test creates the following hierarchy:
- * A memory.low = 0, memory.max = 200M
- * A/B memory.low = 50M
- * A/B/C memory.low = 75M, memory.current = 50M
- * A/B/D memory.low = 25M, memory.current = 50M
- * A/B/E memory.low = 0, memory.current = 50M
- * A/B/F memory.low = 500M, memory.current = 0
- *
- * Usages are pagecache.
- * Then it creates A/G an creates a significant
- * memory pressure in it.
- *
- * Then it checks actual memory usages and expects that:
- * A/B memory.current ~= 50M
- * A/B/C memory.current ~= 29M
- * A/B/D memory.current ~= 21M
- * A/B/E memory.current ~= 0
- * A/B/F memory.current = 0
- * (for origin of the numbers, see model in memcg_protection.m.)
- *
- * After that it tries to allocate more than there is
- * unprotected memory in A available,
- * and checks low and oom events in memory.events.
- */
-static int test_memcg_low(const char *root)
-{
- int ret = KSFT_FAIL;
- char *parent[3] = {NULL};
- char *children[4] = {NULL};
- long low, oom;
- long c[4];
- int i;
- int fd;
-
- fd = get_temp_fd();
- if (fd < 0)
- goto cleanup;
-
- parent[0] = cg_name(root, "memcg_test_0");
- if (!parent[0])
- goto cleanup;
-
- parent[1] = cg_name(parent[0], "memcg_test_1");
- if (!parent[1])
- goto cleanup;
-
- parent[2] = cg_name(parent[0], "memcg_test_2");
- if (!parent[2])
- goto cleanup;
-
- if (cg_create(parent[0]))
- goto cleanup;
-
- if (cg_read_long(parent[0], "memory.low"))
- goto cleanup;
-
- if (cg_write(parent[0], "cgroup.subtree_control", "+memory"))
+ rc = cg_run(parent[2], alloc_anon, (void *)MB(170));
+ if (min && !rc)
goto cleanup;
-
- if (cg_write(parent[0], "memory.max", "200M"))
- goto cleanup;
-
- if (cg_write(parent[0], "memory.swap.max", "0"))
- goto cleanup;
-
- if (cg_create(parent[1]))
- goto cleanup;
-
- if (cg_write(parent[1], "cgroup.subtree_control", "+memory"))
- goto cleanup;
-
- if (cg_create(parent[2]))
+ else if (!min && rc) {
+ fprintf(stderr,
+ "memory.low prevents from allocating anon memory\n");
goto cleanup;
-
- for (i = 0; i < ARRAY_SIZE(children); i++) {
- children[i] = cg_name_indexed(parent[1], "child_memcg", i);
- if (!children[i])
- goto cleanup;
-
- if (cg_create(children[i]))
- goto cleanup;
-
- if (i > 2)
- continue;
-
- if (cg_run(children[i], alloc_pagecache_50M, (void *)(long)fd))
- goto cleanup;
}

- if (cg_write(parent[1], "memory.low", "50M"))
- goto cleanup;
- if (cg_write(children[0], "memory.low", "75M"))
- goto cleanup;
- if (cg_write(children[1], "memory.low", "25M"))
- goto cleanup;
- if (cg_write(children[2], "memory.low", "0"))
- goto cleanup;
- if (cg_write(children[3], "memory.low", "500M"))
- goto cleanup;
-
- if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
- goto cleanup;
-
if (!values_close(cg_read_long(parent[1], "memory.current"), MB(50), 3))
goto cleanup;

- for (i = 0; i < ARRAY_SIZE(children); i++)
- c[i] = cg_read_long(children[i], "memory.current");
-
- if (!values_close(c[0], MB(29), 10))
- goto cleanup;
-
- if (!values_close(c[1], MB(21), 10))
- goto cleanup;
-
- if (c[3] != 0)
- goto cleanup;
-
- if (cg_run(parent[2], alloc_anon, (void *)MB(166))) {
- fprintf(stderr,
- "memory.low prevents from allocating anon memory\n");
+ if (min) {
+ ret = KSFT_PASS;
goto cleanup;
}

for (i = 0; i < ARRAY_SIZE(children); i++) {
int no_low_events_index = 1;
+ long low, oom;

oom = cg_read_key_long(children[i], "memory.events", "oom ");
low = cg_read_key_long(children[i], "memory.events", "low ");
@@ -565,6 +428,16 @@ static int test_memcg_low(const char *root)
return ret;
}

+static int test_memcg_min(const char *root)
+{
+ return test_memcg_protection(root, true);
+}
+
+static int test_memcg_low(const char *root)
+{
+ return test_memcg_protection(root, false);
+}
+
static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
{
size_t size = MB(50);
--
2.35.3


2022-05-18 16:19:59

by Michal Koutný

[permalink] [raw]
Subject: [PATCH v2 4/5] selftests: memcg: Remove protection from top level memcg

The reclaim is triggered by memory limit in a subtree, therefore the
testcase does not need configured protection against external reclaim.

Also, correct respective comments

Signed-off-by: Michal Koutný <[email protected]>
---
tools/testing/selftests/cgroup/test_memcontrol.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index dc2c7d6e3572..63c6a683a8c1 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)

/*
* First, this test creates the following hierarchy:
- * A memory.min = 50M, memory.max = 200M
+ * A memory.min = 0, memory.max = 200M
* A/B memory.min = 50M
* A/B/C memory.min = 75M, memory.current = 50M
* A/B/D memory.min = 25M, memory.current = 50M
@@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
* Usages are pagecache, but the test keeps a running
* process in every leaf cgroup.
* Then it creates A/G and creates a significant
- * memory pressure in it.
+ * memory pressure in A.
*
* Then it checks actual memory usages and expects that:
* A/B memory.current ~= 50M
@@ -338,8 +338,6 @@ static int test_memcg_min(const char *root)
(void *)(long)fd);
}

- if (cg_write(parent[0], "memory.min", "50M"))
- goto cleanup;
if (cg_write(parent[1], "memory.min", "50M"))
goto cleanup;
if (cg_write(children[0], "memory.min", "75M"))
@@ -407,7 +405,7 @@ static int test_memcg_min(const char *root)

/*
* First, this test creates the following hierarchy:
- * A memory.low = 50M, memory.max = 200M
+ * A memory.low = 0, memory.max = 200M
* A/B memory.low = 50M
* A/B/C memory.low = 75M, memory.current = 50M
* A/B/D memory.low = 25M, memory.current = 50M
@@ -495,8 +493,6 @@ static int test_memcg_low(const char *root)
goto cleanup;
}

- if (cg_write(parent[0], "memory.low", "50M"))
- goto cleanup;
if (cg_write(parent[1], "memory.low", "50M"))
goto cleanup;
if (cg_write(children[0], "memory.low", "75M"))
--
2.35.3


2022-05-18 19:11:00

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v2 3/5] selftests: memcg: Adjust expected reclaim values of protected cgroups

On Wed, May 18, 2022 at 06:18:57PM +0200, Michal Koutny wrote:
> The numbers are not easy to derive in a closed form (certainly mere
> protections ratios do not apply), therefore use a simulation to obtain
> expected numbers.
>
> Signed-off-by: Michal Koutn? <[email protected]>

Acked-by: Roman Gushchin <[email protected]>

Thanks, Michal!

2022-05-18 19:14:56

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] selftests: memcg: Remove protection from top level memcg

On Wed, May 18, 2022 at 06:18:58PM +0200, Michal Koutny wrote:
> The reclaim is triggered by memory limit in a subtree, therefore the
> testcase does not need configured protection against external reclaim.
>
> Also, correct respective comments
>
> Signed-off-by: Michal Koutn? <[email protected]>

Acked-by: Roman Gushchin <[email protected]>