Fix a broken zswap kselftest due to cgroup zswap writeback counter
renaming, and add a kselftest to cover the (z)swapin case.
Also, add the zswap kselftest file to zswap maintainer entry so that
get_maintainers script can find zswap maintainers.
Nhat Pham (3):
selftests: zswap: add zswap selftest file to zswap maintainer entry
selftests: fix the zswap invasive shrink test
selftests: add test for zswapin
MAINTAINERS | 1 +
tools/testing/selftests/cgroup/test_zswap.c | 69 ++++++++++++++++++++-
2 files changed, 67 insertions(+), 3 deletions(-)
base-commit: d162e170f1181b4305494843e1976584ddf2b72e
--
2.39.3
Make it easier for contributors to find the zswap maintainers when they
update the zswap tests.
Signed-off-by: Nhat Pham <[email protected]>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index fecebfc4c0dc..5f60faaefaf2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -24396,6 +24396,7 @@ F: include/linux/zpool.h
F: include/linux/zswap.h
F: mm/zpool.c
F: mm/zswap.c
+F: tools/testing/selftests/cgroup/test_zswap.c
THE REST
M: Linus Torvalds <[email protected]>
--
2.39.3
The zswap no invasive shrink selftest breaks because we rename the zswap
writeback counter (see [1]). Fix the test.
[1]: https://patchwork.kernel.org/project/linux-kselftest/patch/[email protected]/
Fixes: a697dc2be925 ("selftests: cgroup: update per-memcg zswap writeback selftest")
Signed-off-by: Nhat Pham <[email protected]>
---
tools/testing/selftests/cgroup/test_zswap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
index 47fdaa146443..32ce975b21d1 100644
--- a/tools/testing/selftests/cgroup/test_zswap.c
+++ b/tools/testing/selftests/cgroup/test_zswap.c
@@ -52,7 +52,7 @@ static int get_zswap_stored_pages(size_t *value)
static int get_cg_wb_count(const char *cg)
{
- return cg_read_key_long(cg, "memory.stat", "zswp_wb");
+ return cg_read_key_long(cg, "memory.stat", "zswpwb");
}
static long get_zswpout(const char *cgroup)
--
2.39.3
We recently encountered a kernel crash on the zswapin path in our
internal kernel, which went undetected because of a lack of test
coverage for this path. Add a selftest to cover this code path,
allocating more memories than the cgroup limit to trigger
swapout/zswapout, then reading the pages back in memories several times.
Also add a variant of this test that runs with zswap disabled, to verify
swapin correctness as well.
Suggested-by: Rik van Riel <[email protected]>
Signed-off-by: Nhat Pham <[email protected]>
---
tools/testing/selftests/cgroup/test_zswap.c | 67 ++++++++++++++++++++-
1 file changed, 65 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
index 32ce975b21d1..86231c86dc89 100644
--- a/tools/testing/selftests/cgroup/test_zswap.c
+++ b/tools/testing/selftests/cgroup/test_zswap.c
@@ -60,17 +60,39 @@ static long get_zswpout(const char *cgroup)
return cg_read_key_long(cgroup, "memory.stat", "zswpout ");
}
-static int allocate_bytes(const char *cgroup, void *arg)
+static int allocate_bytes_and_read(const char *cgroup, void *arg, bool read)
{
size_t size = (size_t)arg;
char *mem = (char *)malloc(size);
+ int ret = 0;
if (!mem)
return -1;
for (int i = 0; i < size; i += 4095)
mem[i] = 'a';
+
+ if (read) {
+ /* cycle through the allocated memory to (z)swap in and out pages */
+ for (int t = 0; t < 5; t++) {
+ for (int i = 0; i < size; i += 4095) {
+ if (mem[i] != 'a')
+ ret = -1;
+ }
+ }
+ }
+
free(mem);
- return 0;
+ return ret;
+}
+
+static int allocate_bytes(const char *cgroup, void *arg)
+{
+ return allocate_bytes_and_read(cgroup, arg, false);
+}
+
+static int read_bytes(const char *cgroup, void *arg)
+{
+ return allocate_bytes_and_read(cgroup, arg, true);
}
static char *setup_test_group_1M(const char *root, const char *name)
@@ -133,6 +155,45 @@ static int test_zswap_usage(const char *root)
return ret;
}
+/* Simple test to verify the (z)swapin code paths */
+static int test_zswapin_size(const char *root, char *zswap_size)
+{
+ int ret = KSFT_FAIL;
+ char *test_group;
+
+ /* Set up */
+ test_group = cg_name(root, "zswapin_test");
+ if (!test_group)
+ goto out;
+ if (cg_create(test_group))
+ goto out;
+ if (cg_write(test_group, "memory.max", "8M"))
+ goto out;
+ if (cg_write(test_group, "memory.zswap.max", zswap_size))
+ goto out;
+
+ /* Allocate and read more than memory.max to trigger (z)swap in */
+ if (cg_run(test_group, read_bytes, (void *)MB(32)))
+ goto out;
+
+ ret = KSFT_PASS;
+
+out:
+ cg_destroy(test_group);
+ free(test_group);
+ return ret;
+}
+
+static int test_swapin(const char *root)
+{
+ return test_zswapin_size(root, "0");
+}
+
+static int test_zswapin_no_limit(const char *root)
+{
+ return test_zswapin_size(root, "max");
+}
+
/*
* When trying to store a memcg page in zswap, if the memcg hits its memory
* limit in zswap, writeback should affect only the zswapped pages of that
@@ -309,6 +370,8 @@ struct zswap_test {
const char *name;
} tests[] = {
T(test_zswap_usage),
+ T(test_swapin),
+ T(test_zswapin_no_limit),
T(test_no_kmem_bypass),
T(test_no_invasive_cgroup_shrink),
};
--
2.39.3
On Mon, Jan 29, 2024 at 02:45:40PM -0800, Nhat Pham wrote:
> Make it easier for contributors to find the zswap maintainers when they
> update the zswap tests.
>
> Signed-off-by: Nhat Pham <[email protected]>
I guess I had to check the zswap tests at some point :)
Acked-by: Yosry Ahmed <[email protected]>
> ---
> MAINTAINERS | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index fecebfc4c0dc..5f60faaefaf2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -24396,6 +24396,7 @@ F: include/linux/zpool.h
> F: include/linux/zswap.h
> F: mm/zpool.c
> F: mm/zswap.c
> +F: tools/testing/selftests/cgroup/test_zswap.c
>
> THE REST
> M: Linus Torvalds <[email protected]>
> --
> 2.39.3
>
On Mon, Jan 29, 2024 at 02:45:41PM -0800, Nhat Pham wrote:
> The zswap no invasive shrink selftest breaks because we rename the zswap
> writeback counter (see [1]). Fix the test.
>
> [1]: https://patchwork.kernel.org/project/linux-kselftest/patch/[email protected]/
>
> Fixes: a697dc2be925 ("selftests: cgroup: update per-memcg zswap writeback selftest")
Looks like this should go into v6.8 too.
> Signed-off-by: Nhat Pham <[email protected]>
Acked-by: Yosry Ahmed <[email protected]>
On Mon, Jan 29, 2024 at 02:45:42PM -0800, Nhat Pham wrote:
> We recently encountered a kernel crash on the zswapin path in our
> internal kernel, which went undetected because of a lack of test
> coverage for this path. Add a selftest to cover this code path,
> allocating more memories than the cgroup limit to trigger
s/memories/memory
> swapout/zswapout, then reading the pages back in memories several times.
>
> Also add a variant of this test that runs with zswap disabled, to verify
> swapin correctness as well.
>
> Suggested-by: Rik van Riel <[email protected]>
> Signed-off-by: Nhat Pham <[email protected]>
> ---
> tools/testing/selftests/cgroup/test_zswap.c | 67 ++++++++++++++++++++-
> 1 file changed, 65 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
> index 32ce975b21d1..86231c86dc89 100644
> --- a/tools/testing/selftests/cgroup/test_zswap.c
> +++ b/tools/testing/selftests/cgroup/test_zswap.c
> @@ -60,17 +60,39 @@ static long get_zswpout(const char *cgroup)
> return cg_read_key_long(cgroup, "memory.stat", "zswpout ");
> }
>
> -static int allocate_bytes(const char *cgroup, void *arg)
> +static int allocate_bytes_and_read(const char *cgroup, void *arg, bool read)
> {
> size_t size = (size_t)arg;
> char *mem = (char *)malloc(size);
> + int ret = 0;
>
> if (!mem)
> return -1;
> for (int i = 0; i < size; i += 4095)
> mem[i] = 'a';
> +
> + if (read) {
> + /* cycle through the allocated memory to (z)swap in and out pages */
> + for (int t = 0; t < 5; t++) {
What benefit does the iteration serve here? I would guess one iteration
is enough to swap everything in at least once, no?
> + for (int i = 0; i < size; i += 4095) {
> + if (mem[i] != 'a')
> + ret = -1;
> + }
> + }
> + }
> +
> free(mem);
> - return 0;
> + return ret;
> +}
> +
> +static int allocate_bytes(const char *cgroup, void *arg)
> +{
> + return allocate_bytes_and_read(cgroup, arg, false);
> +}
> +
> +static int read_bytes(const char *cgroup, void *arg)
> +{
> + return allocate_bytes_and_read(cgroup, arg, true);
> }
I don't like how we reuse allocate_bytes_and_read(), we are not saving
much. Let's keep allocate_bytes() as-is and add a separate helper. Also,
I think allocate_and_read_bytes() is easier to read.
>
> static char *setup_test_group_1M(const char *root, const char *name)
> @@ -133,6 +155,45 @@ static int test_zswap_usage(const char *root)
> return ret;
> }
>
> +/* Simple test to verify the (z)swapin code paths */
> +static int test_zswapin_size(const char *root, char *zswap_size)
> +{
> + int ret = KSFT_FAIL;
> + char *test_group;
> +
> + /* Set up */
> + test_group = cg_name(root, "zswapin_test");
> + if (!test_group)
> + goto out;
> + if (cg_create(test_group))
> + goto out;
> + if (cg_write(test_group, "memory.max", "8M"))
> + goto out;
> + if (cg_write(test_group, "memory.zswap.max", zswap_size))
> + goto out;
> +
> + /* Allocate and read more than memory.max to trigger (z)swap in */
> + if (cg_run(test_group, read_bytes, (void *)MB(32)))
> + goto out;
> +
> + ret = KSFT_PASS;
> +
> +out:
> + cg_destroy(test_group);
> + free(test_group);
> + return ret;
> +}
> +
> +static int test_swapin(const char *root)
> +{
> + return test_zswapin_size(root, "0");
> +}
Why are we testing the no zswap case? I am all for testing but it seems
out of scope here. It would have been understandable if we are testing
memory.zswap.max itself, but we are not doing that.
FWIW, I think the tests here should really be separated from cgroup
tests, but I understand why they were added here. There is a lot of
testing for memcg interface and control for zswap, and a lot of nice
helpers present.
On Mon, Jan 29, 2024 at 5:24 PM Yosry Ahmed <[email protected]> wrote:
>
> On Mon, Jan 29, 2024 at 02:45:42PM -0800, Nhat Pham wrote:
> > We recently encountered a kernel crash on the zswapin path in our
> > internal kernel, which went undetected because of a lack of test
> > coverage for this path. Add a selftest to cover this code path,
> > allocating more memories than the cgroup limit to trigger
>
> s/memories/memory
>
> > swapout/zswapout, then reading the pages back in memories several times.
> >
> > Also add a variant of this test that runs with zswap disabled, to verify
> > swapin correctness as well.
> >
> > Suggested-by: Rik van Riel <[email protected]>
> > Signed-off-by: Nhat Pham <[email protected]>
> > ---
> > tools/testing/selftests/cgroup/test_zswap.c | 67 ++++++++++++++++++++-
> > 1 file changed, 65 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c
> > index 32ce975b21d1..86231c86dc89 100644
> > --- a/tools/testing/selftests/cgroup/test_zswap.c
> > +++ b/tools/testing/selftests/cgroup/test_zswap.c
> > @@ -60,17 +60,39 @@ static long get_zswpout(const char *cgroup)
> > return cg_read_key_long(cgroup, "memory.stat", "zswpout ");
> > }
> >
> > -static int allocate_bytes(const char *cgroup, void *arg)
> > +static int allocate_bytes_and_read(const char *cgroup, void *arg, bool read)
> > {
> > size_t size = (size_t)arg;
> > char *mem = (char *)malloc(size);
> > + int ret = 0;
> >
> > if (!mem)
> > return -1;
> > for (int i = 0; i < size; i += 4095)
> > mem[i] = 'a';
> > +
> > + if (read) {
> > + /* cycle through the allocated memory to (z)swap in and out pages */
> > + for (int t = 0; t < 5; t++) {
>
> What benefit does the iteration serve here? I would guess one iteration
> is enough to swap everything in at least once, no?
There might be data races etc. that might not appear in one iteration.
Running multiple iterations increases the probability of these bugs
cropping up.
Admittedly, the same effect could, perhaps, also be achieved by
running the same test multiple times, so this is not a hill I will die
on :) This is just a bit more convenient - CI infra often runs these
tests once every time a new kernel is built.
>
> > + for (int i = 0; i < size; i += 4095) {
> > + if (mem[i] != 'a')
> > + ret = -1;
> > + }
> > + }
> > + }
> > +
> > free(mem);
> > - return 0;
> > + return ret;
> > +}
> > +
> > +static int allocate_bytes(const char *cgroup, void *arg)
> > +{
> > + return allocate_bytes_and_read(cgroup, arg, false);
> > +}
> > +
> > +static int read_bytes(const char *cgroup, void *arg)
> > +{
> > + return allocate_bytes_and_read(cgroup, arg, true);
> > }
>
> I don't like how we reuse allocate_bytes_and_read(), we are not saving
> much. Let's keep allocate_bytes() as-is and add a separate helper. Also,
> I think allocate_and_read_bytes() is easier to read.
Fair!
>
> >
> > static char *setup_test_group_1M(const char *root, const char *name)
> > @@ -133,6 +155,45 @@ static int test_zswap_usage(const char *root)
> > return ret;
> > }
> >
> > +/* Simple test to verify the (z)swapin code paths */
> > +static int test_zswapin_size(const char *root, char *zswap_size)
> > +{
> > + int ret = KSFT_FAIL;
> > + char *test_group;
> > +
> > + /* Set up */
> > + test_group = cg_name(root, "zswapin_test");
> > + if (!test_group)
> > + goto out;
> > + if (cg_create(test_group))
> > + goto out;
> > + if (cg_write(test_group, "memory.max", "8M"))
> > + goto out;
> > + if (cg_write(test_group, "memory.zswap.max", zswap_size))
> > + goto out;
> > +
> > + /* Allocate and read more than memory.max to trigger (z)swap in */
> > + if (cg_run(test_group, read_bytes, (void *)MB(32)))
> > + goto out;
> > +
> > + ret = KSFT_PASS;
> > +
> > +out:
> > + cg_destroy(test_group);
> > + free(test_group);
> > + return ret;
> > +}
> > +
> > +static int test_swapin(const char *root)
> > +{
> > + return test_zswapin_size(root, "0");
> > +}
>
> Why are we testing the no zswap case? I am all for testing but it seems
> out of scope here. It would have been understandable if we are testing
> memory.zswap.max itself, but we are not doing that.
Eh it's just by convenience. We already have the workload - any test
for zswap can pretty much be turned into a test for swap by disabling
zswap (and enabling swap), so I was trying to kill two birds with one
stone and cover a bit more of the codebase.
>
> FWIW, I think the tests here should really be separated from cgroup
> tests, but I understand why they were added here. There is a lot of
> testing for memcg interface and control for zswap, and a lot of nice
> helpers present.
Yeah FWIW, I agree :) I wonder if there's an easy way to inherit
helpers from one test suite to another. Some sort of kselftest
dependency? Or maybe move these cgroup helpers up the hierarchy (so
that it can be shared by multiple selftest suites).
>
On Mon, Jan 29, 2024 at 5:02 PM Yosry Ahmed <[email protected]> wrote:
>
> On Mon, Jan 29, 2024 at 02:45:40PM -0800, Nhat Pham wrote:
> > Make it easier for contributors to find the zswap maintainers when they
> > update the zswap tests.
> >
> > Signed-off-by: Nhat Pham <[email protected]>
>
> I guess I had to check the zswap tests at some point :)
We sorely need more zswap tests :)
I'm one of the offenders of adding new features without including
tests, so no judging anyone of course, and admittedly zswap is quite
intertwined with other parts of MM, so it's kinda hard to write
unit-ish tests for zswap only. I often had to resort to scripting
stress tests to iron out bugs.
But there are still tests that we can write to verify public API
(cgroup's zswap options come to mind), simple tests that cover crucial
code paths, etc. that we should probably add in. At the very least
this can be a quick/sanity check for developing and backporting
patches into the production system.
>
> Acked-by: Yosry Ahmed <[email protected]>
>
> > ---
> > MAINTAINERS | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index fecebfc4c0dc..5f60faaefaf2 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -24396,6 +24396,7 @@ F: include/linux/zpool.h
> > F: include/linux/zswap.h
> > F: mm/zpool.c
> > F: mm/zswap.c
> > +F: tools/testing/selftests/cgroup/test_zswap.c
> >
> > THE REST
> > M: Linus Torvalds <[email protected]>
> > --
> > 2.39.3
> >
On Tue, Jan 30, 2024 at 10:31:24AM -0800, Nhat Pham wrote:
> On Mon, Jan 29, 2024 at 5:24 PM Yosry Ahmed <[email protected]> wrote:
> >
[..]
> > > -static int allocate_bytes(const char *cgroup, void *arg)
> > > +static int allocate_bytes_and_read(const char *cgroup, void *arg, bool read)
> > > {
> > > size_t size = (size_t)arg;
> > > char *mem = (char *)malloc(size);
> > > + int ret = 0;
> > >
> > > if (!mem)
> > > return -1;
> > > for (int i = 0; i < size; i += 4095)
> > > mem[i] = 'a';
> > > +
> > > + if (read) {
> > > + /* cycle through the allocated memory to (z)swap in and out pages */
> > > + for (int t = 0; t < 5; t++) {
> >
> > What benefit does the iteration serve here? I would guess one iteration
> > is enough to swap everything in at least once, no?
>
> There might be data races etc. that might not appear in one iteration.
> Running multiple iterations increases the probability of these bugs
> cropping up.
Hmm this is a test running in a single process, and I assume the rest of
the system would be idle (at least from a zswap perspective). Did the
iterations actually catch problems in this scenario (not specifically in
this test, but generally in similar testing)?
>
> Admittedly, the same effect could, perhaps, also be achieved by
> running the same test multiple times, so this is not a hill I will die
> on :) This is just a bit more convenient - CI infra often runs these
> tests once every time a new kernel is built.
>
[..]
> > > +
> > > +static int test_swapin(const char *root)
> > > +{
> > > + return test_zswapin_size(root, "0");
> > > +}
> >
> > Why are we testing the no zswap case? I am all for testing but it seems
> > out of scope here. It would have been understandable if we are testing
> > memory.zswap.max itself, but we are not doing that.
>
> Eh it's just by convenience. We already have the workload - any test
> for zswap can pretty much be turned into a test for swap by disabling
> zswap (and enabling swap), so I was trying to kill two birds with one
> stone and cover a bit more of the codebase.
We can check that no data is actually in zswap after
test_zswapin_size(root, "0"), in which case it becomes more of a zswap
test and we get a sanity check for memory.zswap.max == 0. WDYT?
Perhaps we can rename it to test_swpain_nozswap() or so.
>
> >
> > FWIW, I think the tests here should really be separated from cgroup
> > tests, but I understand why they were added here. There is a lot of
> > testing for memcg interface and control for zswap, and a lot of nice
> > helpers present.
>
> Yeah FWIW, I agree :) I wonder if there's an easy way to inherit
> helpers from one test suite to another. Some sort of kselftest
> dependency? Or maybe move these cgroup helpers up the hierarchy (so
> that it can be shared by multiple selftest suites).
I am not fluent in kselftest so I can't claim to know the answer here.
There are a lot of things to do testing-wise for zswap, but I am not
asking anyone to do it because I don't have the time to do it myself. It
would be nice though :)
On Tue, Jan 30, 2024 at 10:37:15AM -0800, Nhat Pham wrote:
> On Mon, Jan 29, 2024 at 5:02 PM Yosry Ahmed <[email protected]> wrote:
> >
> > On Mon, Jan 29, 2024 at 02:45:40PM -0800, Nhat Pham wrote:
> > > Make it easier for contributors to find the zswap maintainers when they
> > > update the zswap tests.
> > >
> > > Signed-off-by: Nhat Pham <[email protected]>
> >
> > I guess I had to check the zswap tests at some point :)
>
> We sorely need more zswap tests :)
>
> I'm one of the offenders of adding new features without including
> tests, so no judging anyone of course, and admittedly zswap is quite
> intertwined with other parts of MM, so it's kinda hard to write
> unit-ish tests for zswap only. I often had to resort to scripting
> stress tests to iron out bugs.
>
> But there are still tests that we can write to verify public API
> (cgroup's zswap options come to mind), simple tests that cover crucial
> code paths, etc. that we should probably add in. At the very least
> this can be a quick/sanity check for developing and backporting
> patches into the production system.
Agreed. I am yet to take a close look at the existing tests tbh, but I
took a quick look before and there is room for improvement (and little
bandwidth to act on it).