A process can enable KSM with the prctl system call. When the process is
forked the KSM flag is inherited by the child process. However if the
process is executing an exec system call directly after the fork, the
KSM setting is cleared. This patch series addresses this problem.
1) Change the mask in coredump.h for execing a new process
2) Add a new test case in ksm_functional_tests
Changes:
- V4:
- Added motivation for the fix to the commit message of the
first patch
- V3:
- Combined two lines in function ksm_fork_exec_child()
- V2:
- Removed the child program from the patch series
- Child program is implemented by the program itself
- Added a new command line parameter for the child program
- Removed new section from Makefile
- Removed duplicate ; charaters
- Added return in if clause
- Used PR_GET_MEMORY_MERGE instead of magic numbers
- Resetting PR_SET_MEMROY_MERGE at the end.
Stefan Roesch (2):
mm/ksm: support fork/exec for prctl
mm/ksm: Test case for prctl fork/exec workflow
include/linux/sched/coredump.h | 7 +-
.../selftests/mm/ksm_functional_tests.c | 66 ++++++++++++++++++-
2 files changed, 70 insertions(+), 3 deletions(-)
base-commit: 15bcc9730fcd7526a3b92eff105d6701767a53bb
--
2.39.3
This adds a new test case to the ksm functional tests to make sure that
the KSM setting is inherited by the child process when doing a
fork/exec.
Signed-off-by: Stefan Roesch <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
---
.../selftests/mm/ksm_functional_tests.c | 66 ++++++++++++++++++-
1 file changed, 65 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
index 901e950f9138..fbff0dd09191 100644
--- a/tools/testing/selftests/mm/ksm_functional_tests.c
+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
@@ -26,6 +26,7 @@
#define KiB 1024u
#define MiB (1024 * KiB)
+#define FORK_EXEC_CHILD_PRG_NAME "ksm_fork_exec_child"
static int mem_fd;
static int ksm_fd;
@@ -479,6 +480,64 @@ static void test_prctl_fork(void)
ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
}
+static int ksm_fork_exec_child(void)
+{
+ /* Test if KSM is enabled for the process. */
+ return prctl(PR_GET_MEMORY_MERGE, 0, 0, 0, 0) == 1;
+}
+
+static void test_prctl_fork_exec(void)
+{
+ int ret, status;
+ pid_t child_pid;
+
+ ksft_print_msg("[RUN] %s\n", __func__);
+
+ ret = prctl(PR_SET_MEMORY_MERGE, 1, 0, 0, 0);
+ if (ret < 0 && errno == EINVAL) {
+ ksft_test_result_skip("PR_SET_MEMORY_MERGE not supported\n");
+ return;
+ } else if (ret) {
+ ksft_test_result_fail("PR_SET_MEMORY_MERGE=1 failed\n");
+ return;
+ }
+
+ child_pid = fork();
+ if (child_pid == -1) {
+ ksft_test_result_skip("fork() failed\n");
+ return;
+ } else if (child_pid == 0) {
+ char *prg_name = "./ksm_functional_tests";
+ char *argv_for_program[] = { prg_name, FORK_EXEC_CHILD_PRG_NAME };
+
+ execv(prg_name, argv_for_program);
+ return;
+ }
+
+ if (waitpid(child_pid, &status, 0) > 0) {
+ if (WIFEXITED(status)) {
+ status = WEXITSTATUS(status);
+ if (status) {
+ ksft_test_result_fail("KSM not enabled\n");
+ return;
+ }
+ } else {
+ ksft_test_result_fail("program didn't terminate normally\n");
+ return;
+ }
+ } else {
+ ksft_test_result_fail("waitpid() failed\n");
+ return;
+ }
+
+ if (prctl(PR_SET_MEMORY_MERGE, 0, 0, 0, 0)) {
+ ksft_test_result_fail("PR_SET_MEMORY_MERGE=0 failed\n");
+ return;
+ }
+
+ ksft_test_result_pass("PR_SET_MEMORY_MERGE value is inherited\n");
+}
+
static void test_prctl_unmerge(void)
{
const unsigned int size = 2 * MiB;
@@ -536,9 +595,13 @@ static void test_prot_none(void)
int main(int argc, char **argv)
{
- unsigned int tests = 7;
+ unsigned int tests = 8;
int err;
+ if (argc > 1 && !strcmp(argv[1], FORK_EXEC_CHILD_PRG_NAME)) {
+ exit(ksm_fork_exec_child() == 1 ? 0 : 1);
+ }
+
#ifdef __NR_userfaultfd
tests++;
#endif
@@ -576,6 +639,7 @@ int main(int argc, char **argv)
test_prctl();
test_prctl_fork();
+ test_prctl_fork_exec();
test_prctl_unmerge();
err = ksft_get_fail_cnt();
--
2.39.3
Today we have two ways to enable KSM:
1) madvise system call
This allows to enable KSM for a memory region for a long time.
2) prctl system call
This is a recent addition to enable KSM for the complete process.
In addition when a process is forked, the KSM setting is inherited.
This change only affects the second case.
One of the use cases for (2) was to support the ability to enable
KSM for cgroups. This allows systemd to enable KSM for the seed
process. By enabling it in the seed process all child processes inherit
the setting.
This works correctly when the process is forked. However it doesn't
support fork/exec workflow.
From the previous cover letter:
....
Use case 3:
With the madvise call sharing opportunities are only enabled for the
current process: it is a workload-local decision. A considerable number
of sharing opportunities may exist across multiple workloads or jobs
(if they are part of the same security domain). Only a higler level
entity like a job scheduler or container can know for certain if its
running one or more instances of a job. That job scheduler however
doesn't have the necessary internal workload knowledge to make targeted
madvise calls.
....
In addition it can also be a bit surprising that fork keeps the KSM
setting and fork/exec does not.
Signed-off-by: Stefan Roesch <[email protected]>
Fixes: d7597f59d1d3 ("mm: add new api to enable ksm per process")
Reviewed-by: David Hildenbrand <[email protected]>
Reported-by: Carl Klemm <[email protected]>
Tested-by: Carl Klemm <[email protected]>
---
include/linux/sched/coredump.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h
index 0ee96ea7a0e9..205aa9917394 100644
--- a/include/linux/sched/coredump.h
+++ b/include/linux/sched/coredump.h
@@ -87,8 +87,11 @@ static inline int get_dumpable(struct mm_struct *mm)
#define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP)
+#define MMF_VM_MERGE_ANY 29
+#define MMF_VM_MERGE_ANY_MASK (1 << MMF_VM_MERGE_ANY)
+
#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
- MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK)
+ MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\
+ MMF_VM_MERGE_ANY_MASK)
-#define MMF_VM_MERGE_ANY 29
#endif /* _LINUX_SCHED_COREDUMP_H */
--
2.39.3