2023-05-22 11:10:52

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 0/6] ksm: support tracking KSM-placed zero-pages

From: xu xin <[email protected]>

The core idea of this patch set is to enable users to perceive the number
of any pages merged by KSM, regardless of whether use_zero_page switch has
been turned on, so that users can know how much free memory increase is
really due to their madvise(MERGEABLE) actions. But the problem is, when
enabling use_zero_pages, all empty pages will be merged with kernel zero
pages instead of with each other as use_zero_pages is disabled, and then
these zero-pages are no longer monitored by KSM.

The motivations to do this is seen at:
https://lore.kernel.org/lkml/[email protected]/

In one word, we hope to implement the support for KSM-placed zero pages
tracking without affecting the feature of use_zero_pages, so that app
developer can also benefit from knowing the actual KSM profit by getting
KSM-placed zero pages to optimize applications eventually when
/sys/kernel/mm/ksm/use_zero_pages is enabled.

Change log
----------
v7->v8:
(1) Since [1] which fix the bug of pte_mkdirty on sparc64 that makes pte
writable, then we can remove the architechture restrictions of our
features.
(2) Improve the scheme of update ksm_zero_pages: add the handling case when
khugepaged replaces a shared zeropage by a THP.

[1] https://lore.kernel.org/all/[email protected]/

v6->v7:
This is an all-newed version which is different from v6 which relys on KSM's
rmap_item. The patch series don't rely on rmap_item but pte_dirty, so the
general handling of tracking KSM-placed zero-pages is simplified a lot.

For safety, we restrict this feature only to the tested and known-working
architechtures (ARM, ARM64, and X86) fow now.

xu xin (6):
ksm: support unsharing KSM-placed zero pages
ksm: count all zero pages placed by KSM
ksm: add ksm zero pages for each process
ksm: add documentation for ksm zero pages
ksm: update the calculation of KSM profit
selftest: add a testcase of ksm zero pages

Documentation/admin-guide/mm/ksm.rst | 26 +++++---
fs/proc/base.c | 1 +
include/linux/ksm.h | 25 ++++++++
include/linux/mm_types.h | 9 ++-
mm/khugepaged.c | 3 +
mm/ksm.c | 19 +++++-
mm/memory.c | 7 ++-
tools/testing/selftests/mm/ksm_functional_tests.c | 75 +++++++++++++++++++++++
8 files changed, 152 insertions(+), 13 deletions(-)

--
2.15.2


2023-05-22 11:14:03

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

From: xu xin <[email protected]>

When use_zero_pages of ksm is enabled, madvise(addr, len, MADV_UNMERGEABLE)
and other ways (like write 2 to /sys/kernel/mm/ksm/run) to trigger
unsharing will *not* actually unshare the shared zeropage as placed by KSM
(which is against the MADV_UNMERGEABLE documentation). As these KSM-placed
zero pages are out of the control of KSM, the related counts of ksm pages
don't expose how many zero pages are placed by KSM (these special zero
pages are different from those initially mapped zero pages, because the
zero pages mapped to MADV_UNMERGEABLE areas are expected to be a complete
and unshared page)

To not blindly unshare all shared zero_pages in applicable VMAs, the patch
use pte_mkdirty (related with architecture) to mark KSM-placed zero pages.
Thus, MADV_UNMERGEABLE will only unshare those KSM-placed zero pages.

The patch will not degrade the performance of use_zero_pages as it doesn't
change the way of merging empty pages in use_zero_pages's feature.

Signed-off-by: xu xin <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Xuexin Jiang <[email protected]>
Reviewed-by: Xiaokai Ran <[email protected]>
Reviewed-by: Yang Yang <[email protected]>
---
include/linux/ksm.h | 6 ++++++
mm/ksm.c | 5 +++--
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 899a314bc487..7989200cdbb7 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);

int __ksm_enter(struct mm_struct *mm);
void __ksm_exit(struct mm_struct *mm);
+/* use pte_mkdirty to track a KSM-placed zero page */
+#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
+#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))

static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
@@ -95,6 +98,9 @@ static inline void ksm_exit(struct mm_struct *mm)
{
}

+#define set_pte_ksm_zero(pte) pte_mkspecial(pte)
+#define is_ksm_zero_pte(pte) 0
+
#ifdef CONFIG_MEMORY_FAILURE
static inline void collect_procs_ksm(struct page *page,
struct list_head *to_kill, int force_early)
diff --git a/mm/ksm.c b/mm/ksm.c
index 0156bded3a66..9962f5962afd 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -447,7 +447,8 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex
if (is_migration_entry(entry))
page = pfn_swap_entry_to_page(entry);
}
- ret = page && PageKsm(page);
+ /* return 1 if the page is an normal ksm page or KSM-placed zero page */
+ ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte);
pte_unmap_unlock(pte, ptl);
return ret;
}
@@ -1220,7 +1221,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
page_add_anon_rmap(kpage, vma, addr, RMAP_NONE);
newpte = mk_pte(kpage, vma->vm_page_prot);
} else {
- newpte = pte_mkspecial(pfn_pte(page_to_pfn(kpage),
+ newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
vma->vm_page_prot));
/*
* We're replacing an anonymous page with a zero page, which is
--
2.15.2


2023-05-22 11:14:12

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 2/6] ksm: count all zero pages placed by KSM

From: xu xin <[email protected]>

As pages_sharing and pages_shared don't include the number of zero pages
merged by KSM, we cannot know how many pages are zero pages placed by KSM
when enabling use_zero_pages, which leads to KSM not being transparent with
all actual merged pages by KSM. In the early days of use_zero_pages,
zero-pages was unable to get unshared by the ways like MADV_UNMERGEABLE so
it's hard to count how many times one of those zeropages was then unmerged.

But now, unsharing KSM-placed zero page accurately has been achieved, so we
can easily count both how many times a page full of zeroes was merged with
zero-page and how many times one of those pages was then unmerged. and so,
it helps to estimate memory demands when each and every shared page could
get unshared.

So we add ksm_zero_pages under /sys/kernel/mm/ksm/ to show the number
of all zero pages placed by KSM.

v7->v8:
Handle the case when khugepaged replaces a shared zeropage by a THP.

Signed-off-by: xu xin <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Xuexin Jiang <[email protected]>
Reviewed-by: Xiaokai Ran <[email protected]>
Reviewed-by: Yang Yang <[email protected]>
---
include/linux/ksm.h | 17 +++++++++++++++++
mm/khugepaged.c | 3 +++
mm/ksm.c | 12 ++++++++++++
mm/memory.c | 7 ++++++-
4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 7989200cdbb7..1adcae0205e3 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -29,6 +29,16 @@ void __ksm_exit(struct mm_struct *mm);
/* use pte_mkdirty to track a KSM-placed zero page */
#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))
+extern unsigned long ksm_zero_pages;
+static inline void inc_ksm_zero_pages(void)
+{
+ ksm_zero_pages++;
+}
+
+static inline void dec_ksm_zero_pages(void)
+{
+ ksm_zero_pages--;
+}

static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
{
@@ -100,6 +110,13 @@ static inline void ksm_exit(struct mm_struct *mm)

#define set_pte_ksm_zero(pte) pte_mkspecial(pte)
#define is_ksm_zero_pte(pte) 0
+static inline void inc_ksm_zero_pages(void)
+{
+}
+
+static inline void dec_ksm_zero_pages(void)
+{
+}

#ifdef CONFIG_MEMORY_FAILURE
static inline void collect_procs_ksm(struct page *page,
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6b9d39d65b73..ba0d077b6951 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -19,6 +19,7 @@
#include <linux/page_table_check.h>
#include <linux/swapops.h>
#include <linux/shmem_fs.h>
+#include <linux/ksm.h>

#include <asm/tlb.h>
#include <asm/pgalloc.h>
@@ -711,6 +712,8 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
spin_lock(ptl);
ptep_clear(vma->vm_mm, address, _pte);
spin_unlock(ptl);
+ if (is_ksm_zero_pte(pteval))
+ dec_ksm_zero_pages();
}
} else {
src_page = pte_page(pteval);
diff --git a/mm/ksm.c b/mm/ksm.c
index 9962f5962afd..2ca7e8860faa 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -278,6 +278,9 @@ static unsigned int zero_checksum __read_mostly;
/* Whether to merge empty (zeroed) pages with actual zero pages */
static bool ksm_use_zero_pages __read_mostly;

+/* The number of zero pages which is placed by KSM */
+unsigned long ksm_zero_pages;
+
#ifdef CONFIG_NUMA
/* Zeroed when merging across nodes is not allowed */
static unsigned int ksm_merge_across_nodes = 1;
@@ -1223,6 +1226,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
} else {
newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
vma->vm_page_prot));
+ inc_ksm_zero_pages();
/*
* We're replacing an anonymous page with a zero page, which is
* not anonymous. We need to do proper accounting otherwise we
@@ -3350,6 +3354,13 @@ static ssize_t pages_volatile_show(struct kobject *kobj,
}
KSM_ATTR_RO(pages_volatile);

+static ssize_t ksm_zero_pages_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ return sysfs_emit(buf, "%ld\n", ksm_zero_pages);
+}
+KSM_ATTR_RO(ksm_zero_pages);
+
static ssize_t general_profit_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
@@ -3417,6 +3428,7 @@ static struct attribute *ksm_attrs[] = {
&pages_sharing_attr.attr,
&pages_unshared_attr.attr,
&pages_volatile_attr.attr,
+ &ksm_zero_pages_attr.attr,
&full_scans_attr.attr,
#ifdef CONFIG_NUMA
&merge_across_nodes_attr.attr,
diff --git a/mm/memory.c b/mm/memory.c
index 8358f3b853f2..058b416adf24 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1415,8 +1415,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
tlb_remove_tlb_entry(tlb, pte, addr);
zap_install_uffd_wp_if_needed(vma, addr, pte, details,
ptent);
- if (unlikely(!page))
+ if (unlikely(!page)) {
+ if (is_ksm_zero_pte(ptent))
+ dec_ksm_zero_pages();
continue;
+ }

delay_rmap = 0;
if (!PageAnon(page)) {
@@ -3120,6 +3123,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
inc_mm_counter(mm, MM_ANONPAGES);
}
} else {
+ if (is_ksm_zero_pte(vmf->orig_pte))
+ dec_ksm_zero_pages();
inc_mm_counter(mm, MM_ANONPAGES);
}
flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
--
2.15.2

2023-05-22 11:15:36

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 5/6] ksm: update the calculation of KSM profit

From: xu xin <[email protected]>

When use_zero_pages is enabled, the calculation of ksm profit is not
correct because ksm zero pages is not counted in. So update the
calculation of KSM profit including the documentation.

Signed-off-by: xu xin <[email protected]>
Cc: Xiaokai Ran <[email protected]>
Cc: Yang Yang <[email protected]>
Cc: Jiang Xuexin <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: David Hildenbrand <[email protected]>
---
Documentation/admin-guide/mm/ksm.rst | 18 +++++++++++-------
mm/ksm.c | 2 +-
2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
index 019dc40a0d3c..dde7c152f0ae 100644
--- a/Documentation/admin-guide/mm/ksm.rst
+++ b/Documentation/admin-guide/mm/ksm.rst
@@ -204,21 +204,25 @@ several times, which are unprofitable memory consumed.
1) How to determine whether KSM save memory or consume memory in system-wide
range? Here is a simple approximate calculation for reference::

- general_profit =~ pages_sharing * sizeof(page) - (all_rmap_items) *
+ general_profit =~ ksm_saved_pages * sizeof(page) - (all_rmap_items) *
sizeof(rmap_item);

- where all_rmap_items can be easily obtained by summing ``pages_sharing``,
- ``pages_shared``, ``pages_unshared`` and ``pages_volatile``.
+ where ksm_saved_pages equals to the sum of ``pages_sharing`` +
+ ``ksm_zero_pages`` of the system, and all_rmap_items can be easily
+ obtained by summing ``pages_sharing``, ``pages_shared``, ``pages_unshared``
+ and ``pages_volatile``.

2) The KSM profit inner a single process can be similarly obtained by the
following approximate calculation::

- process_profit =~ ksm_merging_pages * sizeof(page) -
+ process_profit =~ ksm_saved_pages * sizeof(page) -
ksm_rmap_items * sizeof(rmap_item).

- where ksm_merging_pages is shown under the directory ``/proc/<pid>/``,
- and ksm_rmap_items is shown in ``/proc/<pid>/ksm_stat``. The process profit
- is also shown in ``/proc/<pid>/ksm_stat`` as ksm_process_profit.
+ where ksm_saved_pages equals to the sum of ``ksm_merging_pages`` and
+ ``ksm_zero_pages``, both of which are shown under the directory
+ ``/proc/<pid>/ksm_stat``, and ksm_rmap_items is alos shown in
+ ``/proc/<pid>/ksm_stat``. The process profit is also shown in
+ ``/proc/<pid>/ksm_stat`` as ksm_process_profit.

From the perspective of application, a high ratio of ``ksm_rmap_items`` to
``ksm_merging_pages`` means a bad madvise-applied policy, so developers or
diff --git a/mm/ksm.c b/mm/ksm.c
index 4e510f5c5938..d23a240c2519 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -3085,7 +3085,7 @@ static void wait_while_offlining(void)
#ifdef CONFIG_PROC_FS
long ksm_process_profit(struct mm_struct *mm)
{
- return mm->ksm_merging_pages * PAGE_SIZE -
+ return (long)(mm->ksm_merging_pages + mm->ksm_zero_pages) * PAGE_SIZE -
mm->ksm_rmap_items * sizeof(struct ksm_rmap_item);
}
#endif /* CONFIG_PROC_FS */
--
2.15.2

2023-05-22 11:19:35

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 6/6] selftest: add a testcase of ksm zero pages

From: xu xin <[email protected]>

Add a function test_unmerge_zero_page() to test the functionality on
unsharing and counting ksm-placed zero pages and counting of this patch
series.

test_unmerge_zero_page() actually contains three subjct test objects:
(1) whether the count of ksm zero pages can update correctly after merging;
(2) whether the count of ksm zero pages can update correctly after
unmerging;
(3) whether ksm zero pages are really unmerged.

Signed-off-by: xu xin <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Xuexin Jiang <[email protected]>
Reviewed-by: Xiaokai Ran <[email protected]>
Reviewed-by: Yang Yang <[email protected]>
---
tools/testing/selftests/mm/ksm_functional_tests.c | 75 +++++++++++++++++++++++
1 file changed, 75 insertions(+)

diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
index 26853badae70..9b7fb94ed64f 100644
--- a/tools/testing/selftests/mm/ksm_functional_tests.c
+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
@@ -29,6 +29,8 @@

static int ksm_fd;
static int ksm_full_scans_fd;
+static int ksm_zero_pages_fd;
+static int ksm_use_zero_pages_fd;
static int pagemap_fd;
static size_t pagesize;

@@ -59,6 +61,21 @@ static bool range_maps_duplicates(char *addr, unsigned long size)
return false;
}

+static long get_ksm_zero_pages(void)
+{
+ char buf[20];
+ ssize_t read_size;
+ unsigned long ksm_zero_pages;
+
+ read_size = pread(ksm_zero_pages_fd, buf, sizeof(buf) - 1, 0);
+ if (read_size < 0)
+ return -errno;
+ buf[read_size] = 0;
+ ksm_zero_pages = strtol(buf, NULL, 10);
+
+ return ksm_zero_pages;
+}
+
static long ksm_get_full_scans(void)
{
char buf[10];
@@ -159,6 +176,61 @@ static void test_unmerge(void)
munmap(map, size);
}

+static inline unsigned long expected_ksm_pages(unsigned long mergeable_size)
+{
+ return mergeable_size / pagesize;
+}
+
+static void test_unmerge_zero_pages(void)
+{
+ const unsigned int size = 2 * MiB;
+ char *map;
+ unsigned long pages_expected;
+
+ ksft_print_msg("[RUN] %s\n", __func__);
+
+ /* Confirm the interfaces*/
+ if (ksm_zero_pages_fd < 0) {
+ ksft_test_result_skip("open(\"/sys/kernel/mm/ksm/ksm_zero_pages\") failed\n");
+ return;
+ }
+ if (ksm_use_zero_pages_fd < 0) {
+ ksft_test_result_skip("open \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
+ return;
+ }
+ if (write(ksm_use_zero_pages_fd, "1", 1) != 1) {
+ ksft_test_result_skip("write \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
+ return;
+ }
+
+ /* Mmap zero pages*/
+ map = mmap_and_merge_range(0x00, size, false);
+ if (map == MAP_FAILED)
+ return;
+
+ /* Check if ksm_zero_pages can be update correctly after merging */
+ pages_expected = expected_ksm_pages(size);
+ ksft_test_result(pages_expected == get_ksm_zero_pages(),
+ "The count zero_page_sharing was updated after merging\n");
+
+ /* try to unmerge half of the region */
+ if (madvise(map, size / 2, MADV_UNMERGEABLE)) {
+ ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
+ goto unmap;
+ }
+
+ /* Check if ksm_zero_pages can be update correctly after unmerging */
+ pages_expected = expected_ksm_pages(size / 2);
+ ksft_test_result(pages_expected == get_ksm_zero_pages(),
+ "The count zero_page_sharing was updated after unmerging\n");
+
+ /* Check if ksm zero pages are really unmerged */
+ ksft_test_result(!range_maps_duplicates(map, size / 2),
+ "KSM zero pages were unmerged\n");
+unmap:
+ munmap(map, size);
+}
+
static void test_unmerge_discarded(void)
{
const unsigned int size = 2 * MiB;
@@ -379,8 +451,11 @@ int main(int argc, char **argv)
pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
if (pagemap_fd < 0)
ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n");
+ ksm_zero_pages_fd = open("/sys/kernel/mm/ksm/ksm_zero_pages", O_RDONLY);
+ ksm_use_zero_pages_fd = open("/sys/kernel/mm/ksm/use_zero_pages", O_RDWR);

test_unmerge();
+ test_unmerge_zero_pages();
test_unmerge_discarded();
#ifdef __NR_userfaultfd
test_unmerge_uffd_wp();
--
2.15.2

2023-05-22 11:25:42

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 4/6] ksm: add documentation for ksm zero pages

From: xu xin <[email protected]>

Add the description of ksm_zero_pages.

When use_zero_pages is enabled, pages_sharing cannot represent how
much memory saved actually by KSM, but the sum of ksm_zero_pages +
pages_sharing does.

Signed-off-by: xu xin <[email protected]>
Cc: Xiaokai Ran <[email protected]>
Cc: Yang Yang <[email protected]>
Cc: Jiang Xuexin <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: David Hildenbrand <[email protected]>
---
Documentation/admin-guide/mm/ksm.rst | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
index 7626392fe82c..019dc40a0d3c 100644
--- a/Documentation/admin-guide/mm/ksm.rst
+++ b/Documentation/admin-guide/mm/ksm.rst
@@ -173,6 +173,14 @@ stable_node_chains
the number of KSM pages that hit the ``max_page_sharing`` limit
stable_node_dups
number of duplicated KSM pages
+ksm_zero_pages
+ how many empty pages are sharing the kernel zero page(s) instead
+ of other user pages as it would happen normally. Only meaningful
+ when ``use_zero_pages`` is/was enabled.
+
+When ``use_zero_pages`` is/was enabled, the sum of ``pages_sharing`` +
+``ksm_zero_pages`` represents the actual number of pages saved by KSM.
+if ``use_zero_pages`` has never been enabled, ``ksm_zero_pages`` is 0.

A high ratio of ``pages_sharing`` to ``pages_shared`` indicates good
sharing, but a high ratio of ``pages_unshared`` to ``pages_sharing``
--
2.15.2

2023-05-22 11:28:05

by Yang Yang

[permalink] [raw]
Subject: [PATCH v8 3/6] ksm: add ksm zero pages for each process

From: xu xin <[email protected]>

As the number of ksm zero pages is not included in ksm_merging_pages per
process when enabling use_zero_pages, it's unclear of how many actual
pages are merged by KSM. To let users accurately estimate their memory
demands when unsharing KSM zero-pages, it's necessary to show KSM zero-
pages per process. In addition, it help users to know the actual KSM
profit because KSM-placed zero pages are also benefit from KSM.

since unsharing zero pages placed by KSM accurately is achieved, then
tracking empty pages merging and unmerging is not a difficult thing any
longer.

Since we already have /proc/<pid>/ksm_stat, just add the information of
'ksm_zero_pages' in it.

Signed-off-by: xu xin <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Xuexin Jiang <[email protected]>
Cc: Xiaokai Ran <[email protected]>
Cc: Yang Yang <[email protected]>
---
fs/proc/base.c | 1 +
include/linux/ksm.h | 10 ++++++----
include/linux/mm_types.h | 9 +++++++--
mm/khugepaged.c | 2 +-
mm/ksm.c | 2 +-
mm/memory.c | 4 ++--
6 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 05452c3b9872..e407a34a46e8 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -3209,6 +3209,7 @@ static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns,
seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items);
seq_printf(m, "ksm_merging_pages %lu\n", mm->ksm_merging_pages);
seq_printf(m, "ksm_process_profit %ld\n", ksm_process_profit(mm));
+ seq_printf(m, "ksm_zero_pages %lu\n", mm->ksm_zero_pages);
mmput(mm);
}

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 1adcae0205e3..ca29e95481b0 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -30,14 +30,16 @@ void __ksm_exit(struct mm_struct *mm);
#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))
extern unsigned long ksm_zero_pages;
-static inline void inc_ksm_zero_pages(void)
+static inline void inc_ksm_zero_pages(struct mm_struct *mm)
{
ksm_zero_pages++;
+ mm->ksm_zero_pages++;
}

-static inline void dec_ksm_zero_pages(void)
+static inline void dec_ksm_zero_pages(struct mm_struct *mm)
{
ksm_zero_pages--;
+ mm->ksm_zero_pages--;
}

static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
@@ -110,11 +112,11 @@ static inline void ksm_exit(struct mm_struct *mm)

#define set_pte_ksm_zero(pte) pte_mkspecial(pte)
#define is_ksm_zero_pte(pte) 0
-static inline void inc_ksm_zero_pages(void)
+static inline void inc_ksm_zero_pages(struct mm_struct *mm)
{
}

-static inline void dec_ksm_zero_pages(void)
+static inline void dec_ksm_zero_pages(struct mm_struct *mm)
{
}

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 306a3d1a0fa6..14f781509812 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -777,7 +777,7 @@ struct mm_struct {
#ifdef CONFIG_KSM
/*
* Represent how many pages of this process are involved in KSM
- * merging.
+ * merging (not including ksm_zero_pages).
*/
unsigned long ksm_merging_pages;
/*
@@ -785,7 +785,12 @@ struct mm_struct {
* including merged and not merged.
*/
unsigned long ksm_rmap_items;
-#endif
+ /*
+ * Represent how many empty pages are merged with kernel zero
+ * pages when enabling KSM use_zero_pages.
+ */
+ unsigned long ksm_zero_pages;
+#endif /* CONFIG_KSM */
#ifdef CONFIG_LRU_GEN
struct {
/* this mm_struct is on lru_gen_mm_list */
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index ba0d077b6951..5cd6ac70261e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -713,7 +713,7 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
ptep_clear(vma->vm_mm, address, _pte);
spin_unlock(ptl);
if (is_ksm_zero_pte(pteval))
- dec_ksm_zero_pages();
+ dec_ksm_zero_pages(vma->vm_mm);
}
} else {
src_page = pte_page(pteval);
diff --git a/mm/ksm.c b/mm/ksm.c
index 2ca7e8860faa..4e510f5c5938 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1226,7 +1226,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
} else {
newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
vma->vm_page_prot));
- inc_ksm_zero_pages();
+ inc_ksm_zero_pages(mm);
/*
* We're replacing an anonymous page with a zero page, which is
* not anonymous. We need to do proper accounting otherwise we
diff --git a/mm/memory.c b/mm/memory.c
index 058b416adf24..2603dad833d0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1417,7 +1417,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
ptent);
if (unlikely(!page)) {
if (is_ksm_zero_pte(ptent))
- dec_ksm_zero_pages();
+ dec_ksm_zero_pages(mm);
continue;
}

@@ -3124,7 +3124,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
}
} else {
if (is_ksm_zero_pte(vmf->orig_pte))
- dec_ksm_zero_pages();
+ dec_ksm_zero_pages(mm);
inc_mm_counter(mm, MM_ANONPAGES);
}
flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
--
2.15.2

2023-05-23 09:56:39

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

On 22.05.23 12:49, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> When use_zero_pages of ksm is enabled, madvise(addr, len, MADV_UNMERGEABLE)
> and other ways (like write 2 to /sys/kernel/mm/ksm/run) to trigger
> unsharing will *not* actually unshare the shared zeropage as placed by KSM
> (which is against the MADV_UNMERGEABLE documentation). As these KSM-placed
> zero pages are out of the control of KSM, the related counts of ksm pages
> don't expose how many zero pages are placed by KSM (these special zero
> pages are different from those initially mapped zero pages, because the
> zero pages mapped to MADV_UNMERGEABLE areas are expected to be a complete
> and unshared page)
>
> To not blindly unshare all shared zero_pages in applicable VMAs, the patch
> use pte_mkdirty (related with architecture) to mark KSM-placed zero pages.
> Thus, MADV_UNMERGEABLE will only unshare those KSM-placed zero pages.
>
> The patch will not degrade the performance of use_zero_pages as it doesn't
> change the way of merging empty pages in use_zero_pages's feature.
>

Maybe add: "We'll reuse this mechanism to reliably identify KSM-placed
zeropages to properly account for them (e.g., calculating the KSM profit
that includes zeropages) next."

> Signed-off-by: xu xin <[email protected]>
> Suggested-by: David Hildenbrand <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: Xuexin Jiang <[email protected]>
> Reviewed-by: Xiaokai Ran <[email protected]>
> Reviewed-by: Yang Yang <[email protected]>
> ---
> include/linux/ksm.h | 6 ++++++
> mm/ksm.c | 5 +++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index 899a314bc487..7989200cdbb7 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);
>
> int __ksm_enter(struct mm_struct *mm);
> void __ksm_exit(struct mm_struct *mm);
> +/* use pte_mkdirty to track a KSM-placed zero page */
> +#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))

If there is only a single user (which I assume), please inline it instead.

Let's add some more documentation:

/*
* To identify zeropages that were mapped by KSM, we reuse the dirty bit
* in the PTE. If the PTE is dirty, the zeropage was mapped by KSM when
* deduplicating memory.
*/

> +#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))
>
> static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
> {
> @@ -95,6 +98,9 @@ static inline void ksm_exit(struct mm_struct *mm)
> {
> }
>
> +#define set_pte_ksm_zero(pte) pte_mkspecial(pte)
> +#define is_ksm_zero_pte(pte) 0
> +
> #ifdef CONFIG_MEMORY_FAILURE
> static inline void collect_procs_ksm(struct page *page,
> struct list_head *to_kill, int force_early)
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 0156bded3a66..9962f5962afd 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -447,7 +447,8 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex
> if (is_migration_entry(entry))
> page = pfn_swap_entry_to_page(entry);
> }
> - ret = page && PageKsm(page);
> + /* return 1 if the page is an normal ksm page or KSM-placed zero page */
> + ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte);
> pte_unmap_unlock(pte, ptl);
> return ret;
> }
> @@ -1220,7 +1221,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
> page_add_anon_rmap(kpage, vma, addr, RMAP_NONE);
> newpte = mk_pte(kpage, vma->vm_page_prot);
> } else {
> - newpte = pte_mkspecial(pfn_pte(page_to_pfn(kpage),
> + newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
> vma->vm_page_prot));
> /*
> * We're replacing an anonymous page with a zero page, which is

Apart from that LGTM.

--
Thanks,

David / dhildenb


2023-05-23 09:58:34

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] ksm: count all zero pages placed by KSM

On 22.05.23 12:52, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> As pages_sharing and pages_shared don't include the number of zero pages
> merged by KSM, we cannot know how many pages are zero pages placed by KSM
> when enabling use_zero_pages, which leads to KSM not being transparent with
> all actual merged pages by KSM. In the early days of use_zero_pages,
> zero-pages was unable to get unshared by the ways like MADV_UNMERGEABLE so
> it's hard to count how many times one of those zeropages was then unmerged.
>
> But now, unsharing KSM-placed zero page accurately has been achieved, so we
> can easily count both how many times a page full of zeroes was merged with
> zero-page and how many times one of those pages was then unmerged. and so,
> it helps to estimate memory demands when each and every shared page could
> get unshared.
>
> So we add ksm_zero_pages under /sys/kernel/mm/ksm/ to show the number
> of all zero pages placed by KSM.
>
> v7->v8:
> Handle the case when khugepaged replaces a shared zeropage by a THP.
>
> Signed-off-by: xu xin <[email protected]>
> Suggested-by: David Hildenbrand <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: Xuexin Jiang <[email protected]>
> Reviewed-by: Xiaokai Ran <[email protected]>
> Reviewed-by: Yang Yang <[email protected]>
> ---
> include/linux/ksm.h | 17 +++++++++++++++++
> mm/khugepaged.c | 3 +++
> mm/ksm.c | 12 ++++++++++++
> mm/memory.c | 7 ++++++-
> 4 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index 7989200cdbb7..1adcae0205e3 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -29,6 +29,16 @@ void __ksm_exit(struct mm_struct *mm);
> /* use pte_mkdirty to track a KSM-placed zero page */
> #define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
> #define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))
> +extern unsigned long ksm_zero_pages;
> +static inline void inc_ksm_zero_pages(void)
> +{
> + ksm_zero_pages++;
> +}
> +

No need to export the inc, just inline this.

> +static inline void dec_ksm_zero_pages(void)
> +{
> + ksm_zero_pages--;
> +}
>
> static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
> {
> @@ -100,6 +110,13 @@ static inline void ksm_exit(struct mm_struct *mm)
>
> #define set_pte_ksm_zero(pte) pte_mkspecial(pte)
> #define is_ksm_zero_pte(pte) 0
> +static inline void inc_ksm_zero_pages(void)
> +{
> +}
> +
> +static inline void dec_ksm_zero_pages(void)
> +{
> +}
>
> #ifdef CONFIG_MEMORY_FAILURE
> static inline void collect_procs_ksm(struct page *page,
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 6b9d39d65b73..ba0d077b6951 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -19,6 +19,7 @@
> #include <linux/page_table_check.h>
> #include <linux/swapops.h>
> #include <linux/shmem_fs.h>
> +#include <linux/ksm.h>
>
> #include <asm/tlb.h>
> #include <asm/pgalloc.h>
> @@ -711,6 +712,8 @@ static void __collapse_huge_page_copy_succeeded(pte_t *pte,
> spin_lock(ptl);
> ptep_clear(vma->vm_mm, address, _pte);
> spin_unlock(ptl);
> + if (is_ksm_zero_pte(pteval))
> + dec_ksm_zero_pages();
> }
> } else {
> src_page = pte_page(pteval);
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 9962f5962afd..2ca7e8860faa 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -278,6 +278,9 @@ static unsigned int zero_checksum __read_mostly;
> /* Whether to merge empty (zeroed) pages with actual zero pages */
> static bool ksm_use_zero_pages __read_mostly;
>
> +/* The number of zero pages which is placed by KSM */
> +unsigned long ksm_zero_pages;
> +
> #ifdef CONFIG_NUMA
> /* Zeroed when merging across nodes is not allowed */
> static unsigned int ksm_merge_across_nodes = 1;
> @@ -1223,6 +1226,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
> } else {
> newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
> vma->vm_page_prot));
> + inc_ksm_zero_pages();
> /*
> * We're replacing an anonymous page with a zero page, which is
> * not anonymous. We need to do proper accounting otherwise we
> @@ -3350,6 +3354,13 @@ static ssize_t pages_volatile_show(struct kobject *kobj,
> }
> KSM_ATTR_RO(pages_volatile);
>
> +static ssize_t ksm_zero_pages_show(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + return sysfs_emit(buf, "%ld\n", ksm_zero_pages);
> +}
> +KSM_ATTR_RO(ksm_zero_pages);
> +
> static ssize_t general_profit_show(struct kobject *kobj,
> struct kobj_attribute *attr, char *buf)
> {
> @@ -3417,6 +3428,7 @@ static struct attribute *ksm_attrs[] = {
> &pages_sharing_attr.attr,
> &pages_unshared_attr.attr,
> &pages_volatile_attr.attr,
> + &ksm_zero_pages_attr.attr,
> &full_scans_attr.attr,
> #ifdef CONFIG_NUMA
> &merge_across_nodes_attr.attr,
> diff --git a/mm/memory.c b/mm/memory.c
> index 8358f3b853f2..058b416adf24 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1415,8 +1415,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
> tlb_remove_tlb_entry(tlb, pte, addr);
> zap_install_uffd_wp_if_needed(vma, addr, pte, details,
> ptent);
> - if (unlikely(!page))
> + if (unlikely(!page)) {
> + if (is_ksm_zero_pte(ptent))
> + dec_ksm_zero_pages();
> continue;
> + }
>
> delay_rmap = 0;
> if (!PageAnon(page)) {
> @@ -3120,6 +3123,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
> inc_mm_counter(mm, MM_ANONPAGES);
> }
> } else {
> + if (is_ksm_zero_pte(vmf->orig_pte))
> + dec_ksm_zero_pages();
> inc_mm_counter(mm, MM_ANONPAGES);
> }
> flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));

Can we maybe avoid exporting the dec semantics and rather add a callback
to KSM? Ideally, we'd even distill that down to a single call, and
handle the details in ksm.h. Maybe simply:

ksm_notify_unmap_zero_page(vmf->orig_pte);

and then just have in ksm.h

static inline void ksm_notify_unmap_zero_page(pte_t pte)
{
if (is_ksm_zero_pte(pte))
ksm_zero_pages--;
}

--
Thanks,

David / dhildenb


2023-05-23 10:03:07

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 2/6] ksm: count all zero pages placed by KSM

On 22.05.23 12:52, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> As pages_sharing and pages_shared don't include the number of zero pages
> merged by KSM, we cannot know how many pages are zero pages placed by KSM
> when enabling use_zero_pages, which leads to KSM not being transparent with
> all actual merged pages by KSM. In the early days of use_zero_pages,
> zero-pages was unable to get unshared by the ways like MADV_UNMERGEABLE so
> it's hard to count how many times one of those zeropages was then unmerged.
>
> But now, unsharing KSM-placed zero page accurately has been achieved, so we
> can easily count both how many times a page full of zeroes was merged with
> zero-page and how many times one of those pages was then unmerged. and so,
> it helps to estimate memory demands when each and every shared page could
> get unshared.
>
> So we add ksm_zero_pages under /sys/kernel/mm/ksm/ to show the number
> of all zero pages placed by KSM.
>
> v7->v8:
> Handle the case when khugepaged replaces a shared zeropage by a THP.
>

Oh, and just a note, such version comments should go below the "--",
such that they will automatically get dropped when applying the patch.

(Usually, version information in the cover letter is sufficient :) )

--
Thanks,

David / dhildenb


2023-05-23 10:06:06

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 3/6] ksm: add ksm zero pages for each process

On 22.05.23 12:53, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> As the number of ksm zero pages is not included in ksm_merging_pages per
> process when enabling use_zero_pages, it's unclear of how many actual
> pages are merged by KSM. To let users accurately estimate their memory
> demands when unsharing KSM zero-pages, it's necessary to show KSM zero-
> pages per process. In addition, it help users to know the actual KSM
> profit because KSM-placed zero pages are also benefit from KSM.
>
> since unsharing zero pages placed by KSM accurately is achieved, then
> tracking empty pages merging and unmerging is not a difficult thing any
> longer.
>
> Since we already have /proc/<pid>/ksm_stat, just add the information of
> 'ksm_zero_pages' in it.
>
> Signed-off-by: xu xin <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Xuexin Jiang <[email protected]>
> Cc: Xiaokai Ran <[email protected]>
> Cc: Yang Yang <[email protected]>
> ---


LGTM. [inlining inc_ksm_zero_pages() and avoiding explicit
dec_ksm_zero_pages() as noted on patch #2 ]

--
Thanks,

David / dhildenb


2023-05-23 10:30:58

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 4/6] ksm: add documentation for ksm zero pages

On 22.05.23 12:53, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> Add the description of ksm_zero_pages.
>
> When use_zero_pages is enabled, pages_sharing cannot represent how
> much memory saved actually by KSM, but the sum of ksm_zero_pages +
> pages_sharing does.
>
> Signed-off-by: xu xin <[email protected]>
> Cc: Xiaokai Ran <[email protected]>
> Cc: Yang Yang <[email protected]>
> Cc: Jiang Xuexin <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> ---
> Documentation/admin-guide/mm/ksm.rst | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
> index 7626392fe82c..019dc40a0d3c 100644
> --- a/Documentation/admin-guide/mm/ksm.rst
> +++ b/Documentation/admin-guide/mm/ksm.rst
> @@ -173,6 +173,14 @@ stable_node_chains
> the number of KSM pages that hit the ``max_page_sharing`` limit
> stable_node_dups
> number of duplicated KSM pages
> +ksm_zero_pages
> + how many empty pages are sharing the kernel zero page(s) instead
> + of other user pages as it would happen normally. Only meaningful
> + when ``use_zero_pages`` is/was enabled.

"empty pages" is misleading. You can probably drop the last comment,
because you repeat that afterwards.

how many zero pages that are still mapped into processes were mapped by
KSM when deduplicating.


I suggest squashing this patch into #3.

--
Thanks,

David / dhildenb


2023-05-23 10:35:59

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 6/6] selftest: add a testcase of ksm zero pages

On 22.05.23 12:54, Yang Yang wrote:
> From: xu xin <[email protected]>
>
> Add a function test_unmerge_zero_page() to test the functionality on
> unsharing and counting ksm-placed zero pages and counting of this patch
> series.
>
> test_unmerge_zero_page() actually contains three subjct test objects:
> (1) whether the count of ksm zero pages can update correctly after merging;
> (2) whether the count of ksm zero pages can update correctly after
> unmerging;
> (3) whether ksm zero pages are really unmerged.
>
> Signed-off-by: xu xin <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Xuexin Jiang <[email protected]>
> Reviewed-by: Xiaokai Ran <[email protected]>
> Reviewed-by: Yang Yang <[email protected]>
> ---
> tools/testing/selftests/mm/ksm_functional_tests.c | 75 +++++++++++++++++++++++
> 1 file changed, 75 insertions(+)
>
> diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
> index 26853badae70..9b7fb94ed64f 100644
> --- a/tools/testing/selftests/mm/ksm_functional_tests.c
> +++ b/tools/testing/selftests/mm/ksm_functional_tests.c
> @@ -29,6 +29,8 @@
>
> static int ksm_fd;
> static int ksm_full_scans_fd;
> +static int ksm_zero_pages_fd;
> +static int ksm_use_zero_pages_fd;
> static int pagemap_fd;
> static size_t pagesize;
>
> @@ -59,6 +61,21 @@ static bool range_maps_duplicates(char *addr, unsigned long size)
> return false;
> }
>
> +static long get_ksm_zero_pages(void)
> +{
> + char buf[20];
> + ssize_t read_size;
> + unsigned long ksm_zero_pages;
> +

I would add:

if (!ksm_zero_pages_fd)
return 0;

> + read_size = pread(ksm_zero_pages_fd, buf, sizeof(buf) - 1, 0);
> + if (read_size < 0)
> + return -errno;
> + buf[read_size] = 0;
> + ksm_zero_pages = strtol(buf, NULL, 10);
> +
> + return ksm_zero_pages;
> +}
> +
> static long ksm_get_full_scans(void)
> {
> char buf[10];
> @@ -159,6 +176,61 @@ static void test_unmerge(void)
> munmap(map, size);
> }
>
> +static inline unsigned long expected_ksm_pages(unsigned long mergeable_size)
> +{
> + return mergeable_size / pagesize;
> +}

I suggest to just inline that.

> +
> +static void test_unmerge_zero_pages(void)
> +{
> + const unsigned int size = 2 * MiB;
> + char *map;
> + unsigned long pages_expected;
> +
> + ksft_print_msg("[RUN] %s\n", __func__);
> +
> + /* Confirm the interfaces*/

Missing space at the end of the comment. But I suggest to just drop this comment.

> + if (ksm_zero_pages_fd < 0) {
> + ksft_test_result_skip("open(\"/sys/kernel/mm/ksm/ksm_zero_pages\") failed\n");
> + return;
> + }
> + if (ksm_use_zero_pages_fd < 0) {
> + ksft_test_result_skip("open \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
> + return;
> + }
> + if (write(ksm_use_zero_pages_fd, "1", 1) != 1) {
> + ksft_test_result_skip("write \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n");
> + return;
> + }
> +
> + /* Mmap zero pages*/

Missing space at the end of the comment

> + map = mmap_and_merge_range(0x00, size, false);
> + if (map == MAP_FAILED)
> + return;
> +
> + /* Check if ksm_zero_pages can be update correctly after merging */
> + pages_expected = expected_ksm_pages(size);
> + ksft_test_result(pages_expected == get_ksm_zero_pages(),
> + "The count zero_page_sharing was updated after merging\n");
> +

Make sure that the number of tests (ksft_test_result*() invocations) is on any return
path as expected (e.g., 1).

if (pages_expected != get_ksm_zero_pages) {
ksft_test_result_pass("'zero_page_sharing' updated after merging\n");
goto unmap;
}

> + /* try to unmerge half of the region */
> + if (madvise(map, size / 2, MADV_UNMERGEABLE)) {
> + ksft_test_result_fail("MADV_UNMERGEABLE failed\n");
> + goto unmap;
> + }
> +
> + /* Check if ksm_zero_pages can be update correctly after unmerging */
> + pages_expected = expected_ksm_pages(size / 2);

Just do

pages_expected /= 2;

> + ksft_test_result(pages_expected == get_ksm_zero_pages(),
> + "The count zero_page_sharing was updated after unmerging\n");
> +

if (pages_expected == get_ksm_zero_pages()) {
ksft_test_result_pass("'zero_page_sharing' updated after unmerging\n");
goto unmap;
}

> + /* Check if ksm zero pages are really unmerged */
> + ksft_test_result(!range_maps_duplicates(map, size / 2),
> + "KSM zero pages were unmerged\n");
> +unmap:
> + munmap(map, size);
> +}
> +
> static void test_unmerge_discarded(void)
> {
> const unsigned int size = 2 * MiB;
> @@ -379,8 +451,11 @@ int main(int argc, char **argv)
> pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
> if (pagemap_fd < 0)
> ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n");
> + ksm_zero_pages_fd = open("/sys/kernel/mm/ksm/ksm_zero_pages", O_RDONLY);
> + ksm_use_zero_pages_fd = open("/sys/kernel/mm/ksm/use_zero_pages", O_RDWR);
>
> test_unmerge();
> + test_unmerge_zero_pages();
> test_unmerge_discarded();
> #ifdef __NR_userfaultfd
> test_unmerge_uffd_wp();

You should need something like this:

diff --git a/tools/testing/selftests/mm/ksm_functional_tests.c b/tools/testing/selftests/mm/ksm_functional_tests.c
index 26853badae70..00df05bfc3a3 100644
--- a/tools/testing/selftests/mm/ksm_functional_tests.c
+++ b/tools/testing/selftests/mm/ksm_functional_tests.c
@@ -358,7 +358,7 @@ static void test_prctl_unmerge(void)

int main(int argc, char **argv)
{
- unsigned int tests = 5;
+ unsigned int tests = 6;
int err;

#ifdef __NR_userfaultfd

--
Thanks,

David / dhildenb


2023-05-23 10:40:52

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 5/6] ksm: update the calculation of KSM profit

On 22.05.23 12:54, Yang Yang wrote:
> From: xu xin <[email protected]>
>

I suggest changing the subject to

"ksm: consider KSM-placed zeropages when calculating KSM profit"

> When use_zero_pages is enabled, the calculation of ksm profit is not
> correct because ksm zero pages is not counted in. So update the
> calculation of KSM profit including the documentation.
>
> Signed-off-by: xu xin <[email protected]>
> Cc: Xiaokai Ran <[email protected]>
> Cc: Yang Yang <[email protected]>
> Cc: Jiang Xuexin <[email protected]>
> Cc: Claudio Imbrenda <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> ---
> Documentation/admin-guide/mm/ksm.rst | 18 +++++++++++-------
> mm/ksm.c | 2 +-
> 2 files changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst
> index 019dc40a0d3c..dde7c152f0ae 100644
> --- a/Documentation/admin-guide/mm/ksm.rst
> +++ b/Documentation/admin-guide/mm/ksm.rst
> @@ -204,21 +204,25 @@ several times, which are unprofitable memory consumed.
> 1) How to determine whether KSM save memory or consume memory in system-wide
> range? Here is a simple approximate calculation for reference::
>
> - general_profit =~ pages_sharing * sizeof(page) - (all_rmap_items) *
> + general_profit =~ ksm_saved_pages * sizeof(page) - (all_rmap_items) *
> sizeof(rmap_item);
>
> - where all_rmap_items can be easily obtained by summing ``pages_sharing``,
> - ``pages_shared``, ``pages_unshared`` and ``pages_volatile``.
> + where ksm_saved_pages equals to the sum of ``pages_sharing`` +
> + ``ksm_zero_pages`` of the system, and all_rmap_items can be easily
> + obtained by summing ``pages_sharing``, ``pages_shared``, ``pages_unshared``
> + and ``pages_volatile``.
>
> 2) The KSM profit inner a single process can be similarly obtained by the
> following approximate calculation::
>
> - process_profit =~ ksm_merging_pages * sizeof(page) -
> + process_profit =~ ksm_saved_pages * sizeof(page) -
> ksm_rmap_items * sizeof(rmap_item).
>
> - where ksm_merging_pages is shown under the directory ``/proc/<pid>/``,
> - and ksm_rmap_items is shown in ``/proc/<pid>/ksm_stat``. The process profit
> - is also shown in ``/proc/<pid>/ksm_stat`` as ksm_process_profit.
> + where ksm_saved_pages equals to the sum of ``ksm_merging_pages`` and
> + ``ksm_zero_pages``, both of which are shown under the directory
> + ``/proc/<pid>/ksm_stat``, and ksm_rmap_items is alos shown in

s/alos/also/

> + ``/proc/<pid>/ksm_stat``. The process profit is also shown in
> + ``/proc/<pid>/ksm_stat`` as ksm_process_profit.
>
> From the perspective of application, a high ratio of ``ksm_rmap_items`` to
> ``ksm_merging_pages`` means a bad madvise-applied policy, so developers or
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 4e510f5c5938..d23a240c2519 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -3085,7 +3085,7 @@ static void wait_while_offlining(void)
> #ifdef CONFIG_PROC_FS
> long ksm_process_profit(struct mm_struct *mm)
> {
> - return mm->ksm_merging_pages * PAGE_SIZE -
> + return (long)(mm->ksm_merging_pages + mm->ksm_zero_pages) * PAGE_SIZE -
> mm->ksm_rmap_items * sizeof(struct ksm_rmap_item);
> }
> #endif /* CONFIG_PROC_FS */

Apart from that LGTM. CCing Stefan R.

--
Thanks,

David / dhildenb


2023-05-23 13:58:45

by xu xin

[permalink] [raw]
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

Excuse me, I'm wondering why using inline here instead of macro is better.
Thanks! :)

Thanks for reviews.

2023-05-23 14:07:28

by xu xin

[permalink] [raw]
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

>> ---
>> include/linux/ksm.h | 6 ++++++
>> mm/ksm.c | 5 +++--
>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
>> index 899a314bc487..7989200cdbb7 100644
>> --- a/include/linux/ksm.h
>> +++ b/include/linux/ksm.h
>> @@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);
>>
>> int __ksm_enter(struct mm_struct *mm);
>> void __ksm_exit(struct mm_struct *mm);
>> +/* use pte_mkdirty to track a KSM-placed zero page */
>> +#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
>
>If there is only a single user (which I assume), please inline it instead.

Excuse me, I'm wondering why using inline here instead of macro is better.
Thanks! :)

Thanks for reviews.

2023-05-23 14:13:07

by xu xin

[permalink] [raw]
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

>>>> ---
>>>> include/linux/ksm.h | 6 ++++++
>>>> mm/ksm.c | 5 +++--
>>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
>>>> index 899a314bc487..7989200cdbb7 100644
>>>> --- a/include/linux/ksm.h
>>>> +++ b/include/linux/ksm.h
>>>> @@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);
>>>>
>>>> int __ksm_enter(struct mm_struct *mm);
>>>> void __ksm_exit(struct mm_struct *mm);
>>>> +/* use pte_mkdirty to track a KSM-placed zero page */
>>>> +#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
>>>
>>> If there is only a single user (which I assume), please inline it instead.
>>
>> Excuse me, I'm wondering why using inline here instead of macro is better.
>> Thanks! :)
>
>Just to clarify: not an inline function but removing the macro
>completely and just place that code directly into the single caller.
>
>Single user, no need to put that into ksm.h -- and I'm not super happy
>about the set_pte_ksm_zero() name ;) because we get the zero-pte already
>passed in from the caller ...

Oh, I see. Thanks

2023-05-23 14:20:07

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages

On 23.05.23 15:57, xu xin wrote:
>>> ---
>>> include/linux/ksm.h | 6 ++++++
>>> mm/ksm.c | 5 +++--
>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
>>> index 899a314bc487..7989200cdbb7 100644
>>> --- a/include/linux/ksm.h
>>> +++ b/include/linux/ksm.h
>>> @@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);
>>>
>>> int __ksm_enter(struct mm_struct *mm);
>>> void __ksm_exit(struct mm_struct *mm);
>>> +/* use pte_mkdirty to track a KSM-placed zero page */
>>> +#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
>>
>> If there is only a single user (which I assume), please inline it instead.
>
> Excuse me, I'm wondering why using inline here instead of macro is better.
> Thanks! :)

Just to clarify: not an inline function but removing the macro
completely and just place that code directly into the single caller.

Single user, no need to put that into ksm.h -- and I'm not super happy
about the set_pte_ksm_zero() name ;) because we get the zero-pte already
passed in from the caller ...

--
Thanks,

David / dhildenb