2024-06-05 09:48:36

by alexs

[permalink] [raw]
Subject: [RFC 1/3] mm/ksm: add anonymous check in find_mergeable_vma

From: "Alex Shi (tencent)" <[email protected]>

We do vma_set_anonyous in do_mmap(), and then vma_is_anonymous()
checking workable, use it as a extra check since ksm only care anonymous
pages.

Signed-off-by: Alex Shi (tencent) <[email protected]>
---
mm/ksm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index f5138f43f0d2..088bce39cd33 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -742,7 +742,8 @@ static struct vm_area_struct *find_mergeable_vma(struct mm_struct *mm,
if (ksm_test_exit(mm))
return NULL;
vma = vma_lookup(mm, addr);
- if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
+ if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma ||
+ !vma_is_anonymous(vma))
return NULL;
return vma;
}
--
2.43.0



2024-06-05 09:48:43

by alexs

[permalink] [raw]
Subject: [RFC 2/3] mm/ksm: jump out early if vma out of date in cmp_and_merge_page

From: "Alex Shi (tencent)" <[email protected]>

If we get a page which in a disappearing vma, the page should be useless
soon, so don't bother to add it into ksm.

Signed-off-by: Alex Shi (tencent) <[email protected]>
---
mm/ksm.c | 36 +++++++++++++++++-------------------
1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index 088bce39cd33..ef335ee508d3 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2315,6 +2315,7 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
unsigned int checksum;
int err;
bool max_page_sharing_bypass = false;
+ struct vm_area_struct *vma;

stable_node = page_stable_node(page);
if (stable_node) {
@@ -2370,9 +2371,17 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
* don't want to insert it in the unstable tree, and we don't want
* to waste our time searching for something identical to it there.
*/
+ mmap_read_lock(mm);
+ vma = find_mergeable_vma(mm, rmap_item->address);
+ if (!vma) {
+ /* If the vma is out of date, we do not need to continue.*/
+ mmap_read_unlock(mm);
+ return;
+ }
checksum = calc_checksum(page);
if (rmap_item->oldchecksum != checksum) {
rmap_item->oldchecksum = checksum;
+ mmap_read_unlock(mm);
return;
}

@@ -2381,31 +2390,20 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
* appropriate zero page if the user enabled this via sysfs.
*/
if (ksm_use_zero_pages && (checksum == zero_checksum)) {
- struct vm_area_struct *vma;
-
- mmap_read_lock(mm);
- vma = find_mergeable_vma(mm, rmap_item->address);
- if (vma) {
- err = try_to_merge_one_page(vma, page,
- ZERO_PAGE(rmap_item->address));
- trace_ksm_merge_one_page(
- page_to_pfn(ZERO_PAGE(rmap_item->address)),
- rmap_item, mm, err);
- } else {
- /*
- * If the vma is out of date, we do not need to
- * continue.
- */
- err = 0;
- }
- mmap_read_unlock(mm);
+ err = try_to_merge_one_page(vma, page, ZERO_PAGE(rmap_item->address));
+ trace_ksm_merge_one_page(page_to_pfn(ZERO_PAGE(rmap_item->address)),
+ rmap_item, mm, err);
/*
* In case of failure, the page was not really empty, so we
* need to continue. Otherwise we're done.
*/
- if (!err)
+ if (!err) {
+ mmap_read_unlock(mm);
return;
+ }
}
+ mmap_read_unlock(mm);
+
tree_rmap_item =
unstable_tree_search_insert(rmap_item, page, &tree_page);
if (tree_rmap_item) {
--
2.43.0


2024-06-05 09:48:55

by alexs

[permalink] [raw]
Subject: [RFC 3/3] mm/ksm: move flush_anon_page before checksum calculation

From: "Alex Shi (tencent)" <[email protected]>

commit 6020dff09252 ("[ARM] Resolve fuse and direct-IO failures due to missing cache flushes")
explain that the aim of flush_anon_page() is to keep the cache and memory
content synced. Also as David Hildenbrand pointed, flush page without
the page contents reading here is meaningless, so let's move the flush action
just before page contents reading, like calc_checksum(), not
just find a page, flush it, w/o clear purpose. This should save some flush
actions why keep page content safely synced.

BTW, write_protect_page() do another type flush actions before pages_identical().

Signed-off-by: Alex Shi (tencent) <[email protected]>
---
mm/ksm.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index ef335ee508d3..77e8c1ded9bb 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -784,10 +784,7 @@ static struct page *get_mergeable_page(struct ksm_rmap_item *rmap_item)
goto out;
if (is_zone_device_page(page))
goto out_putpage;
- if (PageAnon(page)) {
- flush_anon_page(vma, page, addr);
- flush_dcache_page(page);
- } else {
+ if (!PageAnon(page)) {
out_putpage:
put_page(page);
out:
@@ -2378,7 +2375,12 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
mmap_read_unlock(mm);
return;
}
+
+ /* flush page contents before calculate checksum */
+ flush_anon_page(vma, page, rmap_item->address);
+ flush_dcache_page(page);
checksum = calc_checksum(page);
+
if (rmap_item->oldchecksum != checksum) {
rmap_item->oldchecksum = checksum;
mmap_read_unlock(mm);
@@ -2662,8 +2664,6 @@ static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page)
if (is_zone_device_page(*page))
goto next_page;
if (PageAnon(*page)) {
- flush_anon_page(vma, *page, ksm_scan.address);
- flush_dcache_page(*page);
rmap_item = get_next_rmap_item(mm_slot,
ksm_scan.rmap_list, ksm_scan.address);
if (rmap_item) {
--
2.43.0


2024-06-05 09:56:34

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC 1/3] mm/ksm: add anonymous check in find_mergeable_vma

On 05.06.24 11:53, [email protected] wrote:
> From: "Alex Shi (tencent)" <[email protected]>
>
> We do vma_set_anonyous in do_mmap(), and then vma_is_anonymous()
> checking workable, use it as a extra check since ksm only care anonymous
> pages.
>
> Signed-off-by: Alex Shi (tencent) <[email protected]>
> ---
> mm/ksm.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index f5138f43f0d2..088bce39cd33 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -742,7 +742,8 @@ static struct vm_area_struct *find_mergeable_vma(struct mm_struct *mm,
> if (ksm_test_exit(mm))
> return NULL;
> vma = vma_lookup(mm, addr);
> - if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
> + if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma ||
> + !vma_is_anonymous(vma))

Doesn't KSM also apply to COW'ed pages in !anon mappings? At least
that's what I recall.

--
Cheers,

David / dhildenb


2024-06-05 10:29:30

by Alex Shi

[permalink] [raw]
Subject: Re: [RFC 3/3] mm/ksm: move flush_anon_page before checksum calculation

Let me withdraw this patch, the flush_anon_page come with the first patchset of ksm, and no explanation for them.
Though, no any guarantee for concurrent page write for the flush, but anyway I give up on the flush optimize.

Sorry for disturber.

Alex

On 6/5/24 5:53 PM, [email protected] wrote:
> From: "Alex Shi (tencent)" <[email protected]>
>
> commit 6020dff09252 ("[ARM] Resolve fuse and direct-IO failures due to missing cache flushes")
> explain that the aim of flush_anon_page() is to keep the cache and memory
> content synced. Also as David Hildenbrand pointed, flush page without
> the page contents reading here is meaningless, so let's move the flush action
> just before page contents reading, like calc_checksum(), not
> just find a page, flush it, w/o clear purpose. This should save some flush
> actions why keep page content safely synced.
>
> BTW, write_protect_page() do another type flush actions before pages_identical().
>
> Signed-off-by: Alex Shi (tencent) <[email protected]>
> ---
> mm/ksm.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index ef335ee508d3..77e8c1ded9bb 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -784,10 +784,7 @@ static struct page *get_mergeable_page(struct ksm_rmap_item *rmap_item)
> goto out;
> if (is_zone_device_page(page))
> goto out_putpage;
> - if (PageAnon(page)) {
> - flush_anon_page(vma, page, addr);
> - flush_dcache_page(page);
> - } else {
> + if (!PageAnon(page)) {
> out_putpage:
> put_page(page);
> out:
> @@ -2378,7 +2375,12 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite
> mmap_read_unlock(mm);
> return;
> }
> +
> + /* flush page contents before calculate checksum */
> + flush_anon_page(vma, page, rmap_item->address);
> + flush_dcache_page(page);
> checksum = calc_checksum(page);
> +
> if (rmap_item->oldchecksum != checksum) {
> rmap_item->oldchecksum = checksum;
> mmap_read_unlock(mm);
> @@ -2662,8 +2664,6 @@ static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page)
> if (is_zone_device_page(*page))
> goto next_page;
> if (PageAnon(*page)) {
> - flush_anon_page(vma, *page, ksm_scan.address);
> - flush_dcache_page(*page);
> rmap_item = get_next_rmap_item(mm_slot,
> ksm_scan.rmap_list, ksm_scan.address);
> if (rmap_item) {

2024-06-07 09:33:57

by Alex Shi

[permalink] [raw]
Subject: Re: [RFC 1/3] mm/ksm: add anonymous check in find_mergeable_vma



On 6/5/24 5:56 PM, David Hildenbrand wrote:
> On 05.06.24 11:53, [email protected] wrote:
>> From: "Alex Shi (tencent)" <[email protected]>
>>
>> We do vma_set_anonyous in do_mmap(), and then vma_is_anonymous()
>> checking workable, use it as a extra check since ksm only care anonymous
>> pages.
>>
>> Signed-off-by: Alex Shi (tencent) <[email protected]>
>> ---
>>   mm/ksm.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index f5138f43f0d2..088bce39cd33 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -742,7 +742,8 @@ static struct vm_area_struct *find_mergeable_vma(struct mm_struct *mm,
>>       if (ksm_test_exit(mm))
>>           return NULL;
>>       vma = vma_lookup(mm, addr);
>> -    if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
>> +    if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma ||
>> +            !vma_is_anonymous(vma))
>
> Doesn't KSM also apply to COW'ed pages in !anon mappings? At least that's what I recall.
I didn't a evidence for this. :(

In write_protect_page(), "PageAnonExclusive(&folio->page);" has a "VM_BUG_ON_PGFLAGS(!PageAnon(page), page);"
So is this hints the vma also need to be anonymous one?


Thanks a lot!
Alex
>

2024-06-07 10:13:33

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC 1/3] mm/ksm: add anonymous check in find_mergeable_vma

On 07.06.24 11:33, Alex Shi wrote:
>
>
> On 6/5/24 5:56 PM, David Hildenbrand wrote:
>> On 05.06.24 11:53, [email protected] wrote:
>>> From: "Alex Shi (tencent)" <[email protected]>
>>>
>>> We do vma_set_anonyous in do_mmap(), and then vma_is_anonymous()
>>> checking workable, use it as a extra check since ksm only care anonymous
>>> pages.
>>>
>>> Signed-off-by: Alex Shi (tencent) <[email protected]>
>>> ---
>>>   mm/ksm.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/ksm.c b/mm/ksm.c
>>> index f5138f43f0d2..088bce39cd33 100644
>>> --- a/mm/ksm.c
>>> +++ b/mm/ksm.c
>>> @@ -742,7 +742,8 @@ static struct vm_area_struct *find_mergeable_vma(struct mm_struct *mm,
>>>       if (ksm_test_exit(mm))
>>>           return NULL;
>>>       vma = vma_lookup(mm, addr);
>>> -    if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma)
>>> +    if (!vma || !(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma ||
>>> +            !vma_is_anonymous(vma))
>>
>> Doesn't KSM also apply to COW'ed pages in !anon mappings? At least that's what I recall.
> I didn't a evidence for this. :(
>
> In write_protect_page(), "PageAnonExclusive(&folio->page);" has a "VM_BUG_ON_PGFLAGS(!PageAnon(page), page);"
> So is this hints the vma also need to be anonymous one?


vma_is_anonymous(vma) is restricted to anonymous folios and the shared
zeropage, but other VMAs (MAP_PRIVATE file-backed VMAs) can contain
anonymous folios as well.

--
Cheers,

David / dhildenb