2022-03-11 23:25:57

by Jakub Matěna

[permalink] [raw]
Subject: [RFC PATCH v2 0/4] Removing limitations of merging anonymous VMAs

Motivation
In the current kernel it is impossible to merge two anonymous VMAs
if one of them was moved. That is because VMA's page offset is
set according to the virtual address where it was created and in
order to merge two VMAs page offsets need to follow up.
Another problem when merging two VMA's is their anon_vma. In
current kernel these anon_vmas have to be the one and the same.
Otherwise merge is again not allowed.
There are several places from which vma_merge() is called and therefore
several use cases that might profit from this upgrade. These include
mmap (that fills a hole between two VMAs), mremap (that moves VMA next
to another one or again perfectly fills a hole), mprotect (that modifies
protection and allows merging with a neighbor) and brk (that expands VMA
so that it is adjacent to a neighbor).
Missed merge opportunities increase the number of VMAs of a process
and in some cases can cause problems when a max count is reached.

Solution
Following series of these patches solves the first problem with
page offsets by updating them when the VMA is moved to a
different virtual address (patch 2). As for the second
problem merging of VMAs with different anon_vma is allowed
(patch 3). Patch 1 refactors function vma_merge and
makes it easier to understand and also allows relatively
seamless tracing of successful merges introduced by the patch 4.

Limitations
For both problems solution works only for VMAs that do not share
physical pages with other processes (usually child or parent
processes). This is checked by looking at anon_vma of the respective
VMA. The reason why it is not possible or at least not easy to
accomplish is that each physical page has a pointer to anon_vma and
page offset. And when this physical page is shared we cannot simply
change these parameters without affecting all of the VMAs mapping
this physical page. Good thing is that this case amounts only for
about 1-3% of all merges (measured on jemalloc (0%), redis (2.7%) and
kcbench (1.2%) tests) that fail to merge in the current kernel.
Measuring also shows slight increase in running time, jemalloc (0.3%),
redis (1%), kcbench (1%). More extensive data can be viewed at
https://home.alabanda.cz/share/results.png

This series of patches and documentation of the related code will
be part of my master's thesis.
This patch series is based on tag v5.17-rc4. This is a second version
including minor changes that arose from the first RFC like formatting.
Speed and failed merge percentage data are also included.

Jakub Matěna (4):
mm: refactor of vma_merge()
mm: adjust page offset in mremap
mm: enable merging of VMAs with different anon_vmas
mm: add tracing for VMA merges

include/linux/rmap.h | 17 ++-
include/trace/events/mmap.h | 83 +++++++++++++++
mm/internal.h | 12 +++
mm/mmap.c | 206 ++++++++++++++++++++++++------------
mm/rmap.c | 77 ++++++++++++++
5 files changed, 325 insertions(+), 70 deletions(-)

--
2.34.1