Following on from Vlastimil Babka's patch series "cleanup vma_merge() and
improve mergeability tests" which was in turn based on Liam's prior
cleanups, this patch series introduces changes discussed in review of
Vlastimil's series and goes further in attempting to make the logic as
clear as possible.
Nearly all of this should have absolutely no functional impact, however it
does add a singular VM_WARN_ON() case.
With many thanks to Vernon for helping kick start the discussion around
simplification - abstract use of vma did indeed turn out not to be
necessary - and to Liam for his excellent suggestions which greatly
simplified things.
v3:
- Combine vma_lookup() cases and reinsert accidentally excluded next = NULL
assignment.
- Reword commit messages to more correctly reflect the current changes.
- Avoid multiple assignment to prev, take vma_start, vma_pgoff assignment
out of the local variable declarations and revert to setting in if (prev)
block.
v2:
- Put the patch series on a serious diet, cut comments down to avoid
bloat.
- Added clever use of find_vma_intersection() and vma_lookup() as suggested
by Liam which improved clarity + brevity significantly.
- Eliminated the use of a temporary vma local as suggested by Vernon, it
does seem this was ultimately adding confusion and Liam's suggestions
eliminated the need for this.
- Moved around initial variables to be more sensible and to initialise each
variable in one place where possible.
https://lore.kernel.org/all/[email protected]
v1:
https://lore.kernel.org/all/[email protected]
Lorenzo Stoakes (4):
mm/mmap/vma_merge: further improve prev/next VMA naming
mm/mmap/vma_merge: fold curr, next assignment logic
mm/mmap/vma_merge: explicitly assign res, vma, extend invariants
mm/mmap/vma_merge: init cleanup, be explicit about the non-mergeable
case
mm/mmap.c | 144 ++++++++++++++++++++++++++++--------------------------
1 file changed, 76 insertions(+), 68 deletions(-)
--
2.39.2
Previously the ASCII diagram above vma_merge() and the accompanying
variable naming was rather confusing, however recent efforts by Liam
Howlett and Vlastimil Babka have significantly improved matters.
This patch goes a little further - replacing 'X' with 'N', which feels more
natural as this represents the _next_ VMA and replacing what was 'N' with
'C' which represents the current VMA.
No word quite describes a VMA that has coincident start as the input span,
however 'concurrent' (or more simply 'current') abbreviated to 'curr' fits
intuitions well alongside prev and next.
This has no functional impact.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
mm/mmap.c | 86 +++++++++++++++++++++++++++----------------------------
1 file changed, 43 insertions(+), 43 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 042d22e63528..c9834364ac98 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -861,44 +861,44 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
* this area are about to be changed to vm_flags - and the no-change
* case has already been eliminated.
*
- * The following mprotect cases have to be considered, where AAAA is
+ * The following mprotect cases have to be considered, where **** is
* the area passed down from mprotect_fixup, never extending beyond one
- * vma, PPPP is the previous vma, NNNN is a vma that starts at the same
- * address as AAAA and is of the same or larger span, and XXXX the next
- * vma after AAAA:
+ * vma, PPPP is the previous vma, CCCC is a concurrent vma that starts
+ * at the same address as **** and is of the same or larger span, and
+ * NNNN the next vma after ****:
*
- * AAAA AAAA AAAA
- * PPPPPPXXXXXX PPPPPPXXXXXX PPPPPPNNNNNN
+ * **** **** ****
+ * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC
* cannot merge might become might become
- * PPXXXXXXXXXX PPPPPPPPPPNN
+ * PPNNNNNNNNNN PPPPPPPPPPCC
* mmap, brk or case 4 below case 5 below
* mremap move:
- * AAAA AAAA
- * PPPP XXXX PPPPNNNNXXXX
+ * **** ****
+ * PPPP NNNN PPPPCCCCNNNN
* might become might become
* PPPPPPPPPPPP 1 or PPPPPPPPPPPP 6 or
- * PPPPPPPPXXXX 2 or PPPPPPPPXXXX 7 or
- * PPPPXXXXXXXX 3 PPPPXXXXXXXX 8
+ * PPPPPPPPNNNN 2 or PPPPPPPPNNNN 7 or
+ * PPPPNNNNNNNN 3 PPPPNNNNNNNN 8
*
- * It is important for case 8 that the vma NNNN overlapping the
- * region AAAA is never going to extended over XXXX. Instead XXXX must
- * be extended in region AAAA and NNNN must be removed. This way in
+ * It is important for case 8 that the vma CCCC overlapping the
+ * region **** is never going to extended over NNNN. Instead NNNN must
+ * be extended in region **** and CCCC must be removed. This way in
* all cases where vma_merge succeeds, the moment vma_merge drops the
* rmap_locks, the properties of the merged vma will be already
* correct for the whole merged range. Some of those properties like
* vm_page_prot/vm_flags may be accessed by rmap_walks and they must
* be correct for the whole merged range immediately after the
- * rmap_locks are released. Otherwise if XXXX would be removed and
- * NNNN would be extended over the XXXX range, remove_migration_ptes
+ * rmap_locks are released. Otherwise if NNNN would be removed and
+ * CCCC would be extended over the NNNN range, remove_migration_ptes
* or other rmap walkers (if working on addresses beyond the "end"
- * parameter) may establish ptes with the wrong permissions of NNNN
- * instead of the right permissions of XXXX.
+ * parameter) may establish ptes with the wrong permissions of CCCC
+ * instead of the right permissions of NNNN.
*
* In the code below:
* PPPP is represented by *prev
- * NNNN is represented by *mid or not represented at all (NULL)
- * XXXX is represented by *next or not represented at all (NULL)
- * AAAA is not represented - it will be merged and the vma containing the
+ * CCCC is represented by *curr or not represented at all (NULL)
+ * NNNN is represented by *next or not represented at all (NULL)
+ * **** is not represented - it will be merged and the vma containing the
* area is returned, or the function will return NULL
*/
struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
@@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
{
pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
pgoff_t vma_pgoff;
- struct vm_area_struct *mid, *next, *res = NULL;
+ struct vm_area_struct *curr, *next, *res = NULL;
struct vm_area_struct *vma, *adjust, *remove, *remove2;
int err = -1;
bool merge_prev = false;
@@ -930,19 +930,19 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
if (vm_flags & VM_SPECIAL)
return NULL;
- mid = find_vma(mm, prev ? prev->vm_end : 0);
- if (mid && mid->vm_end == end) /* cases 6, 7, 8 */
- next = find_vma(mm, mid->vm_end);
+ curr = find_vma(mm, prev ? prev->vm_end : 0);
+ if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
+ next = find_vma(mm, curr->vm_end);
else
- next = mid;
+ next = curr;
- /* In cases 1 - 4 there's no NNNN vma */
- if (mid && end <= mid->vm_start)
- mid = NULL;
+ /* In cases 1 - 4 there's no CCCC vma */
+ if (curr && end <= curr->vm_start)
+ curr = NULL;
/* verify some invariant that must be enforced by the caller */
VM_WARN_ON(prev && addr <= prev->vm_start);
- VM_WARN_ON(mid && end > mid->vm_end);
+ VM_WARN_ON(curr && end > curr->vm_end);
VM_WARN_ON(addr >= end);
if (prev) {
@@ -974,21 +974,21 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
remove = next; /* case 1 */
vma_end = next->vm_end;
err = dup_anon_vma(prev, next);
- if (mid) { /* case 6 */
- remove = mid;
+ if (curr) { /* case 6 */
+ remove = curr;
remove2 = next;
if (!next->anon_vma)
- err = dup_anon_vma(prev, mid);
+ err = dup_anon_vma(prev, curr);
}
} else if (merge_prev) {
err = 0; /* case 2 */
- if (mid) {
- err = dup_anon_vma(prev, mid);
- if (end == mid->vm_end) { /* case 7 */
- remove = mid;
+ if (curr) {
+ err = dup_anon_vma(prev, curr);
+ if (end == curr->vm_end) { /* case 7 */
+ remove = curr;
} else { /* case 5 */
- adjust = mid;
- adj_start = (end - mid->vm_start);
+ adjust = curr;
+ adj_start = (end - curr->vm_start);
}
}
} else if (merge_next) {
@@ -1004,10 +1004,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
vma_end = next->vm_end;
vma_pgoff = next->vm_pgoff;
err = 0;
- if (mid) { /* case 8 */
- vma_pgoff = mid->vm_pgoff;
- remove = mid;
- err = dup_anon_vma(next, mid);
+ if (curr) { /* case 8 */
+ vma_pgoff = curr->vm_pgoff;
+ remove = curr;
+ err = dup_anon_vma(next, curr);
}
}
}
--
2.39.2
Previously, vma was an uninitialised variable which was only definitely
assigned as a result of the logic covering all possible input cases - for
it to have remained uninitialised, prev would have to be NULL, and next
would _have_ to be mergeable.
The value of res defaults to NULL, so we can neatly eliminate the
assignment to res and vma in the if (prev) block and ensure that both res
and vma are both explicitly assigned, by just setting both to prev.
In addition we add an explanation as to under what circumstances both might
change, and since we absolutely do rely on addr == curr->vm_start should
curr exist, assert that this is the case.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
mm/mmap.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index dbdbb92493b2..2a4f63716231 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
{
pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
pgoff_t vma_pgoff;
- struct vm_area_struct *curr, *next, *res = NULL;
+ struct vm_area_struct *curr, *next, *res;
struct vm_area_struct *vma, *adjust, *remove, *remove2;
int err = -1;
bool merge_prev = false;
@@ -939,14 +939,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
else
next = NULL; /* case 5 */
- /* verify some invariant that must be enforced by the caller */
+ /*
+ * By default, we return prev. Cases 3, 4, 8 will instead return next
+ * and cases 3, 8 will also update vma to point at next.
+ */
+ res = vma = prev;
+
+ /* Verify some invariant that must be enforced by the caller. */
VM_WARN_ON(prev && addr <= prev->vm_start);
- VM_WARN_ON(curr && end > curr->vm_end);
+ VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
VM_WARN_ON(addr >= end);
if (prev) {
- res = prev;
- vma = prev;
vma_start = prev->vm_start;
vma_pgoff = prev->vm_pgoff;
/* Can we merge the predecessor? */
@@ -957,6 +961,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
vma_prev(vmi);
}
}
+
/* Can we merge the successor? */
if (next && mpol_equal(policy, vma_policy(next)) &&
can_vma_merge_before(next, vm_flags,
@@ -997,6 +1002,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
adj_start = -(prev->vm_end - addr);
err = dup_anon_vma(next, prev);
} else {
+ /*
+ * Note that cases 3 and 8 are the ONLY ones where prev
+ * is permitted to be (but is not necessarily) NULL.
+ */
vma = next; /* case 3 */
vma_start = addr;
vma_end = next->vm_end;
--
2.39.2
Use find_vma_intersection() and vma_lookup() to both simplify the logic and
to fold the end == next->vm_start condition into one block.
This groups all of the simple range checks together and establishes the
invariant that, if prev, curr or next are non-NULL then their positions are
as expected.
This has no functional impact.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
mm/mmap.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index c9834364ac98..dbdbb92493b2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -930,15 +930,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
if (vm_flags & VM_SPECIAL)
return NULL;
- curr = find_vma(mm, prev ? prev->vm_end : 0);
- if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
- next = find_vma(mm, curr->vm_end);
- else
- next = curr;
+ /* Does the input range span an existing VMA? (cases 5 - 8) */
+ curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
- /* In cases 1 - 4 there's no CCCC vma */
- if (curr && end <= curr->vm_start)
- curr = NULL;
+ if (!curr || /* cases 1 - 4 */
+ end == curr->vm_end) /* cases 6 - 8, adjacent VMA */
+ next = vma_lookup(mm, end);
+ else
+ next = NULL; /* case 5 */
/* verify some invariant that must be enforced by the caller */
VM_WARN_ON(prev && addr <= prev->vm_start);
@@ -959,11 +958,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
}
}
/* Can we merge the successor? */
- if (next && end == next->vm_start &&
- mpol_equal(policy, vma_policy(next)) &&
- can_vma_merge_before(next, vm_flags,
- anon_vma, file, pgoff+pglen,
- vm_userfaultfd_ctx, anon_name)) {
+ if (next && mpol_equal(policy, vma_policy(next)) &&
+ can_vma_merge_before(next, vm_flags,
+ anon_vma, file, pgoff+pglen,
+ vm_userfaultfd_ctx, anon_name)) {
merge_next = true;
}
--
2.39.2
Rather than setting err = -1 and only resetting if we hit merge cases,
explicitly check the non-mergeable case to make it abundantly clear that we
only proceed with the rest if something is mergeable, default err to 0 and
only update if an error might occur.
Move the merge_prev, merge_next cases closer to the logic determining curr,
next and reorder initial variables so they are more logically grouped.
This has no functional impact.
Signed-off-by: Lorenzo Stoakes <[email protected]>
---
mm/mmap.c | 57 ++++++++++++++++++++++++++++---------------------------
1 file changed, 29 insertions(+), 28 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 2a4f63716231..642f3d063be1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
struct anon_vma_name *anon_name)
{
- pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
- pgoff_t vma_pgoff;
struct vm_area_struct *curr, *next, *res;
struct vm_area_struct *vma, *adjust, *remove, *remove2;
- int err = -1;
+ struct vma_prepare vp;
+ pgoff_t vma_pgoff;
+ int err = 0;
bool merge_prev = false;
bool merge_next = false;
bool vma_expanded = false;
- struct vma_prepare vp;
+ unsigned long vma_start = addr;
unsigned long vma_end = end;
+ pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
long adj_start = 0;
- unsigned long vma_start = addr;
validate_mm(mm);
/*
@@ -939,36 +939,38 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
else
next = NULL; /* case 5 */
- /*
- * By default, we return prev. Cases 3, 4, 8 will instead return next
- * and cases 3, 8 will also update vma to point at next.
- */
- res = vma = prev;
-
- /* Verify some invariant that must be enforced by the caller. */
- VM_WARN_ON(prev && addr <= prev->vm_start);
- VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
- VM_WARN_ON(addr >= end);
-
if (prev) {
vma_start = prev->vm_start;
vma_pgoff = prev->vm_pgoff;
+
/* Can we merge the predecessor? */
- if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
+ if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
&& can_vma_merge_after(prev, vm_flags, anon_vma, file,
- pgoff, vm_userfaultfd_ctx, anon_name)) {
+ pgoff, vm_userfaultfd_ctx, anon_name)) {
merge_prev = true;
vma_prev(vmi);
}
}
/* Can we merge the successor? */
- if (next && mpol_equal(policy, vma_policy(next)) &&
- can_vma_merge_before(next, vm_flags,
- anon_vma, file, pgoff+pglen,
- vm_userfaultfd_ctx, anon_name)) {
- merge_next = true;
- }
+ merge_next = next && mpol_equal(policy, vma_policy(next)) &&
+ can_vma_merge_before(next, vm_flags,
+ anon_vma, file, pgoff+pglen,
+ vm_userfaultfd_ctx, anon_name);
+
+ if (!merge_prev && !merge_next)
+ return NULL; /* Not mergeable. */
+
+ /*
+ * By default, we return prev. Cases 3, 4, 8 will instead return next
+ * and cases 3, 8 will also update vma to point at next.
+ */
+ res = vma = prev;
+
+ /* Verify some invariant that must be enforced by the caller. */
+ VM_WARN_ON(prev && addr <= prev->vm_start);
+ VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
+ VM_WARN_ON(addr >= end);
remove = remove2 = adjust = NULL;
/* Can we merge both the predecessor and the successor? */
@@ -984,7 +986,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
err = dup_anon_vma(prev, curr);
}
} else if (merge_prev) {
- err = 0; /* case 2 */
+ /* case 2 */
if (curr) {
err = dup_anon_vma(prev, curr);
if (end == curr->vm_end) { /* case 7 */
@@ -994,7 +996,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
adj_start = (end - curr->vm_start);
}
}
- } else if (merge_next) {
+ } else { /* merge_next */
res = next;
if (prev && addr < prev->vm_end) { /* case 4 */
vma_end = addr;
@@ -1010,7 +1012,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
vma_start = addr;
vma_end = next->vm_end;
vma_pgoff = next->vm_pgoff;
- err = 0;
if (curr) { /* case 8 */
vma_pgoff = curr->vm_pgoff;
remove = curr;
@@ -1019,7 +1020,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
}
}
- /* Cannot merge or error in anon_vma clone */
+ /* Error in anon_vma clone. */
if (err)
return NULL;
--
2.39.2
On 3/22/23 08:13, Lorenzo Stoakes wrote:
> Previously the ASCII diagram above vma_merge() and the accompanying
> variable naming was rather confusing, however recent efforts by Liam
> Howlett and Vlastimil Babka have significantly improved matters.
>
> This patch goes a little further - replacing 'X' with 'N', which feels more
> natural as this represents the _next_ VMA and replacing what was 'N' with
> 'C' which represents the current VMA.
Might have wanted to mention the 'A' to '*' change as well?
> No word quite describes a VMA that has coincident start as the input span,
> however 'concurrent' (or more simply 'current') abbreviated to 'curr' fits
> intuitions well alongside prev and next.
'curr' sounds good to me, I concur
> This has no functional impact.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
> ---
> mm/mmap.c | 86 +++++++++++++++++++++++++++----------------------------
> 1 file changed, 43 insertions(+), 43 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 042d22e63528..c9834364ac98 100644
On 3/22/23 08:13, Lorenzo Stoakes wrote:
> Use find_vma_intersection() and vma_lookup() to both simplify the logic and
> to fold the end == next->vm_start condition into one block.
>
> This groups all of the simple range checks together and establishes the
> invariant that, if prev, curr or next are non-NULL then their positions are
> as expected.
>
> This has no functional impact.
I'm not so sure...
> Signed-off-by: Lorenzo Stoakes <[email protected]>
> ---
> mm/mmap.c | 24 +++++++++++-------------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c9834364ac98..dbdbb92493b2 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -930,15 +930,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> if (vm_flags & VM_SPECIAL)
> return NULL;
>
> - curr = find_vma(mm, prev ? prev->vm_end : 0);
> - if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
> - next = find_vma(mm, curr->vm_end);
> - else
> - next = curr;
> + /* Does the input range span an existing VMA? (cases 5 - 8) */
> + curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>
> - /* In cases 1 - 4 there's no CCCC vma */
> - if (curr && end <= curr->vm_start)
> - curr = NULL;
> + if (!curr || /* cases 1 - 4 */
> + end == curr->vm_end) /* cases 6 - 8, adjacent VMA */
> + next = vma_lookup(mm, end);
AFAICS if the next vma is not adjacent to CCCC or ****, but there's a gap,
this will still give you a non-NULL result?
> + else
> + next = NULL; /* case 5 */
>
> /* verify some invariant that must be enforced by the caller */
> VM_WARN_ON(prev && addr <= prev->vm_start);
> @@ -959,11 +958,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> }
> }
> /* Can we merge the successor? */
> - if (next && end == next->vm_start &&
And then without this end == next->vm_start check, merge will be done
despite the gap.
> - mpol_equal(policy, vma_policy(next)) &&
> - can_vma_merge_before(next, vm_flags,
> - anon_vma, file, pgoff+pglen,
> - vm_userfaultfd_ctx, anon_name)) {
> + if (next && mpol_equal(policy, vma_policy(next)) &&
> + can_vma_merge_before(next, vm_flags,
> + anon_vma, file, pgoff+pglen,
> + vm_userfaultfd_ctx, anon_name)) {
> merge_next = true;
> }
>
> --
> 2.39.2
On 3/22/23 10:06, Vlastimil Babka wrote:
> On 3/22/23 08:13, Lorenzo Stoakes wrote:
>> Use find_vma_intersection() and vma_lookup() to both simplify the logic and
>> to fold the end == next->vm_start condition into one block.
>>
>> This groups all of the simple range checks together and establishes the
>> invariant that, if prev, curr or next are non-NULL then their positions are
>> as expected.
>>
>> This has no functional impact.
>
> I'm not so sure...
Disregard that, vma_lookup() is not find_vma(). Next cleanup challenge: more
obvious naming of those two.
>> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
>> ---
>> mm/mmap.c | 24 +++++++++++-------------
>> 1 file changed, 11 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index c9834364ac98..dbdbb92493b2 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -930,15 +930,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>> if (vm_flags & VM_SPECIAL)
>> return NULL;
>>
>> - curr = find_vma(mm, prev ? prev->vm_end : 0);
>> - if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
>> - next = find_vma(mm, curr->vm_end);
>> - else
>> - next = curr;
>> + /* Does the input range span an existing VMA? (cases 5 - 8) */
>> + curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>>
>> - /* In cases 1 - 4 there's no CCCC vma */
>> - if (curr && end <= curr->vm_start)
>> - curr = NULL;
>> + if (!curr || /* cases 1 - 4 */
>> + end == curr->vm_end) /* cases 6 - 8, adjacent VMA */
>> + next = vma_lookup(mm, end);
>
> AFAICS if the next vma is not adjacent to CCCC or ****, but there's a gap,
> this will still give you a non-NULL result?
>
>> + else
>> + next = NULL; /* case 5 */
>>
>> /* verify some invariant that must be enforced by the caller */
>> VM_WARN_ON(prev && addr <= prev->vm_start);
>> @@ -959,11 +958,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
>> }
>> }
>> /* Can we merge the successor? */
>> - if (next && end == next->vm_start &&
>
> And then without this end == next->vm_start check, merge will be done
> despite the gap.
>
>> - mpol_equal(policy, vma_policy(next)) &&
>> - can_vma_merge_before(next, vm_flags,
>> - anon_vma, file, pgoff+pglen,
>> - vm_userfaultfd_ctx, anon_name)) {
>> + if (next && mpol_equal(policy, vma_policy(next)) &&
>> + can_vma_merge_before(next, vm_flags,
>> + anon_vma, file, pgoff+pglen,
>> + vm_userfaultfd_ctx, anon_name)) {
>> merge_next = true;
>> }
>>
>> --
>> 2.39.2
>
On 3/22/23 08:13, Lorenzo Stoakes wrote:
> Previously, vma was an uninitialised variable which was only definitely
> assigned as a result of the logic covering all possible input cases - for
> it to have remained uninitialised, prev would have to be NULL, and next
> would _have_ to be mergeable.
>
> The value of res defaults to NULL, so we can neatly eliminate the
> assignment to res and vma in the if (prev) block and ensure that both res
> and vma are both explicitly assigned, by just setting both to prev.
>
> In addition we add an explanation as to under what circumstances both might
> change, and since we absolutely do rely on addr == curr->vm_start should
> curr exist, assert that this is the case.
Hm replied to v2 of this, sorry, so let me repeat that here:
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Nit suggestion below.
> ---
> mm/mmap.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index dbdbb92493b2..2a4f63716231 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> {
> pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> pgoff_t vma_pgoff;
> - struct vm_area_struct *curr, *next, *res = NULL;
> + struct vm_area_struct *curr, *next, *res;
> struct vm_area_struct *vma, *adjust, *remove, *remove2;
> int err = -1;
> bool merge_prev = false;
> @@ -939,14 +939,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> else
> next = NULL; /* case 5 */
>
> - /* verify some invariant that must be enforced by the caller */
> + /*
> + * By default, we return prev. Cases 3, 4, 8 will instead return next
> + * and cases 3, 8 will also update vma to point at next.
> + */
> + res = vma = prev;
Later in the function there's a line:
remove = remove2 = adjust = NULL;
Now it would make sense to move it up here?
> +
> + /* Verify some invariant that must be enforced by the caller. */
> VM_WARN_ON(prev && addr <= prev->vm_start);
> - VM_WARN_ON(curr && end > curr->vm_end);
> + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> VM_WARN_ON(addr >= end);
>
> if (prev) {
> - res = prev;
> - vma = prev;
> vma_start = prev->vm_start;
> vma_pgoff = prev->vm_pgoff;
> /* Can we merge the predecessor? */
> @@ -957,6 +961,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> vma_prev(vmi);
> }
> }
> +
> /* Can we merge the successor? */
> if (next && mpol_equal(policy, vma_policy(next)) &&
> can_vma_merge_before(next, vm_flags,
> @@ -997,6 +1002,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> adj_start = -(prev->vm_end - addr);
> err = dup_anon_vma(next, prev);
> } else {
> + /*
> + * Note that cases 3 and 8 are the ONLY ones where prev
> + * is permitted to be (but is not necessarily) NULL.
> + */
> vma = next; /* case 3 */
> vma_start = addr;
> vma_end = next->vm_end;
On 3/22/23 08:13, Lorenzo Stoakes wrote:
> Rather than setting err = -1 and only resetting if we hit merge cases,
> explicitly check the non-mergeable case to make it abundantly clear that we
> only proceed with the rest if something is mergeable, default err to 0 and
> only update if an error might occur.
>
> Move the merge_prev, merge_next cases closer to the logic determining curr,
> next and reorder initial variables so they are more logically grouped.
>
> This has no functional impact.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Some nits:
> ---
> mm/mmap.c | 57 ++++++++++++++++++++++++++++---------------------------
> 1 file changed, 29 insertions(+), 28 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 2a4f63716231..642f3d063be1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
> struct anon_vma_name *anon_name)
> {
> - pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> - pgoff_t vma_pgoff;
> struct vm_area_struct *curr, *next, *res;
> struct vm_area_struct *vma, *adjust, *remove, *remove2;
> - int err = -1;
> + struct vma_prepare vp;
> + pgoff_t vma_pgoff;
> + int err = 0;
> bool merge_prev = false;
> bool merge_next = false;
> bool vma_expanded = false;
> - struct vma_prepare vp;
> + unsigned long vma_start = addr;
> unsigned long vma_end = end;
> + pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> long adj_start = 0;
> - unsigned long vma_start = addr;
>
> validate_mm(mm);
> /*
> @@ -939,36 +939,38 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> else
> next = NULL; /* case 5 */
>
> - /*
> - * By default, we return prev. Cases 3, 4, 8 will instead return next
> - * and cases 3, 8 will also update vma to point at next.
> - */
> - res = vma = prev;
> -
> - /* Verify some invariant that must be enforced by the caller. */
> - VM_WARN_ON(prev && addr <= prev->vm_start);
> - VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> - VM_WARN_ON(addr >= end);
> -
> if (prev) {
> vma_start = prev->vm_start;
> vma_pgoff = prev->vm_pgoff;
> +
> /* Can we merge the predecessor? */
> - if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
> + if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
> && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> - pgoff, vm_userfaultfd_ctx, anon_name)) {
> + pgoff, vm_userfaultfd_ctx, anon_name)) {
> merge_prev = true;
> vma_prev(vmi);
> }
> }
>
> /* Can we merge the successor? */
> - if (next && mpol_equal(policy, vma_policy(next)) &&
> - can_vma_merge_before(next, vm_flags,
> - anon_vma, file, pgoff+pglen,
> - vm_userfaultfd_ctx, anon_name)) {
> - merge_next = true;
> - }
> + merge_next = next && mpol_equal(policy, vma_policy(next)) &&
> + can_vma_merge_before(next, vm_flags,
> + anon_vma, file, pgoff+pglen,
> + vm_userfaultfd_ctx, anon_name);
Not a great fan of this, I think the if() is more readable, but if you and
Liam agree, I won't mind much. Either way could consolidate the parameters
on less lines maybe.
> +
> + if (!merge_prev && !merge_next)
> + return NULL; /* Not mergeable. */
> +
> + /*
> + * By default, we return prev. Cases 3, 4, 8 will instead return next
> + * and cases 3, 8 will also update vma to point at next.
> + */
> + res = vma = prev;
> +
> + /* Verify some invariant that must be enforced by the caller. */
> + VM_WARN_ON(prev && addr <= prev->vm_start);
> + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> + VM_WARN_ON(addr >= end);
>
> remove = remove2 = adjust = NULL;
> /* Can we merge both the predecessor and the successor? */
> @@ -984,7 +986,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> err = dup_anon_vma(prev, curr);
> }
> } else if (merge_prev) {
> - err = 0; /* case 2 */
> + /* case 2 */
Move the comment from this now weirdly empty line to the "else if" one above?
> if (curr) {
> err = dup_anon_vma(prev, curr);
> if (end == curr->vm_end) { /* case 7 */
> @@ -994,7 +996,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> adj_start = (end - curr->vm_start);
> }
> }
> - } else if (merge_next) {
> + } else { /* merge_next */
> res = next;
> if (prev && addr < prev->vm_end) { /* case 4 */
> vma_end = addr;
> @@ -1010,7 +1012,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> vma_start = addr;
> vma_end = next->vm_end;
> vma_pgoff = next->vm_pgoff;
> - err = 0;
> if (curr) { /* case 8 */
> vma_pgoff = curr->vm_pgoff;
> remove = curr;
> @@ -1019,7 +1020,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> }
> }
>
> - /* Cannot merge or error in anon_vma clone */
> + /* Error in anon_vma clone. */
> if (err)
> return NULL;
>
* Vlastimil Babka <[email protected]> [230322 05:28]:
> On 3/22/23 08:13, Lorenzo Stoakes wrote:
> > Rather than setting err = -1 and only resetting if we hit merge cases,
> > explicitly check the non-mergeable case to make it abundantly clear that we
> > only proceed with the rest if something is mergeable, default err to 0 and
> > only update if an error might occur.
> >
> > Move the merge_prev, merge_next cases closer to the logic determining curr,
> > next and reorder initial variables so they are more logically grouped.
> >
> > This has no functional impact.
> >
> > Signed-off-by: Lorenzo Stoakes <[email protected]>
>
> Reviewed-by: Vlastimil Babka <[email protected]>
>
> Some nits:
>
> > ---
> > mm/mmap.c | 57 ++++++++++++++++++++++++++++---------------------------
> > 1 file changed, 29 insertions(+), 28 deletions(-)
> >
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 2a4f63716231..642f3d063be1 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -909,18 +909,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > struct vm_userfaultfd_ctx vm_userfaultfd_ctx,
> > struct anon_vma_name *anon_name)
> > {
> > - pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> > - pgoff_t vma_pgoff;
> > struct vm_area_struct *curr, *next, *res;
> > struct vm_area_struct *vma, *adjust, *remove, *remove2;
> > - int err = -1;
> > + struct vma_prepare vp;
> > + pgoff_t vma_pgoff;
> > + int err = 0;
> > bool merge_prev = false;
> > bool merge_next = false;
> > bool vma_expanded = false;
> > - struct vma_prepare vp;
> > + unsigned long vma_start = addr;
> > unsigned long vma_end = end;
> > + pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> > long adj_start = 0;
> > - unsigned long vma_start = addr;
> >
> > validate_mm(mm);
> > /*
> > @@ -939,36 +939,38 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > else
> > next = NULL; /* case 5 */
> >
> > - /*
> > - * By default, we return prev. Cases 3, 4, 8 will instead return next
> > - * and cases 3, 8 will also update vma to point at next.
> > - */
> > - res = vma = prev;
> > -
> > - /* Verify some invariant that must be enforced by the caller. */
> > - VM_WARN_ON(prev && addr <= prev->vm_start);
> > - VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> > - VM_WARN_ON(addr >= end);
> > -
> > if (prev) {
> > vma_start = prev->vm_start;
> > vma_pgoff = prev->vm_pgoff;
> > +
> > /* Can we merge the predecessor? */
> > - if (prev->vm_end == addr && mpol_equal(vma_policy(prev), policy)
> > + if (addr == prev->vm_end && mpol_equal(vma_policy(prev), policy)
> > && can_vma_merge_after(prev, vm_flags, anon_vma, file,
> > - pgoff, vm_userfaultfd_ctx, anon_name)) {
> > + pgoff, vm_userfaultfd_ctx, anon_name)) {
> > merge_prev = true;
> > vma_prev(vmi);
> > }
> > }
> >
> > /* Can we merge the successor? */
> > - if (next && mpol_equal(policy, vma_policy(next)) &&
> > - can_vma_merge_before(next, vm_flags,
> > - anon_vma, file, pgoff+pglen,
> > - vm_userfaultfd_ctx, anon_name)) {
> > - merge_next = true;
> > - }
> > + merge_next = next && mpol_equal(policy, vma_policy(next)) &&
> > + can_vma_merge_before(next, vm_flags,
> > + anon_vma, file, pgoff+pglen,
> > + vm_userfaultfd_ctx, anon_name);
>
> Not a great fan of this, I think the if() is more readable, but if you and
> Liam agree, I won't mind much. Either way could consolidate the parameters
> on less lines maybe.
I think it's more readable with the if() as well, fwiw. If you revert
this to an if(), then you should keep the braces as it looks very
awkward without them.
>
> > +
> > + if (!merge_prev && !merge_next)
> > + return NULL; /* Not mergeable. */
> > +
> > + /*
> > + * By default, we return prev. Cases 3, 4, 8 will instead return next
> > + * and cases 3, 8 will also update vma to point at next.
> > + */
> > + res = vma = prev;
> > +
> > + /* Verify some invariant that must be enforced by the caller. */
> > + VM_WARN_ON(prev && addr <= prev->vm_start);
> > + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> > + VM_WARN_ON(addr >= end);
> >
> > remove = remove2 = adjust = NULL;
> > /* Can we merge both the predecessor and the successor? */
> > @@ -984,7 +986,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > err = dup_anon_vma(prev, curr);
> > }
> > } else if (merge_prev) {
> > - err = 0; /* case 2 */
> > + /* case 2 */
>
> Move the comment from this now weirdly empty line to the "else if" one above?
Yeah, this also makes sense to me, and brings it in line with case 8/4,
etc.
>
> > if (curr) {
> > err = dup_anon_vma(prev, curr);
> > if (end == curr->vm_end) { /* case 7 */
> > @@ -994,7 +996,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > adj_start = (end - curr->vm_start);
> > }
> > }
> > - } else if (merge_next) {
> > + } else { /* merge_next */
> > res = next;
> > if (prev && addr < prev->vm_end) { /* case 4 */
> > vma_end = addr;
> > @@ -1010,7 +1012,6 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > vma_start = addr;
> > vma_end = next->vm_end;
> > vma_pgoff = next->vm_pgoff;
> > - err = 0;
> > if (curr) { /* case 8 */
> > vma_pgoff = curr->vm_pgoff;
> > remove = curr;
> > @@ -1019,7 +1020,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> > }
> > }
> >
> > - /* Cannot merge or error in anon_vma clone */
> > + /* Error in anon_vma clone. */
> > if (err)
> > return NULL;
> >
>
>
* Lorenzo Stoakes <[email protected]> [230322 03:13]:
> Use find_vma_intersection() and vma_lookup() to both simplify the logic and
> to fold the end == next->vm_start condition into one block.
>
> This groups all of the simple range checks together and establishes the
> invariant that, if prev, curr or next are non-NULL then their positions are
> as expected.
>
> This has no functional impact.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Nit below.
> ---
> mm/mmap.c | 24 +++++++++++-------------
> 1 file changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c9834364ac98..dbdbb92493b2 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -930,15 +930,14 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> if (vm_flags & VM_SPECIAL)
> return NULL;
>
> - curr = find_vma(mm, prev ? prev->vm_end : 0);
> - if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
> - next = find_vma(mm, curr->vm_end);
> - else
> - next = curr;
> + /* Does the input range span an existing VMA? (cases 5 - 8) */
> + curr = find_vma_intersection(mm, prev ? prev->vm_end : 0, end);
>
> - /* In cases 1 - 4 there's no CCCC vma */
> - if (curr && end <= curr->vm_start)
> - curr = NULL;
> + if (!curr || /* cases 1 - 4 */
> + end == curr->vm_end) /* cases 6 - 8, adjacent VMA */
> + next = vma_lookup(mm, end);
> + else
> + next = NULL; /* case 5 */
>
> /* verify some invariant that must be enforced by the caller */
> VM_WARN_ON(prev && addr <= prev->vm_start);
> @@ -959,11 +958,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> }
> }
> /* Can we merge the successor? */
> - if (next && end == next->vm_start &&
> - mpol_equal(policy, vma_policy(next)) &&
> - can_vma_merge_before(next, vm_flags,
> - anon_vma, file, pgoff+pglen,
> - vm_userfaultfd_ctx, anon_name)) {
> + if (next && mpol_equal(policy, vma_policy(next)) &&
> + can_vma_merge_before(next, vm_flags,
> + anon_vma, file, pgoff+pglen,
> + vm_userfaultfd_ctx, anon_name)) {
Can this be compacted to less lines? Pretty much the same thing
Vlastimil said in patch 4.
> merge_next = true;
> }
>
> --
> 2.39.2
* Lorenzo Stoakes <[email protected]> [230322 03:13]:
> Previously the ASCII diagram above vma_merge() and the accompanying
> variable naming was rather confusing, however recent efforts by Liam
> Howlett and Vlastimil Babka have significantly improved matters.
>
> This patch goes a little further - replacing 'X' with 'N', which feels more
> natural as this represents the _next_ VMA and replacing what was 'N' with
> 'C' which represents the current VMA.
>
> No word quite describes a VMA that has coincident start as the input span,
> however 'concurrent' (or more simply 'current') abbreviated to 'curr' fits
> intuitions well alongside prev and next.
>
> This has no functional impact.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
> ---
> mm/mmap.c | 86 +++++++++++++++++++++++++++----------------------------
> 1 file changed, 43 insertions(+), 43 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 042d22e63528..c9834364ac98 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -861,44 +861,44 @@ can_vma_merge_after(struct vm_area_struct *vma, unsigned long vm_flags,
> * this area are about to be changed to vm_flags - and the no-change
> * case has already been eliminated.
> *
> - * The following mprotect cases have to be considered, where AAAA is
> + * The following mprotect cases have to be considered, where **** is
> * the area passed down from mprotect_fixup, never extending beyond one
> - * vma, PPPP is the previous vma, NNNN is a vma that starts at the same
> - * address as AAAA and is of the same or larger span, and XXXX the next
> - * vma after AAAA:
> + * vma, PPPP is the previous vma, CCCC is a concurrent vma that starts
> + * at the same address as **** and is of the same or larger span, and
> + * NNNN the next vma after ****:
> *
> - * AAAA AAAA AAAA
> - * PPPPPPXXXXXX PPPPPPXXXXXX PPPPPPNNNNNN
> + * **** **** ****
> + * PPPPPPNNNNNN PPPPPPNNNNNN PPPPPPCCCCCC
> * cannot merge might become might become
> - * PPXXXXXXXXXX PPPPPPPPPPNN
> + * PPNNNNNNNNNN PPPPPPPPPPCC
> * mmap, brk or case 4 below case 5 below
> * mremap move:
> - * AAAA AAAA
> - * PPPP XXXX PPPPNNNNXXXX
> + * **** ****
> + * PPPP NNNN PPPPCCCCNNNN
> * might become might become
> * PPPPPPPPPPPP 1 or PPPPPPPPPPPP 6 or
> - * PPPPPPPPXXXX 2 or PPPPPPPPXXXX 7 or
> - * PPPPXXXXXXXX 3 PPPPXXXXXXXX 8
> + * PPPPPPPPNNNN 2 or PPPPPPPPNNNN 7 or
> + * PPPPNNNNNNNN 3 PPPPNNNNNNNN 8
> *
> - * It is important for case 8 that the vma NNNN overlapping the
> - * region AAAA is never going to extended over XXXX. Instead XXXX must
> - * be extended in region AAAA and NNNN must be removed. This way in
> + * It is important for case 8 that the vma CCCC overlapping the
> + * region **** is never going to extended over NNNN. Instead NNNN must
> + * be extended in region **** and CCCC must be removed. This way in
> * all cases where vma_merge succeeds, the moment vma_merge drops the
> * rmap_locks, the properties of the merged vma will be already
> * correct for the whole merged range. Some of those properties like
> * vm_page_prot/vm_flags may be accessed by rmap_walks and they must
> * be correct for the whole merged range immediately after the
> - * rmap_locks are released. Otherwise if XXXX would be removed and
> - * NNNN would be extended over the XXXX range, remove_migration_ptes
> + * rmap_locks are released. Otherwise if NNNN would be removed and
> + * CCCC would be extended over the NNNN range, remove_migration_ptes
> * or other rmap walkers (if working on addresses beyond the "end"
> - * parameter) may establish ptes with the wrong permissions of NNNN
> - * instead of the right permissions of XXXX.
> + * parameter) may establish ptes with the wrong permissions of CCCC
> + * instead of the right permissions of NNNN.
> *
> * In the code below:
> * PPPP is represented by *prev
> - * NNNN is represented by *mid or not represented at all (NULL)
> - * XXXX is represented by *next or not represented at all (NULL)
> - * AAAA is not represented - it will be merged and the vma containing the
> + * CCCC is represented by *curr or not represented at all (NULL)
> + * NNNN is represented by *next or not represented at all (NULL)
> + * **** is not represented - it will be merged and the vma containing the
> * area is returned, or the function will return NULL
> */
> struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> {
> pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> pgoff_t vma_pgoff;
> - struct vm_area_struct *mid, *next, *res = NULL;
> + struct vm_area_struct *curr, *next, *res = NULL;
> struct vm_area_struct *vma, *adjust, *remove, *remove2;
> int err = -1;
> bool merge_prev = false;
> @@ -930,19 +930,19 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> if (vm_flags & VM_SPECIAL)
> return NULL;
>
> - mid = find_vma(mm, prev ? prev->vm_end : 0);
> - if (mid && mid->vm_end == end) /* cases 6, 7, 8 */
> - next = find_vma(mm, mid->vm_end);
> + curr = find_vma(mm, prev ? prev->vm_end : 0);
> + if (curr && curr->vm_end == end) /* cases 6, 7, 8 */
> + next = find_vma(mm, curr->vm_end);
> else
> - next = mid;
> + next = curr;
>
> - /* In cases 1 - 4 there's no NNNN vma */
> - if (mid && end <= mid->vm_start)
> - mid = NULL;
> + /* In cases 1 - 4 there's no CCCC vma */
> + if (curr && end <= curr->vm_start)
> + curr = NULL;
>
> /* verify some invariant that must be enforced by the caller */
> VM_WARN_ON(prev && addr <= prev->vm_start);
> - VM_WARN_ON(mid && end > mid->vm_end);
> + VM_WARN_ON(curr && end > curr->vm_end);
> VM_WARN_ON(addr >= end);
>
> if (prev) {
> @@ -974,21 +974,21 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> remove = next; /* case 1 */
> vma_end = next->vm_end;
> err = dup_anon_vma(prev, next);
> - if (mid) { /* case 6 */
> - remove = mid;
> + if (curr) { /* case 6 */
> + remove = curr;
> remove2 = next;
> if (!next->anon_vma)
> - err = dup_anon_vma(prev, mid);
> + err = dup_anon_vma(prev, curr);
> }
> } else if (merge_prev) {
> err = 0; /* case 2 */
> - if (mid) {
> - err = dup_anon_vma(prev, mid);
> - if (end == mid->vm_end) { /* case 7 */
> - remove = mid;
> + if (curr) {
> + err = dup_anon_vma(prev, curr);
> + if (end == curr->vm_end) { /* case 7 */
> + remove = curr;
> } else { /* case 5 */
> - adjust = mid;
> - adj_start = (end - mid->vm_start);
> + adjust = curr;
> + adj_start = (end - curr->vm_start);
> }
> }
> } else if (merge_next) {
> @@ -1004,10 +1004,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> vma_end = next->vm_end;
> vma_pgoff = next->vm_pgoff;
> err = 0;
> - if (mid) { /* case 8 */
> - vma_pgoff = mid->vm_pgoff;
> - remove = mid;
> - err = dup_anon_vma(next, mid);
> + if (curr) { /* case 8 */
> + vma_pgoff = curr->vm_pgoff;
> + remove = curr;
> + err = dup_anon_vma(next, curr);
> }
> }
> }
> --
> 2.39.2
>
* Lorenzo Stoakes <[email protected]> [230322 03:13]:
> Previously, vma was an uninitialised variable which was only definitely
> assigned as a result of the logic covering all possible input cases - for
> it to have remained uninitialised, prev would have to be NULL, and next
> would _have_ to be mergeable.
>
> The value of res defaults to NULL, so we can neatly eliminate the
> assignment to res and vma in the if (prev) block and ensure that both res
> and vma are both explicitly assigned, by just setting both to prev.
>
> In addition we add an explanation as to under what circumstances both might
> change, and since we absolutely do rely on addr == curr->vm_start should
> curr exist, assert that this is the case.
>
> Signed-off-by: Lorenzo Stoakes <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
> ---
> mm/mmap.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index dbdbb92493b2..2a4f63716231 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -911,7 +911,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> {
> pgoff_t pglen = (end - addr) >> PAGE_SHIFT;
> pgoff_t vma_pgoff;
> - struct vm_area_struct *curr, *next, *res = NULL;
> + struct vm_area_struct *curr, *next, *res;
> struct vm_area_struct *vma, *adjust, *remove, *remove2;
> int err = -1;
> bool merge_prev = false;
> @@ -939,14 +939,18 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> else
> next = NULL; /* case 5 */
>
> - /* verify some invariant that must be enforced by the caller */
> + /*
> + * By default, we return prev. Cases 3, 4, 8 will instead return next
> + * and cases 3, 8 will also update vma to point at next.
> + */
> + res = vma = prev;
> +
> + /* Verify some invariant that must be enforced by the caller. */
> VM_WARN_ON(prev && addr <= prev->vm_start);
> - VM_WARN_ON(curr && end > curr->vm_end);
> + VM_WARN_ON(curr && (addr != curr->vm_start || end > curr->vm_end));
> VM_WARN_ON(addr >= end);
>
> if (prev) {
> - res = prev;
> - vma = prev;
> vma_start = prev->vm_start;
> vma_pgoff = prev->vm_pgoff;
> /* Can we merge the predecessor? */
> @@ -957,6 +961,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> vma_prev(vmi);
> }
> }
> +
> /* Can we merge the successor? */
> if (next && mpol_equal(policy, vma_policy(next)) &&
> can_vma_merge_before(next, vm_flags,
> @@ -997,6 +1002,10 @@ struct vm_area_struct *vma_merge(struct vma_iterator *vmi, struct mm_struct *mm,
> adj_start = -(prev->vm_end - addr);
> err = dup_anon_vma(next, prev);
> } else {
> + /*
> + * Note that cases 3 and 8 are the ONLY ones where prev
> + * is permitted to be (but is not necessarily) NULL.
> + */
> vma = next; /* case 3 */
> vma_start = addr;
> vma_end = next->vm_end;
> --
> 2.39.2
>