2012-11-12 11:51:51

by Michel Lespinasse

[permalink] [raw]
Subject: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase

Using the trinity fuzzer, Sasha Levin uncovered a case where
rb_subtree_gap wasn't correctly updated.

Digging into this, the root cause was that vma insertions and removals
require both an rbtree insert or erase operation (which may trigger
tree rotations), and an update of the next vma's gap (which does not
change the tree topology, but may require iterating on the node's
ancestors to propagate the update). The rbtree rotations caused the
rb_subtree_gap values to be updated in some of the internal nodes, but
without upstream propagation. Then the subsequent update on the next
vma didn't iterate as high up the tree as it should have, as it
stopped as soon as it hit one of the internal nodes that had been
updated as part of a tree rotation.

The fix is to impose that all rb_subtree_gap values must be up to date
before any rbtree insertion or erase, with the possible exception that
the node being erased doesn't need to have an up to date rb_subtree_gap.

These 3 patches apply on top of the stack I previously sent (or equally,
on top of the last published mmotm).

Michel Lespinasse (3):
mm: ensure safe rb_subtree_gap update when inserting new VMA
mm: ensure safe rb_subtree_gap update when removing VMA
mm: debug code to verify rb_subtree_gap updates are safe

mm/mmap.c | 121 +++++++++++++++++++++++++++++++++++++------------------------
1 files changed, 73 insertions(+), 48 deletions(-)

--
1.7.7.3


2012-11-12 11:51:53

by Michel Lespinasse

[permalink] [raw]
Subject: [PATCH 1/3] mm: ensure safe rb_subtree_gap update when inserting new VMA

Using the trinity fuzzer, Sasha Levin uncovered a case where
rb_subtree_gap wasn't correctly updated.

Digging into this, the root cause was that vma insertions and removals
require both an rbtree insert or erase operation (which may trigger
tree rotations), and an update of the next vma's gap (which does not
change the tree topology, but may require iterating on the node's
ancestors to propagate the update). The rbtree rotations caused the
rb_subtree_gap values to be updated in some of the internal nodes, but
without upstream propagation. Then the subsequent update on the next
vma didn't iterate as high up the tree as it should have, as it
stopped as soon as it hit one of the internal nodes that had been
updated as part of a tree rotation.

The fix is to impose that all rb_subtree_gap values must be up to date
before any rbtree insertion or erase, with the possible exception that
the node being erased doesn't need to have an up to date rb_subtree_gap.

This change: during vma insertion, make sure to update the rb_subtree_gap
values for both the current and next vmas prior to rebalancing the rbtree
to account for the just-inserted vma.

(Thanks to Sasha Levin for uncovering the problem and to Hugh Dickins
for coming up with a simpler test case)

Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Michel Lespinasse <[email protected]>

---
mm/mmap.c | 27 +++++++++++++++------------
1 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 619b280505fe..14859b999a9f 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -339,17 +339,7 @@ static void vma_gap_update(struct vm_area_struct *vma)
static inline void vma_rb_insert(struct vm_area_struct *vma,
struct rb_root *root)
{
- /*
- * vma->vm_prev wasn't known when we followed the rbtree to find the
- * correct insertion point for that vma. As a result, we could not
- * update the vma vm_rb parents rb_subtree_gap values on the way down.
- * So, we first insert the vma with a zero rb_subtree_gap value
- * (to be consistent with what we did on the way down), and then
- * immediately update the gap to the correct value.
- */
- vma->rb_subtree_gap = 0;
rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
- vma_gap_update(vma);
}

static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
@@ -494,12 +484,25 @@ static int find_vma_links(struct mm_struct *mm, unsigned long addr,
void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
struct rb_node **rb_link, struct rb_node *rb_parent)
{
- rb_link_node(&vma->vm_rb, rb_parent, rb_link);
- vma_rb_insert(vma, &mm->mm_rb);
+ /* Update tracking information for the gap following the new vma. */
if (vma->vm_next)
vma_gap_update(vma->vm_next);
else
mm->highest_vm_end = vma->vm_end;
+
+ /*
+ * vma->vm_prev wasn't known when we followed the rbtree to find the
+ * correct insertion point for that vma. As a result, we could not
+ * update the vma vm_rb parents rb_subtree_gap values on the way down.
+ * So, we first insert the vma with a zero rb_subtree_gap value
+ * (to be consistent with what we did on the way down), and then
+ * immediately update the gap to the correct value. Finally we
+ * rebalance the rbtree after all augmented values have been set.
+ */
+ rb_link_node(&vma->vm_rb, rb_parent, rb_link);
+ vma->rb_subtree_gap = 0;
+ vma_gap_update(vma);
+ vma_rb_insert(vma, &mm->mm_rb);
}

static void __vma_link_file(struct vm_area_struct *vma)
--
1.7.7.3

2012-11-12 11:54:44

by Michel Lespinasse

[permalink] [raw]
Subject: [PATCH 3/3] mm: debug code to verify rb_subtree_gap updates are safe

Using the trinity fuzzer, Sasha Levin uncovered a case where
rb_subtree_gap wasn't correctly updated.

Digging into this, the root cause was that vma insertions and removals
require both an rbtree insert or erase operation (which may trigger
tree rotations), and an update of the next vma's gap (which does not
change the tree topology, but may require iterating on the node's
ancestors to propagate the update). The rbtree rotations caused the
rb_subtree_gap values to be updated in some of the internal nodes, but
without upstream propagation. Then the subsequent update on the next
vma didn't iterate as high up the tree as it should have, as it
stopped as soon as it hit one of the internal nodes that had been
updated as part of a tree rotation.

The fix is to impose that all rb_subtree_gap values must be up to date
before any rbtree insertion or erase, with the possible exception that
the node being erased doesn't need to have an up to date rb_subtree_gap.

This change: introduce validate_mm_rb() to verify that the rbtree does
not include any stale rb_subtree_gap values before node insertion or
erase, so as to avoid the issue where a subsequent vma_gap_update() would
fail to propagate the rb_subtree_gap updates as high up as necessary.

Signed-off-by: Michel Lespinasse <[email protected]>

---
mm/mmap.c | 88 ++++++++++++++++++++++++++++++++++++++-----------------------
1 files changed, 55 insertions(+), 33 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index c60ac9fe2d7e..408d330aca6c 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -319,39 +319,6 @@ static long vma_compute_subtree_gap(struct vm_area_struct *vma)
return max;
}

-RB_DECLARE_CALLBACKS(static, vma_gap_callbacks, struct vm_area_struct, vm_rb,
- unsigned long, rb_subtree_gap, vma_compute_subtree_gap)
-
-/*
- * Update augmented rbtree rb_subtree_gap values after vma->vm_start or
- * vma->vm_prev->vm_end values changed, without modifying the vma's position
- * in the rbtree.
- */
-static void vma_gap_update(struct vm_area_struct *vma)
-{
- /*
- * As it turns out, RB_DECLARE_CALLBACKS() already created a callback
- * function that does exacltly what we want.
- */
- vma_gap_callbacks_propagate(&vma->vm_rb, NULL);
-}
-
-static inline void vma_rb_insert(struct vm_area_struct *vma,
- struct rb_root *root)
-{
- rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
-}
-
-static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
-{
- /*
- * Note rb_erase_augmented is a fairly large inline function,
- * so make sure we instantiate it only once with our desired
- * augmented rbtree callbacks.
- */
- rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
-}
-
#ifdef CONFIG_DEBUG_VM_RB
static int browse_rb(struct rb_root *root)
{
@@ -385,6 +352,18 @@ static int browse_rb(struct rb_root *root)
return bug ? -1 : i;
}

+static void validate_mm_rb(struct rb_root *root, struct vm_area_struct *ignore)
+{
+ struct rb_node *nd;
+
+ for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+ struct vm_area_struct *vma;
+ vma = rb_entry(nd, struct vm_area_struct, vm_rb);
+ BUG_ON(vma != ignore &&
+ vma->rb_subtree_gap != vma_compute_subtree_gap(vma));
+ }
+}
+
void validate_mm(struct mm_struct *mm)
{
int bug = 0;
@@ -412,9 +391,52 @@ void validate_mm(struct mm_struct *mm)
BUG_ON(bug);
}
#else
+#define validate_mm_rb(root, ignore) do { } while (0)
#define validate_mm(mm) do { } while (0)
#endif

+RB_DECLARE_CALLBACKS(static, vma_gap_callbacks, struct vm_area_struct, vm_rb,
+ unsigned long, rb_subtree_gap, vma_compute_subtree_gap)
+
+/*
+ * Update augmented rbtree rb_subtree_gap values after vma->vm_start or
+ * vma->vm_prev->vm_end values changed, without modifying the vma's position
+ * in the rbtree.
+ */
+static void vma_gap_update(struct vm_area_struct *vma)
+{
+ /*
+ * As it turns out, RB_DECLARE_CALLBACKS() already created a callback
+ * function that does exacltly what we want.
+ */
+ vma_gap_callbacks_propagate(&vma->vm_rb, NULL);
+}
+
+static inline void vma_rb_insert(struct vm_area_struct *vma,
+ struct rb_root *root)
+{
+ /* All rb_subtree_gap values must be consistent prior to insertion */
+ validate_mm_rb(root, NULL);
+
+ rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
+}
+
+static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
+{
+ /*
+ * All rb_subtree_gap values must be consistent prior to erase,
+ * with the possible exception of the vma being erased.
+ */
+ validate_mm_rb(root, vma);
+
+ /*
+ * Note rb_erase_augmented is a fairly large inline function,
+ * so make sure we instantiate it only once with our desired
+ * augmented rbtree callbacks.
+ */
+ rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
+}
+
/*
* vma has some anon_vma assigned, and is already inserted on that
* anon_vma's interval trees.
--
1.7.7.3

2012-11-12 11:55:08

by Michel Lespinasse

[permalink] [raw]
Subject: [PATCH 2/3] mm: ensure safe rb_subtree_gap update when removing VMA

Using the trinity fuzzer, Sasha Levin uncovered a case where
rb_subtree_gap wasn't correctly updated.

Digging into this, the root cause was that vma insertions and removals
require both an rbtree insert or erase operation (which may trigger
tree rotations), and an update of the next vma's gap (which does not
change the tree topology, but may require iterating on the node's
ancestors to propagate the update). The rbtree rotations caused the
rb_subtree_gap values to be updated in some of the internal nodes, but
without upstream propagation. Then the subsequent update on the next
vma didn't iterate as high up the tree as it should have, as it
stopped as soon as it hit one of the internal nodes that had been
updated as part of a tree rotation.

The fix is to impose that all rb_subtree_gap values must be up to date
before any rbtree insertion or erase, with the possible exception that
the node being erased doesn't need to have an up to date rb_subtree_gap.

This change: during VMA removal, remove VMA from the rbtree before we
remove it from the linked list. The implication is the next vma's
rb_subtree_gap value becomes stale when next->vm_prev is updated,
and we want to make sure vma_rb_erase() runs before there are any
such stale rb_subtree_gap values in the rbtree.

(I don't know of a reproduceable test case for this particular issue)

Signed-off-by: Michel Lespinasse <[email protected]>

---
mm/mmap.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 14859b999a9f..c60ac9fe2d7e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -578,12 +578,12 @@ static inline void
__vma_unlink(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev)
{
- struct vm_area_struct *next = vma->vm_next;
+ struct vm_area_struct *next;

- prev->vm_next = next;
+ vma_rb_erase(vma, &mm->mm_rb);
+ prev->vm_next = next = vma->vm_next;
if (next)
next->vm_prev = prev;
- vma_rb_erase(vma, &mm->mm_rb);
if (mm->mmap_cache == vma)
mm->mmap_cache = prev;
}
--
1.7.7.3

2012-11-12 13:40:39

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm: debug code to verify rb_subtree_gap updates are safe

On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
> Using the trinity fuzzer, Sasha Levin uncovered a case where
> rb_subtree_gap wasn't correctly updated.
>
> Digging into this, the root cause was that vma insertions and removals
> require both an rbtree insert or erase operation (which may trigger
> tree rotations), and an update of the next vma's gap (which does not
> change the tree topology, but may require iterating on the node's
> ancestors to propagate the update). The rbtree rotations caused the
> rb_subtree_gap values to be updated in some of the internal nodes, but
> without upstream propagation. Then the subsequent update on the next
> vma didn't iterate as high up the tree as it should have, as it
> stopped as soon as it hit one of the internal nodes that had been
> updated as part of a tree rotation.
>
> The fix is to impose that all rb_subtree_gap values must be up to date
> before any rbtree insertion or erase, with the possible exception that
> the node being erased doesn't need to have an up to date rb_subtree_gap.
>
> This change: introduce validate_mm_rb() to verify that the rbtree does
> not include any stale rb_subtree_gap values before node insertion or
> erase, so as to avoid the issue where a subsequent vma_gap_update() would
> fail to propagate the rb_subtree_gap updates as high up as necessary.
>
> Signed-off-by: Michel Lespinasse <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>


--
All rights reversed

2012-11-12 14:04:20

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 2/3] mm: ensure safe rb_subtree_gap update when removing VMA

On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
> Using the trinity fuzzer, Sasha Levin uncovered a case where
> rb_subtree_gap wasn't correctly updated.
>
> Digging into this, the root cause was that vma insertions and removals
> require both an rbtree insert or erase operation (which may trigger
> tree rotations), and an update of the next vma's gap (which does not
> change the tree topology, but may require iterating on the node's
> ancestors to propagate the update). The rbtree rotations caused the
> rb_subtree_gap values to be updated in some of the internal nodes, but
> without upstream propagation. Then the subsequent update on the next
> vma didn't iterate as high up the tree as it should have, as it
> stopped as soon as it hit one of the internal nodes that had been
> updated as part of a tree rotation.
>
> The fix is to impose that all rb_subtree_gap values must be up to date
> before any rbtree insertion or erase, with the possible exception that
> the node being erased doesn't need to have an up to date rb_subtree_gap.
>
> This change: during VMA removal, remove VMA from the rbtree before we
> remove it from the linked list. The implication is the next vma's
> rb_subtree_gap value becomes stale when next->vm_prev is updated,
> and we want to make sure vma_rb_erase() runs before there are any
> such stale rb_subtree_gap values in the rbtree.
>
> (I don't know of a reproduceable test case for this particular issue)
>
> Signed-off-by: Michel Lespinasse <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>


--
All rights reversed

2012-11-12 14:04:19

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH 1/3] mm: ensure safe rb_subtree_gap update when inserting new VMA

On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
> Using the trinity fuzzer, Sasha Levin uncovered a case where
> rb_subtree_gap wasn't correctly updated.
>
> Digging into this, the root cause was that vma insertions and removals
> require both an rbtree insert or erase operation (which may trigger
> tree rotations), and an update of the next vma's gap (which does not
> change the tree topology, but may require iterating on the node's
> ancestors to propagate the update). The rbtree rotations caused the
> rb_subtree_gap values to be updated in some of the internal nodes, but
> without upstream propagation. Then the subsequent update on the next
> vma didn't iterate as high up the tree as it should have, as it
> stopped as soon as it hit one of the internal nodes that had been
> updated as part of a tree rotation.
>
> The fix is to impose that all rb_subtree_gap values must be up to date
> before any rbtree insertion or erase, with the possible exception that
> the node being erased doesn't need to have an up to date rb_subtree_gap.
>
> This change: during vma insertion, make sure to update the rb_subtree_gap
> values for both the current and next vmas prior to rebalancing the rbtree
> to account for the just-inserted vma.
>
> (Thanks to Sasha Levin for uncovering the problem and to Hugh Dickins
> for coming up with a simpler test case)
>
> Reported-by: Sasha Levin <[email protected]>
> Signed-off-by: Michel Lespinasse <[email protected]>

Reviewed-by: Rik van Riel <[email protected]>


--
All rights reversed

2012-11-12 20:55:06

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase

On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
> Using the trinity fuzzer, Sasha Levin uncovered a case where
> rb_subtree_gap wasn't correctly updated.
>
> Digging into this, the root cause was that vma insertions and removals
> require both an rbtree insert or erase operation (which may trigger
> tree rotations), and an update of the next vma's gap (which does not
> change the tree topology, but may require iterating on the node's
> ancestors to propagate the update). The rbtree rotations caused the
> rb_subtree_gap values to be updated in some of the internal nodes, but
> without upstream propagation. Then the subsequent update on the next
> vma didn't iterate as high up the tree as it should have, as it
> stopped as soon as it hit one of the internal nodes that had been
> updated as part of a tree rotation.
>
> The fix is to impose that all rb_subtree_gap values must be up to date
> before any rbtree insertion or erase, with the possible exception that
> the node being erased doesn't need to have an up to date rb_subtree_gap.
>
> These 3 patches apply on top of the stack I previously sent (or equally,
> on top of the last published mmotm).
>
> Michel Lespinasse (3):
> mm: ensure safe rb_subtree_gap update when inserting new VMA
> mm: ensure safe rb_subtree_gap update when removing VMA
> mm: debug code to verify rb_subtree_gap updates are safe
>
> mm/mmap.c | 121 +++++++++++++++++++++++++++++++++++++------------------------
> 1 files changed, 73 insertions(+), 48 deletions(-)
>

Looking good: old warnings gone, no new warnings.


Thanks,
Sasha

2012-11-27 01:16:50

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase

On 11/12/2012 03:54 PM, Sasha Levin wrote:
> On 11/12/2012 06:51 AM, Michel Lespinasse wrote:
>> Using the trinity fuzzer, Sasha Levin uncovered a case where
>> rb_subtree_gap wasn't correctly updated.
>>
>> Digging into this, the root cause was that vma insertions and removals
>> require both an rbtree insert or erase operation (which may trigger
>> tree rotations), and an update of the next vma's gap (which does not
>> change the tree topology, but may require iterating on the node's
>> ancestors to propagate the update). The rbtree rotations caused the
>> rb_subtree_gap values to be updated in some of the internal nodes, but
>> without upstream propagation. Then the subsequent update on the next
>> vma didn't iterate as high up the tree as it should have, as it
>> stopped as soon as it hit one of the internal nodes that had been
>> updated as part of a tree rotation.
>>
>> The fix is to impose that all rb_subtree_gap values must be up to date
>> before any rbtree insertion or erase, with the possible exception that
>> the node being erased doesn't need to have an up to date rb_subtree_gap.
>>
>> These 3 patches apply on top of the stack I previously sent (or equally,
>> on top of the last published mmotm).
>>
>> Michel Lespinasse (3):
>> mm: ensure safe rb_subtree_gap update when inserting new VMA
>> mm: ensure safe rb_subtree_gap update when removing VMA
>> mm: debug code to verify rb_subtree_gap updates are safe
>>
>> mm/mmap.c | 121 +++++++++++++++++++++++++++++++++++++------------------------
>> 1 files changed, 73 insertions(+), 48 deletions(-)
>>
>
> Looking good: old warnings gone, no new warnings.


I've built today's -next, and got the following BUG pretty quickly (2-3 hours):

[ 1556.479284] BUG: unable to handle kernel paging request at 0000000000412000
[ 1556.480036] IP: [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.480036] PGD 31739067 PUD 4fbc4067 PMD 1c936067 PTE 0
[ 1556.480036] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 1556.480036] Dumping ftrace buffer:
[ 1556.480036] (ftrace buffer empty)
[ 1556.480036] CPU 0
[ 1556.480036] Pid: 10274, comm: trinity-child29 Tainted: G W 3.7.0-rc6-next-20121126-sasha-00015-gb04382b-dirty #201
[ 1556.480036] RIP: 0010:[<ffffffff81238184>] [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.480036] RSP: 0018:ffff88004fbc7d08 EFLAGS: 00010206
[ 1556.480036] RAX: 0000000000412000 RBX: 0000000000000000 RCX: 0000000000000000
[ 1556.512120] RDX: 0000000000000000 RSI: ffff88001c1a6008 RDI: ffff88001c1a6000
[ 1556.512120] RBP: ffff88004fbc7d38 R08: ffff8800371e7808 R09: ffff88004fb56cf0
[ 1556.512120] R10: 0000000000000001 R11: 0000000000001000 R12: ffff88001c1a6000
[ 1556.512120] R13: ffff8800371e7b00 R14: 0000000000000000 R15: ffff88001c1a6000
[ 1556.512120] FS: 00007f4e0f8e3700(0000) GS:ffff8800bfc00000(0000) knlGS:0000000000000000
[ 1556.512120] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1556.512120] CR2: 0000000000412000 CR3: 000000002faec000 CR4: 00000000000406f0
[ 1556.512120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1556.512120] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1556.512120] Process trinity-child29 (pid: 10274, threadinfo ffff88004fbc6000, task ffff88004fbb0000)
[ 1556.512120] Stack:
[ 1556.512120] ffff8800bf80aa80 ffff88001c1a6000 ffff88004fb56cf0 ffff8800371e7818
[ 1556.512120] ffff8800371e7808 ffff88001c1a6000 ffff88004fbc7d88 ffffffff8123843c
[ 1556.512120] 0000000000000001 ffff88004fb56da8 ffff880000000000 ffff8800371e7818
[ 1556.512120] Call Trace:
[ 1556.512120] [<ffffffff8123843c>] vma_link+0xcc/0xf0
[ 1556.512120] [<ffffffff8123a8ac>] mmap_region+0x40c/0x5a0
[ 1556.512120] [<ffffffff8123aceb>] do_mmap_pgoff+0x2ab/0x310
[ 1556.512120] [<ffffffff8122477c>] ? vm_mmap_pgoff+0x6c/0xb0
[ 1556.512120] [<ffffffff81224794>] vm_mmap_pgoff+0x84/0xb0
[ 1556.512120] [<ffffffff81239483>] sys_mmap_pgoff+0x193/0x1a0
[ 1556.512120] [<ffffffff81182b08>] ? trace_hardirqs_on_caller+0x118/0x140
[ 1556.512120] [<ffffffff810729ad>] sys_mmap+0x1d/0x20
[ 1556.512120] [<ffffffff83c88418>] tracesys+0xe1/0xe6
[ 1556.512120] Code: 31 f6 41 55 41 54 49 89 fc 53 31 db 48 83 ec 08 4c 8b 2f 4d 85 ed 74 75 0f 1f 80 00 00 00 00 49 8b 85 88 00 00
00 48 85 c0 74 0e <48> 8b 38 31 f6 48 83 c7 08 e8 0e bc a4 02 49 8b 45 78 4d 8d 7d
[ 1556.512120] RIP [<ffffffff81238184>] validate_mm+0x34/0x130
[ 1556.512120] RSP <ffff88004fbc7d08>
[ 1556.512120] CR2: 0000000000412000
[ 1557.729958] ---[ end trace d2a29e98cc9e2568 ]---

The bit that's failing is:

struct vm_area_struct *vma = mm->mmap; // mm->mmap = 0x412000
while (vma) {
struct anon_vma_chain *avc;
vma_lock_anon_vma(vma); // BOOM!


Thanks,
Sasha

2012-11-27 04:55:24

by Michel Lespinasse

[permalink] [raw]
Subject: Re: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase

On Mon, Nov 26, 2012 at 5:16 PM, Sasha Levin <[email protected]> wrote:
> I've built today's -next, and got the following BUG pretty quickly (2-3 hours):
>
> [ 1556.479284] BUG: unable to handle kernel paging request at 0000000000412000
> [ 1556.480036] IP: [<ffffffff81238184>] validate_mm+0x34/0x130
> [ 1556.480036] PGD 31739067 PUD 4fbc4067 PMD 1c936067 PTE 0
> [ 1556.480036] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [ 1556.480036] Dumping ftrace buffer:
> [ 1556.480036] (ftrace buffer empty)
> [ 1556.480036] CPU 0
> [ 1556.480036] Pid: 10274, comm: trinity-child29 Tainted: G W 3.7.0-rc6-next-20121126-sasha-00015-gb04382b-dirty #201
> [ 1556.480036] RIP: 0010:[<ffffffff81238184>] [<ffffffff81238184>] validate_mm+0x34/0x130
> [ 1556.480036] RSP: 0018:ffff88004fbc7d08 EFLAGS: 00010206
> [ 1556.480036] RAX: 0000000000412000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 1556.512120] RDX: 0000000000000000 RSI: ffff88001c1a6008 RDI: ffff88001c1a6000
> [ 1556.512120] RBP: ffff88004fbc7d38 R08: ffff8800371e7808 R09: ffff88004fb56cf0
> [ 1556.512120] R10: 0000000000000001 R11: 0000000000001000 R12: ffff88001c1a6000
> [ 1556.512120] R13: ffff8800371e7b00 R14: 0000000000000000 R15: ffff88001c1a6000
> [ 1556.512120] FS: 00007f4e0f8e3700(0000) GS:ffff8800bfc00000(0000) knlGS:0000000000000000
> [ 1556.512120] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1556.512120] CR2: 0000000000412000 CR3: 000000002faec000 CR4: 00000000000406f0
> [ 1556.512120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1556.512120] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 1556.512120] Process trinity-child29 (pid: 10274, threadinfo ffff88004fbc6000, task ffff88004fbb0000)
> [ 1556.512120] Stack:
> [ 1556.512120] ffff8800bf80aa80 ffff88001c1a6000 ffff88004fb56cf0 ffff8800371e7818
> [ 1556.512120] ffff8800371e7808 ffff88001c1a6000 ffff88004fbc7d88 ffffffff8123843c
> [ 1556.512120] 0000000000000001 ffff88004fb56da8 ffff880000000000 ffff8800371e7818
> [ 1556.512120] Call Trace:
> [ 1556.512120] [<ffffffff8123843c>] vma_link+0xcc/0xf0
> [ 1556.512120] [<ffffffff8123a8ac>] mmap_region+0x40c/0x5a0
> [ 1556.512120] [<ffffffff8123aceb>] do_mmap_pgoff+0x2ab/0x310
> [ 1556.512120] [<ffffffff8122477c>] ? vm_mmap_pgoff+0x6c/0xb0
> [ 1556.512120] [<ffffffff81224794>] vm_mmap_pgoff+0x84/0xb0
> [ 1556.512120] [<ffffffff81239483>] sys_mmap_pgoff+0x193/0x1a0
> [ 1556.512120] [<ffffffff81182b08>] ? trace_hardirqs_on_caller+0x118/0x140
> [ 1556.512120] [<ffffffff810729ad>] sys_mmap+0x1d/0x20
> [ 1556.512120] [<ffffffff83c88418>] tracesys+0xe1/0xe6
> [ 1556.512120] Code: 31 f6 41 55 41 54 49 89 fc 53 31 db 48 83 ec 08 4c 8b 2f 4d 85 ed 74 75 0f 1f 80 00 00 00 00 49 8b 85 88 00 00
> 00 48 85 c0 74 0e <48> 8b 38 31 f6 48 83 c7 08 e8 0e bc a4 02 49 8b 45 78 4d 8d 7d
> [ 1556.512120] RIP [<ffffffff81238184>] validate_mm+0x34/0x130
> [ 1556.512120] RSP <ffff88004fbc7d08>
> [ 1556.512120] CR2: 0000000000412000
> [ 1557.729958] ---[ end trace d2a29e98cc9e2568 ]---
>
> The bit that's failing is:
>
> struct vm_area_struct *vma = mm->mmap; // mm->mmap = 0x412000
> while (vma) {
> struct anon_vma_chain *avc;
> vma_lock_anon_vma(vma); // BOOM!
>
>
> Thanks,
> Sasha

Thanks for the report.

I believe we actually have mm->mmap = ffff8800371e7b00 (r13); That
first vma has its anon_vma field pointing to 0000000000412000 (rax)
and we fault while trying to read anon_vma->root.

I don't see what could cause vma->anon_vma to be an invalid non-null
value. This looks very much like there might be some kind of memory
corruption occuring, but I can't tell where it would come from.

Going back into your previous reports, we also never really identified
the root cause of your two reports at the start of the thread with
subject: "mm: NULL ptr deref in anon_vma_interval_tree_verify" (on Oct
18th and Oct 25th). At some point we thought that taking the anon_vma
lock in validate_mm would prevent a race and fix the issue, but
further inspection convinced us that this shouldn't be necessary - so,
in the end, we still don't know what caused us to crash in these two
cases either (and, I was tempted to suggest memory corruption at the
time too).

So, I'm not sure what to do.

One thing to keep in mind is that CONFIG_DEBUG_VM_RB was only recently
introduced (between v3.6 and v3.7-rc1); before that the code existed
only as an #ifdef in mm/mmap.c. So, it might help if you could run
your trinity test on these few kernel versions:

- first, on v3.6, after editing mm/mmap.c to replace #undef
DEBUG_MM_RB with #define DEBUG_MM_RB;
(if this fails here, then what we have is a latent bug that
CONFIG_DEBUG_VM_RB just happened to reveal)

- then, on v3.7-rc1 with no further changes
(if this fails here, then the issue is likely with my rbtree intervals
stuff that made it into v3.7-rc1)

- finally, on the akpm-base branch in linux-next tree
(if this fails here, then the issue may be some corruption caused by
one of the trees other than -mm)

Sorry to request this; I'm really not sure what else to try at this point :/

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.