2010-06-01 17:00:31

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Sat, May 29, 2010 at 6:31 AM, Peter Zijlstra <[email protected]> wrote:
> A better changelog, and I guess we can keep the explanation of
> the augmented rb-tree since it doesn't actually mentions the
> interface, merely the idea of adding a heap to the binary tree.
>
> The patch simply reverts the rbtree.c hunk, and keeps the non-conforming
> style it has..
>
> ---
> Subject: rbtree: Undo augmented damage
> From: Peter Zijlstra <[email protected]>
> Date: Sat May 29 15:14:02 CEST 2010
>
> Reimplement augmented RB-trees without sprinkling extra branches
> all over the RB-tree code (which lives in the scheduler hot path).
>
> This approach is 'borrowed' from Fabio's BFQ implementation and
> relies on traversing the rebalance path after the RB-tree-op to
> correct the heap property for insertion/removal and make up for
> the damage done by the tree rotations.
>
> For insertion the rebalance path is trivially that from the new
> node upwards to the root, for removal it is that from the deepest
> node in the path from the to be removed node that will still
> be around after the removal.

Yes. I like avoiding the sprinkled if's. But, I don't think
rb_augment_erase_begin() and rb_augment_insert() is covering all the
callbacks needed to maintain the augmented tree correctly. More
comments below.

>
> Cc: Fabio Checconi <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> ?arch/x86/mm/pat_rbtree.c | ? ?9 ++--
> ?include/linux/rbtree.h ? | ? 13 ++++--
> ?lib/rbtree.c ? ? ? ? ? ? | ? 97 +++++++++++++++++++++++++----------------------
> ?3 files changed, 68 insertions(+), 51 deletions(-)
>
> Index: linux-2.6/arch/x86/mm/pat_rbtree.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/mm/pat_rbtree.c
> +++ linux-2.6/arch/x86/mm/pat_rbtree.c
> @@ -34,8 +34,7 @@
> ?* memtype_lock protects the rbtree.
> ?*/
>
> -static void memtype_rb_augment_cb(struct rb_node *node);
> -static struct rb_root memtype_rbroot = RB_AUGMENT_ROOT(&memtype_rb_augment_cb);
> +static struct rb_root memtype_rbroot = RB_ROOT;
>
> ?static int is_node_overlap(struct memtype *node, u64 start, u64 end)
> ?{
> @@ -190,7 +189,7 @@ failure:
> ? ? ? ?return -EBUSY;
> ?}
>
> -static void memtype_rb_augment_cb(struct rb_node *node)
> +static void memtype_rb_augment_cb(struct rb_node *node, void *data)
> ?{
> ? ? ? ?if (node)
> ? ? ? ? ? ? ? ?update_path_max_end(node);

There is a optimization in memtype_rb_augment_cb, where in it does not
do the fixup all the way to the root, when a nodes max_end does not
change. That has to be removed for this change to work.


> @@ -213,6 +212,7 @@ static void memtype_rb_insert(struct rb_
>
> ? ? ? ?rb_link_node(&newdata->rb, parent, node);
> ? ? ? ?rb_insert_color(&newdata->rb, root);
> + ? ? ? rb_augment_insert(&newdata->rb, memtype_rb_augment_cb, NULL);
> ?}
>

<snip>

> @@ -324,6 +284,55 @@ void rb_erase(struct rb_node *node, stru
> ?EXPORT_SYMBOL(rb_erase);
>
> ?/*
> + * after inserting @node into the tree, update the tree to account for
> + * both the new entry and any damage done by rebalance
> + */
> +void rb_augment_insert(struct rb_node *node, rb_augment_f func, void *data)
> +{
> + ? ? ? if (node->rb_left)
> + ? ? ? ? ? ? ? node = node->rb_left;
> + ? ? ? else if (node->rb_right)
> + ? ? ? ? ? ? ? node = node->rb_right;
> +
> + ? ? ? func(node, data);
> +}

I am not seeing how this can cover all the callbacks we will need to
maintain the augmented tree property. May be I am missing something.

For example, during insertion there can be a rotate at the grand
parent. So, (bad ascii art alert)

G
/ \
P U
/
N

P
/ \
N G
\
U

Where N is the new node. Here G has new children and has to
recalculate the nodes augmented data. I don't see that happening with
rb_augment_insert()


> +
> +/*
> + * before removing the node, find the deepest node on the rebalance path
> + * that will still be there after @node gets removed
> + */
> +struct rb_node *rb_augment_erase_begin(struct rb_node *node)
> +{
> + ? ? ? struct rb_node *deepest;
> +
> + ? ? ? if (!node->rb_right && !node->rb_left)
> + ? ? ? ? ? ? ? deepest = rb_parent(node);
> + ? ? ? else if (!node->rb_right)
> + ? ? ? ? ? ? ? deepest = node->rb_left;
> + ? ? ? else if (!node->rb_left)
> + ? ? ? ? ? ? ? deepest = node->rb_right;
> + ? ? ? else {
> + ? ? ? ? ? ? ? deepest = rb_next(node);
> + ? ? ? ? ? ? ? if (deepest->rb_right)
> + ? ? ? ? ? ? ? ? ? ? ? deepest = deepest->rb_right;
> + ? ? ? ? ? ? ? else if (rb_parent(deepest) != node)
> + ? ? ? ? ? ? ? ? ? ? ? deepest = rb_parent(deepest);
> + ? ? ? }
> +
> + ? ? ? return deepest;
> +}
> +

I have similar concern with rb_augment_erase_begin() returning the
deepest node. There is a case in __rb_erase_color where we do a
rotate_left of 'other' (sibling) node. That changes the children of
some nodes that are not in deepest to root path. No?

Thanks,
Venki


2010-06-01 17:19:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, 2010-06-01 at 10:00 -0700, Venkatesh Pallipadi wrote:
> I am not seeing how this can cover all the callbacks we will need to
> maintain the augmented tree property. May be I am missing something.

Yes, indeed, how about the below delta, which my eevdf code did do but I
overlooked on the conversion to the PAT code.

That removes the break as you said, but also adds code to update the
child nodes when walking up the path.

So in your rotation case:

G P
/ \ --> / \
P U N G
/ <-- \
N U

Say we take the right rotation, then the traversal up from N to P will
find that since N was the left child of P and it has a right child (G)
it will also update G.

Signed-off-by: Peter Zijlstra <[email protected]>
---
arch/x86/mm/pat_rbtree.c | 23 ++++++++++++-----------
1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index f537087..ca19aae 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -81,20 +81,21 @@ static void update_node_max_end(struct rb_node *node)
/* Update 'subtree_max_end' for a node and all its ancestors */
static void update_path_max_end(struct rb_node *node)
{
- u64 old_max_end, new_max_end;
+ struct rb_node *parent;

- while (node) {
- struct memtype *data = container_of(node, struct memtype, rb);
-
- old_max_end = data->subtree_max_end;
- update_node_max_end(node);
- new_max_end = data->subtree_max_end;
+up:
+ update_node_max_end(node);
+ parent = rb_parent(node);
+ if (!parent)
+ return;

- if (new_max_end == old_max_end)
- break;
+ if (node == parent->rb_left && parent->rb_right)
+ update_node_max_end(parent->rb_right);
+ else if (parent->rb_left)
+ update_node_max_end(parent->rb_left);

- node = rb_parent(node);
- }
+ node = parent;
+ goto up;
}

/* Find the first (lowest start addr) overlapping range from rb tree */

2010-06-01 17:34:07

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, 2010-06-01 at 19:19 +0200, Peter Zijlstra wrote:
> On Tue, 2010-06-01 at 10:00 -0700, Venkatesh Pallipadi wrote:
> > I am not seeing how this can cover all the callbacks we will need to
> > maintain the augmented tree property. May be I am missing something.
>
> Yes, indeed, how about the below delta, which my eevdf code did do but I
> overlooked on the conversion to the PAT code.
>
> That removes the break as you said, but also adds code to update the
> child nodes when walking up the path.
>
> So in your rotation case:
>
> G P
> / \ --> / \
> P U N G
> / <-- \
> N U
>
> Say we take the right rotation, then the traversal up from N to P will
> find that since N was the left child of P and it has a right child (G)
> it will also update G.

Hmm, that looks like it could well be folded in to the generic code..
let me spin a new patch.

2010-06-01 17:34:39

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, Jun 1, 2010 at 10:19 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2010-06-01 at 10:00 -0700, Venkatesh Pallipadi wrote:
>> I am not seeing how this can cover all the callbacks we will need to
>> maintain the augmented tree property. May be I am missing something.
>
> Yes, indeed, how about the below delta, which my eevdf code did do but I
> overlooked on the conversion to the PAT code.
>
> That removes the break as you said, but also adds code to update the
> child nodes when walking up the path.
>
> So in your rotation case:
>
> ? ?G ? ? ? ? ? ? P
> ? / \ ? ?--> ? ?/ \
> ?P ? U ? ? ? ? N ? G
> ?/ ? ? ? ?<-- ? ? ? ?\
> N ? ? ? ? ? ? ? ? ? ? U
>
> Say we take the right rotation, then the traversal up from N to P will
> find that since N was the left child of P and it has a right child (G)
> it will also update G.

Yes. This will cover all the cases on insert. But on erase, there is
still a case where a rotate of sibling node is done during the
re-coloration process. There we have a child change on sibling's
child. I am not able to think of any easy way to handle that case.

Thanks,
Venki

>
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> ?arch/x86/mm/pat_rbtree.c | ? 23 ++++++++++++-----------
> ?1 files changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
> index f537087..ca19aae 100644
> --- a/arch/x86/mm/pat_rbtree.c
> +++ b/arch/x86/mm/pat_rbtree.c
> @@ -81,20 +81,21 @@ static void update_node_max_end(struct rb_node *node)
> ?/* Update 'subtree_max_end' for a node and all its ancestors */
> ?static void update_path_max_end(struct rb_node *node)
> ?{
> - ? ? ? u64 old_max_end, new_max_end;
> + ? ? ? struct rb_node *parent;
>
> - ? ? ? while (node) {
> - ? ? ? ? ? ? ? struct memtype *data = container_of(node, struct memtype, rb);
> -
> - ? ? ? ? ? ? ? old_max_end = data->subtree_max_end;
> - ? ? ? ? ? ? ? update_node_max_end(node);
> - ? ? ? ? ? ? ? new_max_end = data->subtree_max_end;
> +up:
> + ? ? ? update_node_max_end(node);
> + ? ? ? parent = rb_parent(node);
> + ? ? ? if (!parent)
> + ? ? ? ? ? ? ? return;
>
> - ? ? ? ? ? ? ? if (new_max_end == old_max_end)
> - ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? if (node == parent->rb_left && parent->rb_right)
> + ? ? ? ? ? ? ? update_node_max_end(parent->rb_right);
> + ? ? ? else if (parent->rb_left)
> + ? ? ? ? ? ? ? update_node_max_end(parent->rb_left);
>
> - ? ? ? ? ? ? ? node = rb_parent(node);
> - ? ? ? }
> + ? ? ? node = parent;
> + ? ? ? goto up;
> ?}
>
> ?/* Find the first (lowest start addr) overlapping range from rb tree */
>
>

2010-06-01 17:42:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
>
> Yes. This will cover all the cases on insert. But on erase, there is
> still a case where a rotate of sibling node is done during the
> re-coloration process. There we have a child change on sibling's
> child. I am not able to think of any easy way to handle that case.

Let me go draw some figures with pen and paper to match up the erase
path with the rb_augment_erase_begin() code, because I can't quite spot
the case we're missing.

If you have it handy, ascii art might help..

Below a new patch with the missing bit folded into the generic code.


---
Subject: rbtree: Undo augmented damage
From: Peter Zijlstra <[email protected]>
Date: Sat, 29 May 2010 15:31:43 +0200

Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).

This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.

For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.

Cc: Fabio Checconi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
---
arch/x86/mm/pat_rbtree.c | 34 ++-----------
include/linux/rbtree.h | 13 +++--
lib/rbtree.c | 116 +++++++++++++++++++++++++++++------------------
3 files changed, 87 insertions(+), 76 deletions(-)

Index: linux-2.6/arch/x86/mm/pat_rbtree.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat_rbtree.c
+++ linux-2.6/arch/x86/mm/pat_rbtree.c
@@ -34,8 +34,7 @@
* memtype_lock protects the rbtree.
*/

-static void memtype_rb_augment_cb(struct rb_node *node);
-static struct rb_root memtype_rbroot = RB_AUGMENT_ROOT(&memtype_rb_augment_cb);
+static struct rb_root memtype_rbroot = RB_ROOT;

static int is_node_overlap(struct memtype *node, u64 start, u64 end)
{
@@ -56,7 +55,7 @@ static u64 get_subtree_max_end(struct rb
}

/* Update 'subtree_max_end' for a node, based on node and its children */
-static void update_node_max_end(struct rb_node *node)
+static void memtype_rb_augment_cb(struct rb_node *node, void *__unused)
{
struct memtype *data;
u64 max_end, child_max_end;
@@ -78,25 +77,6 @@ static void update_node_max_end(struct r
data->subtree_max_end = max_end;
}

-/* Update 'subtree_max_end' for a node and all its ancestors */
-static void update_path_max_end(struct rb_node *node)
-{
- u64 old_max_end, new_max_end;
-
- while (node) {
- struct memtype *data = container_of(node, struct memtype, rb);
-
- old_max_end = data->subtree_max_end;
- update_node_max_end(node);
- new_max_end = data->subtree_max_end;
-
- if (new_max_end == old_max_end)
- break;
-
- node = rb_parent(node);
- }
-}
-
/* Find the first (lowest start addr) overlapping range from rb tree */
static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
u64 start, u64 end)
@@ -190,12 +170,6 @@ failure:
return -EBUSY;
}

-static void memtype_rb_augment_cb(struct rb_node *node)
-{
- if (node)
- update_path_max_end(node);
-}
-
static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)
{
struct rb_node **node = &(root->rb_node);
@@ -213,6 +187,7 @@ static void memtype_rb_insert(struct rb_

rb_link_node(&newdata->rb, parent, node);
rb_insert_color(&newdata->rb, root);
+ rb_augment_insert(&newdata->rb, memtype_rb_augment_cb, NULL);
}

int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)
@@ -233,13 +208,16 @@ int rbt_memtype_check_insert(struct memt

struct memtype *rbt_memtype_erase(u64 start, u64 end)
{
+ struct rb_node *deepest;
struct memtype *data;

data = memtype_rb_exact_match(&memtype_rbroot, start, end);
if (!data)
goto out;

+ deepest = rb_augment_erase_begin(&data->rb);
rb_erase(&data->rb, &memtype_rbroot);
+ rb_augment_erase_end(deepest, memtype_rb_augment_cb, NULL);
out:
return data;
}
Index: linux-2.6/include/linux/rbtree.h
===================================================================
--- linux-2.6.orig/include/linux/rbtree.h
+++ linux-2.6/include/linux/rbtree.h
@@ -110,7 +110,6 @@ struct rb_node
struct rb_root
{
struct rb_node *rb_node;
- void (*augment_cb)(struct rb_node *node);
};


@@ -130,9 +129,7 @@ static inline void rb_set_color(struct r
rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
}

-#define RB_ROOT (struct rb_root) { NULL, NULL, }
-#define RB_AUGMENT_ROOT(x) (struct rb_root) { NULL, x}
-
+#define RB_ROOT (struct rb_root) { NULL, }
#define rb_entry(ptr, type, member) container_of(ptr, type, member)

#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
@@ -142,6 +139,14 @@ static inline void rb_set_color(struct r
extern void rb_insert_color(struct rb_node *, struct rb_root *);
extern void rb_erase(struct rb_node *, struct rb_root *);

+typedef void (*rb_augment_f)(struct rb_node *node, void *data);
+
+extern void rb_augment_insert(struct rb_node *node,
+ rb_augment_f func, void *data);
+extern struct rb_node *rb_augment_erase_begin(struct rb_node *node);
+extern void rb_augment_erase_end(struct rb_node *node,
+ rb_augment_f func, void *data);
+
/* Find logical next and previous nodes in a tree */
extern struct rb_node *rb_next(const struct rb_node *);
extern struct rb_node *rb_prev(const struct rb_node *);
Index: linux-2.6/lib/rbtree.c
===================================================================
--- linux-2.6.orig/lib/rbtree.c
+++ linux-2.6/lib/rbtree.c
@@ -44,11 +44,6 @@ static void __rb_rotate_left(struct rb_n
else
root->rb_node = right;
rb_set_parent(node, right);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(right);
- }
}

static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
@@ -72,20 +67,12 @@ static void __rb_rotate_right(struct rb_
else
root->rb_node = left;
rb_set_parent(node, left);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(left);
- }
}

void rb_insert_color(struct rb_node *node, struct rb_root *root)
{
struct rb_node *parent, *gparent;

- if (root->augment_cb)
- root->augment_cb(node);
-
while ((parent = rb_parent(node)) && rb_is_red(parent))
{
gparent = rb_parent(parent);
@@ -240,15 +227,12 @@ void rb_erase(struct rb_node *node, stru
else
{
struct rb_node *old = node, *left;
- int old_parent_cb = 0;
- int successor_parent_cb = 0;

node = node->rb_right;
while ((left = node->rb_left) != NULL)
node = left;

if (rb_parent(old)) {
- old_parent_cb = 1;
if (rb_parent(old)->rb_left == old)
rb_parent(old)->rb_left = node;
else
@@ -263,10 +247,8 @@ void rb_erase(struct rb_node *node, stru
if (parent == old) {
parent = node;
} else {
- successor_parent_cb = 1;
if (child)
rb_set_parent(child, parent);
-
parent->rb_left = child;

node->rb_right = old->rb_right;
@@ -277,24 +259,6 @@ void rb_erase(struct rb_node *node, stru
node->rb_left = old->rb_left;
rb_set_parent(old->rb_left, node);

- if (root->augment_cb) {
- /*
- * Here, three different nodes can have new children.
- * The parent of the successor node that was selected
- * to replace the node to be erased.
- * The node that is getting erased and is now replaced
- * by its successor.
- * The parent of the node getting erased-replaced.
- */
- if (successor_parent_cb)
- root->augment_cb(parent);
-
- root->augment_cb(node);
-
- if (old_parent_cb)
- root->augment_cb(rb_parent(old));
- }
-
goto color;
}

@@ -303,19 +267,15 @@ void rb_erase(struct rb_node *node, stru

if (child)
rb_set_parent(child, parent);
-
- if (parent) {
+ if (parent)
+ {
if (parent->rb_left == node)
parent->rb_left = child;
else
parent->rb_right = child;
-
- if (root->augment_cb)
- root->augment_cb(parent);
-
- } else {
- root->rb_node = child;
}
+ else
+ root->rb_node = child;

color:
if (color == RB_BLACK)
@@ -323,6 +283,74 @@ void rb_erase(struct rb_node *node, stru
}
EXPORT_SYMBOL(rb_erase);

+static void rb_augment_path(struct rb_node *node, rb_augment_f func, void *data)
+{
+ struct rb_node *parent;
+
+up:
+ func(node, data);
+ parent = rb_parent(node);
+ if (!parent)
+ return;
+
+ if (node == parent->rb_left && parent->rb_right)
+ func(parent->rb_right, data);
+ else if (parent->rb_left)
+ func(parent->rb_left, data);
+
+ node = parent;
+ goto up;
+}
+
+/*
+ * after inserting @node into the tree, update the tree to account for
+ * both the new entry and any damage done by rebalance
+ */
+void rb_augment_insert(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node->rb_left)
+ node = node->rb_left;
+ else if (node->rb_right)
+ node = node->rb_right;
+
+ rb_augment_path(node, func, data);
+}
+
+/*
+ * before removing the node, find the deepest node on the rebalance path
+ * that will still be there after @node gets removed
+ */
+struct rb_node *rb_augment_erase_begin(struct rb_node *node)
+{
+ struct rb_node *deepest;
+
+ if (!node->rb_right && !node->rb_left)
+ deepest = rb_parent(node);
+ else if (!node->rb_right)
+ deepest = node->rb_left;
+ else if (!node->rb_left)
+ deepest = node->rb_right;
+ else {
+ deepest = rb_next(node);
+ if (deepest->rb_right)
+ deepest = deepest->rb_right;
+ else if (rb_parent(deepest) != node)
+ deepest = rb_parent(deepest);
+ }
+
+ return deepest;
+}
+
+/*
+ * after removal, update the tree to account for the removed entry
+ * and any rebalance damage.
+ */
+void rb_augment_erase_end(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node)
+ rb_augment_path(node, func, data);
+}
+
/*
* This function returns the first node (in sort order) of the tree.
*/

2010-06-01 18:09:34

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, Jun 1, 2010 at 10:42 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
>>
>> Yes. This will cover all the cases on insert. But on erase, there is
>> still a case where a rotate of sibling node is done during the
>> re-coloration process. There we have a child change on sibling's
>> child. I am not able to think of any easy way to handle that case.
>
> Let me go draw some figures with pen and paper to match up the erase
> path with the rb_augment_erase_begin() code, because I can't quite spot
> the case we're missing.
>
> If you have it handy, ascii art might help..

It is this case

P
/ \
N S
/ \
SL SR

changing to

P
/ \
N SL
\
S
\
SR

This can happen in __rb_erase_color, where 'other' points to sibling node.

Thanks,
Venki

2010-06-01 18:42:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, 2010-06-01 at 11:09 -0700, Venkatesh Pallipadi wrote:
> On Tue, Jun 1, 2010 at 10:42 AM, Peter Zijlstra <[email protected]> wrote:
> > On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
> >>
> >> Yes. This will cover all the cases on insert. But on erase, there is
> >> still a case where a rotate of sibling node is done during the
> >> re-coloration process. There we have a child change on sibling's
> >> child. I am not able to think of any easy way to handle that case.
> >
> > Let me go draw some figures with pen and paper to match up the erase
> > path with the rb_augment_erase_begin() code, because I can't quite spot
> > the case we're missing.
> >
> > If you have it handy, ascii art might help..
>
> It is this case
>
> P
> / \
> N S
> / \
> SL SR
>
> changing to
>
> P
> / \
> N SL
> \
> S
> \
> SR

Right, but see: http://en.wikipedia.org/wiki/Red-black_tree
That is delete_case5, however then we fall into delete_case6 and perform
a left rotation.

So suppose we start with the tree:

P P P SL
/ \ / \ / \ / \
D S --> N S --> N SL --> P S
\ / \ / \ \ / \
N SL SR SL* SR S* N SR
\
SR

and then remove D, delete case 5 and finally delete case 6, * marks red.

rb_augment_erase_begin(D) will return N, and then rb_augment_path(N)
will re-augment: N, P, SL and S.

2010-06-01 20:31:40

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, Jun 1, 2010 at 11:42 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2010-06-01 at 11:09 -0700, Venkatesh Pallipadi wrote:
>> On Tue, Jun 1, 2010 at 10:42 AM, Peter Zijlstra <[email protected]> wrote:
>> > On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
>> >>
>> >> Yes. This will cover all the cases on insert. But on erase, there is
>> >> still a case where a rotate of sibling node is done during the
>> >> re-coloration process. There we have a child change on sibling's
>> >> child. I am not able to think of any easy way to handle that case.
>> >
>> > Let me go draw some figures with pen and paper to match up the erase
>> > path with the rb_augment_erase_begin() code, because I can't quite spot
>> > the case we're missing.
>> >
>> > If you have it handy, ascii art might help..
>>
>> It is this case
>>
>> ? ? P
>> ? ?/ \
>> ? N ? S
>> ? ? ?/ \
>> ? ? SL SR
>>
>> changing to
>>
>> ? ? P
>> ? ?/ \
>> ? N ?SL
>> ? ? ? \
>> ? ? ? ?S
>> ? ? ? ? \
>> ? ? ? ? SR
>
> Right, but see: http://en.wikipedia.org/wiki/Red-black_tree
> That is delete_case5, however then we fall into delete_case6 and perform
> a left rotation.
>
> So suppose we start with the tree:
>
> ? ?P ? ? ? ? ? ? ? ? ?P ? ? ? ? ? ? ? P ? ? ? ? ? ? ? ?SL
> ?/ ? \ ? ? ? ? ? ? ? / \ ? ? ? ? ? ? / \ ? ? ? ? ? ? ? / \
> ?D ? ? S ? ? ?--> ? ?N ?S ? ? ?--> ? N ? SL ? ? --> ? ?P ? S
> ?\ ? / \ ? ? ? ? ? ? ?/ \ ? ? ? ? ? ? ? ?\ ? ? ? ? ? / ? ? \
> ? N SL ?SR ? ? ? ? ?SL* SR ? ? ? ? ? ? ? ?S* ? ? ? ?N ? ? ?SR
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?\
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? SR
>
> and then remove D, delete case 5 and finally delete case 6, * marks red.
>
> rb_augment_erase_begin(D) will return N, and then rb_augment_path(N)
> will re-augment: N, P, SL and S.
>

Yes. I had missed that rotate_right of parent following the
rotate_left of sibling (or vice-versa). Thanks for fixing this. The
latest patch looks good.

Acked-by: Venkatesh Pallipadi <[email protected]>

2010-06-02 21:12:21

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Tue, 2010-06-01 at 11:42 -0700, Peter Zijlstra wrote:
> On Tue, 2010-06-01 at 11:09 -0700, Venkatesh Pallipadi wrote:
> > On Tue, Jun 1, 2010 at 10:42 AM, Peter Zijlstra <[email protected]> wrote:
> > > On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
> > >>
> > >> Yes. This will cover all the cases on insert. But on erase, there is
> > >> still a case where a rotate of sibling node is done during the
> > >> re-coloration process. There we have a child change on sibling's
> > >> child. I am not able to think of any easy way to handle that case.
> > >
> > > Let me go draw some figures with pen and paper to match up the erase
> > > path with the rb_augment_erase_begin() code, because I can't quite spot
> > > the case we're missing.
> > >
> > > If you have it handy, ascii art might help..
> >
> > It is this case
> >
> > P
> > / \
> > N S
> > / \
> > SL SR
> >
> > changing to
> >
> > P
> > / \
> > N SL
> > \
> > S
> > \
> > SR
>
> Right, but see: http://en.wikipedia.org/wiki/Red-black_tree
> That is delete_case5, however then we fall into delete_case6 and perform
> a left rotation.
>
> So suppose we start with the tree:
>
> P P P SL
> / \ / \ / \ / \
> D S --> N S --> N SL --> P S
> \ / \ / \ \ / \
> N SL SR SL* SR S* N SR
> \
> SR
>
> and then remove D, delete case 5 and finally delete case 6, * marks red.
>
> rb_augment_erase_begin(D) will return N, and then rb_augment_path(N)
> will re-augment: N, P, SL and S.


P SL
/ \ / \
N S ---> N S
/ / \ / \
C SL SR C SR

If P needs to be removed, we need to re-augment S also in this case,
right? It looks like we are not handling this case.

thanks,
suresh


2010-06-03 01:16:09

by Venkatesh Pallipadi

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Wed, Jun 2, 2010 at 2:11 PM, Suresh Siddha <[email protected]> wrote:
> On Tue, 2010-06-01 at 11:42 -0700, Peter Zijlstra wrote:
>> On Tue, 2010-06-01 at 11:09 -0700, Venkatesh Pallipadi wrote:
>> > On Tue, Jun 1, 2010 at 10:42 AM, Peter Zijlstra <[email protected]> wrote:
>> > > On Tue, 2010-06-01 at 10:34 -0700, Venkatesh Pallipadi wrote:
>> > >>
>> > >> Yes. This will cover all the cases on insert. But on erase, there is
>> > >> still a case where a rotate of sibling node is done during the
>> > >> re-coloration process. There we have a child change on sibling's
>> > >> child. I am not able to think of any easy way to handle that case.
>> > >
>> > > Let me go draw some figures with pen and paper to match up the erase
>> > > path with the rb_augment_erase_begin() code, because I can't quite spot
>> > > the case we're missing.
>> > >
>> > > If you have it handy, ascii art might help..
>> >
>> > It is this case
>> >
>> > ? ? P
>> > ? ?/ \
>> > ? N ? S
>> > ? ? ?/ \
>> > ? ? SL SR
>> >
>> > changing to
>> >
>> > ? ? P
>> > ? ?/ \
>> > ? N ?SL
>> > ? ? ? \
>> > ? ? ? ?S
>> > ? ? ? ? \
>> > ? ? ? ? SR
>>
>> Right, but see: http://en.wikipedia.org/wiki/Red-black_tree
>> That is delete_case5, however then we fall into delete_case6 and perform
>> a left rotation.
>>
>> So suppose we start with the tree:
>>
>> ? ? P ? ? ? ? ? ? ? ? ?P ? ? ? ? ? ? ? P ? ? ? ? ? ? ? ?SL
>> ? / ? \ ? ? ? ? ? ? ? / \ ? ? ? ? ? ? / \ ? ? ? ? ? ? ? / \
>> ?D ? ? S ? ? ?--> ? ?N ?S ? ? ?--> ? N ? SL ? ? --> ? ?P ? S
>> ? \ ? / \ ? ? ? ? ? ? ?/ \ ? ? ? ? ? ? ? ?\ ? ? ? ? ? / ? ? \
>> ? ?N SL ?SR ? ? ? ? ?SL* SR ? ? ? ? ? ? ? ?S* ? ? ? ?N ? ? ?SR
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? \
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?SR
>>
>> and then remove D, delete case 5 and finally delete case 6, * marks red.
>>
>> rb_augment_erase_begin(D) will return N, and then rb_augment_path(N)
>> will re-augment: N, P, SL and S.
>
>
> ? ? ?P ? ? ? ? ? ? ? ? ? ? ? SL
> ? ?/ ?\ ? ? ? ? ? ? ? ? ? ? / ?\
> ? N ? ?S ? ? ? ---> ? ? ? ?N ? ?S
> ?/ ? / ?\ ? ? ? ? ? ? ? ? / ? ? ?\
> ?C ? SL ?SR ? ? ? ? ? ? ? C ? ? ? ?SR
>
> If P needs to be removed, we need to re-augment S also in this case,
> right? It looks like we are not handling this case.
>

rb_augment_erase_begin() should take care of that. In this case, it
will return S as the deepest node and we start the walk-back-to-root
from there.

Thanks,
Venki

2010-06-03 07:13:38

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Wed, 2010-06-02 at 14:11 -0700, Suresh Siddha wrote:
> P SL
> / \ / \
> N S ---> N S
> / / \ / \
> C SL SR C SR
>
> If P needs to be removed, we need to re-augment S also in this case,
> right? It looks like we are not handling this case.


rb_augment_erase_begin(P) will return S if I read the code right, so
rb_augment_path(S) will the re-augment S, SL, N.

2010-06-03 07:48:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Thu, 2010-06-03 at 09:13 +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-02 at 14:11 -0700, Suresh Siddha wrote:
> > P SL
> > / \ / \
> > N S ---> N S
> > / / \ / \
> > C SL SR C SR
> >
> > If P needs to be removed, we need to re-augment S also in this case,
> > right? It looks like we are not handling this case.
>
>
> rb_augment_erase_begin(P) will return S if I read the code right, so
> rb_augment_path(S) will the re-augment S, SL, N.

To be more explicit: P has two children, so we select the next entry,
SL, as its substitute. Since SL doesn't have a right child we select its
parent, S.

If SL were to represent a subtree with a right child, then we'd select
that, lets call it SLR. SLR would then end up being a direct descendant
of S, and hence rb_augment_path(SLR) would still pass S on its way up.

2010-06-03 16:05:11

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] rbtree: undo augmented damage -v2

On Thu, 2010-06-03 at 00:48 -0700, Peter Zijlstra wrote:
> On Thu, 2010-06-03 at 09:13 +0200, Peter Zijlstra wrote:
> > On Wed, 2010-06-02 at 14:11 -0700, Suresh Siddha wrote:
> > > P SL
> > > / \ / \
> > > N S ---> N S
> > > / / \ / \
> > > C SL SR C SR
> > >
> > > If P needs to be removed, we need to re-augment S also in this case,
> > > right? It looks like we are not handling this case.
> >
> >
> > rb_augment_erase_begin(P) will return S if I read the code right, so
> > rb_augment_path(S) will the re-augment S, SL, N.
>
> To be more explicit: P has two children, so we select the next entry,
> SL, as its substitute. Since SL doesn't have a right child we select its
> parent, S.
>
> If SL were to represent a subtree with a right child, then we'd select
> that, lets call it SLR. SLR would then end up being a direct descendant
> of S, and hence rb_augment_path(SLR) would still pass S on its way up.

Yep. I missed the rb_parent() part in the rb_augment_erase_begin().

Thanks.

Acked-by: Suresh Siddha <[email protected]>

2010-06-09 10:14:23

by Peter Zijlstra

[permalink] [raw]
Subject: [tip:x86/pat] rbtree: Undo augmented trees performance damage

Commit-ID: 2463eb8b3093995e09a0d41b3d78ee0cf5fb4249
Gitweb: http://git.kernel.org/tip/2463eb8b3093995e09a0d41b3d78ee0cf5fb4249
Author: Peter Zijlstra <[email protected]>
AuthorDate: Sat, 29 May 2010 15:31:43 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 9 Jun 2010 10:32:49 +0200

rbtree: Undo augmented trees performance damage

Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).

This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.

For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.

Acked-by: Suresh Siddha <[email protected]>
Acked-by: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: Fabio Checconi <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
LKML-Reference: <1275414172.27810.27961.camel@twins>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/pat_rbtree.c | 34 ++-----------
include/linux/rbtree.h | 13 ++++--
lib/rbtree.c | 116 ++++++++++++++++++++++++++++-----------------
3 files changed, 87 insertions(+), 76 deletions(-)

diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index f537087..a2903e6 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -34,8 +34,7 @@
* memtype_lock protects the rbtree.
*/

-static void memtype_rb_augment_cb(struct rb_node *node);
-static struct rb_root memtype_rbroot = RB_AUGMENT_ROOT(&memtype_rb_augment_cb);
+static struct rb_root memtype_rbroot = RB_ROOT;

static int is_node_overlap(struct memtype *node, u64 start, u64 end)
{
@@ -56,7 +55,7 @@ static u64 get_subtree_max_end(struct rb_node *node)
}

/* Update 'subtree_max_end' for a node, based on node and its children */
-static void update_node_max_end(struct rb_node *node)
+static void memtype_rb_augment_cb(struct rb_node *node, void *__unused)
{
struct memtype *data;
u64 max_end, child_max_end;
@@ -78,25 +77,6 @@ static void update_node_max_end(struct rb_node *node)
data->subtree_max_end = max_end;
}

-/* Update 'subtree_max_end' for a node and all its ancestors */
-static void update_path_max_end(struct rb_node *node)
-{
- u64 old_max_end, new_max_end;
-
- while (node) {
- struct memtype *data = container_of(node, struct memtype, rb);
-
- old_max_end = data->subtree_max_end;
- update_node_max_end(node);
- new_max_end = data->subtree_max_end;
-
- if (new_max_end == old_max_end)
- break;
-
- node = rb_parent(node);
- }
-}
-
/* Find the first (lowest start addr) overlapping range from rb tree */
static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
u64 start, u64 end)
@@ -190,12 +170,6 @@ failure:
return -EBUSY;
}

-static void memtype_rb_augment_cb(struct rb_node *node)
-{
- if (node)
- update_path_max_end(node);
-}
-
static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)
{
struct rb_node **node = &(root->rb_node);
@@ -213,6 +187,7 @@ static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)

rb_link_node(&newdata->rb, parent, node);
rb_insert_color(&newdata->rb, root);
+ rb_augment_insert(&newdata->rb, memtype_rb_augment_cb, NULL);
}

int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)
@@ -233,13 +208,16 @@ int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)

struct memtype *rbt_memtype_erase(u64 start, u64 end)
{
+ struct rb_node *deepest;
struct memtype *data;

data = memtype_rb_exact_match(&memtype_rbroot, start, end);
if (!data)
goto out;

+ deepest = rb_augment_erase_begin(&data->rb);
rb_erase(&data->rb, &memtype_rbroot);
+ rb_augment_erase_end(deepest, memtype_rb_augment_cb, NULL);
out:
return data;
}
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index fe1872e..7066acb 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -110,7 +110,6 @@ struct rb_node
struct rb_root
{
struct rb_node *rb_node;
- void (*augment_cb)(struct rb_node *node);
};


@@ -130,9 +129,7 @@ static inline void rb_set_color(struct rb_node *rb, int color)
rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
}

-#define RB_ROOT (struct rb_root) { NULL, NULL, }
-#define RB_AUGMENT_ROOT(x) (struct rb_root) { NULL, x}
-
+#define RB_ROOT (struct rb_root) { NULL, }
#define rb_entry(ptr, type, member) container_of(ptr, type, member)

#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
@@ -142,6 +139,14 @@ static inline void rb_set_color(struct rb_node *rb, int color)
extern void rb_insert_color(struct rb_node *, struct rb_root *);
extern void rb_erase(struct rb_node *, struct rb_root *);

+typedef void (*rb_augment_f)(struct rb_node *node, void *data);
+
+extern void rb_augment_insert(struct rb_node *node,
+ rb_augment_f func, void *data);
+extern struct rb_node *rb_augment_erase_begin(struct rb_node *node);
+extern void rb_augment_erase_end(struct rb_node *node,
+ rb_augment_f func, void *data);
+
/* Find logical next and previous nodes in a tree */
extern struct rb_node *rb_next(const struct rb_node *);
extern struct rb_node *rb_prev(const struct rb_node *);
diff --git a/lib/rbtree.c b/lib/rbtree.c
index 15e10b1..4693f79 100644
--- a/lib/rbtree.c
+++ b/lib/rbtree.c
@@ -44,11 +44,6 @@ static void __rb_rotate_left(struct rb_node *node, struct rb_root *root)
else
root->rb_node = right;
rb_set_parent(node, right);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(right);
- }
}

static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
@@ -72,20 +67,12 @@ static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
else
root->rb_node = left;
rb_set_parent(node, left);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(left);
- }
}

void rb_insert_color(struct rb_node *node, struct rb_root *root)
{
struct rb_node *parent, *gparent;

- if (root->augment_cb)
- root->augment_cb(node);
-
while ((parent = rb_parent(node)) && rb_is_red(parent))
{
gparent = rb_parent(parent);
@@ -240,15 +227,12 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
else
{
struct rb_node *old = node, *left;
- int old_parent_cb = 0;
- int successor_parent_cb = 0;

node = node->rb_right;
while ((left = node->rb_left) != NULL)
node = left;

if (rb_parent(old)) {
- old_parent_cb = 1;
if (rb_parent(old)->rb_left == old)
rb_parent(old)->rb_left = node;
else
@@ -263,10 +247,8 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
if (parent == old) {
parent = node;
} else {
- successor_parent_cb = 1;
if (child)
rb_set_parent(child, parent);
-
parent->rb_left = child;

node->rb_right = old->rb_right;
@@ -277,24 +259,6 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
node->rb_left = old->rb_left;
rb_set_parent(old->rb_left, node);

- if (root->augment_cb) {
- /*
- * Here, three different nodes can have new children.
- * The parent of the successor node that was selected
- * to replace the node to be erased.
- * The node that is getting erased and is now replaced
- * by its successor.
- * The parent of the node getting erased-replaced.
- */
- if (successor_parent_cb)
- root->augment_cb(parent);
-
- root->augment_cb(node);
-
- if (old_parent_cb)
- root->augment_cb(rb_parent(old));
- }
-
goto color;
}

@@ -303,19 +267,15 @@ void rb_erase(struct rb_node *node, struct rb_root *root)

if (child)
rb_set_parent(child, parent);
-
- if (parent) {
+ if (parent)
+ {
if (parent->rb_left == node)
parent->rb_left = child;
else
parent->rb_right = child;
-
- if (root->augment_cb)
- root->augment_cb(parent);
-
- } else {
- root->rb_node = child;
}
+ else
+ root->rb_node = child;

color:
if (color == RB_BLACK)
@@ -323,6 +283,74 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
}
EXPORT_SYMBOL(rb_erase);

+static void rb_augment_path(struct rb_node *node, rb_augment_f func, void *data)
+{
+ struct rb_node *parent;
+
+up:
+ func(node, data);
+ parent = rb_parent(node);
+ if (!parent)
+ return;
+
+ if (node == parent->rb_left && parent->rb_right)
+ func(parent->rb_right, data);
+ else if (parent->rb_left)
+ func(parent->rb_left, data);
+
+ node = parent;
+ goto up;
+}
+
+/*
+ * after inserting @node into the tree, update the tree to account for
+ * both the new entry and any damage done by rebalance
+ */
+void rb_augment_insert(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node->rb_left)
+ node = node->rb_left;
+ else if (node->rb_right)
+ node = node->rb_right;
+
+ rb_augment_path(node, func, data);
+}
+
+/*
+ * before removing the node, find the deepest node on the rebalance path
+ * that will still be there after @node gets removed
+ */
+struct rb_node *rb_augment_erase_begin(struct rb_node *node)
+{
+ struct rb_node *deepest;
+
+ if (!node->rb_right && !node->rb_left)
+ deepest = rb_parent(node);
+ else if (!node->rb_right)
+ deepest = node->rb_left;
+ else if (!node->rb_left)
+ deepest = node->rb_right;
+ else {
+ deepest = rb_next(node);
+ if (deepest->rb_right)
+ deepest = deepest->rb_right;
+ else if (rb_parent(deepest) != node)
+ deepest = rb_parent(deepest);
+ }
+
+ return deepest;
+}
+
+/*
+ * after removal, update the tree to account for the removed entry
+ * and any rebalance damage.
+ */
+void rb_augment_erase_end(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node)
+ rb_augment_path(node, func, data);
+}
+
/*
* This function returns the first node (in sort order) of the tree.
*/

2010-07-05 12:53:21

by Peter Zijlstra

[permalink] [raw]
Subject: [tip:x86/urgent] rbtree: Undo augmented trees performance damage and regression

Commit-ID: b945d6b2554d550fe95caadc61e521c0ad71fb9c
Gitweb: http://git.kernel.org/tip/b945d6b2554d550fe95caadc61e521c0ad71fb9c
Author: Peter Zijlstra <[email protected]>
AuthorDate: Sat, 29 May 2010 15:31:43 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 5 Jul 2010 14:43:50 +0200

rbtree: Undo augmented trees performance damage and regression

Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).

This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.

For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.

[ This patch also fixes a video driver regression reported by
Ali Gholami Rudi - the memtype->subtree_max_end was updated
incorrectly. ]

Acked-by: Suresh Siddha <[email protected]>
Acked-by: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Tested-by: Ali Gholami Rudi <[email protected]>
Cc: Fabio Checconi <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
LKML-Reference: <1275414172.27810.27961.camel@twins>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/pat_rbtree.c | 34 ++-----------
include/linux/rbtree.h | 13 ++++--
lib/rbtree.c | 116 ++++++++++++++++++++++++++++-----------------
3 files changed, 87 insertions(+), 76 deletions(-)

diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index f20eeec..8acaddd 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -34,8 +34,7 @@
* memtype_lock protects the rbtree.
*/

-static void memtype_rb_augment_cb(struct rb_node *node);
-static struct rb_root memtype_rbroot = RB_AUGMENT_ROOT(&memtype_rb_augment_cb);
+static struct rb_root memtype_rbroot = RB_ROOT;

static int is_node_overlap(struct memtype *node, u64 start, u64 end)
{
@@ -56,7 +55,7 @@ static u64 get_subtree_max_end(struct rb_node *node)
}

/* Update 'subtree_max_end' for a node, based on node and its children */
-static void update_node_max_end(struct rb_node *node)
+static void memtype_rb_augment_cb(struct rb_node *node, void *__unused)
{
struct memtype *data;
u64 max_end, child_max_end;
@@ -78,25 +77,6 @@ static void update_node_max_end(struct rb_node *node)
data->subtree_max_end = max_end;
}

-/* Update 'subtree_max_end' for a node and all its ancestors */
-static void update_path_max_end(struct rb_node *node)
-{
- u64 old_max_end, new_max_end;
-
- while (node) {
- struct memtype *data = container_of(node, struct memtype, rb);
-
- old_max_end = data->subtree_max_end;
- update_node_max_end(node);
- new_max_end = data->subtree_max_end;
-
- if (new_max_end == old_max_end)
- break;
-
- node = rb_parent(node);
- }
-}
-
/* Find the first (lowest start addr) overlapping range from rb tree */
static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
u64 start, u64 end)
@@ -190,12 +170,6 @@ failure:
return -EBUSY;
}

-static void memtype_rb_augment_cb(struct rb_node *node)
-{
- if (node)
- update_path_max_end(node);
-}
-
static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)
{
struct rb_node **node = &(root->rb_node);
@@ -213,6 +187,7 @@ static void memtype_rb_insert(struct rb_root *root, struct memtype *newdata)

rb_link_node(&newdata->rb, parent, node);
rb_insert_color(&newdata->rb, root);
+ rb_augment_insert(&newdata->rb, memtype_rb_augment_cb, NULL);
}

int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)
@@ -234,13 +209,16 @@ int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)

struct memtype *rbt_memtype_erase(u64 start, u64 end)
{
+ struct rb_node *deepest;
struct memtype *data;

data = memtype_rb_exact_match(&memtype_rbroot, start, end);
if (!data)
goto out;

+ deepest = rb_augment_erase_begin(&data->rb);
rb_erase(&data->rb, &memtype_rbroot);
+ rb_augment_erase_end(deepest, memtype_rb_augment_cb, NULL);
out:
return data;
}
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index fe1872e..7066acb 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -110,7 +110,6 @@ struct rb_node
struct rb_root
{
struct rb_node *rb_node;
- void (*augment_cb)(struct rb_node *node);
};


@@ -130,9 +129,7 @@ static inline void rb_set_color(struct rb_node *rb, int color)
rb->rb_parent_color = (rb->rb_parent_color & ~1) | color;
}

-#define RB_ROOT (struct rb_root) { NULL, NULL, }
-#define RB_AUGMENT_ROOT(x) (struct rb_root) { NULL, x}
-
+#define RB_ROOT (struct rb_root) { NULL, }
#define rb_entry(ptr, type, member) container_of(ptr, type, member)

#define RB_EMPTY_ROOT(root) ((root)->rb_node == NULL)
@@ -142,6 +139,14 @@ static inline void rb_set_color(struct rb_node *rb, int color)
extern void rb_insert_color(struct rb_node *, struct rb_root *);
extern void rb_erase(struct rb_node *, struct rb_root *);

+typedef void (*rb_augment_f)(struct rb_node *node, void *data);
+
+extern void rb_augment_insert(struct rb_node *node,
+ rb_augment_f func, void *data);
+extern struct rb_node *rb_augment_erase_begin(struct rb_node *node);
+extern void rb_augment_erase_end(struct rb_node *node,
+ rb_augment_f func, void *data);
+
/* Find logical next and previous nodes in a tree */
extern struct rb_node *rb_next(const struct rb_node *);
extern struct rb_node *rb_prev(const struct rb_node *);
diff --git a/lib/rbtree.c b/lib/rbtree.c
index 15e10b1..4693f79 100644
--- a/lib/rbtree.c
+++ b/lib/rbtree.c
@@ -44,11 +44,6 @@ static void __rb_rotate_left(struct rb_node *node, struct rb_root *root)
else
root->rb_node = right;
rb_set_parent(node, right);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(right);
- }
}

static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
@@ -72,20 +67,12 @@ static void __rb_rotate_right(struct rb_node *node, struct rb_root *root)
else
root->rb_node = left;
rb_set_parent(node, left);
-
- if (root->augment_cb) {
- root->augment_cb(node);
- root->augment_cb(left);
- }
}

void rb_insert_color(struct rb_node *node, struct rb_root *root)
{
struct rb_node *parent, *gparent;

- if (root->augment_cb)
- root->augment_cb(node);
-
while ((parent = rb_parent(node)) && rb_is_red(parent))
{
gparent = rb_parent(parent);
@@ -240,15 +227,12 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
else
{
struct rb_node *old = node, *left;
- int old_parent_cb = 0;
- int successor_parent_cb = 0;

node = node->rb_right;
while ((left = node->rb_left) != NULL)
node = left;

if (rb_parent(old)) {
- old_parent_cb = 1;
if (rb_parent(old)->rb_left == old)
rb_parent(old)->rb_left = node;
else
@@ -263,10 +247,8 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
if (parent == old) {
parent = node;
} else {
- successor_parent_cb = 1;
if (child)
rb_set_parent(child, parent);
-
parent->rb_left = child;

node->rb_right = old->rb_right;
@@ -277,24 +259,6 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
node->rb_left = old->rb_left;
rb_set_parent(old->rb_left, node);

- if (root->augment_cb) {
- /*
- * Here, three different nodes can have new children.
- * The parent of the successor node that was selected
- * to replace the node to be erased.
- * The node that is getting erased and is now replaced
- * by its successor.
- * The parent of the node getting erased-replaced.
- */
- if (successor_parent_cb)
- root->augment_cb(parent);
-
- root->augment_cb(node);
-
- if (old_parent_cb)
- root->augment_cb(rb_parent(old));
- }
-
goto color;
}

@@ -303,19 +267,15 @@ void rb_erase(struct rb_node *node, struct rb_root *root)

if (child)
rb_set_parent(child, parent);
-
- if (parent) {
+ if (parent)
+ {
if (parent->rb_left == node)
parent->rb_left = child;
else
parent->rb_right = child;
-
- if (root->augment_cb)
- root->augment_cb(parent);
-
- } else {
- root->rb_node = child;
}
+ else
+ root->rb_node = child;

color:
if (color == RB_BLACK)
@@ -323,6 +283,74 @@ void rb_erase(struct rb_node *node, struct rb_root *root)
}
EXPORT_SYMBOL(rb_erase);

+static void rb_augment_path(struct rb_node *node, rb_augment_f func, void *data)
+{
+ struct rb_node *parent;
+
+up:
+ func(node, data);
+ parent = rb_parent(node);
+ if (!parent)
+ return;
+
+ if (node == parent->rb_left && parent->rb_right)
+ func(parent->rb_right, data);
+ else if (parent->rb_left)
+ func(parent->rb_left, data);
+
+ node = parent;
+ goto up;
+}
+
+/*
+ * after inserting @node into the tree, update the tree to account for
+ * both the new entry and any damage done by rebalance
+ */
+void rb_augment_insert(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node->rb_left)
+ node = node->rb_left;
+ else if (node->rb_right)
+ node = node->rb_right;
+
+ rb_augment_path(node, func, data);
+}
+
+/*
+ * before removing the node, find the deepest node on the rebalance path
+ * that will still be there after @node gets removed
+ */
+struct rb_node *rb_augment_erase_begin(struct rb_node *node)
+{
+ struct rb_node *deepest;
+
+ if (!node->rb_right && !node->rb_left)
+ deepest = rb_parent(node);
+ else if (!node->rb_right)
+ deepest = node->rb_left;
+ else if (!node->rb_left)
+ deepest = node->rb_right;
+ else {
+ deepest = rb_next(node);
+ if (deepest->rb_right)
+ deepest = deepest->rb_right;
+ else if (rb_parent(deepest) != node)
+ deepest = rb_parent(deepest);
+ }
+
+ return deepest;
+}
+
+/*
+ * after removal, update the tree to account for the removed entry
+ * and any rebalance damage.
+ */
+void rb_augment_erase_end(struct rb_node *node, rb_augment_f func, void *data)
+{
+ if (node)
+ rb_augment_path(node, func, data);
+}
+
/*
* This function returns the first node (in sort order) of the tree.
*/