Some fixes and clean up for maple tree.
The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
coincidences, because now the implementation of mtree_lookup_walk() scans
pivots one by one and exits the loop early. The test cases for the bugs fixed by
[PATCH v2 3/3] are difficult to write. If I think of how to write them later,
I will send them out. So I send out the second edition first.
Changes since v1:
- drop [PATCH 4/4]
- update the commit message of [PATCH 2/4]
- collect Reviewed-bys
- add fixes tags
Peng Zhang (3):
maple_tree: Fix get wrong data_end in mtree_lookup_walk()
maple_tree: Simplify mas_wr_node_walk()
maple_tree: Fix a potential concurrency bug in RCU mode
lib/maple_tree.c | 52 ++++++++++--------------------------------------
1 file changed, 11 insertions(+), 41 deletions(-)
--
2.20.1
There is a concurrency bug that may cause the wrong value to be loaded
when a CPU is modifying the maple tree.
CPU1:
mtree_insert_range()
mas_insert()
mas_store_root()
...
mas_root_expand()
...
rcu_assign_pointer(mas->tree->ma_root, mte_mk_root(mas->node));
ma_set_meta(node, maple_leaf_64, 0, slot); <---IP
CPU2:
mtree_load()
mtree_lookup_walk()
ma_data_end();
When CPU1 is about to execute the instruction pointed to by IP,
the ma_data_end() executed by CPU2 may return the wrong end position,
which will cause the value loaded by mtree_load() to be wrong.
An example of triggering the bug:
Add mdelay(100) between rcu_assign_pointer() and ma_set_meta() in
mas_root_expand().
static DEFINE_MTREE(tree);
int work(void *p) {
unsigned long val;
for (int i = 0 ; i< 30; ++i) {
val = (unsigned long)mtree_load(&tree, 8);
mdelay(5);
pr_info("%lu",val);
}
return 0;
}
mt_init_flags(&tree, MT_FLAGS_USE_RCU);
mtree_insert(&tree, 0, (void*)12345, GFP_KERNEL);
run_thread(work)
mtree_insert(&tree, 1, (void*)56789, GFP_KERNEL);
In RCU mode, mtree_load() should always return the value before or after
the data structure is modified, and in this example mtree_load(&tree, 8)
may return 56789 which is not expected, it should always return NULL.
Fix it by put ma_set_meta() before rcu_assign_pointer().
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
---
lib/maple_tree.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 4d15202a0692..de43ff19da72 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3635,10 +3635,9 @@ static inline int mas_root_expand(struct ma_state *mas, void *entry)
slot++;
mas->depth = 1;
mas_set_height(mas);
-
+ ma_set_meta(node, maple_leaf_64, 0, slot);
/* swap the new root into the tree */
rcu_assign_pointer(mas->tree->ma_root, mte_mk_root(mas->node));
- ma_set_meta(node, maple_leaf_64, 0, slot);
return slot;
}
--
2.20.1
if (likely(offset > end))
max = pivots[offset];
The above code should be changed to if (likely(offset < end)), which is
correct. This affects the correctness of ma_data_end(). Now it seems
that the final result will not be wrong, but it is best to change it.
This patch does not change the code as above, because it simplifies the
code by the way.
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
---
lib/maple_tree.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 646297cae5d1..b3164266cfde 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3875,18 +3875,13 @@ static inline void *mtree_lookup_walk(struct ma_state *mas)
end = ma_data_end(node, type, pivots, max);
if (unlikely(ma_dead_node(node)))
goto dead_node;
-
- if (pivots[offset] >= mas->index)
- goto next;
-
do {
- offset++;
- } while ((offset < end) && (pivots[offset] < mas->index));
-
- if (likely(offset > end))
- max = pivots[offset];
+ if (pivots[offset] >= mas->index) {
+ max = pivots[offset];
+ break;
+ }
+ } while (++offset < end);
-next:
slots = ma_slots(node, type);
next = mt_slot(mas->tree, slots, offset);
if (unlikely(ma_dead_node(node)))
--
2.20.1
Simplify code of mas_wr_node_walk() without changing functionality,
and improve readability. Remove some special judgments. Instead of
dynamically recording the min and max in the loop, get the final min
and max directly at the end.
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
---
lib/maple_tree.c | 34 +++++-----------------------------
1 file changed, 5 insertions(+), 29 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index b3164266cfde..4d15202a0692 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2251,9 +2251,7 @@ static inline struct maple_enode *mte_node_or_none(struct maple_enode *enode)
static inline void mas_wr_node_walk(struct ma_wr_state *wr_mas)
{
struct ma_state *mas = wr_mas->mas;
- unsigned char count;
- unsigned char offset;
- unsigned long index, min, max;
+ unsigned char count, offset;
if (unlikely(ma_is_dense(wr_mas->type))) {
wr_mas->r_max = wr_mas->r_min = mas->index;
@@ -2266,34 +2264,12 @@ static inline void mas_wr_node_walk(struct ma_wr_state *wr_mas)
count = wr_mas->node_end = ma_data_end(wr_mas->node, wr_mas->type,
wr_mas->pivots, mas->max);
offset = mas->offset;
- min = mas_safe_min(mas, wr_mas->pivots, offset);
- if (unlikely(offset == count))
- goto max;
-
- max = wr_mas->pivots[offset];
- index = mas->index;
- if (unlikely(index <= max))
- goto done;
-
- if (unlikely(!max && offset))
- goto max;
- min = max + 1;
- while (++offset < count) {
- max = wr_mas->pivots[offset];
- if (index <= max)
- goto done;
- else if (unlikely(!max))
- break;
-
- min = max + 1;
- }
+ while (offset < count && mas->index > wr_mas->pivots[offset])
+ offset++;
-max:
- max = mas->max;
-done:
- wr_mas->r_max = max;
- wr_mas->r_min = min;
+ wr_mas->r_max = offset < count ? wr_mas->pivots[offset] : mas->max;
+ wr_mas->r_min = mas_safe_min(mas, wr_mas->pivots, offset);
wr_mas->offset_end = mas->offset = offset;
}
--
2.20.1
Kindly ping.
在 2023/3/14 20:42, Peng Zhang 写道:
> Some fixes and clean up for maple tree.
>
> The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
> coincidences, because now the implementation of mtree_lookup_walk() scans
> pivots one by one and exits the loop early. The test cases for the bugs fixed by
> [PATCH v2 3/3] are difficult to write. If I think of how to write them later,
> I will send them out. So I send out the second edition first.
>
> Changes since v1:
> - drop [PATCH 4/4]
> - update the commit message of [PATCH 2/4]
> - collect Reviewed-bys
> - add fixes tags
>
> Peng Zhang (3):
> maple_tree: Fix get wrong data_end in mtree_lookup_walk()
> maple_tree: Simplify mas_wr_node_walk()
> maple_tree: Fix a potential concurrency bug in RCU mode
>
> lib/maple_tree.c | 52 ++++++++++--------------------------------------
> 1 file changed, 11 insertions(+), 41 deletions(-)
>
On Tue, 14 Mar 2023 20:42:00 +0800 Peng Zhang <[email protected]> wrote:
> Some fixes and clean up for maple tree.
>
> The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
> coincidences, because now the implementation of mtree_lookup_walk() scans
> pivots one by one and exits the loop early. The test cases for the bugs fixed by
> [PATCH v2 3/3] are difficult to write. If I think of how to write them later,
> I will send them out. So I send out the second edition first.
Do we feel that any of these should be backported into -stable kernels?
[3/3] looks to be a candidate for this?
* Andrew Morton <[email protected]> [230404 16:27]:
> On Tue, 14 Mar 2023 20:42:00 +0800 Peng Zhang <[email protected]> wrote:
>
> > Some fixes and clean up for maple tree.
> >
> > The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
> > coincidences, because now the implementation of mtree_lookup_walk() scans
> > pivots one by one and exits the loop early. The test cases for the bugs fixed by
> > [PATCH v2 3/3] are difficult to write. If I think of how to write them later,
> > I will send them out. So I send out the second edition first.
>
> Do we feel that any of these should be backported into -stable kernels?
> [3/3] looks to be a candidate for this?
1 and 3 are fixes and should be sent to stable.
2 doesn't fix anything so I'm not sure how we could get it to be taken.
在 2023/4/5 04:27, Andrew Morton 写道:
> On Tue, 14 Mar 2023 20:42:00 +0800 Peng Zhang <[email protected]> wrote:
>
>> Some fixes and clean up for maple tree.
>>
>> The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
>> coincidences, because now the implementation of mtree_lookup_walk() scans
>> pivots one by one and exits the loop early. The test cases for the bugs fixed by
>> [PATCH v2 3/3] are difficult to write. If I think of how to write them later,
>> I will send them out. So I send out the second edition first.
> Do we feel that any of these should be backported into -stable kernels?
> [3/3] looks to be a candidate for this?
Both [1/3] and [3/3] can be sent to the stable kernel, do I need to do anything?
* Peng Zhang <[email protected]> [230405 05:53]:
>
> 在 2023/4/5 04:27, Andrew Morton 写道:
>
> > On Tue, 14 Mar 2023 20:42:00 +0800 Peng Zhang <[email protected]> wrote:
> >
> > > Some fixes and clean up for maple tree.
> > >
> > > The bug fixed by [PATCH v2 1/3] does not seem to be triggered due to some
> > > coincidences, because now the implementation of mtree_lookup_walk() scans
> > > pivots one by one and exits the loop early. The test cases for the bugs fixed by
> > > [PATCH v2 3/3] are difficult to write. If I think of how to write them later,
> > > I will send them out. So I send out the second edition first.
> > Do we feel that any of these should be backported into -stable kernels?
> > [3/3] looks to be a candidate for this?
>
> Both [1/3] and [3/3] can be sent to the stable kernel, do I need to do anything?
>
Usually any bug fixes should have a commit message with:
Link: <any discussion>
Reported-by: <if necessary>
Fixes: <upstream commit>
Cc: <[email protected]>
and the usual S-O-B, reviewed-by