2019-02-19 05:27:59

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH mmotm] mm: ksm: do not block on page lock when searching stable tree fix

I hit the kernel BUG at mm/ksm.c:809! quite easily under KSM swapping
load. That's the BUG_ON(age > 1) in remove_rmap_item_from_tree().

There is a comment above it, but explaining in more detail: KSM saves
effort by not fully maintaining the unstable tree like a proper RB
tree throughout, but at the start of each pass forgetting the old tree
and rebuilding anew from scratch. But that means that whenever it looks
like we need to remove an item from the unstable tree, we have to check
whether it has already been linked into the new tree this time around
(hence rb_erase needed), or it's just a free-floating leftover from the
previous tree.

"age" 0 or 1 says which: but if it's more than 1, then something has
gone wrong: cmp_and_merge_page() was forgetting to remove the item
in the new EBUSY case.

Signed-off-by: Hugh Dickins <[email protected]>
---
Fix to fold into
mm-ksm-do-not-block-on-page-lock-when-searching-stable-tree.patch

I like that patch better now it has the mods suggested by John Hubbard;
but what I'd still really prefer to do is to make the patch unnecessary,
by reworking that window of KSM page migration so that there's just no
need for stable_tree_search() to take page lock. We would all prefer
that. However, each time I've gone to do so, it's turned out to need
more care than I expected, and I run out of time. So, let's go with
what we have, and one day I might perhaps get back to it.

mm/ksm.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

--- mmotm/mm/ksm.c 2019-02-14 15:16:13.000000000 -0800
+++ linux/mm/ksm.c 2019-02-18 20:36:44.707310427 -0800
@@ -2082,10 +2082,6 @@ static void cmp_and_merge_page(struct pa

/* We first start with searching the page inside the stable tree */
kpage = stable_tree_search(page);
-
- if (PTR_ERR(kpage) == -EBUSY)
- return;
-
if (kpage == page && rmap_item->head == stable_node) {
put_page(kpage);
return;
@@ -2094,6 +2090,9 @@ static void cmp_and_merge_page(struct pa
remove_rmap_item_from_tree(rmap_item);

if (kpage) {
+ if (PTR_ERR(kpage) == -EBUSY)
+ return;
+
err = try_to_merge_with_ksm_page(rmap_item, page, kpage);
if (!err) {
/*


2019-02-19 18:13:49

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH mmotm] mm: ksm: do not block on page lock when searching stable tree fix



On 2/18/19 9:26 PM, Hugh Dickins wrote:
> I hit the kernel BUG at mm/ksm.c:809! quite easily under KSM swapping
> load. That's the BUG_ON(age > 1) in remove_rmap_item_from_tree().
>
> There is a comment above it, but explaining in more detail: KSM saves
> effort by not fully maintaining the unstable tree like a proper RB
> tree throughout, but at the start of each pass forgetting the old tree
> and rebuilding anew from scratch. But that means that whenever it looks
> like we need to remove an item from the unstable tree, we have to check
> whether it has already been linked into the new tree this time around
> (hence rb_erase needed), or it's just a free-floating leftover from the
> previous tree.
>
> "age" 0 or 1 says which: but if it's more than 1, then something has
> gone wrong: cmp_and_merge_page() was forgetting to remove the item
> in the new EBUSY case.
>
> Signed-off-by: Hugh Dickins <[email protected]>
> ---
> Fix to fold into
> mm-ksm-do-not-block-on-page-lock-when-searching-stable-tree.patch

Thanks for catching this. The fix looks good to me.

>
> I like that patch better now it has the mods suggested by John Hubbard;
> but what I'd still really prefer to do is to make the patch unnecessary,
> by reworking that window of KSM page migration so that there's just no
> need for stable_tree_search() to take page lock. We would all prefer
> that. However, each time I've gone to do so, it's turned out to need
> more care than I expected, and I run out of time. So, let's go with
> what we have, and one day I might perhaps get back to it.

I agree it needs extra scrutiny to make the code lockless.

Regards,
Yang

>
> mm/ksm.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> --- mmotm/mm/ksm.c 2019-02-14 15:16:13.000000000 -0800
> +++ linux/mm/ksm.c 2019-02-18 20:36:44.707310427 -0800
> @@ -2082,10 +2082,6 @@ static void cmp_and_merge_page(struct pa
>
> /* We first start with searching the page inside the stable tree */
> kpage = stable_tree_search(page);
> -
> - if (PTR_ERR(kpage) == -EBUSY)
> - return;
> -
> if (kpage == page && rmap_item->head == stable_node) {
> put_page(kpage);
> return;
> @@ -2094,6 +2090,9 @@ static void cmp_and_merge_page(struct pa
> remove_rmap_item_from_tree(rmap_item);
>
> if (kpage) {
> + if (PTR_ERR(kpage) == -EBUSY)
> + return;
> +
> err = try_to_merge_with_ksm_page(rmap_item, page, kpage);
> if (!err) {
> /*