2009-06-09 09:16:45

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim


From: KAMEZAWA Hiroyuki <[email protected]>

In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't know
from which LRU "cursor" page came from. Then, putback it to "src" list is BUG.
Just leave it as it is.
(And I think rotate here is overkilling even if "src" is correct.)

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
mm/vmscan.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

Index: mmotm-2.6.30-Jun4/mm/vmscan.c
===================================================================
--- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
+++ mmotm-2.6.30-Jun4/mm/vmscan.c
@@ -940,10 +940,9 @@ static unsigned long isolate_lru_pages(u
nr_taken++;
scan++;
break;
-
case -EBUSY:
- /* else it is being freed elsewhere */
- list_move(&cursor_page->lru, src);
+ /* Do nothing because we don't know where
+ cusrsor_page comes from */
default:
break; /* ! on LRU or wrong list */
}


2009-06-09 09:24:31

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: [PATCH] memcg: fix mem_cgroup_isolate_lru_page to use the same rotate logic at busy path

From: KAMEZAWA Hiroyuki <[email protected]>

This patch tries to fix memcg's lru rotation sanity...make memcg use
the same logic as global LRU does.

Now, at __isolate_lru_page() retruns -EBUSY, the page is rotated to
the tail of LRU in global LRU's isolate LRU pages. But in memcg,
it's not handled. This makes memcg do the same behavior as global LRU
and rotate LRU in the page is busy.

Note: __isolate_lru_page() is not isolate_lru_page() and it's just used
in sc->isolate_pages() logic.

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>

---
mm/memcontrol.c | 13 ++++++++++++-
mm/vmscan.c | 4 +++-
2 files changed, 15 insertions(+), 2 deletions(-)

Index: mmotm-2.6.30-Jun4/mm/vmscan.c
===================================================================
--- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
+++ mmotm-2.6.30-Jun4/mm/vmscan.c
@@ -842,7 +842,6 @@ int __isolate_lru_page(struct page *page
*/
ClearPageLRU(page);
ret = 0;
- mem_cgroup_del_lru(page);
}

return ret;
@@ -890,12 +889,14 @@ static unsigned long isolate_lru_pages(u
switch (__isolate_lru_page(page, mode, file)) {
case 0:
list_move(&page->lru, dst);
+ mem_cgroup_del_lru(page);
nr_taken++;
break;

case -EBUSY:
/* else it is being freed elsewhere */
list_move(&page->lru, src);
+ mem_cgroup_rotate_lru_list(page, page_lru(page));
continue;

default:
@@ -937,6 +938,7 @@ static unsigned long isolate_lru_pages(u
switch (__isolate_lru_page(cursor_page, mode, file)) {
case 0:
list_move(&cursor_page->lru, dst);
+ mem_cgroup_del_lru(page);
nr_taken++;
scan++;
break;
Index: mmotm-2.6.30-Jun4/mm/memcontrol.c
===================================================================
--- mmotm-2.6.30-Jun4.orig/mm/memcontrol.c
+++ mmotm-2.6.30-Jun4/mm/memcontrol.c
@@ -649,6 +649,7 @@ unsigned long mem_cgroup_isolate_pages(u
int zid = zone_idx(z);
struct mem_cgroup_per_zone *mz;
int lru = LRU_FILE * !!file + !!active;
+ int ret;

BUG_ON(!mem_cont);
mz = mem_cgroup_zoneinfo(mem_cont, nid, zid);
@@ -666,9 +667,19 @@ unsigned long mem_cgroup_isolate_pages(u
continue;

scan++;
- if (__isolate_lru_page(page, mode, file) == 0) {
+ ret = __isolate_lru_page(page, mode, file);
+ switch (ret) {
+ case 0:
list_move(&page->lru, dst);
+ mem_cgroup_del_lru(page);
nr_taken++;
+ break;
+ case -EBUSY:
+ /* we don't affect global LRU but rotate in our LRU */
+ mem_cgroup_rotate_lru_list(page, page_lru(page));
+ break;
+ default:
+ break;
}
}

2009-06-09 09:29:38

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

>
> From: KAMEZAWA Hiroyuki <[email protected]>
>
> In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't know
^^^^^^
lumpy?

> from which LRU "cursor" page came from. Then, putback it to "src" list is BUG.
> Just leave it as it is.
> (And I think rotate here is overkilling even if "src" is correct.)
>
> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>

Yes, thanks great catch!

lumpy reclaimed neighbor pages doesn't need to ratate, it because
neighbor pages doesn't stay in head of lru list.


Reviewed-by: KOSAKI Motohiro <[email protected]>



> ---
> mm/vmscan.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> Index: mmotm-2.6.30-Jun4/mm/vmscan.c
> ===================================================================
> --- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
> +++ mmotm-2.6.30-Jun4/mm/vmscan.c
> @@ -940,10 +940,9 @@ static unsigned long isolate_lru_pages(u
> nr_taken++;
> scan++;
> break;
> -
> case -EBUSY:
> - /* else it is being freed elsewhere */
> - list_move(&cursor_page->lru, src);
> + /* Do nothing because we don't know where
> + cusrsor_page comes from */
> default:
> break; /* ! on LRU or wrong list */
> }
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>


2009-06-09 10:00:27

by Minchan Kim

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

On Tue, Jun 9, 2009 at 6:15 PM, KAMEZAWAHiroyuki<[email protected]> wrote:>> From: KAMEZAWA Hiroyuki <[email protected]>>> In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't know> from which LRU "cursor" page came from. Then, putback it to "src" list is BUG.> Just leave it as it is.> (And I think rotate here is overkilling even if "src" is correct.)>> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>> --->  mm/vmscan.c |    5 ++--->  1 file changed, 2 insertions(+), 3 deletions(-)>> Index: mmotm-2.6.30-Jun4/mm/vmscan.c> ===================================================================> --- mmotm-2.6.30-Jun4.orig/mm/vmscan.c> +++ mmotm-2.6.30-Jun4/mm/vmscan.c> @@ -940,10 +940,9 @@ static unsigned long isolate_lru_pages(u>                                nr_taken++;>                                scan++;>                                break;> ->                        case -EBUSY:
We can remove case -EBUSY itself, too.It is meaningless.
> -                               /* else it is being freed elsewhere */> -                               list_move(&cursor_page->lru, src);> +                               /* Do nothing because we don't know where> +                                  cusrsor_page comes from */>                        default:>                                break;  /* ! on LRU or wrong list */
Hmm.. what meaning of this break ?We are in switch case.This "break" can't go out of loop.But comment said "abort this block scan".
If I understand it properly , don't we add goto phrase ?
>                        }>> --> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in> the body of a message to [email protected]> More majordomo info at  http://vger.kernel.org/majordomo-info.html> Please read the FAQ at  http://www.tux.org/lkml/>


-- Kinds regards,Minchan Kim????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2009-06-09 11:20:39

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim


Minchan Kim wrote:
> On Tue, Jun 9, 2009 at 6:15 PM, KAMEZAWA
> Hiroyuki<[email protected]> wrote:
>>
>> From: KAMEZAWA Hiroyuki <[email protected]>
>>
>> In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't
>> know
>> from which LRU "cursor" page came from. Then, putback it to "src" list
>> is BUG.
>> Just leave it as it is.
>> (And I think rotate here is overkilling even if "src" is correct.)
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
>> ---
>> mm/vmscan.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> Index: mmotm-2.6.30-Jun4/mm/vmscan.c
>> ===================================================================
>> --- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
>> +++ mmotm-2.6.30-Jun4/mm/vmscan.c
>> @@ -940,10 +940,9 @@ static unsigned long isolate_lru_pages(u
>> nr_taken++;
>> scan++;
>> break;
>> -
>>case -EBUSY:
>
> We can remove case -EBUSY itself, too.
> It is meaningless.
>
Sure, will post v2 and remove EBUSY case.
(I'm sorry my webmail system converts a space to a multibyte char...
then I cut some.)

>> - /* else it is being freed
>> elsewhere */
>> -
>> list_move(&cursor_page->lru, src);
>> + /* Do nothing because we
>> don't know where
>> + cusrsor_page comes
>> from */
>>default:
>> break; /* ! on LRU or
>> wrong list */
>
> Hmm.. what meaning of this break ?
> We are in switch case.
> This "break" can't go out of loop.
yes.

> But comment said "abort this block scan".
>
Where ? the comment says the cursor_page is not on lru (PG_lru is unset)
> If I understand it properly , don't we add goto phrase ?
>
No.

Just try next page on list.

Thank you for review, I'll post updated one tomorrow.
-Kame

2009-06-09 11:30:09

by Minchan Kim

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

2009/6/9 KAMEZAWA Hiroyuki <[email protected]>:
>
> Minchan Kim wrote:
>> On Tue, Jun 9, 2009 at 6:15 PM, KAMEZAWA
>> Hiroyuki<[email protected]> wrote:
>>>
>>> From: KAMEZAWA Hiroyuki <[email protected]>
>>>
>>> In lumpty reclaim, "cursor_page" is found just by pfn. Then, we don't
>>> know
>>> from which LRU "cursor" page came from. Then, putback it to "src" list
>>> is BUG.
>>> Just leave it as it is.
>>> (And I think rotate here is overkilling even if "src" is correct.)
>>>
>>> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
>>> ---
>>> mm/vmscan.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> Index: mmotm-2.6.30-Jun4/mm/vmscan.c
>>> ===================================================================
>>> --- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
>>> +++ mmotm-2.6.30-Jun4/mm/vmscan.c
>>> @@ -940,10 +940,9 @@ static unsigned long isolate_lru_pages(u
>>> nr_taken++;
>>> scan++;
>>> break;
>>> -
>>>case -EBUSY:
>>
>> We can remove case -EBUSY itself, too.
>> It is meaningless.
>>
> Sure, will post v2 and remove EBUSY case.
> (I'm sorry my webmail system converts a space to a multibyte char...
>  then I cut some.)
>
>>> - /* else it is being freed
>>> elsewhere */
>>> -
>>> list_move(&cursor_page->lru, src);
>>> +  /* Do nothing because we
>>> don't know where
>>> + cusrsor_page comes
>>> from */
>>>default:
>>> break; /* ! on LRU or
>>> wrong list */
>>
>> Hmm.. what meaning of this break ?
>> We are in switch case.
>> This "break" can't go out of loop.
> yes.
>
>> But comment said "abort this block scan".
>>
> Where ? the comment says the cursor_page is not on lru (PG_lru is unset)

I mean follow as
908 /*
909 * Attempt to take all pages in the order aligned region
910 * surrounding the tag page. Only take those pages of
911 * the same active state as that tag page. We may safely
912 * round the target page pfn down to the requested order
913 * as the mem_map is guarenteed valid out to MAX_ORDER,
914 * where that page is in a different zone we will detect
915 * it from its zone id and abort this block scan.
916 */
917 zone_id = page_zone_id(page);


>> If I understand it properly , don't we add goto phrase ?
>>
> No.

If it is so, the break also is meaningless.

> Just try next page on list.
>
> Thank you for review, I'll post updated one tomorrow.
> -Kame
>
>



--
Kinds regards,
Minchan Kim

2009-06-09 11:47:57

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

Minchan Kim wrote:

> I mean follow as
> 908 /*
> 909 * Attempt to take all pages in the order aligned region
> 910 * surrounding the tag page. Only take those pages of
> 911 * the same active state as that tag page. We may safely
> 912 * round the target page pfn down to the requested order
> 913 * as the mem_map is guarenteed valid out to MAX_ORDER,
> 914 * where that page is in a different zone we will detect
> 915 * it from its zone id and abort this block scan.
> 916 */
> 917 zone_id = page_zone_id(page);
>
But what this code really do is.
==
931 /* Check that we have not crossed a zone
boundary. */
932 if (unlikely(page_zone_id(cursor_page) !=
zone_id))
933 continue;
==
continue. I think this should be "break"
I wonder what "This block scan" means is "scanning this aligned block".

But I think the whoe code is not written as commented.

>
>>> If I understand it properly , don't we add goto phrase ?
>>>
>> No.
>
> If it is so, the break also is meaningless.
>
yes. I'll remove it. But need to add "exit from for loop" logic again.

I'm sorry that the wrong logic of this loop was out of my sight.
I'll review and rewrite this part all, tomorrow.

Thanks,
-Kame

2009-06-09 12:05:06

by Balbir Singh

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix mem_cgroup_isolate_lru_page to use the same rotate logic at busy path

* KAMEZAWA Hiroyuki <[email protected]> [2009-06-09 18:22:53]:

> From: KAMEZAWA Hiroyuki <[email protected]>
>
> This patch tries to fix memcg's lru rotation sanity...make memcg use
> the same logic as global LRU does.
>
> Now, at __isolate_lru_page() retruns -EBUSY, the page is rotated to
> the tail of LRU in global LRU's isolate LRU pages. But in memcg,
> it's not handled. This makes memcg do the same behavior as global LRU
> and rotate LRU in the page is busy.
>
> Note: __isolate_lru_page() is not isolate_lru_page() and it's just used
> in sc->isolate_pages() logic.
>
> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>


Acked-by: Balbir Singh <[email protected]>


--
Balbir

2009-06-09 12:07:48

by Minchan Kim

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

2009/6/9 KAMEZAWA Hiroyuki <[email protected]>:
> Minchan Kim wrote:
>
>> I mean follow as
>>  908         /*
>>  909          * Attempt to take all pages in the order aligned region
>>  910          * surrounding the tag page.  Only take those pages of
>>  911          * the same active state as that tag page.  We may safely
>>  912          * round the target page pfn down to the requested order
>>  913          * as the mem_map is guarenteed valid out to MAX_ORDER,
>>  914          * where that page is in a different zone we will detect
>>  915          * it from its zone id and abort this block scan.
>>  916          */
>>  917         zone_id = page_zone_id(page);
>>
> But what this code really do is.
> ==
> 931                         /* Check that we have not crossed a zone
> boundary. */
>  932                         if (unlikely(page_zone_id(cursor_page) !=
> zone_id))
>  933                                 continue;
> ==
> continue. I think this should be "break"
> I wonder what "This block scan" means is "scanning this aligned block".

It is to find first page in same zone with target page when we have
crossed a zone.
so it shouldn't stop due to that.

I think 'abort' means stopping only the page.
If it is right, it would be better to change follow as.
"and continue scanning next page"

Let's Cced Andy Whitcroft.

> But I think the whoe code is not written as commented.
>
>>
>>>> If I understand it properly , don't we add goto phrase ?
>>>>
>>> No.
>>
>> If it is so, the break also is meaningless.
>>
> yes. I'll remove it. But need to add "exit from for loop" logic again.
>
> I'm sorry that the wrong logic of this loop was out of my sight.
> I'll review and rewrite this part all, tomorrow.

Yes. I will review tomorrow, too. :)

> Thanks,
> -Kame
>
>



--
Kinds regards,
Minchan Kim

2009-06-09 13:01:43

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

On Tue, Jun 09, 2009 at 09:07:16PM +0900, Minchan Kim wrote:
> 2009/6/9 KAMEZAWA Hiroyuki <[email protected]>:
> > Minchan Kim wrote:
> >
> >> I mean follow as
> >> ?908 ? ? ? ? /*
> >> ?909 ? ? ? ? ?* Attempt to take all pages in the order aligned region
> >> ?910 ? ? ? ? ?* surrounding the tag page. ?Only take those pages of
> >> ?911 ? ? ? ? ?* the same active state as that tag page. ?We may safely
> >> ?912 ? ? ? ? ?* round the target page pfn down to the requested order
> >> ?913 ? ? ? ? ?* as the mem_map is guarenteed valid out to MAX_ORDER,
> >> ?914 ? ? ? ? ?* where that page is in a different zone we will detect
> >> ?915 ? ? ? ? ?* it from its zone id and abort this block scan.
> >> ?916 ? ? ? ? ?*/
> >> ?917 ? ? ? ? zone_id = page_zone_id(page);
> >>
> > But what this code really do is.
> > ==
> > 931 ? ? ? ? ? ? ? ? ? ? ? ? /* Check that we have not crossed a zone
> > boundary. */
> > ?932 ? ? ? ? ? ? ? ? ? ? ? ? if (unlikely(page_zone_id(cursor_page) !=
> > zone_id))
> > ?933 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? continue;
> > ==
> > continue. I think this should be "break"
> > I wonder what "This block scan" means is "scanning this aligned block".
>
> It is to find first page in same zone with target page when we have
> crossed a zone.
> so it shouldn't stop due to that.
>
> I think 'abort' means stopping only the page.
> If it is right, it would be better to change follow as.
> "and continue scanning next page"
>
> Let's Cced Andy Whitcroft.
>
> > But I think the whoe code is not written as commented.
> >
> >>
> >>>> If I understand it properly , don't we add goto phrase ?
> >>>>
> >>> No.
> >>
> >> If it is so, the break also is meaningless.
> >>
> > yes. I'll remove it. But need to add "exit from for loop" logic again.
> >
> > I'm sorry that the wrong logic of this loop was out of my sight.
> > I'll review and rewrite this part all, tomorrow.
>
> Yes. I will review tomorrow, too. :)

The comment is not the best wording. The point here is that we need to
round down in order to safely scan the free blocks as they are only
marked at the start. In rounding down however we may move back into the
previous zone as zones are not necessarily MAX_ORDER aligned. We want
to ignore the bit before our zone starts and that check moves us on to
the next page. It should be noted that this occurs rarely, ie. only
when we touch the start of a zone and only then where the zone
boundaries are not MAX_ORDER aligned.

-apw

2009-06-09 14:01:34

by Minchan Kim

[permalink] [raw]
Subject: Re: [BUGFIX][PATCH] fix wrong lru rotate back at lumpty reclaim

Hi, Andy.

On Tue, Jun 9, 2009 at 10:00 PM, Andy Whitcroft<[email protected]> wrote:
> On Tue, Jun 09, 2009 at 09:07:16PM +0900, Minchan Kim wrote:
>> 2009/6/9 KAMEZAWA Hiroyuki <[email protected]>:
>> > Minchan Kim wrote:
>> >
>> >> I mean follow as
>> >>  908         /*
>> >>  909          * Attempt to take all pages in the order aligned region
>> >>  910          * surrounding the tag page.  Only take those pages of
>> >>  911          * the same active state as that tag page.  We may safely
>> >>  912          * round the target page pfn down to the requested order
>> >>  913          * as the mem_map is guarenteed valid out to MAX_ORDER,
>> >>  914          * where that page is in a different zone we will detect
>> >>  915          * it from its zone id and abort this block scan.
>> >>  916          */
>> >>  917         zone_id = page_zone_id(page);
>> >>
>> > But what this code really do is.
>> > ==
>> > 931                         /* Check that we have not crossed a zone
>> > boundary. */
>> >  932                         if (unlikely(page_zone_id(cursor_page) !=
>> > zone_id))
>> >  933                                 continue;
>> > ==
>> > continue. I think this should be "break"
>> > I wonder what "This block scan" means is "scanning this aligned block".
>>
>> It is to find first page in same zone with target page when we have
>> crossed a zone.
>> so it shouldn't stop due to that.
>>
>> I think 'abort' means stopping only the page.
>> If it is right, it would be better to change follow as.
>> "and continue scanning next page"
>>
>> Let's Cced Andy Whitcroft.
>>
>> > But I think the whoe code is not written as commented.
>> >
>> >>
>> >>>> If I understand it properly , don't we add goto phrase ?
>> >>>>
>> >>> No.
>> >>
>> >> If it is so, the break also is meaningless.
>> >>
>> > yes. I'll remove it. But need to add "exit from for loop" logic again.
>> >
>> > I'm sorry that the wrong logic of this loop was out of my sight.
>> > I'll review and rewrite this part all, tomorrow.
>>
>> Yes. I will review tomorrow, too. :)
>
> The comment is not the best wording.  The point here is that we need to
> round down in order to safely scan the free blocks as they are only
> marked at the start.  In rounding down however we may move back into the
> previous zone as zones are not necessarily MAX_ORDER aligned.  We want
> to ignore the bit before our zone starts and that check moves us on to
> the next page.  It should be noted that this occurs rarely, ie. only
> when we touch the start of a zone and only then where the zone
> boundaries are not MAX_ORDER aligned.

Thanks for kind explanation.

I think this thread's issue is the 'break' following as.

...
cursor_page = pfn_to_page(pfn);

/* Check that we have not crossed a zone boundary. */
if (unlikely(page_zone_id(cursor_page) != zone_id))
continue;
switch (__isolate_lru_page(cursor_page, mode, file)) {
case 0:
list_move(&cursor_page->lru, dst);
nr_taken++;
scan++;
break;

case -EBUSY:
/* else it is being freed elsewhere */
list_move(&cursor_page->lru, src);
default:
break; /* ! on LRU or wrong list */
<====== HERE
}
}
}
...

I think you meant that if we met not lru pages, it should stop scanning.
That's because we have in trouble with high order page allocation.
So, if we fail to allocate contiguous page frame, scanning isn't a
point any more.

But that break can't stop loop. It is in switch case. so if we want to
break in loop really, we have to use goto phrase.
What do you think about it ?

> -apw
>



--
Kinds regards,
Minchan Kim

2009-06-10 00:12:39

by Daisuke Nishimura

[permalink] [raw]
Subject: Re: [PATCH] memcg: fix mem_cgroup_isolate_lru_page to use the same rotate logic at busy path

On Tue, 9 Jun 2009 18:22:53 +0900, KAMEZAWA Hiroyuki <[email protected]> wrote:
> From: KAMEZAWA Hiroyuki <[email protected]>
>
> This patch tries to fix memcg's lru rotation sanity...make memcg use
> the same logic as global LRU does.
>
> Now, at __isolate_lru_page() retruns -EBUSY, the page is rotated to
> the tail of LRU in global LRU's isolate LRU pages. But in memcg,
> it's not handled. This makes memcg do the same behavior as global LRU
> and rotate LRU in the page is busy.
>
> Note: __isolate_lru_page() is not isolate_lru_page() and it's just used
> in sc->isolate_pages() logic.
>
> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
>
Looks good to me.

Reviewed-by: Daisuke Nishimura <[email protected]>

> ---
> mm/memcontrol.c | 13 ++++++++++++-
> mm/vmscan.c | 4 +++-
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
> Index: mmotm-2.6.30-Jun4/mm/vmscan.c
> ===================================================================
> --- mmotm-2.6.30-Jun4.orig/mm/vmscan.c
> +++ mmotm-2.6.30-Jun4/mm/vmscan.c
> @@ -842,7 +842,6 @@ int __isolate_lru_page(struct page *page
> */
> ClearPageLRU(page);
> ret = 0;
> - mem_cgroup_del_lru(page);
> }
>
> return ret;
> @@ -890,12 +889,14 @@ static unsigned long isolate_lru_pages(u
> switch (__isolate_lru_page(page, mode, file)) {
> case 0:
> list_move(&page->lru, dst);
> + mem_cgroup_del_lru(page);
> nr_taken++;
> break;
>
> case -EBUSY:
> /* else it is being freed elsewhere */
> list_move(&page->lru, src);
> + mem_cgroup_rotate_lru_list(page, page_lru(page));
> continue;
>
> default:
> @@ -937,6 +938,7 @@ static unsigned long isolate_lru_pages(u
> switch (__isolate_lru_page(cursor_page, mode, file)) {
> case 0:
> list_move(&cursor_page->lru, dst);
> + mem_cgroup_del_lru(page);
> nr_taken++;
> scan++;
> break;
> Index: mmotm-2.6.30-Jun4/mm/memcontrol.c
> ===================================================================
> --- mmotm-2.6.30-Jun4.orig/mm/memcontrol.c
> +++ mmotm-2.6.30-Jun4/mm/memcontrol.c
> @@ -649,6 +649,7 @@ unsigned long mem_cgroup_isolate_pages(u
> int zid = zone_idx(z);
> struct mem_cgroup_per_zone *mz;
> int lru = LRU_FILE * !!file + !!active;
> + int ret;
>
> BUG_ON(!mem_cont);
> mz = mem_cgroup_zoneinfo(mem_cont, nid, zid);
> @@ -666,9 +667,19 @@ unsigned long mem_cgroup_isolate_pages(u
> continue;
>
> scan++;
> - if (__isolate_lru_page(page, mode, file) == 0) {
> + ret = __isolate_lru_page(page, mode, file);
> + switch (ret) {
> + case 0:
> list_move(&page->lru, dst);
> + mem_cgroup_del_lru(page);
> nr_taken++;
> + break;
> + case -EBUSY:
> + /* we don't affect global LRU but rotate in our LRU */
> + mem_cgroup_rotate_lru_list(page, page_lru(page));
> + break;
> + default:
> + break;
> }
> }
>
>