2023-03-16 09:23:19

by Yang Yang

[permalink] [raw]
Subject:  [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

From: Yang Yang <[email protected]>

After we implemented workingset detection for anonymous LRU[1],
the calculation of workingset size is a little complex. Actually there is
no need to call mem_cgroup_get_nr_swap_pages() if refault page is
anonymous page, since we are doing swapping then should always
give pressure to NR_ACTIVE_ANON.
So avoid using mem_cgroup_get_nr_swap_pages() when handling
swapin in workingset_refault(). This also give us a chance to refactor
the code to make it simpler and more understandable.

[1] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU")

Signed-off-by: Yang Yang <[email protected]>
Reviewed-by: Wang Yong <[email protected]>
Reviewed-by: Xiaokai Ran <[email protected]>
---
mm/workingset.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index 00c6f4d9d9be..a304e8571d54 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -466,22 +466,23 @@ void workingset_refault(struct folio *folio, void *shadow)
/*
* Compare the distance to the existing workingset size. We
* don't activate pages that couldn't stay resident even if
- * all the memory was available to the workingset. Whether
- * workingset competition needs to consider anon or not depends
- * on having swap.
+ * all the memory was available to the workingset. For page
+ * cache whether workingset competition needs to consider
+ * anon or not depends on having swap.
*/
workingset_size = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
+ /* For anonymous page */
if (!file) {
+ workingset_size += lruvec_page_state(eviction_lruvec,
+ NR_ACTIVE_ANON);
workingset_size += lruvec_page_state(eviction_lruvec,
NR_INACTIVE_FILE);
- }
- if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) {
+ /* For page cache */
+ } else if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) {
workingset_size += lruvec_page_state(eviction_lruvec,
NR_ACTIVE_ANON);
- if (file) {
- workingset_size += lruvec_page_state(eviction_lruvec,
+ workingset_size += lruvec_page_state(eviction_lruvec,
NR_INACTIVE_ANON);
- }
}
if (refault_distance > workingset_size)
goto out;
--
2.25.1


2023-03-16 13:14:28

by Matthew Wilcox

[permalink] [raw]
Subject: Re:  [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

On Thu, Mar 16, 2023 at 05:23:05PM +0800, [email protected] wrote:
> * Compare the distance to the existing workingset size. We
> * don't activate pages that couldn't stay resident even if
> - * all the memory was available to the workingset. Whether
> - * workingset competition needs to consider anon or not depends
> - * on having swap.
> + * all the memory was available to the workingset. For page
> + * cache whether workingset competition needs to consider
> + * anon or not depends on having swap.

I don't mind this change

> */
> workingset_size = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
> + /* For anonymous page */

This comment adds no value

> if (!file) {
> + workingset_size += lruvec_page_state(eviction_lruvec,
> + NR_ACTIVE_ANON);
> workingset_size += lruvec_page_state(eviction_lruvec,
> NR_INACTIVE_FILE);
> - }
> - if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) {
> + /* For page cache */

Nor this one

> + } else if (mem_cgroup_get_nr_swap_pages(eviction_memcg) > 0) {
> workingset_size += lruvec_page_state(eviction_lruvec,
> NR_ACTIVE_ANON);
> - if (file) {
> - workingset_size += lruvec_page_state(eviction_lruvec,
> + workingset_size += lruvec_page_state(eviction_lruvec,
> NR_INACTIVE_ANON);
> - }
> }

I don't have an opinion on the actual code changes.

2023-03-16 14:30:16

by Johannes Weiner

[permalink] [raw]
Subject: Re:  [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

On Thu, Mar 16, 2023 at 05:23:05PM +0800, [email protected] wrote:
> From: Yang Yang <[email protected]>
>
> After we implemented workingset detection for anonymous LRU[1],
> the calculation of workingset size is a little complex. Actually there is
> no need to call mem_cgroup_get_nr_swap_pages() if refault page is
> anonymous page, since we are doing swapping then should always
> give pressure to NR_ACTIVE_ANON.

This is false.

(mem_cgroup_)get_nr_swap_pages() returns the *free swap slots*. There
might be swap, but if it's full, reclaim stops scanning anonymous
pages altogether. That means that refaults of either type can no
longer displace existing anonymous pages, only cache.

So yes, all refaults need to check free swap to determine how to act
on the reuse frequency.

> @@ -466,22 +466,23 @@ void workingset_refault(struct folio *folio, void *shadow)
> /*
> * Compare the distance to the existing workingset size. We
> * don't activate pages that couldn't stay resident even if
> - * all the memory was available to the workingset. Whether
> - * workingset competition needs to consider anon or not depends
> - * on having swap.
> + * all the memory was available to the workingset. For page
> + * cache whether workingset competition needs to consider
> + * anon or not depends on having swap.

No, it applies to all refaults, not just cache.

What could help is changing the comment to "having free swap space".

2023-03-17 02:06:32

by Yang Yang

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

>On Thu, Mar 16, 2023 at 05:23:05PM +0800, [email protected] wrote:
>> From: Yang Yang <[email protected]>
>>
>> After we implemented workingset detection for anonymous LRU[1],
>> the calculation of workingset size is a little complex. Actually there is
>> no need to call mem_cgroup_get_nr_swap_pages() if refault page is
>> anonymous page, since we are doing swapping then should always
>> give pressure to NR_ACTIVE_ANON.
>
> This is false.
>
> (mem_cgroup_)get_nr_swap_pages() returns the *free swap slots*. There
> might be swap, but if it's full, reclaim stops scanning anonymous
> pages altogether. That means that refaults of either type can no
> longer displace existing anonymous pages, only cache.

I see in this patch "mm: vmscan: enforce inactive:active ratio at the
reclaim root", reclaim will be done in the combined workingset of
different workloads in different cgroups.

So if current cgroup reach it's swap limit(mem_cgroup_get_nr_swap_pages(memcg) == 0),
but other cgroup still has swap slot, should we allow the refaulting page
to active and give pressure to other cgroup?

2023-03-17 14:10:05

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

On Fri, Mar 17, 2023 at 01:59:03AM +0000, Yang Yang wrote:
> >On Thu, Mar 16, 2023 at 05:23:05PM +0800, [email protected] wrote:
> >> From: Yang Yang <[email protected]>
> >>
> >> After we implemented workingset detection for anonymous LRU[1],
> >> the calculation of workingset size is a little complex. Actually there is
> >> no need to call mem_cgroup_get_nr_swap_pages() if refault page is
> >> anonymous page, since we are doing swapping then should always
> >> give pressure to NR_ACTIVE_ANON.
> >
> > This is false.
> >
> > (mem_cgroup_)get_nr_swap_pages() returns the *free swap slots*. There
> > might be swap, but if it's full, reclaim stops scanning anonymous
> > pages altogether. That means that refaults of either type can no
> > longer displace existing anonymous pages, only cache.
>
> I see in this patch "mm: vmscan: enforce inactive:active ratio at the
> reclaim root", reclaim will be done in the combined workingset of
> different workloads in different cgroups.
>
> So if current cgroup reach it's swap limit(mem_cgroup_get_nr_swap_pages(memcg) == 0),
> but other cgroup still has swap slot, should we allow the refaulting page
> to active and give pressure to other cgroup?

That's what we do today.

The shadow entry remembers the reclaim root, so that refaults can
later evaluated at the same level. So, say you have:

root - A - A1
`- A2

and A1 and A2 are reclaimed due to a limit in A. The shadow entries of
evictions from A1 and A2 will actually refer to A.

When they refault later on, the distance is interpreted based on
whether A has swap (eviction_lruvec).

2023-03-18 03:10:14

by Yang Yang

[permalink] [raw]
Subject: Re: [PATCH linux-next] mm: workingset: simplify the calculation of workingset size

> On Fri, Mar 17, 2023 at 01:59:03AM +0000, Yang Yang wrote:
>> I see in this patch "mm: vmscan: enforce inactive:active ratio at the
>> reclaim root", reclaim will be done in the combined workingset of
>> different workloads in different cgroups.
>>
>> So if current cgroup reach it's swap limit(mem_cgroup_get_nr_swap_pages(memcg) == 0),
>> but other cgroup still has swap slot, should we allow the refaulting page
>> to active and give pressure to other cgroup?
>
> That's what we do today.
>
> The shadow entry remembers the reclaim root, so that refaults can
> later evaluated at the same level. So, say you have:
>
> root - A - A1
> `- A2

> and A1 and A2 are reclaimed due to a limit in A. The shadow entries of
> evictions from A1 and A2 will actually refer to A.
>
> When they refault later on, the distance is interpreted based on
> whether A has swap (eviction_lruvec).

Much appreciate to your patient explanation.

Still a question:
In the example above, if (NR_ACTIVE_FILE + NR_INACTIVE_FILE) <
refault_distance < (NR_ACTIVE_FILE + NR_INACTIVE_FILE + NR_ACTIVE_ANON),
and swap slot is full, the refault page is not set active. Then if some
swap slots is freed, the newly refault page might be early reclaimed
since it's inactive.
And if we let the refault page be set active evenif swap slot is full,
when swap slot is freed, the refault page is protected from being early
relcaimed.