2021-01-13 03:42:37

by Huang, Ying

[permalink] [raw]
Subject: [PATCH] mm: Free unused swap cache page in write protection fault handler

Commit 09854ba94c6a ("mm: do_wp_page() simplification") introduces an
issue as follows.

On a system with free memory as follow before test,

total used free shared buff/cache available
Mem: 1697300 160156 1459220 8648 77924 1419724
Swap: 1048572 0 0

The AnonPages filed of /proc/meminfo is 11712 kB. After running a
memory eater which will trigger many swapins and write protection
faults, the free memory becomes,

total used free shared buff/cache available
Mem: 1697300 352620 1309004 624 35676 1252380
Swap: 1048572 216924 831648

While the /proc/meminfo shows,

SwapCached: 198908 kB
AnonPages: 1956 kB

Then, with `swapoff -a`, the free memory becomes,

total used free shared buff/cache available
Mem: 1697300 161972 1488184 8648 47144 1433172
Swap: 0 0 0

That is, after swapins and write protection faults, many unused swap
cache pages will be left unfreed in system. Although the following
page reclaiming or swapoff will free these pages, it's still better to
free these pages at the first place.

So in this patch, at the end of wp_page_copy(), the old unused swap
cache page will be tried to be freed. With that, after running the
memory eater which will trigger many swapins and write protection
faults, the free memory is,

total used free shared buff/cache available
Mem: 1697300 154020 1509400 1212 33880 1451524
Swap: 1048572 18432 1030140

While the /proc/meminfo shows,

SwapCached: 1240 kB
AnonPages: 1904 kB

BTW: I think this should be in stable after v5.9.

Fixes: 09854ba94c6a ("mm: do_wp_page() simplification")
Signed-off-by: "Huang, Ying" <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
---
mm/memory.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index feff48e1465a..2abaff1befcb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2963,6 +2963,11 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
munlock_vma_page(old_page);
unlock_page(old_page);
}
+ if (page_copied && PageSwapCache(old_page) &&
+ !page_mapped(old_page) && trylock_page(old_page)) {
+ try_to_free_swap(old_page);
+ unlock_page(old_page);
+ }
put_page(old_page);
}
return page_copied ? VM_FAULT_WRITE : 0;
--
2.29.2


2021-01-13 03:42:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

On Tue, Jan 12, 2021 at 6:43 PM Huang Ying <[email protected]> wrote:
>
> So in this patch, at the end of wp_page_copy(), the old unused swap
> cache page will be tried to be freed.

I'd much rather free it later when needed, rather than when you're in
a COW section.

Linus

2021-01-13 03:44:17

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

On Wed, Jan 13, 2021 at 11:08:56AM +0800, huang ying wrote:
> On Wed, Jan 13, 2021 at 10:47 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > On Tue, Jan 12, 2021 at 6:43 PM Huang Ying <[email protected]> wrote:
> > >
> > > So in this patch, at the end of wp_page_copy(), the old unused swap
> > > cache page will be tried to be freed.
> >
> > I'd much rather free it later when needed, rather than when you're in
> > a COW section.
>
> Unused swap cache isn't unused file cache. Nobody can reuse them
> directly before freeing them firstly. It will make COW a little
> faster via keeping them. But I think the overhead to free them isn't
> high. While keeping them in system will confuse users (users will
> expect file cache to use free memory, but not expect unused swap cache
> to use much free memory), make the swap space more fragmented, and add
> system overall overhead (scanning LRU list, etc.).

Couldn't we just move it to the tail of the LRU list so it's reclaimed
first? Or is locking going to be a problem here?

2021-01-13 03:44:32

by huang ying

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

On Wed, Jan 13, 2021 at 10:47 AM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, Jan 12, 2021 at 6:43 PM Huang Ying <[email protected]> wrote:
> >
> > So in this patch, at the end of wp_page_copy(), the old unused swap
> > cache page will be tried to be freed.
>
> I'd much rather free it later when needed, rather than when you're in
> a COW section.

Unused swap cache isn't unused file cache. Nobody can reuse them
directly before freeing them firstly. It will make COW a little
faster via keeping them. But I think the overhead to free them isn't
high. While keeping them in system will confuse users (users will
expect file cache to use free memory, but not expect unused swap cache
to use much free memory), make the swap space more fragmented, and add
system overall overhead (scanning LRU list, etc.).

Best Regards,
Huang, Ying

2021-01-13 05:26:24

by huang ying

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

On Wed, Jan 13, 2021 at 11:12 AM Matthew Wilcox <[email protected]> wrote:
>
> On Wed, Jan 13, 2021 at 11:08:56AM +0800, huang ying wrote:
> > On Wed, Jan 13, 2021 at 10:47 AM Linus Torvalds
> > <[email protected]> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 6:43 PM Huang Ying <[email protected]> wrote:
> > > >
> > > > So in this patch, at the end of wp_page_copy(), the old unused swap
> > > > cache page will be tried to be freed.
> > >
> > > I'd much rather free it later when needed, rather than when you're in
> > > a COW section.
> >
> > Unused swap cache isn't unused file cache. Nobody can reuse them
> > directly before freeing them firstly. It will make COW a little
> > faster via keeping them. But I think the overhead to free them isn't
> > high. While keeping them in system will confuse users (users will
> > expect file cache to use free memory, but not expect unused swap cache
> > to use much free memory), make the swap space more fragmented, and add
> > system overall overhead (scanning LRU list, etc.).
>
> Couldn't we just move it to the tail of the LRU list so it's reclaimed
> first? Or is locking going to be a problem here?

Yes. That's a way to reduce the disturbance to the page reclaiming.
For LRU lock contention, is it sufficient to use another pagevec?

Best Regards,
Huang, Ying

2021-01-13 21:24:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

On Tue, Jan 12, 2021 at 9:24 PM huang ying <[email protected]> wrote:
> >
> > Couldn't we just move it to the tail of the LRU list so it's reclaimed
> > first? Or is locking going to be a problem here?
>
> Yes. That's a way to reduce the disturbance to the page reclaiming.
> For LRU lock contention, is it sufficient to use another pagevec?

I wonder if this is really worth it. I'd like to see numbers.

Because in probably 99%+ of all cases, that LRU dance is only going to
hurt and add extra locking overhead and dirty caches.

So I'd like to see some numbers that it actually helps measurably in
whatever paging-heavy case...

Linus

2021-01-15 08:52:21

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH] mm: Free unused swap cache page in write protection fault handler

Linus Torvalds <[email protected]> writes:

> On Tue, Jan 12, 2021 at 9:24 PM huang ying <[email protected]> wrote:
>> >
>> > Couldn't we just move it to the tail of the LRU list so it's reclaimed
>> > first? Or is locking going to be a problem here?
>>
>> Yes. That's a way to reduce the disturbance to the page reclaiming.
>> For LRU lock contention, is it sufficient to use another pagevec?
>
> I wonder if this is really worth it. I'd like to see numbers.
>
> Because in probably 99%+ of all cases, that LRU dance is only going to
> hurt and add extra locking overhead and dirty caches.
>
> So I'd like to see some numbers that it actually helps measurably in
> whatever paging-heavy case...

OK. I will start from a simpler version and only use a pagevec if
there's measurable difference.

Best Regards,
Huang, Ying