2021-09-30 18:56:59

by Matthew Wilcox

[permalink] [raw]
Subject: [RFC] mm: Optimise put_pages_list()

Instead of calling put_page() one page at a time, pop pages off
the list if there are other refcounts and pass the remainder
to free_unref_page_list(). This should be a speed improvement,
but I have no measurements to support that. It's also not very
widely used today, so I can't say I've really tested it. I'm only
bothering with this patch because I'd like the IOMMU code to use it
https://lore.kernel.org/lkml/[email protected]/

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
mm/swap.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index af3cad4e5378..f6b38398fa6f 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -139,13 +139,14 @@ EXPORT_SYMBOL(__put_page);
*/
void put_pages_list(struct list_head *pages)
{
- while (!list_empty(pages)) {
- struct page *victim;
+ struct page *page, *next;

- victim = lru_to_page(pages);
- list_del(&victim->lru);
- put_page(victim);
+ list_for_each_entry_safe(page, next, pages, lru) {
+ if (!put_page_testzero(page))
+ list_del(&page->lru);
}
+
+ free_unref_page_list(pages);
}
EXPORT_SYMBOL(put_pages_list);

--
2.32.0


2021-10-04 10:08:34

by Mel Gorman

[permalink] [raw]
Subject: Re: [RFC] mm: Optimise put_pages_list()

On Thu, Sep 30, 2021 at 05:32:58PM +0100, Matthew Wilcox (Oracle) wrote:
> Instead of calling put_page() one page at a time, pop pages off
> the list if there are other refcounts and pass the remainder
> to free_unref_page_list(). This should be a speed improvement,
> but I have no measurements to support that. It's also not very
> widely used today, so I can't say I've really tested it. I'm only
> bothering with this patch because I'd like the IOMMU code to use it
> https://lore.kernel.org/lkml/[email protected]/
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>

I see your motivation but you need to check that all users of
put_pages_list (current and future) handle destroy_compound_page properly
or handle it within put_pages_list. For example, the release_pages()
user of free_unref_page_list calls __put_compound_page directly before
freeing. put_pages_list as it stands will call dstroy_compound_page but
free_unref_page_list does not destroy compound pages in free_pages_prepare

--
Mel Gorman
SUSE Labs

2021-10-04 12:52:02

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC] mm: Optimise put_pages_list()

On Mon, Oct 04, 2021 at 10:10:37AM +0100, Mel Gorman wrote:
> On Thu, Sep 30, 2021 at 05:32:58PM +0100, Matthew Wilcox (Oracle) wrote:
> > Instead of calling put_page() one page at a time, pop pages off
> > the list if there are other refcounts and pass the remainder
> > to free_unref_page_list(). This should be a speed improvement,
> > but I have no measurements to support that. It's also not very
> > widely used today, so I can't say I've really tested it. I'm only
> > bothering with this patch because I'd like the IOMMU code to use it
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
>
> I see your motivation but you need to check that all users of
> put_pages_list (current and future) handle destroy_compound_page properly
> or handle it within put_pages_list. For example, the release_pages()
> user of free_unref_page_list calls __put_compound_page directly before
> freeing. put_pages_list as it stands will call dstroy_compound_page but
> free_unref_page_list does not destroy compound pages in free_pages_prepare

Quite right. I was really only thinking about order-zero pages because
there aren't any users of compound pages that call this. But of course,
we should be robust against future callers. So the obvious thing to do
is to copy what release_pages() does:

+++ b/mm/swap.c
@@ -144,6 +144,10 @@ void put_pages_list(struct list_head *pages)
list_for_each_entry_safe(page, next, pages, lru) {
if (!put_page_testzero(page))
list_del(&page->lru);
+ if (PageCompound(page)) {
+ list_del(&page->lru);
+ __put_compound_page(page);
+ }
}

free_unref_page_list(pages);

But would it be better to have free_unref_page_list() handle compound
pages itself?

+++ b/mm/page_alloc.c
@@ -3427,6 +3427,11 @@ void free_unref_page_list(struct list_head *list)

/* Prepare pages for freeing */
list_for_each_entry_safe(page, next, list, lru) {
+ if (PageCompound(page)) {
+ __put_compound_page(page);
+ list_del(&page->lru);
+ continue;
+ }
pfn = page_to_pfn(page);
if (!free_unref_page_prepare(page, pfn, 0)) {
list_del(&page->lru);

(and delete the special handling from release_pages() in the same patch)

2021-10-04 21:46:46

by Mel Gorman

[permalink] [raw]
Subject: Re: [RFC] mm: Optimise put_pages_list()

On Mon, Oct 04, 2021 at 01:49:37PM +0100, Matthew Wilcox wrote:
> On Mon, Oct 04, 2021 at 10:10:37AM +0100, Mel Gorman wrote:
> > On Thu, Sep 30, 2021 at 05:32:58PM +0100, Matthew Wilcox (Oracle) wrote:
> > > Instead of calling put_page() one page at a time, pop pages off
> > > the list if there are other refcounts and pass the remainder
> > > to free_unref_page_list(). This should be a speed improvement,
> > > but I have no measurements to support that. It's also not very
> > > widely used today, so I can't say I've really tested it. I'm only
> > > bothering with this patch because I'd like the IOMMU code to use it
> > > https://lore.kernel.org/lkml/[email protected]/
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> >
> > I see your motivation but you need to check that all users of
> > put_pages_list (current and future) handle destroy_compound_page properly
> > or handle it within put_pages_list. For example, the release_pages()
> > user of free_unref_page_list calls __put_compound_page directly before
> > freeing. put_pages_list as it stands will call dstroy_compound_page but
> > free_unref_page_list does not destroy compound pages in free_pages_prepare
>
> Quite right. I was really only thinking about order-zero pages because
> there aren't any users of compound pages that call this. But of course,
> we should be robust against future callers. So the obvious thing to do
> is to copy what release_pages() does:
>
> +++ b/mm/swap.c
> @@ -144,6 +144,10 @@ void put_pages_list(struct list_head *pages)
> list_for_each_entry_safe(page, next, pages, lru) {
> if (!put_page_testzero(page))
> list_del(&page->lru);
> + if (PageCompound(page)) {
> + list_del(&page->lru);
> + __put_compound_page(page);
> + }
> }
>
> free_unref_page_list(pages);

That would be the most straight-forward

>
> But would it be better to have free_unref_page_list() handle compound
> pages itself?
>
> +++ b/mm/page_alloc.c
> @@ -3427,6 +3427,11 @@ void free_unref_page_list(struct list_head *list)
>
> /* Prepare pages for freeing */
> list_for_each_entry_safe(page, next, list, lru) {
> + if (PageCompound(page)) {
> + __put_compound_page(page);
> + list_del(&page->lru);
> + continue;
> + }
> pfn = page_to_pfn(page);
> if (!free_unref_page_prepare(page, pfn, 0)) {
> list_del(&page->lru);
>
> (and delete the special handling from release_pages() in the same patch)

It's surprisingly tricky.

Minimally, that list_del should be before __put_compound_page or you'll
clobber whatever list the compound page destructor placed the free page on.
Take care with how you remove the special handling and leave a comment
explaining why __put_compound_page is not called and that PageLRU will be
cleared when it falls through to add the page to pages_to_free. The tricky
part is memcg uncharging because if mem_cgroup_uncharge_list() is called
then the uncharging happens twice -- once in the destructor and again in
mem_cgroup_uncharge_list. I guess you could use two lists and splice them
after mem_cgroup_uncharge_list() and before free_unref_page_list.

--
Mel Gorman
SUSE Labs