2022-08-01 18:50:28

by Kairui Song

[permalink] [raw]
Subject: [PATCH] mm/util: reduce stack usage of folio_mapcount

From: Kairui Song <[email protected]>

folio_entire_mapcount will call PageHeadHuge which is a function call,
and blocks the compiler from recognizing this redundant load.

After rearranging the code, stack usage is dropped from 32 to 24, and
the function size is smaller (tested on GCC 12):

Before:
Stack usage:
mm/util.c:845:5:folio_mapcount 32 static
Size:
0000000000000ea0 00000000000000c7 T folio_mapcount

After:
Stack usage:
mm/util.c:845:5:folio_mapcount 24 static
Size:
0000000000000ea0 00000000000000b0 T folio_mapcount

Signed-off-by: Kairui Song <[email protected]>
---
mm/util.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/util.c b/mm/util.c
index 0837570c9225..98a589bb89c9 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -850,10 +850,10 @@ int folio_mapcount(struct folio *folio)
return atomic_read(&folio->_mapcount) + 1;

compound = folio_entire_mapcount(folio);
- nr = folio_nr_pages(folio);
if (folio_test_hugetlb(folio))
return compound;
ret = compound;
+ nr = folio_nr_pages(folio);
for (i = 0; i < nr; i++)
ret += atomic_read(&folio_page(folio, i)->_mapcount) + 1;
/* File pages has compound_mapcount included in _mapcount */
--
2.35.2



2022-08-11 23:17:56

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mm/util: reduce stack usage of folio_mapcount

On Tue, 2 Aug 2022 01:31:55 +0800 Kairui Song <[email protected]> wrote:

> From: Kairui Song <[email protected]>
>
> folio_entire_mapcount will call PageHeadHuge which is a function call,
> and blocks the compiler from recognizing this redundant load.

Did you mean folio_test_hugetlb() rather than folio_entire_mapcount()?


> After rearranging the code, stack usage is dropped from 32 to 24, and
> the function size is smaller (tested on GCC 12):
>
> Before:
> Stack usage:
> mm/util.c:845:5:folio_mapcount 32 static
> Size:
> 0000000000000ea0 00000000000000c7 T folio_mapcount
>
> After:
> Stack usage:
> mm/util.c:845:5:folio_mapcount 24 static
> Size:
> 0000000000000ea0 00000000000000b0 T folio_mapcount
>
> ...
>
> @@ -850,10 +850,10 @@ int folio_mapcount(struct folio *folio)
> return atomic_read(&folio->_mapcount) + 1;
>
> compound = folio_entire_mapcount(folio);
> - nr = folio_nr_pages(folio);
> if (folio_test_hugetlb(folio))
> return compound;
> ret = compound;
> + nr = folio_nr_pages(folio);
> for (i = 0; i < nr; i++)
> ret += atomic_read(&folio_page(folio, i)->_mapcount) + 1;
> /* File pages has compound_mapcount included in _mapcount */
> --
> 2.35.2

2022-08-12 05:01:01

by Kairui Song

[permalink] [raw]
Subject: Re: [PATCH] mm/util: reduce stack usage of folio_mapcount

Andrew Morton <[email protected]> 于2022年8月12日周五 07:07写道:
>
> On Tue, 2 Aug 2022 01:31:55 +0800 Kairui Song <[email protected]> wrote:
>
> > From: Kairui Song <[email protected]>
> >
> > folio_entire_mapcount will call PageHeadHuge which is a function call,
> > and blocks the compiler from recognizing this redundant load.
>
> Did you mean folio_test_hugetlb() rather than folio_entire_mapcount()?

Thanks for checking out this patch, and Yes, it's folio_test_hugetlb,
my mistake...

>
>
> > After rearranging the code, stack usage is dropped from 32 to 24, and
> > the function size is smaller (tested on GCC 12):
> >
> > Before:
> > Stack usage:
> > mm/util.c:845:5:folio_mapcount 32 static
> > Size:
> > 0000000000000ea0 00000000000000c7 T folio_mapcount
> >
> > After:
> > Stack usage:
> > mm/util.c:845:5:folio_mapcount 24 static
> > Size:
> > 0000000000000ea0 00000000000000b0 T folio_mapcount
> >
> > ...
> >
> > @@ -850,10 +850,10 @@ int folio_mapcount(struct folio *folio)
> > return atomic_read(&folio->_mapcount) + 1;
> >
> > compound = folio_entire_mapcount(folio);
> > - nr = folio_nr_pages(folio);
> > if (folio_test_hugetlb(folio))
> > return compound;
> > ret = compound;
> > + nr = folio_nr_pages(folio);
> > for (i = 0; i < nr; i++)
> > ret += atomic_read(&folio_page(folio, i)->_mapcount) + 1;
> > /* File pages has compound_mapcount included in _mapcount */
> > --
> > 2.35.2

Is the rest of the patch a valid fix? Should I send V2?