2021-01-06 03:48:18

by Liang Li

[permalink] [raw]
Subject: [PATCH 0/6] hugetlbfs: support free page reporting

A typical usage of hugetlbfs it's to reserve amount of memory
during the kernel booting stage, and the reserved pages are
unlikely to return to the buddy system. When application need
hugepages, kernel will allocate them from the reserved pool.
when application terminates, huge pages will return to the
reserved pool and are kept in the free list for hugetlbfs,
these free pages will not return to buddy freelist unless the
size of reserved pool is changed.
Free page reporting only supports buddy pages, it can't report
the free pages reserved for hugetlbfs. On the other hand,
hugetlbfs is a good choice for system with a huge amount of RAM,
because it can help to reduce the memory management overhead and
improve system performance.
This patch add the support for reporting hugepages in the free
list of hugetlbfs, it can be used by virtio_balloon driver for
memory overcommit and pre zero out free pages for speeding up
memory population and page fault handling.

Most of the code are 'copied' from free page reporting because
they are working in the same way. So the code can be refined to
remove duplication. It can be done later.

Since some guys have some concern about side effect of the 'buddy
free page pre zero out' feature brings, I remove it from this
serier.

Liang Li (6):
mm: Add batch size for free page reporting
mm: let user decide page reporting option
hugetlb: add free page reporting support
hugetlb: avoid allocation failed when page reporting is on going
virtio-balloon: reporting hugetlb free page to host
hugetlb: support free hugepage pre zero out

drivers/virtio/virtio_balloon.c | 58 +++++-
include/linux/hugetlb.h | 5 +
include/linux/page-flags.h | 12 ++
include/linux/page_reporting.h | 7 +
mm/Kconfig | 11 ++
mm/huge_memory.c | 3 +-
mm/hugetlb.c | 271 +++++++++++++++++++++++++++
mm/memory.c | 4 +
mm/page_reporting.c | 315 +++++++++++++++++++++++++++++++-
mm/page_reporting.h | 50 ++++-
10 files changed, 725 insertions(+), 11 deletions(-)

--
2.18.2


2021-01-06 09:45:05

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 0/6] hugetlbfs: support free page reporting

On 06.01.21 04:46, Liang Li wrote:
> A typical usage of hugetlbfs it's to reserve amount of memory
> during the kernel booting stage, and the reserved pages are
> unlikely to return to the buddy system. When application need
> hugepages, kernel will allocate them from the reserved pool.
> when application terminates, huge pages will return to the
> reserved pool and are kept in the free list for hugetlbfs,
> these free pages will not return to buddy freelist unless the
> size of reserved pool is changed.
> Free page reporting only supports buddy pages, it can't report
> the free pages reserved for hugetlbfs. On the other hand,
> hugetlbfs is a good choice for system with a huge amount of RAM,
> because it can help to reduce the memory management overhead and
> improve system performance.
> This patch add the support for reporting hugepages in the free
> list of hugetlbfs, it can be used by virtio_balloon driver for
> memory overcommit and pre zero out free pages for speeding up
> memory population and page fault handling.

You should lay out the use case + measurements. Further you should
describe what this patch set actually does, how behavior can be tuned,
pros and cons, etc... And you should most probably keep this RFC.

>
> Most of the code are 'copied' from free page reporting because
> they are working in the same way. So the code can be refined to
> remove duplication. It can be done later.

Nothing speaks about getting it right from the beginning. Otherwise it
will most likely never happen.

>
> Since some guys have some concern about side effect of the 'buddy
> free page pre zero out' feature brings, I remove it from this
> serier.

You should really point out what changed size the last version. I
remember Alex and Mike had some pretty solid points of what they don't
want to see (especially: don't use free page reporting infrastructure
and don't temporarily allocate huge pages for processing them).

I am not convinced that we want to use the free page reporting
infrastructure for this (pre-zeroing huge pages). What speaks about a
thread simply iterating over huge pages one at a time, zeroing them? The
whole free page reporting infrastructure was invented because we have to
do expensive coordination (+ locking) when going via the hypervisor. For
the main use case of zeroing huge pages in the background, I don't see a
real need for that. If you believe this is the right thing to do, please
add a discussion regarding this.

--
Thanks,

David / dhildenb

2021-01-07 01:54:37

by Liang Li

[permalink] [raw]
Subject: Re: [PATCH 0/6] hugetlbfs: support free page reporting

On Wed, Jan 6, 2021 at 5:41 PM David Hildenbrand <[email protected]> wrote:
>
> On 06.01.21 04:46, Liang Li wrote:
> > A typical usage of hugetlbfs it's to reserve amount of memory
> > during the kernel booting stage, and the reserved pages are
> > unlikely to return to the buddy system. When application need
> > hugepages, kernel will allocate them from the reserved pool.
> > when application terminates, huge pages will return to the
> > reserved pool and are kept in the free list for hugetlbfs,
> > these free pages will not return to buddy freelist unless the
> > size of reserved pool is changed.
> > Free page reporting only supports buddy pages, it can't report
> > the free pages reserved for hugetlbfs. On the other hand,
> > hugetlbfs is a good choice for system with a huge amount of RAM,
> > because it can help to reduce the memory management overhead and
> > improve system performance.
> > This patch add the support for reporting hugepages in the free
> > list of hugetlbfs, it can be used by virtio_balloon driver for
> > memory overcommit and pre zero out free pages for speeding up
> > memory population and page fault handling.
>
> You should lay out the use case + measurements. Further you should
> describe what this patch set actually does, how behavior can be tuned,
> pros and cons, etc... And you should most probably keep this RFC.
>
> >
> > Most of the code are 'copied' from free page reporting because
> > they are working in the same way. So the code can be refined to
> > remove duplication. It can be done later.
>
> Nothing speaks about getting it right from the beginning. Otherwise it
> will most likely never happen.
>
> >
> > Since some guys have some concern about side effect of the 'buddy
> > free page pre zero out' feature brings, I remove it from this
> > serier.
>
> You should really point out what changed size the last version. I
> remember Alex and Mike had some pretty solid points of what they don't
> want to see (especially: don't use free page reporting infrastructure
> and don't temporarily allocate huge pages for processing them).
>
> I am not convinced that we want to use the free page reporting
> infrastructure for this (pre-zeroing huge pages). What speaks about a
> thread simply iterating over huge pages one at a time, zeroing them? The
> whole free page reporting infrastructure was invented because we have to
> do expensive coordination (+ locking) when going via the hypervisor. For
> the main use case of zeroing huge pages in the background, I don't see a
> real need for that. If you believe this is the right thing to do, please
> add a discussion regarding this.
>
> --
> Thanks,
>
> David / dhildenb
>
>
I will take all your advice and give more detail in the next revision,
Thanks for your comments!

Liang