2019-10-23 00:19:25

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v12 0/6] mm / virtio: Provide support for unused page reporting

On Tue, 22 Oct 2019 15:27:52 -0700 Alexander Duyck <[email protected]> wrote:

> Below are the results from various benchmarks. I primarily focused on two
> tests. The first is the will-it-scale/page_fault2 test, and the other is
> a modified version of will-it-scale/page_fault1 that was enabled to use
> THP. I did this as it allows for better visibility into different parts
> of the memory subsystem. The guest is running on one node of a E5-2630 v3
> CPU with 48G of RAM that I split up into two logical nodes in the guest
> in order to test with NUMA as well.
>
> Test page_fault1 (THP) page_fault2
> Baseline 1 1256106.33 +/-0.09% 482202.67 +/-0.46%
> 16 8864441.67 +/-0.09% 3734692.00 +/-1.23%
>
> Patches applied 1 1257096.00 +/-0.06% 477436.00 +/-0.16%
> 16 8864677.33 +/-0.06% 3800037.00 +/-0.19%
>
> Patches enabled 1 1258420.00 +/-0.04% 480080.00 +/-0.07%
> MADV disabled 16 8753840.00 +/-1.27% 3782764.00 +/-0.37%
>
> Patches enabled 1 1267916.33 +/-0.08% 472075.67 +/-0.39%
> 16 8287050.33 +/-0.67% 3774500.33 +/-0.11%
>
> The results above are for a baseline with a linux-next-20191021 kernel,
> that kernel with this patch set applied but page reporting disabled in
> virtio-balloon, patches applied but the madvise disabled by direct
> assigning a device, and the patches applied and page reporting fully
> enabled. These results include the deviation seen between the average
> value reported here versus the high and/or low value. I observed that
> during the test the memory usage for the first three tests never dropped
> whereas with the patches fully enabled the VM would drop to using only a
> few GB of the host's memory when switching from memhog to page fault tests.
>
> Most of the overhead seen with this patch set fully enabled is due to the
> fact that accessing the reported pages will cause a page fault and the host
> will have to zero the page before giving it back to the guest. The overall
> guest size is kept fairly small to only a few GB while the test is running.
> This overhead is much more visible when using THP than with standard 4K
> pages. As such for the case where the host memory is not oversubscribed
> this results in a performance regression, however if the host memory were
> oversubscribed this patch set should result in a performance improvement
> as swapping memory from the host can be avoided.

I'm trying to understand "how valuable is this patchset" and the above
resulted in some headscratching.

Overall, how valuable is this patchset? To real users running real
workloads?

> There is currently an alternative patch set[1] that has been under work
> for some time however the v12 version of that patch set could not be
> tested as it triggered a kernel panic when I attempted to test it. It
> requires multiple modifications to get up and running with performance
> comparable to this patch set. A follow-on set has yet to be posted. As
> such I have not included results from that patch set, and I would
> appreciate it if we could keep this patch set the focus of any discussion
> on this thread.

Actually, the rest of us would be interested in a comparison ;)



2019-10-23 00:23:35

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH v12 0/6] mm / virtio: Provide support for unused page reporting

On Tue, 2019-10-22 at 16:01 -0700, Andrew Morton wrote:
> On Tue, 22 Oct 2019 15:27:52 -0700 Alexander Duyck <[email protected]> wrote:
>
> > Below are the results from various benchmarks. I primarily focused on two
> > tests. The first is the will-it-scale/page_fault2 test, and the other is
> > a modified version of will-it-scale/page_fault1 that was enabled to use
> > THP. I did this as it allows for better visibility into different parts
> > of the memory subsystem. The guest is running on one node of a E5-2630 v3
> > CPU with 48G of RAM that I split up into two logical nodes in the guest
> > in order to test with NUMA as well.
> >
> > Test page_fault1 (THP) page_fault2
> > Baseline 1 1256106.33 +/-0.09% 482202.67 +/-0.46%
> > 16 8864441.67 +/-0.09% 3734692.00 +/-1.23%
> >
> > Patches applied 1 1257096.00 +/-0.06% 477436.00 +/-0.16%
> > 16 8864677.33 +/-0.06% 3800037.00 +/-0.19%
> >
> > Patches enabled 1 1258420.00 +/-0.04% 480080.00 +/-0.07%
> > MADV disabled 16 8753840.00 +/-1.27% 3782764.00 +/-0.37%
> >
> > Patches enabled 1 1267916.33 +/-0.08% 472075.67 +/-0.39%
> > 16 8287050.33 +/-0.67% 3774500.33 +/-0.11%
> >
> > The results above are for a baseline with a linux-next-20191021 kernel,
> > that kernel with this patch set applied but page reporting disabled in
> > virtio-balloon, patches applied but the madvise disabled by direct
> > assigning a device, and the patches applied and page reporting fully
> > enabled. These results include the deviation seen between the average
> > value reported here versus the high and/or low value. I observed that
> > during the test the memory usage for the first three tests never dropped
> > whereas with the patches fully enabled the VM would drop to using only a
> > few GB of the host's memory when switching from memhog to page fault tests.
> >
> > Most of the overhead seen with this patch set fully enabled is due to the
> > fact that accessing the reported pages will cause a page fault and the host
> > will have to zero the page before giving it back to the guest. The overall
> > guest size is kept fairly small to only a few GB while the test is running.
> > This overhead is much more visible when using THP than with standard 4K
> > pages. As such for the case where the host memory is not oversubscribed
> > this results in a performance regression, however if the host memory were
> > oversubscribed this patch set should result in a performance improvement
> > as swapping memory from the host can be avoided.
>
> I'm trying to understand "how valuable is this patchset" and the above
> resulted in some headscratching.
>
> Overall, how valuable is this patchset? To real users running real
> workloads?

A more detailed reply is in my response to your comments on patch 3.
Basically the value is for host memory overcommit in that we can avoid
having to go to swap nearly as often and can potentially pack the guests
even tighter with better performance.

> > There is currently an alternative patch set[1] that has been under work
> > for some time however the v12 version of that patch set could not be
> > tested as it triggered a kernel panic when I attempted to test it. It
> > requires multiple modifications to get up and running with performance
> > comparable to this patch set. A follow-on set has yet to be posted. As
> > such I have not included results from that patch set, and I would
> > appreciate it if we could keep this patch set the focus of any discussion
> > on this thread.
>
> Actually, the rest of us would be interested in a comparison ;)

I understand that. However, the last time I tried benchmarking that patch
set it blew up into a thread where we kept having to fix things on that
patch set and by the time we were done we weren't benchmarking the v12
patch set anymore since we had made so many modifications to it, and that
assumes Nitesh and I were in sync. Also I don't know what the current
state of his patch set is as he was working on some additional changes
when we last discussed things.

Ideally that patch set can be reposted with the necessary fixes and then
we can go through any necessary debug, repair, and addressing limitations
there.


2019-10-23 11:20:46

by Nitesh Narayan Lal

[permalink] [raw]
Subject: Re: [PATCH v12 0/6] mm / virtio: Provide support for unused page reporting


On 10/22/19 7:43 PM, Alexander Duyck wrote:
> On Tue, 2019-10-22 at 16:01 -0700, Andrew Morton wrote:
>> On Tue, 22 Oct 2019 15:27:52 -0700 Alexander Duyck <[email protected]> wrote:
>>
[...]
>>> There is currently an alternative patch set[1] that has been under work
>>> for some time however the v12 version of that patch set could not be
>>> tested as it triggered a kernel panic when I attempted to test it. It
>>> requires multiple modifications to get up and running with performance
>>> comparable to this patch set. A follow-on set has yet to be posted. As
>>> such I have not included results from that patch set, and I would
>>> appreciate it if we could keep this patch set the focus of any discussion
>>> on this thread.
>> Actually, the rest of us would be interested in a comparison ;)
> I understand that. However, the last time I tried benchmarking that patch
> set it blew up into a thread where we kept having to fix things on that
> patch set and by the time we were done we weren't benchmarking the v12
> patch set anymore since we had made so many modifications to it, and that
> assumes Nitesh and I were in sync. Also I don't know what the current
> state of his patch set is as he was working on some additional changes
> when we last discussed things.

Just an update about the current state of my patch-series:

As we last discussed I was going to try implementing Michal Hock's suggestion of
using page-isolation APIs. To do that I have replaced __isolate_free_page() with
start/undo_isolate_free_page_range().
However, I am running into some issues which I am currently investigating.

After this, I will be investigating the reason why I was seeing degradation
specifically with (MAX_ORDER - 2) as the reporting order.

>
> Ideally that patch set can be reposted with the necessary fixes and then
> we can go through any necessary debug, repair, and addressing limitations
> there.
>
>