2020-02-06 09:26:37

by Wang, Wei W

[permalink] [raw]
Subject: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

There are cases that users want to shrink balloon pages after the
pagecache depleted. The conservative_shrinker lets the shrinker
shrink balloon pages when all the pagecache has been reclaimed.

Signed-off-by: Wei Wang <[email protected]>
---
drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 93f995f6cf36..b4c5bb13a867 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
static struct vfsmount *balloon_mnt;
#endif

+static bool conservative_shrinker = true;
+module_param(conservative_shrinker, bool, 0644);
+MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
+
enum virtio_balloon_vq {
VIRTIO_BALLOON_VQ_INFLATE,
VIRTIO_BALLOON_VQ_DEFLATE,
@@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
{
unsigned long pages_freed = 0;

+ /* Balloon pages only gets shrunk when the pagecache depleted */
+ if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+ return 0;
+
/*
* One invocation of leak_balloon can deflate at most
* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
@@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
struct virtio_balloon, shrinker);
unsigned long count;

- count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+ if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+ count = 0;
+ else
+ count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+
count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;

return count;
--
2.17.1


2020-02-06 09:35:49

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thu, Feb 06, 2020 at 04:01:47PM +0800, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> Signed-off-by: Wei Wang <[email protected]>

I'd rather avoid module parameters, but otherwise looks
like a reasonable idea.
Tyler, what do you think?


> ---
> drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
> static struct vfsmount *balloon_mnt;
> #endif
>
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
> +
> enum virtio_balloon_vq {
> VIRTIO_BALLOON_VQ_INFLATE,
> VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
> /*
> * One invocation of leak_balloon can deflate at most
> * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> struct virtio_balloon, shrinker);
> unsigned long count;
>
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))

I'd rather have an API for that in mm/. In particular, do we want other
shrinkers to run, not just pagecache? To pick an example I'm familiar
with, kvm mmu cache for nested virt?

> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
> count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>
> return count;
> --
> 2.17.1

2020-02-06 09:39:29

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On 06.02.20 09:01, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> Signed-off-by: Wei Wang <[email protected]>
> ---
> drivers/virtio/virtio_balloon.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
> static struct vfsmount *balloon_mnt;
> #endif
>
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
> +
> enum virtio_balloon_vq {
> VIRTIO_BALLOON_VQ_INFLATE,
> VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
> /*
> * One invocation of leak_balloon can deflate at most
> * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker,
> struct virtio_balloon, shrinker);
> unsigned long count;
>
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
> count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>
> return count;
>

so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
deflate the balloon when conservative_shrinker=true?

--
Thanks,

David / dhildenb

2020-02-06 09:47:09

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
> deflate the balloon when conservative_shrinker=true?
>

Should be. Need Tyler's help to test it.

Best,
Wei

2020-02-06 09:59:15

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
>
> How about just making this a last resort thing to be compatible with existing
> hypervisors? if someone wants to change behaviour that really should use a
> feature bit ...

Yeah, sounds good to me to control via feature bits.

Best,
Wei

2020-02-06 10:02:08

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
>
> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
> am not wrong.
>
> Especially, a echo 3 > /proc/sys/vm/drop_caches
>
> will first drop the page cache and then drop_slab()

Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It invokes other shrinkers as well (if considered an issue), need to be tweaked in the mm.

Best,
Wei

2020-02-06 10:04:29

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On 06.02.20 10:44, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
>>
>> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
>> am not wrong.
>>
>> Especially, a echo 3 > /proc/sys/vm/drop_caches
>>
>> will first drop the page cache and then drop_slab()
>
> Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It invokes other shrinkers as well (if considered an issue), need to be tweaked in the mm.

In short, I don't like this approach as long as a drop_slab() can
deflate the whole balloon and don't think this is the right approach then.


--
Thanks,

David / dhildenb

2020-02-06 10:45:42

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > struct virtio_balloon, shrinker);
> > unsigned long count;
> >
> > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
>
> I'd rather have an API for that in mm/. In particular, do we want other
> shrinkers to run, not just pagecache? To pick an example I'm familiar
> with, kvm mmu cache for nested virt?

We could make it extendable:

#define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0)
#define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE (1 << 1)
...

uint64_t conservative_shrinker;
if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && global_node_page_state(NR_FILE_PAGES))
return 0;

For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.

Best,
Wei

2020-02-06 10:45:46

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thu, Feb 06, 2020 at 09:27:04AM +0000, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> > virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > > struct virtio_balloon, shrinker);
> > > unsigned long count;
> > >
> > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> >
> > I'd rather have an API for that in mm/. In particular, do we want other
> > shrinkers to run, not just pagecache? To pick an example I'm familiar
> > with, kvm mmu cache for nested virt?
>
> We could make it extendable:
>
> #define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0)
> #define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE (1 << 1)
> ...
>
> uint64_t conservative_shrinker;
> if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && global_node_page_state(NR_FILE_PAGES))
> return 0;
>
> For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.
>
> Best,
> Wei

How about just making this a last resort thing to be compatible with
existing hypervisors? if someone wants to change behaviour
that really should use a feature bit ...

--
MST

2020-02-06 10:45:49

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On 06.02.20 10:28, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
>> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
>> deflate the balloon when conservative_shrinker=true?
>>
>
> Should be. Need Tyler's help to test it.
>

If the page cache is empty, a drop_slab() will deflate the whole balloon
if I am not wrong.

Especially, a echo 3 > /proc/sys/vm/drop_caches

will first drop the page cache and then drop_slab()

While I like the general idea, it looks more like a hack to me, to try
to teach the shrinker something it was not built for/does not support yet.

--
Thanks,

David / dhildenb

2020-02-06 11:31:42

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Thu, Feb 06, 2020 at 09:43:10AM +0000, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
> >
> > How about just making this a last resort thing to be compatible with existing
> > hypervisors? if someone wants to change behaviour that really should use a
> > feature bit ...
>
> Yeah, sounds good to me to control via feature bits.
>
> Best,
> Wei

To clarify, shrinker use could be a feature bit. OOM behaviour was
there for years and has been used to dynamically size guests.

--
MST

2020-02-08 12:38:51

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On 2020/02/06 17:01, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
>
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct virtio_balloon *vb,
> {
> unsigned long pages_freed = 0;
>
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +

Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and node-B's
NR_FILE_PAGES is not 0, but allocation request which triggered this shrinker
wants to allocate from only node-B" happen? Can some thread keep this shrinker
defunctional by keep increasing NR_FILE_PAGES?

Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
the guest could start reclaiming memory based on host's request (like OOM
notifier chain) which is issued when host thinks that host is getting close
to OOM and thus guests should start returning their unused memory to host.
Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
condition" is also possible.

2020-02-10 03:15:03

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote:
>
> Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and
> node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this
> shrinker wants to allocate from only node-B" happen?

No, it's a global counter.

>Can some thread keep
> this shrinker defunctional by keep increasing NR_FILE_PAGES?

Yes. Actually it's our intention - as long as there are pagecache pages,
balloon pages are avoided to be reclaimed.


>
> Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
> the guest could start reclaiming memory based on host's request (like OOM
> notifier chain) which is issued when host thinks that host is getting close to
> OOM and thus guests should start returning their unused memory to host.
> Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
> condition" is also possible.

That's about the host usages. The host side management software decides when to issue a request to balloon (either periodically or event driven), I think there isn't anything we need to do in the balloon driver here.

Best,
Wei

2020-02-10 03:58:26

by Tetsuo Handa

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

Wang, Wei W wrote:
> On Saturday, February 8, 2020 8:33 PM, Tetsuo Handa wrote:
> >
> > Is this NUMA aware? Can "node-A's NR_FILE_PAGES is already 0 and
> > node-B's NR_FILE_PAGES is not 0, but allocation request which triggered this
> > shrinker wants to allocate from only node-B" happen?
>
> No, it's a global counter.
>
> >Can some thread keep
> > this shrinker defunctional by keep increasing NR_FILE_PAGES?
>
> Yes. Actually it's our intention - as long as there are pagecache pages,
> balloon pages are avoided to be reclaimed.

Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is not 0, but
allocation request which triggered this shrinker wants to allocate from only node-A"
would be confused by this change, for the pagecache pages for allocating thread's
interested node are already depleted but the balloon cannot shrink when it should
because the pagecache pages for allocating thread's uninterested nodes are not yet
depleted.

>
>
> >
> > Is this patch from "Re: Balloon pressuring page cache" thread? I hope that
> > the guest could start reclaiming memory based on host's request (like OOM
> > notifier chain) which is issued when host thinks that host is getting close to
> > OOM and thus guests should start returning their unused memory to host.
> > Maybe "periodically (e.g. 5 minutes)" in addition to "upon close to OOM
> > condition" is also possible.
>
> That's about the host usages. The host side management software decides when to
> issue a request to balloon (either periodically or event driven), I think there
> isn't anything we need to do in the balloon driver here.

Well, my comment is rather: "Do not try to reserve guest's memory. In other words,
do not try to maintain balloons on the guest side. Since host would be able to cache
file data on the host's cache, guests would be able to quickly fetch file data from
host's cache via normal I/O requests." ;-)

2020-02-10 07:27:48

by Wang, Wei W

[permalink] [raw]
Subject: RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On Monday, February 10, 2020 11:57 AM, Tetsuo Handa wrote:
> Then, "node-A's NR_FILE_PAGES is already 0 and node-B's NR_FILE_PAGES is
> not 0, but allocation request which triggered this shrinker wants to allocate
> from only node-A"
> would be confused by this change, for the pagecache pages for allocating
> thread's interested node are already depleted but the balloon cannot shrink
> when it should because the pagecache pages for allocating thread's
> uninterested nodes are not yet depleted.

The existing balloon isn't numa aware. "but the balloon cannot shrink " - even we
let balloon to shrink, it could shrink pages from the uninterested node.

When we have a numa aware balloon, we could further update the shrinker
to check with the per node counter , node_page_state(NR_FILE_PAGES).

>
> >
> Well, my comment is rather: "Do not try to reserve guest's memory. In other
> words, do not try to maintain balloons on the guest side. Since host would
> be able to cache file data on the host's cache, guests would be able to
> quickly fetch file data from host's cache via normal I/O requests." ;-)

Didn't this one. The discussion was about guest pagecache pages v.s. guest balloon pages.
Why is host's pagecache here?

Best,
Wei

2020-02-11 16:37:49

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

On 2020/02/10 16:27, Wang, Wei W wrote:
>> Well, my comment is rather: "Do not try to reserve guest's memory. In other
>> words, do not try to maintain balloons on the guest side. Since host would
>> be able to cache file data on the host's cache, guests would be able to
>> quickly fetch file data from host's cache via normal I/O requests." ;-)
>
> Didn't this one. The discussion was about guest pagecache pages v.s. guest balloon pages.
> Why is host's pagecache here?

I'm expecting a mode: "Guests should try to minimize pagecache pages (and teach
host to treat reclaimed pages as if POSIX_FADV_DONTNEED) instead of managing
guest balloon pages". In other words, as if

while :; sleep 5; echo 1 > /proc/sys/vm/drop_caches; done

is running in the guest's kernel. And as if

echo 2 > /proc/sys/vm/drop_caches

is triggered in the guest's kernel when host requested guests to reclaim
memory. No long-life balloons. Guest balloons do not need to care about
NUMA. Just leave the management of pagecache pages to the host.