MIME-Version: 1.0
References: <DS0PR02MB90787835F5B9CB9771A20329C4C09@DS0PR02MB9078.namprd02.prod.outlook.com>
In-Reply-To: <DS0PR02MB90787835F5B9CB9771A20329C4C09@DS0PR02MB9078.namprd02.prod.outlook.com>
From:   "T.J. Alumbaugh" <talumbau@google.com>
Date:   Mon, 23 Jan 2023 13:26:36 -0800
Message-ID: <CABmGT5HiH8OM_F6K7VQj9vj1ZnSvVowwJ_d4EVox5-JDL3Eh5w@mail.gmail.com>
Subject: Re: [RFC] memory pressure detection in VMs using PSI mechanism for
 dynamically inflating/deflating VM memory
To:     "Sudarshan Rajagopalan (QUIC)" <quic_sudaraja@quicinc.com>
Cc:     David Hildenbrand <david@redhat.com>,
        Johannes Weiner <hannes@cmpxchg.org>,
        Suren Baghdasaryan <surenb@google.com>,
        Mike Rapoport <rppt@kernel.org>,
        Oscar Salvador <osalvador@suse.de>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        "mark.rutland@arm.com" <mark.rutland@arm.com>,
        "will@kernel.org" <will@kernel.org>,
        "virtualization@lists.linux-foundation.org" 
        <virtualization@lists.linux-foundation.org>,
        "linux-mm@kvack.org" <linux-mm@kvack.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linux-arm-kernel@lists.infradead.org" 
        <linux-arm-kernel@lists.infradead.org>,
        "linux-arm-msm@vger.kernel.org" <linux-arm-msm@vger.kernel.org>,
        "Trilok Soni (QUIC)" <quic_tsoni@quicinc.com>,
        "Sukadev Bhattiprolu (QUIC)" <quic_sukadev@quicinc.com>,
        "Srivatsa Vaddagiri (QUIC)" <quic_svaddagi@quicinc.com>,
        "Patrick Daly (QUIC)" <quic_pdaly@quicinc.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk

Hi Sudarshan,

I had questions about the setup and another about the use of PSI.

>
> 1. This will be a native userspace daemon that will be running only in th=
e Linux VM which will use virtio-mem driver that uses memory hotplug to add=
/remove memory. The VM (aka Secondary VM, SVM) will request for memory from=
 the host which is Primary VM, PVM via the backend hypervisor which takes c=
are of cross-VM communication.
>

In regards to the "PVM/SVM" nomenclature, is the implied setup one of
fault tolerance (i.e. the secondary is there to take over in case of
failure of the primary VM)? Generally speaking, are the PVM and SVM
part of a defined system running some workload? The context seems to
be that the situation is more intricate than "two virtual machines
running on a host", but I'm not clear how it is different from that
general notion.

>
> 5. Detecting decrease in memory pressure =E2=80=93 the reverse part where=
 we give back memory to PVM when memory is no longer needed is bit tricky. =
We look for pressure decay and see if PSI averages (avg10, avg60, avg300) g=
o down, and along with other memory stats (such as free memory etc) we make=
 an educated guess that usecase has ended and memory has been free=E2=80=99=
ed by the usecase, and this memory can be given back to PVM when its no lon=
ger needed.
>

This is also very interesting to me. Detecting a decrease in pressure
using PSI seems difficult. IIUC correctly, the approach taken in
OOMD/senpai from Meta seems to be continually applying
pressure/backing off, and then seeing the outcome of that decision on
the pressure metric to feedback to the next decision (see links
below). Is your approach similar? Do you check the metric periodically
or only when receiving PSI memory events in userspace?

https://github.com/facebookincubator/senpai/blob/main/senpai.py#L117-L148
https://github.com/facebookincubator/oomd/blob/main/src/oomd/plugins/Senpai=
.cpp#L529-L538

Very interesting proposal. Thanks for sending,

-T.J.