by SeongJae Park

[permalink] [raw]

Subject: Re: [PATCH v20 00/15] Introduce Data Access MONitor (DAMON)

On Wed, 23 Sep 2020 10:04:57 -0700 Shakeel Butt <[email protected]> wrote:

> On Mon, Aug 17, 2020 at 3:52 AM SeongJae Park <[email protected]> wrote:
> >
> > From: SeongJae Park <[email protected]>
> >
> > Changes from Previous Version
> > =============================
> >
> > - Place 'CREATE_TRACE_POINTS' after '#include' statements (Steven Rostedt)
> > - Support large record file (Alkaid)
> > - Place 'put_pid()' of virtual monitoring targets in 'cleanup' callback
> > - Avoid conflict between concurrent DAMON users
> > - Update evaluation result document
> >
> > Introduction
> > ============
> >
> > DAMON is a data access monitoring framework subsystem for the Linux kernel.
> > The core mechanisms of DAMON called 'region based sampling' and 'adaptive
> > regions adjustment' (refer to 'mechanisms.rst' in the 11th patch of this
> > patchset for the detail) make it
> >
> > - accurate (The monitored information is useful for DRAM level memory
> > management. It might not appropriate for Cache-level accuracy, though.),
> > - light-weight (The monitoring overhead is low enough to be applied online
> > while making no impact on the performance of the target workloads.), and
> > - scalable (the upper-bound of the instrumentation overhead is controllable
> > regardless of the size of target workloads.).
> >
> > Using this framework, therefore, the kernel's core memory management mechanisms
> > such as reclamation and THP can be optimized for better memory management. The
> > experimental memory management optimization works that incurring high
> > instrumentation overhead will be able to have another try. In user space,
> > meanwhile, users who have some special workloads will be able to write
> > personalized tools or applications for deeper understanding and specialized
> > optimizations of their systems.
> >
> > Evaluations
> > ===========
> >
> > We evaluated DAMON's overhead, monitoring quality and usefulness using 25
> > realistic workloads on my QEMU/KVM based virtual machine running a kernel that
> > v20 DAMON patchset is applied.
> >
> > DAMON is lightweight. It increases system memory usage by 0.12% and slows
> > target workloads down by 1.39%.
> >
> > DAMON is accurate and useful for memory management optimizations. An
> > experimental DAMON-based operation scheme for THP, 'ethp', removes 88.16% of
> > THP memory overheads while preserving 88.73% of THP speedup. Another
> > experimental DAMON-based 'proactive reclamation' implementation, 'prcl',
> > reduces 91.34% of residential sets and 25.59% of system memory footprint while
> > incurring only 1.58% runtime overhead in the best case (parsec3/freqmine).
> >
> > NOTE that the experimentail THP optimization and proactive reclamation are not
> > for production but just only for proof of concepts.
> >
> > Please refer to the official document[1] or "Documentation/admin-guide/mm: Add
> > a document for DAMON" patch in this patchset for detailed evaluation setup and
> > results.
> >
> > [1] https://damonitor.github.io/doc/html/latest-damon/admin-guide/mm/damon/eval.html
> >
>
>
> Hi SeongJae,
>
> Sorry for the late response. I will start looking at this series in
> more detail in the next couple of weeks.

Thank you so much!

> I have a couple of high level comments for now.
>
> 1) Please explain in the cover letter why someone should prefer to use
> DAMON instead of Page Idle Tracking.

In short, because DAMON provides overhead-quality tradeoff and allow use of
variable monitoring primitives other than only PG_Idle and PTE Accessed bits.
I will explain this in detail in the cover letter of the next version of this
patchset.

>
> 2) Also add what features Page Idle Tracking provides which the first
> version of DAMON does not provide (like page level tracking, physical
> or unmapped memory tracking e.t.c) and tell if you plan to add such
> features to DAMON in future. Basically giving reasons to not block the
> current version of DAMON until it is feature-rich.

In short, DAMON will provide only virtual address space monitoring by default
but I believe the lack of features because DAMON is expandable for those.
Also, I will make DAMON co-exists with Idle Page Tracking again. I will post
another RFC patchset for this soon. Again, I will describe this in detail in
the next version of the cover letter.

>
> 3) I think in the first mergeable version of DAMON, I would prefer to
> have support to control (create/delete/account) the DAMON context. You
> already have a RFC series on it. I would like to have that series part
> of this one.

Ok, I will apply it here.

Thanks,
SeongJae Park

2020-09-25 15:04:01

by SeongJae Park

[permalink] [raw]

Subject: Re: [PATCH v20 00/15] Introduce Data Access MONitor (DAMON)

On Mon, 31 Aug 2020 13:22:35 +0200 SeongJae Park <[email protected]> wrote:

> On Thu, 20 Aug 2020 09:27:38 +0200 SeongJae Park <[email protected]> wrote:
>
> > On Mon, 17 Aug 2020 12:51:22 +0200 SeongJae Park <[email protected]> wrote:
> >
> > > From: SeongJae Park <[email protected]>
> > >
[...]
> > > Introduction
> > > ============
> > >
> > > DAMON is a data access monitoring framework subsystem for the Linux kernel.
> > > The core mechanisms of DAMON called 'region based sampling' and 'adaptive
> > > regions adjustment' (refer to 'mechanisms.rst' in the 11th patch of this
> > > patchset for the detail) make it
> > >
> > > - accurate (The monitored information is useful for DRAM level memory
> > > management. It might not appropriate for Cache-level accuracy, though.),
> > > - light-weight (The monitoring overhead is low enough to be applied online
> > > while making no impact on the performance of the target workloads.), and
> > > - scalable (the upper-bound of the instrumentation overhead is controllable
> > > regardless of the size of target workloads.).
> > >
> > > Using this framework, therefore, the kernel's core memory management mechanisms
> > > such as reclamation and THP can be optimized for better memory management. The
> > > experimental memory management optimization works that incurring high
> > > instrumentation overhead will be able to have another try. In user space,
> > > meanwhile, users who have some special workloads will be able to write
> > > personalized tools or applications for deeper understanding and specialized
> > > optimizations of their systems.
> >
> > DAMON will be presented in the next week LPC[1]. To be prepared for a screen
> > sharing error (if I get no such error, I will do a live-demo), I recorded a
> > simple demo video. I would like to share it here to help your easier
> > understanding of DAMON.
> >
> > https://youtu.be/l63eqbVBZRY
> >
> > [1] https://linuxplumbersconf.org/event/7/contributions/659/
>
> During the session, I introduced the list of future works and asked the
> audiences to vote for the priority of the tasks:
> https://youtu.be/jOBkKMA0uF0?t=13253

I also promised to make my automated tests for DAMON available as open source.
I'm happy to announce that it is not available at Github[1] under GPL v2
license. Using that, you can easily test how well DAMON works on your machine.
Hopefully, it could be used as a getting started guide for both users and
developers of DAMON.

[1] https://github.com/awslabs/damon-tests

Thanks,
SeongJae Park

>
> To summarize here, the tasks are (highest priority first):
>
> 1. Make current DAMON patchset series merged in the mainline (6 votes)
> 2. User space interface improvement (4 votes)
> - Multiple monitoring contexts
> - Charging of the monitoring threads' CPU usage
> 3. Support more address spaces (2 votes)
> - Cgroups, cached pages, specific file-backed pages, swap slots, ...
> 3. DAMON-based MM optimizations (2 votes)
> - Page reclaim, THP, compaction, NUMA balancing, ...
> 4. Optimize for special use-cases (1 vote)
> - Page granularity monitoring, accessed-or-not monitoring, ...
>
> So, I'd like to focus on polishing current patchset so that it could be merged
> in. For that, I'd like to ask your more reviews.
>
> While waiting for the reviews, I will start implementing other future features
> that received many votes. The support of multiple monitoring contexts for the
> user space would be the first one. Once the implementation is finished, I will
> post it as separated RFC patchset (the user space interface will be compatible
> with current one).
>
> Any comment is welcome.
>
>
> Thanks,
> SeongJae Park