2023-11-29 13:13:39

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi SeongJae,

We are using damon on the Android operating system. It starts monitoring
when app comes to the foreground, stops monitoring and save the
monitoring results when app goes to the background.

The two methods that you mentioned,

1.tracepoint events
This method requires opening the tracepoint event and using the
'perf-record' tool to generate the perf.data file. Then parsing the
perf.data file. However, the user's phone is not enabled tracepoint
events. Additionally, the generated file is quite complex, and we only
need memory addresses and access frequency informations.

2. damos
There is no direct Python runtime environment on android phones.

Both of these methods provide results that are not very intuitive and
require complex parsing. We save the results in the format of starting
address, region size, and access frequency. When the available memory
reaches a threshold, the user space reclaim memory with low access
frequency by calling 'process_madvise' function.

Thanks.

On Tue, Nov 28, 2023 at 06:57:39PM +0000, SeongJae Park wrote:
> Hi Cuiyanpei,
>
>
> Thank you for this nice patchset.
>
> On Tue, 28 Nov 2023 15:34:39 +0800 cuiyangpei <[email protected]> wrote:
>
> > The user space users can control DAMON and get the monitoring results
> > via implements 'recording' feature in 'damon-sysfs'. The feature
> > can be used via 'record' and 'state' file in the '<sysfs>/kernel/mm/
> > damon/admin/kdamonds/N/' directory.
> >
> > The file allows users to record monitored access patterns in a text
> > file. Firstly, users set the size of the buffer and the path of the
> > result file by writing to the ``record`` file. Then the recorded
> > results are first written in an in-memory buffer and flushed the
> > recorded results to a file in batch by writing 'record' to the
> > ``state`` file.
> >
> > For example, below commands set the buffer to be 4 KiB and the result
> > to be saved in ``/damon.txt``. ::
> >
> > # cd <sysfs>/kernel/mm/damon/admin/kdamonds/N
> > # echo "4096 /damon.txt" > record
> > # echo "record" > state
>
> This reminds me the record feature of DAMON debugfs interface[1], which still
> not merged in the mainline. I deprioritized the patchset to have a better
> answer to Andrew's questions on the discussion (nice definition of the binary
> format and quatization of the benefit), and later I realized I don't have real
> use case that this makes real benefit, so I'm no more aiming to make this
> merged into the mainline.
>
> More specifically, I'm now thinking the feature is not really needed since
> trace event based recording works, and we found no problem so far. The DAMON
> user-space tool (damo)[2] also dropped support of the in-kernel record feature,
> but we received no problem report.
>
> Also, I believe DAMOS tried regions like feature could provide some level of
> information, since it provides snapshot of the monitoring result, which
> contains a time data, namely 'age'.
>
> Could you please further elaborate your aimed use case of this feature and the
> advantage compared to other alternatives (tracepoint-based recording or DAMOS
> tried regions based snapshot collecting) I mentioned above?
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
> [2] https://github.com/awslabs/damo
>
>
> Thanks,
> SJ
>
> >
> > Signed-off-by: cuiyangpei <[email protected]>


2023-11-29 17:11:15

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi Cuiyangpei,

On Wed, 29 Nov 2023 21:13:15 +0800 cuiyangpei <[email protected]> wrote:

> Hi SeongJae,
>
> We are using damon on the Android operating system. It starts monitoring
> when app comes to the foreground, stops monitoring and save the
> monitoring results when app goes to the background.

Thank you so much for sharing this detailed use case. This will be very
helpful for us at understanding real usage of DAMON and making it better for
that together.

>
> The two methods that you mentioned,
>
> 1.tracepoint events
> This method requires opening the tracepoint event and using the
> 'perf-record' tool to generate the perf.data file. Then parsing the
> perf.data file. However, the user's phone is not enabled tracepoint
> events. Additionally, the generated file is quite complex, and we only
> need memory addresses and access frequency informations.

That's fair points, thank you for kindly explaining this.

>
> 2. damos
> There is no direct Python runtime environment on android phones.
>
> Both of these methods provide results that are not very intuitive and
> require complex parsing. We save the results in the format of starting
> address, region size, and access frequency. When the available memory
> reaches a threshold, the user space reclaim memory with low access
> frequency by calling 'process_madvise' function.

Again, very fair points. So, if I understood correctly, you want to reclaim
cold pages proactively when the available memory reaches a threshold, right?
DAMON could do that directly instead of you[1]. Using that, you don't need to
save the access pattern and parse but just ask DAMON to find memory regions of
specific access frequency range and reclaim. Have you also considered using
that but found some problems?

I understand the feature may not perfectly fit for your use case, and I want to
learn from you how it could be better.

[1] https://docs.kernel.org/mm/damon/design.html#operation-schemes


Thanks,
SJ

>
> Thanks.
>
[...]

2023-11-30 09:14:50

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi SeongJae,

We also investigated the operation schemes you mentioned, but we don't
think it can fit our needs.

On android, user will open many apps and switch between these apps as
needs. We hope to monitor apps' memory access only when they are on
foreground and record the memory access pattern when they are switched
to the background.

When avaliable memory reaches a threshold, we will use these access
patterns with some strategies to recognize those memory that will have
little impact on user experience and to reclaim them proactively.

I'm not sure I have clarified it clearly, if you still have questions
on this, please let us know.

Thanks.

On Wed, Nov 29, 2023 at 05:10:58PM +0000, SeongJae Park wrote:
> Hi Cuiyangpei,
>
> On Wed, 29 Nov 2023 21:13:15 +0800 cuiyangpei <[email protected]> wrote:
>
> > Hi SeongJae,
> >
> > We are using damon on the Android operating system. It starts monitoring
> > when app comes to the foreground, stops monitoring and save the
> > monitoring results when app goes to the background.
>
> Thank you so much for sharing this detailed use case. This will be very
> helpful for us at understanding real usage of DAMON and making it better for
> that together.
>
> >
> > The two methods that you mentioned,
> >
> > 1.tracepoint events
> > This method requires opening the tracepoint event and using the
> > 'perf-record' tool to generate the perf.data file. Then parsing the
> > perf.data file. However, the user's phone is not enabled tracepoint
> > events. Additionally, the generated file is quite complex, and we only
> > need memory addresses and access frequency informations.
>
> That's fair points, thank you for kindly explaining this.
>
> >
> > 2. damos
> > There is no direct Python runtime environment on android phones.
> >
> > Both of these methods provide results that are not very intuitive and
> > require complex parsing. We save the results in the format of starting
> > address, region size, and access frequency. When the available memory
> > reaches a threshold, the user space reclaim memory with low access
> > frequency by calling 'process_madvise' function.
>
> Again, very fair points. So, if I understood correctly, you want to reclaim
> cold pages proactively when the available memory reaches a threshold, right?
> DAMON could do that directly instead of you[1]. Using that, you don't need to
> save the access pattern and parse but just ask DAMON to find memory regions of
> specific access frequency range and reclaim. Have you also considered using
> that but found some problems?
>
> I understand the feature may not perfectly fit for your use case, and I want to
> learn from you how it could be better.
>
> [1] https://docs.kernel.org/mm/damon/design.html#operation-schemes
>
>
> Thanks,
> SJ
>
> >
> > Thanks.
> >
> [...]

2023-11-30 19:44:34

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi Cuiyangpei,

On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:

> Hi SeongJae,
>
> We also investigated the operation schemes you mentioned, but we don't
> think it can fit our needs.
>
> On android, user will open many apps and switch between these apps as
> needs. We hope to monitor apps' memory access only when they are on
> foreground and record the memory access pattern when they are switched
> to the background.
>
> When avaliable memory reaches a threshold, we will use these access
> patterns with some strategies to recognize those memory that will have
> little impact on user experience and to reclaim them proactively.
>
> I'm not sure I have clarified it clearly, if you still have questions
> on this, please let us know.

So, to my understanding, you expect applications may keep similar access
pattern when they are in foreground, but have a different, less aggressive
access pattern in background, and therefore reclaim memory based on the
foreground-access pattern, right?

Very interesting idea, thank you for sharing!

Then, yes, I agree current DAMOS might not that helpful for the situation, and
this record feature could be useful for your case.

That said, do you really need full recording of the monitoring results? If
not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
monitoring results snapshot that include both frequency and recency of all
regions in an efficient way. If single snapshot is not having enough
information for you, you could collect multiple snapshots.

You mentioned absence of Python on Android as a blocker of DAMOS use on the
previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
or Python.

Of course, I think you might already surveyed it but found some problems.
Could you please share that in detail if so?

[1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
[2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/


Thanks,
SJ

>
> Thanks.

2023-12-01 12:25:55

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> Hi Cuiyangpei,
>
> On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
>
> > Hi SeongJae,
> >
> > We also investigated the operation schemes you mentioned, but we don't
> > think it can fit our needs.
> >
> > On android, user will open many apps and switch between these apps as
> > needs. We hope to monitor apps' memory access only when they are on
> > foreground and record the memory access pattern when they are switched
> > to the background.
> >
> > When avaliable memory reaches a threshold, we will use these access
> > patterns with some strategies to recognize those memory that will have
> > little impact on user experience and to reclaim them proactively.
> >
> > I'm not sure I have clarified it clearly, if you still have questions
> > on this, please let us know.
>
> So, to my understanding, you expect applications may keep similar access
> pattern when they are in foreground, but have a different, less aggressive
> access pattern in background, and therefore reclaim memory based on the
> foreground-access pattern, right?
>

Different apps may have different access pattern. On android, the apps will
join in freeze cgroup and be frozen after switch to the background. So we
monitor apps' memory access only when they are in foreground.
> Very interesting idea, thank you for sharing!
>
> Then, yes, I agree current DAMOS might not that helpful for the situation, and
> this record feature could be useful for your case.
>
> That said, do you really need full recording of the monitoring results? If
> not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
> monitoring results snapshot that include both frequency and recency of all
> regions in an efficient way. If single snapshot is not having enough
> information for you, you could collect multiple snapshots.
>
> You mentioned absence of Python on Android as a blocker of DAMOS use on the
> previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
> or Python.
>
> Of course, I think you might already surveyed it but found some problems.
> Could you please share that in detail if so?
>
DAMOS tried regions feature you mentioned is not fully applicable. It needs to
apply schemes on regions. There is no available scheme we can use for our use
case. What we need is to return regions with access frequency and recency to
userspace for later use.
> [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
> [2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/
>
>
> Thanks,
> SJ
>
> >
> > Thanks.

2023-12-01 17:31:31

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi Cuiyangpei,

On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:

> On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > Hi Cuiyangpei,
> >
> > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> >
> > > Hi SeongJae,
> > >
> > > We also investigated the operation schemes you mentioned, but we don't
> > > think it can fit our needs.
> > >
> > > On android, user will open many apps and switch between these apps as
> > > needs. We hope to monitor apps' memory access only when they are on
> > > foreground and record the memory access pattern when they are switched
> > > to the background.
> > >
> > > When avaliable memory reaches a threshold, we will use these access
> > > patterns with some strategies to recognize those memory that will have
> > > little impact on user experience and to reclaim them proactively.
> > >
> > > I'm not sure I have clarified it clearly, if you still have questions
> > > on this, please let us know.
> >
> > So, to my understanding, you expect applications may keep similar access
> > pattern when they are in foreground, but have a different, less aggressive
> > access pattern in background, and therefore reclaim memory based on the
> > foreground-access pattern, right?
> >
>
> Different apps may have different access pattern. On android, the apps will
> join in freeze cgroup and be frozen after switch to the background. So we
> monitor apps' memory access only when they are in foreground.

Thank you for this enlightening me :)

> > Very interesting idea, thank you for sharing!
> >
> > Then, yes, I agree current DAMOS might not that helpful for the situation, and
> > this record feature could be useful for your case.
> >
> > That said, do you really need full recording of the monitoring results? If
> > not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
> > monitoring results snapshot that include both frequency and recency of all
> > regions in an efficient way. If single snapshot is not having enough
> > information for you, you could collect multiple snapshots.
> >
> > You mentioned absence of Python on Android as a blocker of DAMOS use on the
> > previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
> > or Python.
> >
> > Of course, I think you might already surveyed it but found some problems.
> > Could you please share that in detail if so?
> >
> DAMOS tried regions feature you mentioned is not fully applicable. It needs to
> apply schemes on regions. There is no available scheme we can use for our use
> case. What we need is to return regions with access frequency and recency to
> userspace for later use.


Thank you for the answer, I understand your concern. One of the available
DAMOS action is 'stat'[1], which does nothing but just count the statistic.
Using DAMOS scheme for any access pattern with 'stat' action, you can extract
the access pattern via DAMOS tried regions feature of DAMON sysfs interface,
without making any unnecessary impact to the workload.

Quote from [2]:

The expected usage of this directory is investigations of schemes' behaviors,
and query-like efficient data access monitoring results retrievals. For the
latter use case, in particular, users can set the action as stat and set the
access pattern as their interested pattern that they want to query.

For example, you could

# cd /sys/kernel/mm/damon/admin
#
# # populate directories
# echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
# echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
# cd kdamonds/0/contexts/0/schemes/0
#
# # set the access pattern for any case (max as 2**64 - 1), and action as stat
# echo 0 > access_pattern/sz/min
# echo 18446744073709551615 > access_pattern/sz/max
# echo 0 > access_pattern/nr_accesses/min
# echo 18446744073709551615 > access_pattern/nr_accesses/max
# echo 0 > access_pattern/age/min
# echo 18446744073709551615 > access_pattern/age/max
# echo stat > action

And this is how DAMON user-space tool is getting the snapshot with 'damo show'
command[3].

Could this be used for your case? Please ask any question if you have :)

[1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n
[2] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions,
[3] https://github.com/awslabs/damo/blob/next/USAGE.md#damo-show


Thanks,
SJ

> > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
> > [2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/
> >
> >
> > Thanks,
> > SJ
> >
> > >
> > > Thanks.

2023-12-03 06:02:54

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> Hi Cuiyangpei,
>
> On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
>
> > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > Hi Cuiyangpei,
> > >
> > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> > >
> > > > Hi SeongJae,
> > > >
> > > > We also investigated the operation schemes you mentioned, but we don't
> > > > think it can fit our needs.
> > > >
> > > > On android, user will open many apps and switch between these apps as
> > > > needs. We hope to monitor apps' memory access only when they are on
> > > > foreground and record the memory access pattern when they are switched
> > > > to the background.
> > > >
> > > > When avaliable memory reaches a threshold, we will use these access
> > > > patterns with some strategies to recognize those memory that will have
> > > > little impact on user experience and to reclaim them proactively.
> > > >
> > > > I'm not sure I have clarified it clearly, if you still have questions
> > > > on this, please let us know.
> > >
> > > So, to my understanding, you expect applications may keep similar access
> > > pattern when they are in foreground, but have a different, less aggressive
> > > access pattern in background, and therefore reclaim memory based on the
> > > foreground-access pattern, right?
> > >
> >
> > Different apps may have different access pattern. On android, the apps will
> > join in freeze cgroup and be frozen after switch to the background. So we
> > monitor apps' memory access only when they are in foreground.
>
> Thank you for this enlightening me :)
>
> > > Very interesting idea, thank you for sharing!
> > >
> > > Then, yes, I agree current DAMOS might not that helpful for the situation, and
> > > this record feature could be useful for your case.
> > >
> > > That said, do you really need full recording of the monitoring results? If
> > > not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
> > > monitoring results snapshot that include both frequency and recency of all
> > > regions in an efficient way. If single snapshot is not having enough
> > > information for you, you could collect multiple snapshots.
> > >
> > > You mentioned absence of Python on Android as a blocker of DAMOS use on the
> > > previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
> > > or Python.
> > >
> > > Of course, I think you might already surveyed it but found some problems.
> > > Could you please share that in detail if so?
> > >
> > DAMOS tried regions feature you mentioned is not fully applicable. It needs to
> > apply schemes on regions. There is no available scheme we can use for our use
> > case. What we need is to return regions with access frequency and recency to
> > userspace for later use.
>
>
> Thank you for the answer, I understand your concern. One of the available
> DAMOS action is 'stat'[1], which does nothing but just count the statistic.
> Using DAMOS scheme for any access pattern with 'stat' action, you can extract
> the access pattern via DAMOS tried regions feature of DAMON sysfs interface,
> without making any unnecessary impact to the workload.
>
> Quote from [2]:
>
> The expected usage of this directory is investigations of schemes' behaviors,
> and query-like efficient data access monitoring results retrievals. For the
> latter use case, in particular, users can set the action as stat and set the
> access pattern as their interested pattern that they want to query.
>
> For example, you could
>
> # cd /sys/kernel/mm/damon/admin
> #
> # # populate directories
> # echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
> # echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
> # cd kdamonds/0/contexts/0/schemes/0
> #
> # # set the access pattern for any case (max as 2**64 - 1), and action as stat
> # echo 0 > access_pattern/sz/min
> # echo 18446744073709551615 > access_pattern/sz/max
> # echo 0 > access_pattern/nr_accesses/min
> # echo 18446744073709551615 > access_pattern/nr_accesses/max
> # echo 0 > access_pattern/age/min
> # echo 18446744073709551615 > access_pattern/age/max
> # echo stat > action
>
> And this is how DAMON user-space tool is getting the snapshot with 'damo show'
> command[3].
>
> Could this be used for your case? Please ask any question if you have :)
>
> [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n
> [2] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions,
> [3] https://github.com/awslabs/damo/blob/next/USAGE.md#damo-show

Thank you for your detailed response, it is very helpful to us. We will look into it
and contact you if we have any questions.

>
>
> Thanks,
> SJ
>
> > > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
> > > [2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/
> > >
> > >
> > > Thanks,
> > > SJ
> > >
> > > >
> > > > Thanks.

2023-12-03 19:39:02

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:

> On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > Hi Cuiyangpei,
> >
> > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> >
> > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > Hi Cuiyangpei,
> > > >
> > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> > > >
> > > > > Hi SeongJae,
> > > > >
> > > > > We also investigated the operation schemes you mentioned, but we don't
> > > > > think it can fit our needs.
> > > > >
> > > > > On android, user will open many apps and switch between these apps as
> > > > > needs. We hope to monitor apps' memory access only when they are on
> > > > > foreground and record the memory access pattern when they are switched
> > > > > to the background.
> > > > >
> > > > > When avaliable memory reaches a threshold, we will use these access
> > > > > patterns with some strategies to recognize those memory that will have
> > > > > little impact on user experience and to reclaim them proactively.
> > > > >
> > > > > I'm not sure I have clarified it clearly, if you still have questions
> > > > > on this, please let us know.
> > > >
> > > > So, to my understanding, you expect applications may keep similar access
> > > > pattern when they are in foreground, but have a different, less aggressive
> > > > access pattern in background, and therefore reclaim memory based on the
> > > > foreground-access pattern, right?
> > > >
> > >
> > > Different apps may have different access pattern. On android, the apps will
> > > join in freeze cgroup and be frozen after switch to the background. So we
> > > monitor apps' memory access only when they are in foreground.
> >
> > Thank you for this enlightening me :)
> >
> > > > Very interesting idea, thank you for sharing!
> > > >
> > > > Then, yes, I agree current DAMOS might not that helpful for the situation, and
> > > > this record feature could be useful for your case.
> > > >
> > > > That said, do you really need full recording of the monitoring results? If
> > > > not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
> > > > monitoring results snapshot that include both frequency and recency of all
> > > > regions in an efficient way. If single snapshot is not having enough
> > > > information for you, you could collect multiple snapshots.
> > > >
> > > > You mentioned absence of Python on Android as a blocker of DAMOS use on the
> > > > previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
> > > > or Python.
> > > >
> > > > Of course, I think you might already surveyed it but found some problems.
> > > > Could you please share that in detail if so?
> > > >
> > > DAMOS tried regions feature you mentioned is not fully applicable. It needs to
> > > apply schemes on regions. There is no available scheme we can use for our use
> > > case. What we need is to return regions with access frequency and recency to
> > > userspace for later use.
> >
> >
> > Thank you for the answer, I understand your concern. One of the available
> > DAMOS action is 'stat'[1], which does nothing but just count the statistic.
> > Using DAMOS scheme for any access pattern with 'stat' action, you can extract
> > the access pattern via DAMOS tried regions feature of DAMON sysfs interface,
> > without making any unnecessary impact to the workload.
> >
> > Quote from [2]:
> >
> > The expected usage of this directory is investigations of schemes' behaviors,
> > and query-like efficient data access monitoring results retrievals. For the
> > latter use case, in particular, users can set the action as stat and set the
> > access pattern as their interested pattern that they want to query.
> >
> > For example, you could
> >
> > # cd /sys/kernel/mm/damon/admin
> > #
> > # # populate directories
> > # echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
> > # echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
> > # cd kdamonds/0/contexts/0/schemes/0
> > #
> > # # set the access pattern for any case (max as 2**64 - 1), and action as stat
> > # echo 0 > access_pattern/sz/min
> > # echo 18446744073709551615 > access_pattern/sz/max
> > # echo 0 > access_pattern/nr_accesses/min
> > # echo 18446744073709551615 > access_pattern/nr_accesses/max
> > # echo 0 > access_pattern/age/min
> > # echo 18446744073709551615 > access_pattern/age/max
> > # echo stat > action
> >
> > And this is how DAMON user-space tool is getting the snapshot with 'damo show'
> > command[3].
> >
> > Could this be used for your case? Please ask any question if you have :)
> >
> > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n
> > [2] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions,
> > [3] https://github.com/awslabs/damo/blob/next/USAGE.md#damo-show
>
> Thank you for your detailed response, it is very helpful to us. We will look into it
> and contact you if we have any questions.

So glad to hear this. Please let me know if you have any questions or need any
help :)


Thanks,
SJ

>
> >
> >
> > Thanks,
> > SJ
> >
> > > > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
> > > > [2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/
> > > >
> > > >
> > > > Thanks,
> > > > SJ
> > > >
> > > > >
> > > > > Thanks.

2024-01-22 05:46:48

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Sun, Dec 03, 2023 at 07:37:45PM +0000, SeongJae Park wrote:
> On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:
>
> > On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > > Hi Cuiyangpei,
> > >
> > > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> > >
> > > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > > Hi Cuiyangpei,
> > > > >
> > > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> > > > >
> > > > > > Hi SeongJae,
> > > > > >
> > > > > > We also investigated the operation schemes you mentioned, but we don't
> > > > > > think it can fit our needs.
> > > > > >
> > > > > > On android, user will open many apps and switch between these apps as
> > > > > > needs. We hope to monitor apps' memory access only when they are on
> > > > > > foreground and record the memory access pattern when they are switched
> > > > > > to the background.
> > > > > >
> > > > > > When avaliable memory reaches a threshold, we will use these access
> > > > > > patterns with some strategies to recognize those memory that will have
> > > > > > little impact on user experience and to reclaim them proactively.
> > > > > >
> > > > > > I'm not sure I have clarified it clearly, if you still have questions
> > > > > > on this, please let us know.
> > > > >
> > > > > So, to my understanding, you expect applications may keep similar access
> > > > > pattern when they are in foreground, but have a different, less aggressive
> > > > > access pattern in background, and therefore reclaim memory based on the
> > > > > foreground-access pattern, right?
> > > > >
> > > >
> > > > Different apps may have different access pattern. On android, the apps will
> > > > join in freeze cgroup and be frozen after switch to the background. So we
> > > > monitor apps' memory access only when they are in foreground.
> > >
> > > Thank you for this enlightening me :)
> > >
> > > > > Very interesting idea, thank you for sharing!
> > > > >
> > > > > Then, yes, I agree current DAMOS might not that helpful for the situation, and
> > > > > this record feature could be useful for your case.
> > > > >
> > > > > That said, do you really need full recording of the monitoring results? If
> > > > > not, DAMOS provides DAMOS tried regions feature[1], which allows users get the
> > > > > monitoring results snapshot that include both frequency and recency of all
> > > > > regions in an efficient way. If single snapshot is not having enough
> > > > > information for you, you could collect multiple snapshots.
> > > > >
> > > > > You mentioned absence of Python on Android as a blocker of DAMOS use on the
> > > > > previous reply[2], but DAMOS tried regions feature is not depend on tracepoints
> > > > > or Python.
> > > > >
> > > > > Of course, I think you might already surveyed it but found some problems.
> > > > > Could you please share that in detail if so?
> > > > >
> > > > DAMOS tried regions feature you mentioned is not fully applicable. It needs to
> > > > apply schemes on regions. There is no available scheme we can use for our use
> > > > case. What we need is to return regions with access frequency and recency to
> > > > userspace for later use.
> > >
> > >
> > > Thank you for the answer, I understand your concern. One of the available
> > > DAMOS action is 'stat'[1], which does nothing but just count the statistic.
> > > Using DAMOS scheme for any access pattern with 'stat' action, you can extract
> > > the access pattern via DAMOS tried regions feature of DAMON sysfs interface,
> > > without making any unnecessary impact to the workload.
> > >
> > > Quote from [2]:
> > >
> > > The expected usage of this directory is investigations of schemes' behaviors,
> > > and query-like efficient data access monitoring results retrievals. For the
> > > latter use case, in particular, users can set the action as stat and set the
> > > access pattern as their interested pattern that they want to query.
> > >
> > > For example, you could
> > >
> > > # cd /sys/kernel/mm/damon/admin
> > > #
> > > # # populate directories
> > > # echo 1 > kdamonds/nr_kdamonds; echo 1 > kdamonds/0/contexts/nr_contexts;
> > > # echo 1 > kdamonds/0/contexts/0/schemes/nr_schemes
> > > # cd kdamonds/0/contexts/0/schemes/0
> > > #
> > > # # set the access pattern for any case (max as 2**64 - 1), and action as stat
> > > # echo 0 > access_pattern/sz/min
> > > # echo 18446744073709551615 > access_pattern/sz/max
> > > # echo 0 > access_pattern/nr_accesses/min
> > > # echo 18446744073709551615 > access_pattern/nr_accesses/max
> > > # echo 0 > access_pattern/age/min
> > > # echo 18446744073709551615 > access_pattern/age/max
> > > # echo stat > action
> > >
> > > And this is how DAMON user-space tool is getting the snapshot with 'damo show'
> > > command[3].
> > >
> > > Could this be used for your case? Please ask any question if you have :)
> > >
> > > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n
> > > [2] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions,
> > > [3] https://github.com/awslabs/damo/blob/next/USAGE.md#damo-show
> >
> > Thank you for your detailed response, it is very helpful to us. We will look into it
> > and contact you if we have any questions.
>
> So glad to hear this. Please let me know if you have any questions or need any
> help :)
>
>
> Thanks,
> SJ
>
> >
> > >
> > >
> > > Thanks,
> > > SJ
> > >
> > > > > [1] https://docs.kernel.org/admin-guide/mm/damon/usage.html#schemes-n-tried-regions
> > > > > [2] https://lore.kernel.org/damon/20231129131315.GB12957@cuiyangpei/
> > > > >
> > > > >
> > > > > Thanks,
> > > > > SJ
> > > > >
> > > > > >
> > > > > > Thanks.

Hi SeongJae,

We set 'access_pattern' and 'stat' action in schemes when apps are on
foreground, record apps' memory access pattern when they are switched
to the background with 'update_schemes_tried_regions' state. But it
catch the snapshot after next aggregation interval. DAMON is still
sampling during the app switches to the background and the next
aggregation time, which can cause the value of "age" to change. The
sampling results during this period cannot accurately reflect the app's
foreground access pattern.

Is there any way to catch sampling result immediately after setting the
"update_schemes_tried_regions" state? Alternatively, can it return the
"last_nr_accesses" and "last_age" values in tried_regions/<N> directory?

Do you have any other suggestions?

Thanks.


2024-01-22 18:30:32

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi cuiyangpei,

On Mon, 22 Jan 2024 13:46:31 +0800 cuiyangpei <[email protected]> wrote:

> On Sun, Dec 03, 2023 at 07:37:45PM +0000, SeongJae Park wrote:
> > On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:
> >
> > > On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > > > Hi Cuiyangpei,
> > > >
> > > > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> > > >
> > > > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > > > Hi Cuiyangpei,
> > > > > >
> > > > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
[...]
>
> Hi SeongJae,
>
> We set 'access_pattern' and 'stat' action in schemes when apps are on
> foreground, record apps' memory access pattern when they are switched
> to the background with 'update_schemes_tried_regions' state. But it
> catch the snapshot after next aggregation interval. DAMON is still
> sampling during the app switches to the background and the next
> aggregation time, which can cause the value of "age" to change. The
> sampling results during this period cannot accurately reflect the app's
> foreground access pattern.
>
> Is there any way to catch sampling result immediately after setting the
> "update_schemes_tried_regions" state?

There is no way for exactly doing this. You would need to proactively collect
snapshots while the app is foreground, and use the latest one that collected
before the app goes background, like recording-based approach would do.

I think recent DAMON changes might make an alternative approach available,
though. From v6.7, DAMON provides pseudo-moving-average monitoring result in
sampling interval granualrity, since patchset "mm/damon: provide pseudo-moving
sum based access rate". And a followup patchset, namely "mm/damon: implement
DAMOS apply intervals", has made DAMOS works in the sampling interval
granualrity. Both patchsets are merged into v6.7-rc1.

Hence, I think you could use 'update_schemes_tried_regions' after you noticed
the app's state transition, with DAMOS apply interval of one sampling interval.
Then you will get the monitoring results after one sampling interval. Of
course, the snapshot may contain some of background access pattern, but
wouldn't made it changed significantly, unless you set aggregation interval too
short.

> Alternatively, can it return the "last_nr_accesses" and "last_age" values in
> tried_regions/<N> directory?

This could also be a good alternative in my think. Nice idea. But, because
the previously mentioned alternative is already available while this require a
bit small but additional changes, could we check if the previously one make
sense and works first? We could revisit this idea if it turns out the previous
alternative is not suffice in my opinion.

>
> Do you have any other suggestions?

As I mentioned above, I'd suggest the DAMOS apply interval of single sampling
interval for now.


Thanks,
SJ

>
> Thanks.

2024-01-26 08:53:17

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Mon, Jan 22, 2024 at 09:56:11AM -0800, SeongJae Park wrote:
> Hi cuiyangpei,
>
> On Mon, 22 Jan 2024 13:46:31 +0800 cuiyangpei <[email protected]> wrote:
>
> > On Sun, Dec 03, 2023 at 07:37:45PM +0000, SeongJae Park wrote:
> > > On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:
> > >
> > > > On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > > > > Hi Cuiyangpei,
> > > > >
> > > > > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> > > > >
> > > > > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > > > > Hi Cuiyangpei,
> > > > > > >
> > > > > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> [...]
> >
> > Hi SeongJae,
> >
> > We set 'access_pattern' and 'stat' action in schemes when apps are on
> > foreground, record apps' memory access pattern when they are switched
> > to the background with 'update_schemes_tried_regions' state. But it
> > catch the snapshot after next aggregation interval. DAMON is still
> > sampling during the app switches to the background and the next
> > aggregation time, which can cause the value of "age" to change. The
> > sampling results during this period cannot accurately reflect the app's
> > foreground access pattern.
> >
> > Is there any way to catch sampling result immediately after setting the
> > "update_schemes_tried_regions" state?
>
> There is no way for exactly doing this. You would need to proactively collect
> snapshots while the app is foreground, and use the latest one that collected
> before the app goes background, like recording-based approach would do.
>
> I think recent DAMON changes might make an alternative approach available,
> though. From v6.7, DAMON provides pseudo-moving-average monitoring result in
> sampling interval granualrity, since patchset "mm/damon: provide pseudo-moving
> sum based access rate". And a followup patchset, namely "mm/damon: implement
> DAMOS apply intervals", has made DAMOS works in the sampling interval
> granualrity. Both patchsets are merged into v6.7-rc1.
>
> Hence, I think you could use 'update_schemes_tried_regions' after you noticed
> the app's state transition, with DAMOS apply interval of one sampling interval.
> Then you will get the monitoring results after one sampling interval. Of
> course, the snapshot may contain some of background access pattern, but
> wouldn't made it changed significantly, unless you set aggregation interval too
> short.

All other actions will apply at one sampling interval except for the
`stat` action.

We use 'update_schemes_tried_regions' after switch to the background. The
before_damos_apply callback function will only be set when the next aggregation
interval arrives. The `tried_regions` will only be updated after setting the
callback function.

DAMON is still sampling during setting 'update_schemes_tried_regions' to the next
aggregation time, which is not what we expected. The pseudo-moving-average
monitoring result can reduce nr_accesees inaccuracy, but age is still being modified
during this time, so it can't improve this issue.

Please let me know if my understanding is incorrect. Thank you.
>
> > Alternatively, can it return the "last_nr_accesses" and "last_age" values in
> > tried_regions/<N> directory?
>
> This could also be a good alternative in my think. Nice idea. But, because
> the previously mentioned alternative is already available while this require a
> bit small but additional changes, could we check if the previously one make
> sense and works first? We could revisit this idea if it turns out the previous
> alternative is not suffice in my opinion.
>
Can you consider adding "last_nr_accesses" and "last_age" two files in
'tried_regions/<N>' directory?

Thanks.
> >
> > Do you have any other suggestions?
>
> As I mentioned above, I'd suggest the DAMOS apply interval of single sampling
> interval for now.
>
>
> Thanks,
> SJ
>
> >
> > Thanks.

2024-01-26 09:11:19

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Fri, 26 Jan 2024 14:57:06 +0800 cuiyangpei <[email protected]> wrote:

> On Mon, Jan 22, 2024 at 09:56:11AM -0800, SeongJae Park wrote:
> > Hi cuiyangpei,
> >
> > On Mon, 22 Jan 2024 13:46:31 +0800 cuiyangpei <[email protected]> wrote:
> >
> > > On Sun, Dec 03, 2023 at 07:37:45PM +0000, SeongJae Park wrote:
> > > > On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:
> > > >
> > > > > On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > > > > > Hi Cuiyangpei,
> > > > > >
> > > > > > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> > > > > >
> > > > > > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > > > > > Hi Cuiyangpei,
> > > > > > > >
> > > > > > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
[...]
> > > Is there any way to catch sampling result immediately after setting the
> > > "update_schemes_tried_regions" state?
> >
> > There is no way for exactly doing this. You would need to proactively collect
> > snapshots while the app is foreground, and use the latest one that collected
> > before the app goes background, like recording-based approach would do.
> >
> > I think recent DAMON changes might make an alternative approach available,
> > though. From v6.7, DAMON provides pseudo-moving-average monitoring result in
> > sampling interval granualrity, since patchset "mm/damon: provide pseudo-moving
> > sum based access rate". And a followup patchset, namely "mm/damon: implement
> > DAMOS apply intervals", has made DAMOS works in the sampling interval
> > granualrity. Both patchsets are merged into v6.7-rc1.
> >
> > Hence, I think you could use 'update_schemes_tried_regions' after you noticed
> > the app's state transition, with DAMOS apply interval of one sampling interval.
> > Then you will get the monitoring results after one sampling interval. Of
> > course, the snapshot may contain some of background access pattern, but
> > wouldn't made it changed significantly, unless you set aggregation interval too
> > short.
>
> All other actions will apply at one sampling interval except for the
> `stat` action.
>
> We use 'update_schemes_tried_regions' after switch to the background. The
> before_damos_apply callback function will only be set when the next aggregation
> interval arrives. The `tried_regions` will only be updated after setting the
> callback function.
>
> DAMON is still sampling during setting 'update_schemes_tried_regions' to the next
> aggregation time, which is not what we expected. The pseudo-moving-average
> monitoring result can reduce nr_accesees inaccuracy, but age is still being modified
> during this time, so it can't improve this issue.
>
> Please let me know if my understanding is incorrect. Thank you.

So, 'update_schemes_tried_regions' command is firstly handled by
'damon_sysfs_cmd_request_callback()', which is registered as
after_wmarks_check() and after_aggregation() callback. Hence
'update_schemes_tried_regions' command is still effectively working in
aggregation interval granularity. I think this is what you found, right?

If I'm not wrongly understanding your point, I think the concern is valid. I
think we should make it works in sampling interval granularity. I will try to
make so. Would that work for your use case?

> >
> > > Alternatively, can it return the "last_nr_accesses" and "last_age" values in
> > > tried_regions/<N> directory?
> >
> > This could also be a good alternative in my think. Nice idea. But, because
> > the previously mentioned alternative is already available while this require a
> > bit small but additional changes, could we check if the previously one make
> > sense and works first? We could revisit this idea if it turns out the previous
> > alternative is not suffice in my opinion.
> >
> Can you consider adding "last_nr_accesses" and "last_age" two files in
> 'tried_regions/<N>' directory?

Actually we don't have 'last_age' field, right? And in case of
'last_nr_accesses', it is a hidden private field, since it is intended to be
accessed by only DAMON core code. Making it exposed to user means exposing
implementation details, and the mechanism that coupled with an exposed
interface is hard to be changed, so be unflexible. Hence I'd prefer making
'update_schemes_tried_regions' works in sampling interval granularity, more
than exposing the two information if it works for your use case.


Thanks,
SJ

[...]

2024-01-28 09:13:15

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Fri, Jan 26, 2024 at 12:04:54AM -0800, SeongJae Park wrote:
> On Fri, 26 Jan 2024 14:57:06 +0800 cuiyangpei <[email protected]> wrote:
>
> > On Mon, Jan 22, 2024 at 09:56:11AM -0800, SeongJae Park wrote:
> > > Hi cuiyangpei,
> > >
> > > On Mon, 22 Jan 2024 13:46:31 +0800 cuiyangpei <[email protected]> wrote:
> > >
> > > > On Sun, Dec 03, 2023 at 07:37:45PM +0000, SeongJae Park wrote:
> > > > > On 2023-12-03T13:43:13+08:00 cuiyangpei <[email protected]> wrote:
> > > > >
> > > > > > On Fri, Dec 01, 2023 at 05:31:12PM +0000, SeongJae Park wrote:
> > > > > > > Hi Cuiyangpei,
> > > > > > >
> > > > > > > On Fri, 1 Dec 2023 20:25:07 +0800 cuiyangpei <[email protected]> wrote:
> > > > > > >
> > > > > > > > On Thu, Nov 30, 2023 at 07:44:20PM +0000, SeongJae Park wrote:
> > > > > > > > > Hi Cuiyangpei,
> > > > > > > > >
> > > > > > > > > On Thu, 30 Nov 2023 17:14:26 +0800 cuiyangpei <[email protected]> wrote:
> [...]
> > > > Is there any way to catch sampling result immediately after setting the
> > > > "update_schemes_tried_regions" state?
> > >
> > > There is no way for exactly doing this. You would need to proactively collect
> > > snapshots while the app is foreground, and use the latest one that collected
> > > before the app goes background, like recording-based approach would do.
> > >
> > > I think recent DAMON changes might make an alternative approach available,
> > > though. From v6.7, DAMON provides pseudo-moving-average monitoring result in
> > > sampling interval granualrity, since patchset "mm/damon: provide pseudo-moving
> > > sum based access rate". And a followup patchset, namely "mm/damon: implement
> > > DAMOS apply intervals", has made DAMOS works in the sampling interval
> > > granualrity. Both patchsets are merged into v6.7-rc1.
> > >
> > > Hence, I think you could use 'update_schemes_tried_regions' after you noticed
> > > the app's state transition, with DAMOS apply interval of one sampling interval.
> > > Then you will get the monitoring results after one sampling interval. Of
> > > course, the snapshot may contain some of background access pattern, but
> > > wouldn't made it changed significantly, unless you set aggregation interval too
> > > short.
> >
> > All other actions will apply at one sampling interval except for the
> > `stat` action.
> >
> > We use 'update_schemes_tried_regions' after switch to the background. The
> > before_damos_apply callback function will only be set when the next aggregation
> > interval arrives. The `tried_regions` will only be updated after setting the
> > callback function.
> >
> > DAMON is still sampling during setting 'update_schemes_tried_regions' to the next
> > aggregation time, which is not what we expected. The pseudo-moving-average
> > monitoring result can reduce nr_accesees inaccuracy, but age is still being modified
> > during this time, so it can't improve this issue.
> >
> > Please let me know if my understanding is incorrect. Thank you.
>
> So, 'update_schemes_tried_regions' command is firstly handled by
> 'damon_sysfs_cmd_request_callback()', which is registered as
> after_wmarks_check() and after_aggregation() callback. Hence
> 'update_schemes_tried_regions' command is still effectively working in
> aggregation interval granularity. I think this is what you found, right?
>
Yes.
> If I'm not wrongly understanding your point, I think the concern is valid. I
> think we should make it works in sampling interval granularity. I will try to
> make so. Would that work for your use case?
>
It's much better than working in aggregation interval.

I have a question. Why does the 'update_schemes_tried_regions' command need to work
in the sampling time or aggregation time? 'update_schemes_tried_regions' is a
relatively special state that updates the regions that corresponding operation scheme.
Can it be separated from other states and controlled by sysfs node to respond immediately
after being written?

> > >
> > > > Alternatively, can it return the "last_nr_accesses" and "last_age" values in
> > > > tried_regions/<N> directory?
> > >
> > > This could also be a good alternative in my think. Nice idea. But, because
> > > the previously mentioned alternative is already available while this require a
> > > bit small but additional changes, could we check if the previously one make
> > > sense and works first? We could revisit this idea if it turns out the previous
> > > alternative is not suffice in my opinion.
> > >
> > Can you consider adding "last_nr_accesses" and "last_age" two files in
> > 'tried_regions/<N>' directory?
>
> Actually we don't have 'last_age' field, right? And in case of
> 'last_nr_accesses', it is a hidden private field, since it is intended to be
> accessed by only DAMON core code. Making it exposed to user means exposing
> implementation details, and the mechanism that coupled with an exposed
> interface is hard to be changed, so be unflexible. Hence I'd prefer making
> 'update_schemes_tried_regions' works in sampling interval granularity, more
> than exposing the two information if it works for your use case.
>
Ok, I get it.
> Thanks,
> SJ
>
> [...]

2024-01-28 16:51:57

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Sun, 28 Jan 2024 17:13:00 +0800 cuiyangpei <[email protected]> wrote:

> On Fri, Jan 26, 2024 at 12:04:54AM -0800, SeongJae Park wrote:
[...]
> > So, 'update_schemes_tried_regions' command is firstly handled by
> > 'damon_sysfs_cmd_request_callback()', which is registered as
> > after_wmarks_check() and after_aggregation() callback. Hence
> > 'update_schemes_tried_regions' command is still effectively working in
> > aggregation interval granularity. I think this is what you found, right?
> >
> Yes.
> > If I'm not wrongly understanding your point, I think the concern is valid. I
> > think we should make it works in sampling interval granularity. I will try to
> > make so. Would that work for your use case?
> >
> It's much better than working in aggregation interval.

Thank you for confirming. I will start working on this.

>
> I have a question. Why does the 'update_schemes_tried_regions' command need to work
> in the sampling time or aggregation time? 'update_schemes_tried_regions' is a
> relatively special state that updates the regions that corresponding operation scheme.
> Can it be separated from other states and controlled by sysfs node to respond immediately
> after being written?

Mainly because the region data is updated by a kdamond thread. To safely
access the region, the accessor should do some kind of synchronization with the
kdamond thread. To minimize such synchronization overhead, DAMON let the API
users (kernel components) to register callbacks which kdamond invokes under
specific events including 'after_sampling' or 'after_aggregate'. Because the
callback is executed in the kdamond thread, callbacks can safely access the
data without additional synchronization. DAMON sysfs interface is using the
callback mechanism, and hence need to work in the sampling or aggregation
times.


Thanks,
SJ

[...]

2024-01-29 12:14:04

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Sun, Jan 28, 2024 at 08:28:04AM -0800, SeongJae Park wrote:
> On Sun, 28 Jan 2024 17:13:00 +0800 cuiyangpei <[email protected]> wrote:
>
> > On Fri, Jan 26, 2024 at 12:04:54AM -0800, SeongJae Park wrote:
> [...]
> > > So, 'update_schemes_tried_regions' command is firstly handled by
> > > 'damon_sysfs_cmd_request_callback()', which is registered as
> > > after_wmarks_check() and after_aggregation() callback. Hence
> > > 'update_schemes_tried_regions' command is still effectively working in
> > > aggregation interval granularity. I think this is what you found, right?
> > >
> > Yes.
> > > If I'm not wrongly understanding your point, I think the concern is valid. I
> > > think we should make it works in sampling interval granularity. I will try to
> > > make so. Would that work for your use case?
> > >
> > It's much better than working in aggregation interval.
>
> Thank you for confirming. I will start working on this.
>

Great, looking forward to seeing the progress.

> >
> > I have a question. Why does the 'update_schemes_tried_regions' command need to work
> > in the sampling time or aggregation time? 'update_schemes_tried_regions' is a
> > relatively special state that updates the regions that corresponding operation scheme.
> > Can it be separated from other states and controlled by sysfs node to respond immediately
> > after being written?
>
> Mainly because the region data is updated by a kdamond thread. To safely
> access the region, the accessor should do some kind of synchronization with the
> kdamond thread. To minimize such synchronization overhead, DAMON let the API
> users (kernel components) to register callbacks which kdamond invokes under
> specific events including 'after_sampling' or 'after_aggregate'. Because the
> callback is executed in the kdamond thread, callbacks can safely access the
> data without additional synchronization. DAMON sysfs interface is using the
> callback mechanism, and hence need to work in the sampling or aggregation
> times.
>
Thank you for the detailed explanation.

> Thanks,
> SJ
>
> [...]

2024-02-06 02:57:07

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

Hi Cuiyangpei,

On Mon, 29 Jan 2024 20:13:16 +0800 cuiyangpei <[email protected]> wrote:

> On Sun, Jan 28, 2024 at 08:28:04AM -0800, SeongJae Park wrote:
> > On Sun, 28 Jan 2024 17:13:00 +0800 cuiyangpei <[email protected]> wrote:
> >
> > > On Fri, Jan 26, 2024 at 12:04:54AM -0800, SeongJae Park wrote:
> > [...]
> > > > So, 'update_schemes_tried_regions' command is firstly handled by
> > > > 'damon_sysfs_cmd_request_callback()', which is registered as
> > > > after_wmarks_check() and after_aggregation() callback. Hence
> > > > 'update_schemes_tried_regions' command is still effectively working in
> > > > aggregation interval granularity. I think this is what you found, right?
> > > >
> > > Yes.
> > > > If I'm not wrongly understanding your point, I think the concern is valid. I
> > > > think we should make it works in sampling interval granularity. I will try to
> > > > make so. Would that work for your use case?
> > > >
> > > It's much better than working in aggregation interval.
> >
> > Thank you for confirming. I will start working on this.
> >
>
> Great, looking forward to seeing the progress.

Just sent a patch[1] for this.

I also updated DAMON user-space tool, damo, to use this improvement[2]. I hope
that to help others who using DAMON with their own tool to easily understand
how they can get the improvement from this patch.

Also, please feel free to ask any questions and/or help.

[1] https://lore.kernel.org/r/[email protected]
[2] https://github.com/awslabs/damo/commit/75af3a1c0b3e79cd3207f0f8df5b5ac39f887450


Thanks,
SJ

[...]

2024-02-06 03:27:13

by cuiyangpei

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/damon/sysfs: Implement recording feature

On Mon, Feb 05, 2024 at 06:56:59PM -0800, SeongJae Park wrote:
> Hi Cuiyangpei,
>
> On Mon, 29 Jan 2024 20:13:16 +0800 cuiyangpei <[email protected]> wrote:
>
> > On Sun, Jan 28, 2024 at 08:28:04AM -0800, SeongJae Park wrote:
> > > On Sun, 28 Jan 2024 17:13:00 +0800 cuiyangpei <[email protected]> wrote:
> > >
> > > > On Fri, Jan 26, 2024 at 12:04:54AM -0800, SeongJae Park wrote:
> > > [...]
> > > > > So, 'update_schemes_tried_regions' command is firstly handled by
> > > > > 'damon_sysfs_cmd_request_callback()', which is registered as
> > > > > after_wmarks_check() and after_aggregation() callback. Hence
> > > > > 'update_schemes_tried_regions' command is still effectively working in
> > > > > aggregation interval granularity. I think this is what you found, right?
> > > > >
> > > > Yes.
> > > > > If I'm not wrongly understanding your point, I think the concern is valid. I
> > > > > think we should make it works in sampling interval granularity. I will try to
> > > > > make so. Would that work for your use case?
> > > > >
> > > > It's much better than working in aggregation interval.
> > >
> > > Thank you for confirming. I will start working on this.
> > >
> >
> > Great, looking forward to seeing the progress.
>
> Just sent a patch[1] for this.
>
> I also updated DAMON user-space tool, damo, to use this improvement[2]. I hope
> that to help others who using DAMON with their own tool to easily understand
> how they can get the improvement from this patch.
>
> Also, please feel free to ask any questions and/or help.
>
> [1] https://lore.kernel.org/r/[email protected]
> [2] https://github.com/awslabs/damo/commit/75af3a1c0b3e79cd3207f0f8df5b5ac39f887450
>
>
> Thanks,
> SJ
>
> [...]

Hi SeongJae,

Thank you for sending the patch. I will verify this feature on the phone and reach out
if I have any questions or require assistance.

Thanks.