From: SeongJae Park <[email protected]>
NOTE: This is an RFC for future change of DAMON patchsets[1,2,3], which is not
merged in the mainline yet. The aim of this RFC is to show how the patchset
would be changed in the next version. So, if you have some interest in this
RFC, please consider reviewing the DAMON patchset, either.
Currently, DAMON is configured to be exclusive with Idle Page Tracking because
both of the subsystems use PG_Idle flag and there is no way to synchronize with
Idle Page Tracking. Though there are many use cases DAMON could do better than
Idle Page Tracking, DAMON cannot fully replace Idle Page Tracking, since
- DAMON doesn't support all features of Idle Page Tracking from the beginning
(e.g., physical address space is supported from the third DAMON patchset[3]),
and
- there are some use cases Idle Page Tracking could be more efficient (e.g.,
page size granularity working set size calculation).
Therefore, this patchset makes DAMON coexistable with Idle Page Tracking. As
the first decision of making DAMON exclusive was not a good idea, this change
will be merged in the next versions of the original patchsets[1,2,3].
Therefore, you could skip detail of the changes but wait for postings of the
next versions of the patchsets, except the 4th patch.
The changes significantly refactor the code, especially 'damon.c' and
'damon-test.c'. Though the refactoring changes are only straightforward, if
you gave 'Reviewed-by' before and you want to drop it due to the changes,
please let me know.
[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/[email protected]/
[3] https://lore.kernel.org/linux-mm/[email protected]/
Sequence of Patches
===================
The 1st patch separates DAMON components that unnecessarily implemented in one
source file and depend on one config option (CONFIG_DAMON)
to multiple files and apply fine-grained dependency. As a result, the core
framework part of DAMON becomes coexistable with Idle Page Tracking.
Following two patches further refactor the code for cleaner bound between the
components.
The 4th patch implements a synchronization infrastructure for PG_idle flag
users. We implement it to eventually used for DAMON, but the change is
independent with DAMON and the also required for Idle Page Tracking itself.
This could be picked before DAMON patchsets merged.
Finally, the 5th patch updates DAMON to use the PG_idle synchronization
infrastructure and fully coexistable with Page Idle Tracking.
Baseline and Complete Git Trees
===============================
The patches are based on the v5.8 plus DAMON v20 patchset[1], RFC v14 of DAMOS
patchset, RFC v8 of physical address space support patchset, RFC v1 of user
space improvement[4], and some more trivial fixes (s/snprintf/scnprintf). You
can also clone the complete git tree:
$ git clone git://github.com/sjp38/linux -b damon-usi/rfc/v1
The web is also available:
https://github.com/sjp38/linux/releases/tag/damon-usi/rfc/v1
[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/[email protected]/
[3] https://lore.kernel.org/linux-mm/[email protected]/
[4] https://lore.kernel.org/linux-mm/[email protected]/
SeongJae Park (5):
mm/damon: Separate components and apply fine-grained dependencies
mm/damon: Separate DAMON schemes application to primitives
mm/damon: Move recording feature from core to dbgfs
mm/page_idle: Avoid interferences from concurrent users
mm/damon/primitives: Make coexistable with Idle Page Tracking
.../admin-guide/mm/idle_page_tracking.rst | 22 +-
MAINTAINERS | 3 +-
include/linux/damon.h | 109 +-
include/linux/page_idle.h | 2 +
mm/Kconfig | 25 +-
mm/Makefile | 2 +-
mm/damon-test.h | 724 -----
mm/damon.c | 2754 -----------------
mm/damon/Kconfig | 68 +
mm/damon/Makefile | 5 +
mm/damon/core-test.h | 253 ++
mm/damon/core.c | 860 +++++
mm/damon/damon.h | 7 +
mm/damon/dbgfs-test.h | 264 ++
mm/damon/dbgfs.c | 1158 +++++++
mm/damon/primitives-test.h | 328 ++
mm/damon/primitives.c | 896 ++++++
mm/page_idle.c | 40 +
18 files changed, 3982 insertions(+), 3538 deletions(-)
delete mode 100644 mm/damon-test.h
delete mode 100644 mm/damon.c
create mode 100644 mm/damon/Kconfig
create mode 100644 mm/damon/Makefile
create mode 100644 mm/damon/core-test.h
create mode 100644 mm/damon/core.c
create mode 100644 mm/damon/damon.h
create mode 100644 mm/damon/dbgfs-test.h
create mode 100644 mm/damon/dbgfs.c
create mode 100644 mm/damon/primitives-test.h
create mode 100644 mm/damon/primitives.c
--
2.17.1
From: SeongJae Park <[email protected]>
Concurrent Idle Page Tracking users can interfere each other because the
interface doesn't provide a central rule for synchronization between the
users. Users could implement their own synchronization rule, but even
in that case, applications developed by different users would not know
how to synchronize with others. To help this situation, this commit
introduces a centralized synchronization infrastructure of Idle Page
Tracking.
In detail, this commit introduces a mutex lock for Idle Page Tracking,
called 'page_idle_lock'. It is exposed to user space via a new bool
sysfs file, '/sys/kernel/mm/page_idle/lock'. By writing to and reading
from the file, users can hold/release and read status of the mutex.
Writes to the Idle Page Tracking 'bitmap' file fails if the lock is not
held, while reads of the file can be done regardless of the lock status.
Note that users could still interfere each other if they abuse this
locking rule. Nevertheless, this change will let them notice the rule.
Signed-off-by: SeongJae Park <[email protected]>
---
.../admin-guide/mm/idle_page_tracking.rst | 22 +++++++---
mm/page_idle.c | 40 +++++++++++++++++++
2 files changed, 56 insertions(+), 6 deletions(-)
diff --git a/Documentation/admin-guide/mm/idle_page_tracking.rst b/Documentation/admin-guide/mm/idle_page_tracking.rst
index df9394fb39c2..3f5e7a8b5b78 100644
--- a/Documentation/admin-guide/mm/idle_page_tracking.rst
+++ b/Documentation/admin-guide/mm/idle_page_tracking.rst
@@ -21,13 +21,13 @@ User API
========
The idle page tracking API is located at ``/sys/kernel/mm/page_idle``.
-Currently, it consists of the only read-write file,
-``/sys/kernel/mm/page_idle/bitmap``.
+Currently, it consists of two read-write file,
+``/sys/kernel/mm/page_idle/bitmap`` and ``/sys/kernel/mm/page_idle/lock``.
-The file implements a bitmap where each bit corresponds to a memory page. The
-bitmap is represented by an array of 8-byte integers, and the page at PFN #i is
-mapped to bit #i%64 of array element #i/64, byte order is native. When a bit is
-set, the corresponding page is idle.
+The ``bitmap`` file implements a bitmap where each bit corresponds to a memory
+page. The bitmap is represented by an array of 8-byte integers, and the page at
+PFN #i is mapped to bit #i%64 of array element #i/64, byte order is native.
+When a bit is set, the corresponding page is idle.
A page is considered idle if it has not been accessed since it was marked idle
(for more details on what "accessed" actually means see the :ref:`Implementation
@@ -74,6 +74,16 @@ See :ref:`Documentation/admin-guide/mm/pagemap.rst <pagemap>` for more
information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and
``/proc/kpagecgroup``.
+The ``lock`` file is for avoidance of interference from concurrent users. If
+the content of the ``lock`` file is ``1``, it means the ``bitmap`` file is
+currently being used by someone. While the content of the ``lock`` file is
+``1``, writing ``1`` to the file fails. Therefore, users should first
+successfully write ``1`` to the ``lock`` file before starting use of ``bitmap``
+file and write ``0`` to the ``lock`` file after they finished use of the
+``bitmap`` file. If a user writes the ``bitmap`` file while the ``lock`` is
+``0``, the write fails. Meanwhile, reads of the ``bitmap`` file success
+regardless of the ``lock`` status.
+
.. _impl_details:
Implementation Details
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 144fb4ed961d..0aa45f848570 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -16,6 +16,8 @@
#define BITMAP_CHUNK_SIZE sizeof(u64)
#define BITMAP_CHUNK_BITS (BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
+static DEFINE_MUTEX(page_idle_lock);
+
/*
* Idle page tracking only considers user memory pages, for other types of
* pages the idle flag is always unset and an attempt to set it is silently
@@ -169,6 +171,9 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj,
unsigned long pfn, end_pfn;
int bit;
+ if (!mutex_is_locked(&page_idle_lock))
+ return -EPERM;
+
if (pos % BITMAP_CHUNK_SIZE || count % BITMAP_CHUNK_SIZE)
return -EINVAL;
@@ -197,17 +202,52 @@ static ssize_t page_idle_bitmap_write(struct file *file, struct kobject *kobj,
return (char *)in - buf;
}
+static ssize_t page_idle_lock_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%d\n", mutex_is_locked(&page_idle_lock));
+}
+
+static ssize_t page_idle_lock_store(struct kobject *kobj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ bool do_lock;
+ int ret;
+
+ ret = kstrtobool(buf, &do_lock);
+ if (ret < 0)
+ return ret;
+
+ if (do_lock) {
+ if (!mutex_trylock(&page_idle_lock))
+ return -EBUSY;
+ } else {
+ mutex_unlock(&page_idle_lock);
+ }
+
+ return count;
+}
+
static struct bin_attribute page_idle_bitmap_attr =
__BIN_ATTR(bitmap, 0600,
page_idle_bitmap_read, page_idle_bitmap_write, 0);
+static struct kobj_attribute page_idle_lock_attr =
+ __ATTR(lock, 0600, page_idle_lock_show, page_idle_lock_store);
+
static struct bin_attribute *page_idle_bin_attrs[] = {
&page_idle_bitmap_attr,
NULL,
};
+static struct attribute *page_idle_lock_attrs[] = {
+ &page_idle_lock_attr.attr,
+ NULL,
+};
+
static const struct attribute_group page_idle_attr_group = {
.bin_attrs = page_idle_bin_attrs,
+ .attrs = page_idle_lock_attrs,
.name = "page_idle",
};
--
2.17.1
From: SeongJae Park <[email protected]>
DAMON's reference 'primitives' internally use 'PG_Idle' flag. Because
the flag is also used by Idle Page Tracking but there was no way to
synchronize with it, the 'primitives' were configured to be exclusive
with Idle Page Tracking before. However, as we can now synchronize with
Idle Page Tracking using 'idle_page_lock', this commit makes the
primitives to do the synchronization and coexistable with Idle Page
Tracking.
In more detail, the 'primitives' only require the users to do the
synchronization by themselves. Real synchronization is done by the
DAMON debugfs interface, who is the only one user of the 'primitives' as
of now.
Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 1 +
include/linux/page_idle.h | 2 ++
mm/damon/Kconfig | 2 +-
mm/damon/dbgfs.c | 32 +++++++++++++++++++++++++++++++-
mm/damon/primitives.c | 16 +++++++++++++++-
mm/page_idle.c | 2 +-
6 files changed, 51 insertions(+), 4 deletions(-)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 606e59f785a2..12200a1171a8 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -312,6 +312,7 @@ void kdamond_init_phys_regions(struct damon_ctx *ctx);
void kdamond_update_phys_regions(struct damon_ctx *ctx);
void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
+bool kdamond_phys_target_valid(struct damon_target *t);
void damon_set_paddr_primitives(struct damon_ctx *ctx);
#endif /* CONFIG_DAMON_PRIMITIVES */
diff --git a/include/linux/page_idle.h b/include/linux/page_idle.h
index d8a6aecf99cb..bcbb965b566c 100644
--- a/include/linux/page_idle.h
+++ b/include/linux/page_idle.h
@@ -8,6 +8,8 @@
#ifdef CONFIG_PAGE_IDLE_FLAG
+extern struct mutex page_idle_lock;
+
#ifdef CONFIG_64BIT
static inline bool page_is_young(struct page *page)
{
diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig
index 8b3f3dd3bd32..64d69a239408 100644
--- a/mm/damon/Kconfig
+++ b/mm/damon/Kconfig
@@ -26,7 +26,7 @@ config DAMON_KUNIT_TEST
config DAMON_PRIMITIVES
bool "DAMON primitives for virtual/physical address spaces monitoring"
- depends on DAMON && MMU && !IDLE_PAGE_TRACKING
+ depends on DAMON && MMU
select PAGE_EXTENSION if !64BIT
select PAGE_IDLE_FLAG
help
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 7a6c279690f8..ce12e92e1667 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -12,6 +12,7 @@
#include <linux/file.h>
#include <linux/mm.h>
#include <linux/module.h>
+#include <linux/page_idle.h>
#include <linux/slab.h>
#define MIN_RECORD_BUFFER_LEN 1024
@@ -28,6 +29,7 @@ struct debugfs_recorder {
/* Monitoring contexts for debugfs interface users. */
static struct damon_ctx **debugfs_ctxs;
static int debugfs_nr_ctxs = 1;
+static int debugfs_nr_terminated_ctxs;
static DEFINE_MUTEX(damon_dbgfs_lock);
@@ -106,9 +108,20 @@ static void debugfs_init_vm_regions(struct damon_ctx *ctx)
kdamond_init_vm_regions(ctx);
}
+static void debugfs_unlock_page_idle_lock(void)
+{
+ mutex_lock(&damon_dbgfs_lock);
+ if (++debugfs_nr_terminated_ctxs == debugfs_nr_ctxs) {
+ debugfs_nr_terminated_ctxs = 0;
+ mutex_unlock(&page_idle_lock);
+ }
+ mutex_unlock(&damon_dbgfs_lock);
+}
+
static void debugfs_vm_cleanup(struct damon_ctx *ctx)
{
debugfs_flush_rbuffer(ctx->private);
+ debugfs_unlock_page_idle_lock();
kdamond_vm_cleanup(ctx);
}
@@ -120,6 +133,8 @@ static void debugfs_init_phys_regions(struct damon_ctx *ctx)
static void debugfs_phys_cleanup(struct damon_ctx *ctx)
{
debugfs_flush_rbuffer(ctx->private);
+ debugfs_unlock_page_idle_lock();
+
}
/*
@@ -197,6 +212,21 @@ static char *user_input_str(const char __user *buf, size_t count, loff_t *ppos)
return kbuf;
}
+static int debugfs_start_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs)
+{
+ int rc;
+
+ if (!mutex_trylock(&page_idle_lock))
+ return -EBUSY;
+
+ rc = damon_start_ctx_ptrs(ctxs, nr_ctxs);
+ if (rc)
+ mutex_unlock(&page_idle_lock);
+
+ return rc;
+}
+
+
static ssize_t debugfs_monitor_on_write(struct file *file,
const char __user *buf, size_t count, loff_t *ppos)
{
@@ -212,7 +242,7 @@ static ssize_t debugfs_monitor_on_write(struct file *file,
if (sscanf(kbuf, "%s", kbuf) != 1)
return -EINVAL;
if (!strncmp(kbuf, "on", count))
- err = damon_start_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
+ err = debugfs_start_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
else if (!strncmp(kbuf, "off", count))
err = damon_stop_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
else
diff --git a/mm/damon/primitives.c b/mm/damon/primitives.c
index e762dc8a5f2e..442b41b79b82 100644
--- a/mm/damon/primitives.c
+++ b/mm/damon/primitives.c
@@ -30,6 +30,10 @@
#include "damon.h"
+#ifndef CONFIG_IDLE_PAGE_TRACKING
+DEFINE_MUTEX(page_idle_lock);
+#endif
+
/* Minimal region size. Every damon_region is aligned by this. */
#ifndef CONFIG_DAMON_KUNIT_TEST
#define MIN_REGION PAGE_SIZE
@@ -776,6 +780,9 @@ bool kdamond_vm_target_valid(struct damon_target *t)
{
struct task_struct *task;
+ if (!mutex_is_locked(&page_idle_lock))
+ return false;
+
task = damon_get_task_struct(t);
if (task) {
put_task_struct(task);
@@ -795,6 +802,13 @@ void kdamond_vm_cleanup(struct damon_ctx *ctx)
}
}
+bool kdamond_phys_target_valid(struct damon_target *t)
+{
+ if (!mutex_is_locked(&page_idle_lock))
+ return false;
+ return true;
+}
+
#ifndef CONFIG_ADVISE_SYSCALLS
static int damos_madvise(struct damon_target *target, struct damon_region *r,
int behavior)
@@ -874,7 +888,7 @@ void damon_set_paddr_primitives(struct damon_ctx *ctx)
ctx->update_target_regions = kdamond_update_phys_regions;
ctx->prepare_access_checks = kdamond_prepare_phys_access_checks;
ctx->check_accesses = kdamond_check_phys_accesses;
- ctx->target_valid = NULL;
+ ctx->target_valid = kdamond_phys_target_valid;
ctx->cleanup = NULL;
ctx->apply_scheme = NULL;
}
diff --git a/mm/page_idle.c b/mm/page_idle.c
index 0aa45f848570..958dcc18f6cd 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -16,7 +16,7 @@
#define BITMAP_CHUNK_SIZE sizeof(u64)
#define BITMAP_CHUNK_BITS (BITMAP_CHUNK_SIZE * BITS_PER_BYTE)
-static DEFINE_MUTEX(page_idle_lock);
+DEFINE_MUTEX(page_idle_lock);
/*
* Idle page tracking only considers user memory pages, for other types of
--
2.17.1
From: SeongJae Park <[email protected]>
DAMON-based operation schemes feature is implemented inside DAMON
'core'. Though the access pattern based schemes target region tracking
part makes sense to reside in the 'core', applying the scheme action
would better to be reside in the 'primitives', as the work highly
depends on the type of the target region.
For the reason, this commit moves the part to 'primitives' by adding one
more context callback, 'apply_scheme' and implementing it in the
reference primitives implementation for the virtual address spaces.
Note that this doesn't add the implementation for the physical address
space, as it didn't exist before. Nonetheless, the extension for
physical space would be easily done in this way in future.
Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 8 +++++
mm/damon/core.c | 65 ++------------------------------------
mm/damon/damon.h | 28 -----------------
mm/damon/primitives.c | 73 ++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 82 insertions(+), 92 deletions(-)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 264958a62c02..505e6261cefa 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -170,6 +170,7 @@ struct damos {
* @check_accesses: Checks the access of target regions.
* @target_valid: Determine if the target is valid.
* @cleanup: Cleans up the context.
+ * @apply_scheme: Apply a DAMON-based operation scheme.
* @sample_cb: Called for each sampling interval.
* @aggregate_cb: Called for each aggregation interval.
*
@@ -193,6 +194,9 @@ struct damos {
* monitoring.
* @cleanup is called from @kdamond just before its termination. After this
* call, only @kdamond_lock and @kdamond will be touched.
+ * @apply_scheme is called from @kdamond when a region for user provided
+ * DAMON-based operation scheme is found. It should apply the scheme's action
+ * to the region.
*
* @sample_cb and @aggregate_cb are called from @kdamond for each of the
* sampling intervals and aggregation intervals, respectively. Therefore,
@@ -229,6 +233,8 @@ struct damon_ctx {
unsigned int (*check_accesses)(struct damon_ctx *context);
bool (*target_valid)(struct damon_target *target);
void (*cleanup)(struct damon_ctx *context);
+ int (*apply_scheme)(struct damon_ctx *context, struct damon_target *t,
+ struct damon_region *r, struct damos *scheme);
void (*sample_cb)(struct damon_ctx *context);
void (*aggregate_cb)(struct damon_ctx *context);
};
@@ -312,6 +318,8 @@ void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
bool kdamond_vm_target_valid(struct damon_target *t);
void kdamond_vm_cleanup(struct damon_ctx *ctx);
+int kdamond_vm_apply_scheme(struct damon_ctx *context, struct damon_target *t,
+ struct damon_region *r, struct damos *scheme);
void damon_set_vaddr_primitives(struct damon_ctx *ctx);
/* Reference callback implementations for physical memory */
diff --git a/mm/damon/core.c b/mm/damon/core.c
index d85ade7b5e23..ba52421a2673 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -701,68 +701,6 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
}
}
-#ifndef CONFIG_ADVISE_SYSCALLS
-static int damos_madvise(struct damon_target *target, struct damon_region *r,
- int behavior)
-{
- return -EINVAL;
-}
-#else
-static int damos_madvise(struct damon_target *target, struct damon_region *r,
- int behavior)
-{
- struct task_struct *t;
- struct mm_struct *mm;
- int ret = -ENOMEM;
-
- t = damon_get_task_struct(target);
- if (!t)
- goto out;
- mm = damon_get_mm(target);
- if (!mm)
- goto put_task_out;
-
- ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start),
- PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
- mmput(mm);
-put_task_out:
- put_task_struct(t);
-out:
- return ret;
-}
-#endif /* CONFIG_ADVISE_SYSCALLS */
-
-static int damos_do_action(struct damon_target *target, struct damon_region *r,
- enum damos_action action)
-{
- int madv_action;
-
- switch (action) {
- case DAMOS_WILLNEED:
- madv_action = MADV_WILLNEED;
- break;
- case DAMOS_COLD:
- madv_action = MADV_COLD;
- break;
- case DAMOS_PAGEOUT:
- madv_action = MADV_PAGEOUT;
- break;
- case DAMOS_HUGEPAGE:
- madv_action = MADV_HUGEPAGE;
- break;
- case DAMOS_NOHUGEPAGE:
- madv_action = MADV_NOHUGEPAGE;
- break;
- case DAMOS_STAT:
- return 0;
- default:
- pr_warn("Wrong action %d\n", action);
- return -EINVAL;
- }
-
- return damos_madvise(target, r, madv_action);
-}
-
static void damon_do_apply_schemes(struct damon_ctx *c,
struct damon_target *t,
struct damon_region *r)
@@ -781,7 +719,8 @@ static void damon_do_apply_schemes(struct damon_ctx *c,
continue;
s->stat_count++;
s->stat_sz += sz;
- damos_do_action(t, r, s->action);
+ if (c->apply_scheme)
+ c->apply_scheme(c, t, r, s);
if (s->action != DAMOS_STAT)
r->age = 0;
}
diff --git a/mm/damon/damon.h b/mm/damon/damon.h
index fc565fff4953..4315dadcca8a 100644
--- a/mm/damon/damon.h
+++ b/mm/damon/damon.h
@@ -5,31 +5,3 @@
/* Get a random number in [l, r) */
#define damon_rand(l, r) (l + prandom_u32() % (r - l))
-
-/*
- * 't->id' should be the pointer to the relevant 'struct pid' having reference
- * count. Caller must put the returned task, unless it is NULL.
- */
-#define damon_get_task_struct(t) \
- (get_pid_task((struct pid *)t->id, PIDTYPE_PID))
-
-/*
- * Get the mm_struct of the given target
- *
- * Caller _must_ put the mm_struct after use, unless it is NULL.
- *
- * Returns the mm_struct of the target on success, NULL on failure
- */
-static inline struct mm_struct *damon_get_mm(struct damon_target *t)
-{
- struct task_struct *task;
- struct mm_struct *mm;
-
- task = damon_get_task_struct(t);
- if (!task)
- return NULL;
-
- mm = get_task_mm(task);
- put_task_struct(task);
- return mm;
-}
diff --git a/mm/damon/primitives.c b/mm/damon/primitives.c
index d7796cbffbd8..e762dc8a5f2e 100644
--- a/mm/damon/primitives.c
+++ b/mm/damon/primitives.c
@@ -38,8 +38,11 @@
#endif
/*
- * Functions for the initial monitoring target regions construction
+ * 't->id' should be the pointer to the relevant 'struct pid' having reference
+ * count. Caller must put the returned task, unless it is NULL.
*/
+#define damon_get_task_struct(t) \
+ (get_pid_task((struct pid *)t->id, PIDTYPE_PID))
/*
* Get the mm_struct of the given target
@@ -62,6 +65,10 @@ struct mm_struct *damon_get_mm(struct damon_target *t)
return mm;
}
+/*
+ * Functions for the initial monitoring target regions construction
+ */
+
/*
* Size-evenly split a region into 'nr_pieces' small regions
*
@@ -788,6 +795,68 @@ void kdamond_vm_cleanup(struct damon_ctx *ctx)
}
}
+#ifndef CONFIG_ADVISE_SYSCALLS
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+ int behavior)
+{
+ return -EINVAL;
+}
+#else
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+ int behavior)
+{
+ struct task_struct *t;
+ struct mm_struct *mm;
+ int ret = -ENOMEM;
+
+ t = damon_get_task_struct(target);
+ if (!t)
+ goto out;
+ mm = damon_get_mm(target);
+ if (!mm)
+ goto put_task_out;
+
+ ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start),
+ PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
+ mmput(mm);
+put_task_out:
+ put_task_struct(t);
+out:
+ return ret;
+}
+#endif /* CONFIG_ADVISE_SYSCALLS */
+
+int kdamond_vm_apply_scheme(struct damon_ctx *ctx, struct damon_target *t,
+ struct damon_region *r, struct damos *scheme)
+{
+ int madv_action;
+
+ switch (scheme->action) {
+ case DAMOS_WILLNEED:
+ madv_action = MADV_WILLNEED;
+ break;
+ case DAMOS_COLD:
+ madv_action = MADV_COLD;
+ break;
+ case DAMOS_PAGEOUT:
+ madv_action = MADV_PAGEOUT;
+ break;
+ case DAMOS_HUGEPAGE:
+ madv_action = MADV_HUGEPAGE;
+ break;
+ case DAMOS_NOHUGEPAGE:
+ madv_action = MADV_NOHUGEPAGE;
+ break;
+ case DAMOS_STAT:
+ return 0;
+ default:
+ pr_warn("Wrong action %d\n", scheme->action);
+ return -EINVAL;
+ }
+
+ return damos_madvise(t, r, madv_action);
+}
+
void damon_set_vaddr_primitives(struct damon_ctx *ctx)
{
ctx->init_target_regions = kdamond_init_vm_regions;
@@ -796,6 +865,7 @@ void damon_set_vaddr_primitives(struct damon_ctx *ctx)
ctx->check_accesses = kdamond_check_vm_accesses;
ctx->target_valid = kdamond_vm_target_valid;
ctx->cleanup = kdamond_vm_cleanup;
+ ctx->apply_scheme = kdamond_vm_apply_scheme;
}
void damon_set_paddr_primitives(struct damon_ctx *ctx)
@@ -806,6 +876,7 @@ void damon_set_paddr_primitives(struct damon_ctx *ctx)
ctx->check_accesses = kdamond_check_phys_accesses;
ctx->target_valid = NULL;
ctx->cleanup = NULL;
+ ctx->apply_scheme = NULL;
}
#include "primitives-test.h"
--
2.17.1
From: SeongJae Park <[email protected]>
DAMON passes the monitoring results to user space via two ways: 1) a
tracepoint and 2) it's recording feature. The recording feature is for
the users who want simplest use.
However, as the feature is for the user space only while the core is
fundamentally a framework for the kernel space, keeping the feature in
the core would make no sense. Therefore, this commit moves the feature
to the debugfs interface of DAMON.
Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 23 +---
mm/damon/core-test.h | 57 ++-------
mm/damon/core.c | 150 +-----------------------
mm/damon/dbgfs-test.h | 87 +++++++++++++-
mm/damon/dbgfs.c | 264 ++++++++++++++++++++++++++++++++++++++++--
5 files changed, 359 insertions(+), 222 deletions(-)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index 505e6261cefa..606e59f785a2 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -134,14 +134,6 @@ struct damos {
* in case of virtual memory monitoring) and applies the changes for each
* @regions_update_interval. All time intervals are in micro-seconds.
*
- * @rbuf: In-memory buffer for monitoring result recording.
- * @rbuf_len: The length of @rbuf.
- * @rbuf_offset: The offset for next write to @rbuf.
- * @rfile_path: Record file path.
- *
- * If @rbuf, @rbuf_len, and @rfile_path are set, the monitored results are
- * automatically stored in @rfile_path file.
- *
* @kdamond: Kernel thread who does the monitoring.
* @kdamond_stop: Notifies whether kdamond should stop.
* @kdamond_lock: Mutex for the synchronizations with @kdamond.
@@ -164,6 +156,8 @@ struct damos {
* @targets_list: Head of monitoring targets (&damon_target) list.
* @schemes_list: Head of schemes (&damos) list.
*
+ * @private Private user data.
+ *
* @init_target_regions: Constructs initial monitoring target regions.
* @update_target_regions: Updates monitoring target regions.
* @prepare_access_checks: Prepares next access check of target regions.
@@ -214,11 +208,6 @@ struct damon_ctx {
struct timespec64 last_aggregation;
struct timespec64 last_regions_update;
- unsigned char *rbuf;
- unsigned int rbuf_len;
- unsigned int rbuf_offset;
- char *rfile_path;
-
struct task_struct *kdamond;
bool kdamond_stop;
struct mutex kdamond_lock;
@@ -226,6 +215,8 @@ struct damon_ctx {
struct list_head targets_list; /* 'damon_target' objects */
struct list_head schemes_list; /* 'damos' objects */
+ void *private;
+
/* callbacks */
void (*init_target_regions)(struct damon_ctx *context);
void (*update_target_regions)(struct damon_ctx *context);
@@ -241,10 +232,6 @@ struct damon_ctx {
#ifdef CONFIG_DAMON
-#define MIN_RECORD_BUFFER_LEN 1024
-#define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024)
-#define MAX_RFILE_PATH_LEN 256
-
#define damon_next_region(r) \
(container_of(r->list.next, struct damon_region, list))
@@ -298,8 +285,6 @@ int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
unsigned long min_nr_reg, unsigned long max_nr_reg);
int damon_set_schemes(struct damon_ctx *ctx,
struct damos **schemes, ssize_t nr_schemes);
-int damon_set_recording(struct damon_ctx *ctx,
- unsigned int rbuf_len, char *rfile_path);
int damon_nr_running_ctxs(void);
int damon_start(struct damon_ctx *ctxs, int nr_ctxs);
diff --git a/mm/damon/core-test.h b/mm/damon/core-test.h
index c916d773397a..b815dfbfb5fd 100644
--- a/mm/damon/core-test.h
+++ b/mm/damon/core-test.h
@@ -36,6 +36,17 @@ static void damon_test_regions(struct kunit *test)
damon_free_target(t);
}
+static unsigned int nr_damon_targets(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ unsigned int nr_targets = 0;
+
+ damon_for_each_target(t, ctx)
+ nr_targets++;
+
+ return nr_targets;
+}
+
static void damon_test_target(struct kunit *test)
{
struct damon_ctx *c = damon_new_ctx();
@@ -54,23 +65,6 @@ static void damon_test_target(struct kunit *test)
damon_destroy_ctx(c);
}
-static void damon_test_set_recording(struct kunit *test)
-{
- struct damon_ctx *ctx = damon_new_ctx();
- int err;
-
- err = damon_set_recording(ctx, 42, "foo");
- KUNIT_EXPECT_EQ(test, err, -EINVAL);
- damon_set_recording(ctx, 4242, "foo.bar");
- KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 4242u);
- KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo.bar");
- damon_set_recording(ctx, 424242, "foo");
- KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 424242u);
- KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo");
-
- damon_destroy_ctx(ctx);
-}
-
/*
* Test kdamond_reset_aggregated()
*
@@ -91,9 +85,7 @@ static void damon_test_aggregate(struct kunit *test)
struct damon_target *t;
struct damon_region *r;
int it, ir;
- ssize_t sz, sr, sp;
- damon_set_recording(ctx, 4242, "damon.data");
damon_set_targets(ctx, target_ids, 3);
it = 0;
@@ -121,31 +113,6 @@ static void damon_test_aggregate(struct kunit *test)
/* targets also should be preserved */
KUNIT_EXPECT_EQ(test, 3, it);
- /* The aggregated information should be written in the buffer */
- sr = sizeof(r->ar.start) + sizeof(r->ar.end) + sizeof(r->nr_accesses);
- sp = sizeof(t->id) + sizeof(unsigned int) + 3 * sr;
- sz = sizeof(struct timespec64) + sizeof(unsigned int) + 3 * sp;
- KUNIT_EXPECT_EQ(test, (unsigned int)sz, ctx->rbuf_offset);
-
- damon_destroy_ctx(ctx);
-}
-
-static void damon_test_write_rbuf(struct kunit *test)
-{
- struct damon_ctx *ctx = damon_new_ctx();
- char *data;
-
- damon_set_recording(ctx, 4242, "damon.data");
-
- data = "hello";
- damon_write_rbuf(ctx, data, strnlen(data, 256));
- KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
-
- damon_write_rbuf(ctx, data, 0);
- KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
-
- KUNIT_EXPECT_STREQ(test, (char *)ctx->rbuf, data);
-
damon_destroy_ctx(ctx);
}
@@ -267,9 +234,7 @@ static void damon_test_split_regions_of(struct kunit *test)
static struct kunit_case damon_test_cases[] = {
KUNIT_CASE(damon_test_target),
KUNIT_CASE(damon_test_regions),
- KUNIT_CASE(damon_test_set_recording),
KUNIT_CASE(damon_test_aggregate),
- KUNIT_CASE(damon_test_write_rbuf),
KUNIT_CASE(damon_test_split_at),
KUNIT_CASE(damon_test_merge_two),
KUNIT_CASE(damon_test_merge_regions_of),
diff --git a/mm/damon/core.c b/mm/damon/core.c
index ba52421a2673..ba0035d7a27a 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -208,11 +208,6 @@ struct damon_ctx *damon_new_ctx(void)
ktime_get_coarse_ts64(&ctx->last_aggregation);
ctx->last_regions_update = ctx->last_aggregation;
- if (damon_set_recording(ctx, 0, "none")) {
- kfree(ctx);
- return NULL;
- }
-
mutex_init(&ctx->kdamond_lock);
INIT_LIST_HEAD(&ctx->targets_list);
@@ -328,54 +323,6 @@ int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
return 0;
}
-/**
- * damon_set_recording() - Set attributes for the recording.
- * @ctx: target kdamond context
- * @rbuf_len: length of the result buffer
- * @rfile_path: path to the monitor result files
- *
- * Setting 'rbuf_len' 0 disables recording.
- *
- * This function should not be called while the kdamond is running.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_set_recording(struct damon_ctx *ctx,
- unsigned int rbuf_len, char *rfile_path)
-{
- size_t rfile_path_len;
-
- if (rbuf_len && (rbuf_len > MAX_RECORD_BUFFER_LEN ||
- rbuf_len < MIN_RECORD_BUFFER_LEN)) {
- pr_err("result buffer size (%u) is out of [%d,%d]\n",
- rbuf_len, MIN_RECORD_BUFFER_LEN,
- MAX_RECORD_BUFFER_LEN);
- return -EINVAL;
- }
- rfile_path_len = strnlen(rfile_path, MAX_RFILE_PATH_LEN);
- if (rfile_path_len >= MAX_RFILE_PATH_LEN) {
- pr_err("too long (>%d) result file path %s\n",
- MAX_RFILE_PATH_LEN, rfile_path);
- return -EINVAL;
- }
- ctx->rbuf_len = rbuf_len;
- kfree(ctx->rbuf);
- ctx->rbuf = NULL;
- kfree(ctx->rfile_path);
- ctx->rfile_path = NULL;
-
- if (rbuf_len) {
- ctx->rbuf = kvmalloc(rbuf_len, GFP_KERNEL);
- if (!ctx->rbuf)
- return -ENOMEM;
- }
- ctx->rfile_path = kmalloc(rfile_path_len + 1, GFP_KERNEL);
- if (!ctx->rfile_path)
- return -ENOMEM;
- strncpy(ctx->rfile_path, rfile_path, rfile_path_len + 1);
- return 0;
-}
-
/**
* damon_nr_running_ctxs() - Return number of currently running contexts.
*/
@@ -390,17 +337,6 @@ int damon_nr_running_ctxs(void)
return nr_ctxs;
}
-static unsigned int nr_damon_targets(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- unsigned int nr_targets = 0;
-
- damon_for_each_target(t, ctx)
- nr_targets++;
-
- return nr_targets;
-}
-
/* Returns the size upper limit for each monitoring region */
static unsigned long damon_region_sz_limit(struct damon_ctx *ctx)
{
@@ -613,87 +549,18 @@ static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx)
}
/*
- * Flush the content in the result buffer to the result file
- */
-static void damon_flush_rbuffer(struct damon_ctx *ctx)
-{
- ssize_t sz;
- loff_t pos = 0;
- struct file *rfile;
-
- if (!ctx->rbuf_offset)
- return;
-
- rfile = filp_open(ctx->rfile_path,
- O_CREAT | O_RDWR | O_APPEND | O_LARGEFILE, 0644);
- if (IS_ERR(rfile)) {
- pr_err("Cannot open the result file %s\n",
- ctx->rfile_path);
- return;
- }
-
- while (ctx->rbuf_offset) {
- sz = kernel_write(rfile, ctx->rbuf, ctx->rbuf_offset, &pos);
- if (sz < 0)
- break;
- ctx->rbuf_offset -= sz;
- }
- filp_close(rfile, NULL);
-}
-
-/*
- * Write a data into the result buffer
- */
-static void damon_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size)
-{
- if (!ctx->rbuf_len || !ctx->rbuf || !ctx->rfile_path)
- return;
- if (ctx->rbuf_offset + size > ctx->rbuf_len)
- damon_flush_rbuffer(ctx);
- if (ctx->rbuf_offset + size > ctx->rbuf_len) {
- pr_warn("%s: flush failed, or wrong size given(%u, %zu)\n",
- __func__, ctx->rbuf_offset, size);
- return;
- }
-
- memcpy(&ctx->rbuf[ctx->rbuf_offset], data, size);
- ctx->rbuf_offset += size;
-}
-
-/*
- * Flush the aggregated monitoring results to the result buffer
- *
- * Stores current tracking results to the result buffer and reset 'nr_accesses'
- * of each region. The format for the result buffer is as below:
- *
- * <time> <number of targets> <array of target infos>
- *
- * target info: <id> <number of regions> <array of region infos>
- * region info: <start address> <end address> <nr_accesses>
+ * Reset the aggregated monitoring results ('nr_accesses' of each region).
*/
static void kdamond_reset_aggregated(struct damon_ctx *c)
{
struct damon_target *t;
- struct timespec64 now;
unsigned int nr;
- ktime_get_coarse_ts64(&now);
-
- damon_write_rbuf(c, &now, sizeof(now));
- nr = nr_damon_targets(c);
- damon_write_rbuf(c, &nr, sizeof(nr));
-
damon_for_each_target(t, c) {
struct damon_region *r;
- damon_write_rbuf(c, &t->id, sizeof(t->id));
nr = damon_nr_regions(t);
- damon_write_rbuf(c, &nr, sizeof(nr));
damon_for_each_region(r, t) {
- damon_write_rbuf(c, &r->ar.start, sizeof(r->ar.start));
- damon_write_rbuf(c, &r->ar.end, sizeof(r->ar.end));
- damon_write_rbuf(c, &r->nr_accesses,
- sizeof(r->nr_accesses));
trace_damon_aggregated(t, r, nr);
r->last_nr_accesses = r->nr_accesses;
r->nr_accesses = 0;
@@ -927,14 +794,6 @@ static bool kdamond_need_stop(struct damon_ctx *ctx)
return true;
}
-static void kdamond_write_record_header(struct damon_ctx *ctx)
-{
- int recfmt_ver = 2;
-
- damon_write_rbuf(ctx, "damon_recfmt_ver", 16);
- damon_write_rbuf(ctx, &recfmt_ver, sizeof(recfmt_ver));
-}
-
/*
* The monitoring daemon that runs as a kernel thread
*/
@@ -951,8 +810,6 @@ static int kdamond_fn(void *data)
ctx->init_target_regions(ctx);
sz_limit = damon_region_sz_limit(ctx);
- kdamond_write_record_header(ctx);
-
while (!kdamond_need_stop(ctx)) {
if (ctx->prepare_access_checks)
ctx->prepare_access_checks(ctx);
@@ -965,10 +822,10 @@ static int kdamond_fn(void *data)
max_nr_accesses = ctx->check_accesses(ctx);
if (kdamond_aggregate_interval_passed(ctx)) {
- if (ctx->aggregate_cb)
- ctx->aggregate_cb(ctx);
kdamond_merge_regions(ctx, max_nr_accesses / 10,
sz_limit);
+ if (ctx->aggregate_cb)
+ ctx->aggregate_cb(ctx);
kdamond_apply_schemes(ctx);
kdamond_reset_aggregated(ctx);
kdamond_split_regions(ctx);
@@ -980,7 +837,6 @@ static int kdamond_fn(void *data)
sz_limit = damon_region_sz_limit(ctx);
}
}
- damon_flush_rbuffer(ctx);
damon_for_each_target(t, ctx) {
damon_for_each_region_safe(r, next, t)
damon_destroy_region(r);
diff --git a/mm/damon/dbgfs-test.h b/mm/damon/dbgfs-test.h
index dffb9f70e399..426adf5dadc2 100644
--- a/mm/damon/dbgfs-test.h
+++ b/mm/damon/dbgfs-test.h
@@ -78,7 +78,7 @@ static void damon_dbgfs_test_str_to_target_ids(struct kunit *test)
static void damon_dbgfs_test_set_targets(struct kunit *test)
{
- struct damon_ctx *ctx = damon_new_ctx();
+ struct damon_ctx *ctx = debugfs_new_ctx();
unsigned long ids[] = {1, 2, 3};
char buf[64];
@@ -105,9 +105,91 @@ static void damon_dbgfs_test_set_targets(struct kunit *test)
sprint_target_ids(ctx, buf, 64);
KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
+ debugfs_destroy_ctx(ctx);
+}
+
+static void damon_dbgfs_test_set_recording(struct kunit *test)
+{
+ struct damon_ctx *ctx = debugfs_new_ctx();
+ struct debugfs_recorder *rec = ctx->private;
+ int err;
+
+ err = debugfs_set_recording(ctx, 42, "foo");
+ KUNIT_EXPECT_EQ(test, err, -EINVAL);
+ debugfs_set_recording(ctx, 4242, "foo.bar");
+ KUNIT_EXPECT_EQ(test, rec->rbuf_len, 4242u);
+ KUNIT_EXPECT_STREQ(test, rec->rfile_path, "foo.bar");
+ debugfs_set_recording(ctx, 424242, "foo");
+ KUNIT_EXPECT_EQ(test, rec->rbuf_len, 424242u);
+ KUNIT_EXPECT_STREQ(test, rec->rfile_path, "foo");
+
+ debugfs_destroy_ctx(ctx);
+}
+
+static void damon_dbgfs_test_write_rbuf(struct kunit *test)
+{
+ struct damon_ctx *ctx = debugfs_new_ctx();
+ struct debugfs_recorder *rec = ctx->private;
+ char *data;
+
+ debugfs_set_recording(ctx, 4242, "damon.data");
+
+ data = "hello";
+ debugfs_write_rbuf(ctx, data, strnlen(data, 256));
+ KUNIT_EXPECT_EQ(test, rec->rbuf_offset, 5u);
+
+ debugfs_write_rbuf(ctx, data, 0);
+ KUNIT_EXPECT_EQ(test, rec->rbuf_offset, 5u);
+
+ KUNIT_EXPECT_STREQ(test, (char *)rec->rbuf, data);
+
+ debugfs_destroy_ctx(ctx);
+}
+
+/*
+ * Test debugfs_aggregate_cb()
+ *
+ * dbgfs sets debugfs_aggregate_cb() as aggregate callback. It stores the
+ * aggregated monitoring information ('->nr_accesses' of each regions) to the
+ * result buffer.
+ */
+static void damon_dbgfs_test_aggregate(struct kunit *test)
+{
+ struct damon_ctx *ctx = debugfs_new_ctx();
+ struct debugfs_recorder *rec = ctx->private;
+ unsigned long target_ids[] = {1, 2, 3};
+ unsigned long saddr[][3] = {{10, 20, 30}, {5, 42, 49}, {13, 33, 55} };
+ unsigned long eaddr[][3] = {{15, 27, 40}, {31, 45, 55}, {23, 44, 66} };
+ unsigned long accesses[][3] = {{42, 95, 84}, {10, 20, 30}, {0, 1, 2} };
+ struct damon_target *t;
+ struct damon_region *r;
+ int it, ir;
+ ssize_t sz, sr, sp;
+
+ debugfs_set_recording(ctx, 4242, "damon.data");
+ damon_set_targets(ctx, target_ids, 3);
+
+ it = 0;
+ damon_for_each_target(t, ctx) {
+ for (ir = 0; ir < 3; ir++) {
+ r = damon_new_region(saddr[it][ir], eaddr[it][ir]);
+ r->nr_accesses = accesses[it][ir];
+ damon_add_region(r, t);
+ }
+ it++;
+ }
+ debugfs_aggregate_cb(ctx);
+
+ /* The aggregated information should be written in the buffer */
+ sr = sizeof(r->ar.start) + sizeof(r->ar.end) + sizeof(r->nr_accesses);
+ sp = sizeof(t->id) + sizeof(unsigned int) + 3 * sr;
+ sz = sizeof(struct timespec64) + sizeof(unsigned int) + 3 * sp;
+ KUNIT_EXPECT_EQ(test, (unsigned int)sz, rec->rbuf_offset);
+
damon_destroy_ctx(ctx);
}
+
static void damon_dbgfs_test_set_init_regions(struct kunit *test)
{
struct damon_ctx *ctx = damon_new_ctx();
@@ -164,6 +246,9 @@ static void damon_dbgfs_test_set_init_regions(struct kunit *test)
static struct kunit_case damon_test_cases[] = {
KUNIT_CASE(damon_dbgfs_test_str_to_target_ids),
KUNIT_CASE(damon_dbgfs_test_set_targets),
+ KUNIT_CASE(damon_dbgfs_test_set_recording),
+ KUNIT_CASE(damon_dbgfs_test_write_rbuf),
+ KUNIT_CASE(damon_dbgfs_test_aggregate),
KUNIT_CASE(damon_dbgfs_test_set_init_regions),
{},
};
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
index 646a492100ff..7a6c279690f8 100644
--- a/mm/damon/dbgfs.c
+++ b/mm/damon/dbgfs.c
@@ -10,15 +10,155 @@
#include <linux/damon.h>
#include <linux/debugfs.h>
#include <linux/file.h>
+#include <linux/mm.h>
#include <linux/module.h>
#include <linux/slab.h>
+#define MIN_RECORD_BUFFER_LEN 1024
+#define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024)
+#define MAX_RFILE_PATH_LEN 256
+
+struct debugfs_recorder {
+ unsigned char *rbuf;
+ unsigned int rbuf_len;
+ unsigned int rbuf_offset;
+ char *rfile_path;
+};
+
/* Monitoring contexts for debugfs interface users. */
static struct damon_ctx **debugfs_ctxs;
static int debugfs_nr_ctxs = 1;
static DEFINE_MUTEX(damon_dbgfs_lock);
+static unsigned int nr_damon_targets(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ unsigned int nr_targets = 0;
+
+ damon_for_each_target(t, ctx)
+ nr_targets++;
+
+ return nr_targets;
+}
+
+/*
+ * Flush the content in the result buffer to the result file
+ */
+static void debugfs_flush_rbuffer(struct debugfs_recorder *rec)
+{
+ ssize_t sz;
+ loff_t pos = 0;
+ struct file *rfile;
+
+ if (!rec->rbuf_offset)
+ return;
+
+ rfile = filp_open(rec->rfile_path,
+ O_CREAT | O_RDWR | O_APPEND | O_LARGEFILE, 0644);
+ if (IS_ERR(rfile)) {
+ pr_err("Cannot open the result file %s\n",
+ rec->rfile_path);
+ return;
+ }
+
+ while (rec->rbuf_offset) {
+ sz = kernel_write(rfile, rec->rbuf, rec->rbuf_offset, &pos);
+ if (sz < 0)
+ break;
+ rec->rbuf_offset -= sz;
+ }
+ filp_close(rfile, NULL);
+}
+
+/*
+ * Write a data into the result buffer
+ */
+static void debugfs_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size)
+{
+ struct debugfs_recorder *rec = (struct debugfs_recorder *)ctx->private;
+
+ if (!rec->rbuf_len || !rec->rbuf || !rec->rfile_path)
+ return;
+ if (rec->rbuf_offset + size > rec->rbuf_len)
+ debugfs_flush_rbuffer(ctx->private);
+ if (rec->rbuf_offset + size > rec->rbuf_len) {
+ pr_warn("%s: flush failed, or wrong size given(%u, %zu)\n",
+ __func__, rec->rbuf_offset, size);
+ return;
+ }
+
+ memcpy(&rec->rbuf[rec->rbuf_offset], data, size);
+ rec->rbuf_offset += size;
+}
+
+static void debugfs_write_record_header(struct damon_ctx *ctx)
+{
+ int recfmt_ver = 2;
+
+ debugfs_write_rbuf(ctx, "damon_recfmt_ver", 16);
+ debugfs_write_rbuf(ctx, &recfmt_ver, sizeof(recfmt_ver));
+}
+
+static void debugfs_init_vm_regions(struct damon_ctx *ctx)
+{
+ debugfs_write_record_header(ctx);
+ kdamond_init_vm_regions(ctx);
+}
+
+static void debugfs_vm_cleanup(struct damon_ctx *ctx)
+{
+ debugfs_flush_rbuffer(ctx->private);
+ kdamond_vm_cleanup(ctx);
+}
+
+static void debugfs_init_phys_regions(struct damon_ctx *ctx)
+{
+ debugfs_write_record_header(ctx);
+}
+
+static void debugfs_phys_cleanup(struct damon_ctx *ctx)
+{
+ debugfs_flush_rbuffer(ctx->private);
+}
+
+/*
+ * Store the aggregated monitoring results to the result buffer
+ *
+ * The format for the result buffer is as below:
+ *
+ * <time> <number of targets> <array of target infos>
+ *
+ * target info: <id> <number of regions> <array of region infos>
+ * region info: <start address> <end address> <nr_accesses>
+ */
+static void debugfs_aggregate_cb(struct damon_ctx *c)
+{
+ struct damon_target *t;
+ struct timespec64 now;
+ unsigned int nr;
+
+ ktime_get_coarse_ts64(&now);
+
+ debugfs_write_rbuf(c, &now, sizeof(now));
+ nr = nr_damon_targets(c);
+ debugfs_write_rbuf(c, &nr, sizeof(nr));
+
+ damon_for_each_target(t, c) {
+ struct damon_region *r;
+
+ debugfs_write_rbuf(c, &t->id, sizeof(t->id));
+ nr = damon_nr_regions(t);
+ debugfs_write_rbuf(c, &nr, sizeof(nr));
+ damon_for_each_region(r, t) {
+ debugfs_write_rbuf(c, &r->ar.start, sizeof(r->ar.start));
+ debugfs_write_rbuf(c, &r->ar.end, sizeof(r->ar.end));
+ debugfs_write_rbuf(c, &r->nr_accesses,
+ sizeof(r->nr_accesses));
+ }
+ }
+}
+
static ssize_t debugfs_monitor_on_read(struct file *file,
char __user *buf, size_t count, loff_t *ppos)
{
@@ -330,6 +470,20 @@ static struct pid *damon_get_pidfd_pid(unsigned int pidfd)
return pid;
}
+static void debugfs_set_vaddr_primitives(struct damon_ctx *ctx)
+{
+ damon_set_vaddr_primitives(ctx);
+ ctx->init_target_regions = debugfs_init_vm_regions;
+ ctx->cleanup = debugfs_vm_cleanup;
+}
+
+static void debugfs_set_paddr_primitives(struct damon_ctx *ctx)
+{
+ damon_set_paddr_primitives(ctx);
+ ctx->init_target_regions = debugfs_init_phys_regions;
+ ctx->cleanup = debugfs_phys_cleanup;
+}
+
static ssize_t debugfs_target_ids_write(struct file *file,
const char __user *buf, size_t count, loff_t *ppos)
{
@@ -349,12 +503,13 @@ static ssize_t debugfs_target_ids_write(struct file *file,
nrs = kbuf;
if (!strncmp(kbuf, "paddr\n", count)) {
/* Configure the context for physical memory monitoring */
- damon_set_paddr_primitives(ctx);
+ debugfs_set_paddr_primitives(ctx);
/* target id is meaningless here, but we set it just for fun */
scnprintf(kbuf, count, "42 ");
} else {
/* Configure the context for virtual memory monitoring */
- damon_set_vaddr_primitives(ctx);
+ debugfs_set_vaddr_primitives(ctx);
+
if (!strncmp(kbuf, "pidfd ", 6)) {
received_pidfds = true;
nrs = &kbuf[6];
@@ -398,16 +553,76 @@ static ssize_t debugfs_record_read(struct file *file,
char __user *buf, size_t count, loff_t *ppos)
{
struct damon_ctx *ctx = file->private_data;
+ struct debugfs_recorder *rec = ctx->private;
char record_buf[20 + MAX_RFILE_PATH_LEN];
int ret;
mutex_lock(&ctx->kdamond_lock);
ret = scnprintf(record_buf, ARRAY_SIZE(record_buf), "%u %s\n",
- ctx->rbuf_len, ctx->rfile_path);
+ rec->rbuf_len, rec->rfile_path);
mutex_unlock(&ctx->kdamond_lock);
return simple_read_from_buffer(buf, count, ppos, record_buf, ret);
}
+/*
+ * debugfs_set_recording() - Set attributes for the recording.
+ * @ctx: target kdamond context
+ * @rbuf_len: length of the result buffer
+ * @rfile_path: path to the monitor result files
+ *
+ * Setting 'rbuf_len' 0 disables recording.
+ *
+ * This function should not be called while the kdamond is running.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+static int debugfs_set_recording(struct damon_ctx *ctx,
+ unsigned int rbuf_len, char *rfile_path)
+{
+ struct debugfs_recorder *recorder;
+ size_t rfile_path_len;
+
+ if (rbuf_len && (rbuf_len > MAX_RECORD_BUFFER_LEN ||
+ rbuf_len < MIN_RECORD_BUFFER_LEN)) {
+ pr_err("result buffer size (%u) is out of [%d,%d]\n",
+ rbuf_len, MIN_RECORD_BUFFER_LEN,
+ MAX_RECORD_BUFFER_LEN);
+ return -EINVAL;
+ }
+ rfile_path_len = strnlen(rfile_path, MAX_RFILE_PATH_LEN);
+ if (rfile_path_len >= MAX_RFILE_PATH_LEN) {
+ pr_err("too long (>%d) result file path %s\n",
+ MAX_RFILE_PATH_LEN, rfile_path);
+ return -EINVAL;
+ }
+
+ recorder = ctx->private;
+ if (!recorder) {
+ recorder = kzalloc(sizeof(*recorder), GFP_KERNEL);
+ if (!recorder)
+ return -ENOMEM;
+ ctx->private = recorder;
+ }
+
+ recorder->rbuf_len = rbuf_len;
+ kfree(recorder->rbuf);
+ recorder->rbuf = NULL;
+ kfree(recorder->rfile_path);
+ recorder->rfile_path = NULL;
+
+ if (rbuf_len) {
+ recorder->rbuf = kvmalloc(rbuf_len, GFP_KERNEL);
+ if (!recorder->rbuf)
+ return -ENOMEM;
+ }
+ recorder->rfile_path = kmalloc(rfile_path_len + 1, GFP_KERNEL);
+ if (!recorder->rfile_path)
+ return -ENOMEM;
+ strncpy(recorder->rfile_path, rfile_path, rfile_path_len + 1);
+
+ return 0;
+}
+
static ssize_t debugfs_record_write(struct file *file,
const char __user *buf, size_t count, loff_t *ppos)
{
@@ -434,7 +649,7 @@ static ssize_t debugfs_record_write(struct file *file,
goto unlock_out;
}
- err = damon_set_recording(ctx, rbuf_len, rfile_path);
+ err = debugfs_set_recording(ctx, rbuf_len, rfile_path);
if (err)
ret = err;
unlock_out:
@@ -654,6 +869,38 @@ static struct dentry **debugfs_dirs;
static int debugfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx);
+static void debugfs_free_recorder(struct debugfs_recorder *recorder)
+{
+ kfree(recorder->rbuf);
+ kfree(recorder->rfile_path);
+ kfree(recorder);
+}
+
+static struct damon_ctx *debugfs_new_ctx(void)
+{
+ struct damon_ctx *ctx;
+
+ ctx = damon_new_ctx();
+ if (!ctx)
+ return NULL;
+
+ if (debugfs_set_recording(ctx, 0, "none")) {
+ damon_destroy_ctx(ctx);
+ return NULL;
+ }
+
+ debugfs_set_vaddr_primitives(ctx);
+ ctx->aggregate_cb = debugfs_aggregate_cb;
+ return ctx;
+}
+
+static void debugfs_destroy_ctx(struct damon_ctx *ctx)
+{
+ debugfs_free_recorder(ctx->private);
+ damon_destroy_ctx(ctx);
+}
+
+
static ssize_t debugfs_nr_contexts_write(struct file *file,
const char __user *buf, size_t count, loff_t *ppos)
{
@@ -689,7 +936,7 @@ static ssize_t debugfs_nr_contexts_write(struct file *file,
for (i = nr_contexts; i < debugfs_nr_ctxs; i++) {
debugfs_remove(debugfs_dirs[i]);
- damon_destroy_ctx(debugfs_ctxs[i]);
+ debugfs_destroy_ctx(debugfs_ctxs[i]);
}
new_dirs = kmalloc_array(nr_contexts, sizeof(*new_dirs), GFP_KERNEL);
@@ -729,13 +976,13 @@ static ssize_t debugfs_nr_contexts_write(struct file *file,
break;
}
- debugfs_ctxs[i] = damon_new_ctx();
+ debugfs_ctxs[i] = debugfs_new_ctx();
if (!debugfs_ctxs[i]) {
pr_err("ctx for %s creation failed\n", dirname);
+ debugfs_remove(debugfs_dirs[i]);
ret = -ENOMEM;
break;
}
- damon_set_vaddr_primitives(debugfs_ctxs[i]);
if (debugfs_fill_ctx_dir(debugfs_dirs[i], debugfs_ctxs[i])) {
ret = -ENOMEM;
@@ -865,10 +1112,9 @@ static int __init damon_dbgfs_init(void)
int rc;
debugfs_ctxs = kmalloc(sizeof(*debugfs_ctxs), GFP_KERNEL);
- debugfs_ctxs[0] = damon_new_ctx();
+ debugfs_ctxs[0] = debugfs_new_ctx();
if (!debugfs_ctxs[0])
return -ENOMEM;
- damon_set_vaddr_primitives(debugfs_ctxs[0]);
rc = damon_debugfs_init();
if (rc)
--
2.17.1
From: SeongJae Park <[email protected]>
DAMON depends on one kernel configuration, 'CONFIG_DAMON' and
implemented in one source file, 'damon.c'. There are three independent
components in the file, though:
- the core logic for the overhead-accuracy tradeoff,
- the reference implementations of low level monitoring primitives for
the virtual and physical address spaces, and
- debugfs interface for the user space.
Only the core logic is the essence of DAMON, which is an extensible
framework. The other two components are default extensions built on top
of the framework interface that implemented to let users use DAMON for
usual use cases without their own extensions. Also, those are intended
to be used as reference applications of the DAMON framework.
Putting these independent components under one configuration option is
not only unnecessary but makes their boundary unclear and even make
false dependencies. For example, because the 'primitives' depends on
'CONFIG_PAGE_IDLE_FLAG', even the 'core' also depedns on it and as a
result cannot co-exist with the 'Idle Page Tracking'.
Though the default 'primitives' could cover lots of 'Idle Page Tracking'
use cases more efficiently, some usecases that 'Idle Page Tracking' can
do better (e.g., page size granularity working set size calculation)
also exists. In some cases, someone could want to extend DAMON with
thir own primitives implementation that can co-exist with 'Idle Page
Tracking'. Therefore, making the DAMON 'core' exclusive with it makes
no sense.
For the reason, this commit separates the parts to independent files and
apply fine-grained config dependency. After this commit, the 'core'
depedns on nothing and therefore could co-exist with the 'Idle Page
Tracking'.
Signed-off-by: SeongJae Park <[email protected]>
---
MAINTAINERS | 3 +-
include/linux/damon.h | 87 +-
mm/Kconfig | 25 +-
mm/Makefile | 2 +-
mm/damon-test.h | 724 ----------
mm/damon.c | 2754 ------------------------------------
mm/damon/Kconfig | 68 +
mm/damon/Makefile | 5 +
mm/damon/core-test.h | 288 ++++
mm/damon/core.c | 1065 ++++++++++++++
mm/damon/damon.h | 35 +
mm/damon/dbgfs-test.h | 179 +++
mm/damon/dbgfs.c | 882 ++++++++++++
mm/damon/primitives-test.h | 328 +++++
mm/damon/primitives.c | 811 +++++++++++
15 files changed, 3738 insertions(+), 3518 deletions(-)
delete mode 100644 mm/damon-test.h
delete mode 100644 mm/damon.c
create mode 100644 mm/damon/Kconfig
create mode 100644 mm/damon/Makefile
create mode 100644 mm/damon/core-test.h
create mode 100644 mm/damon/core.c
create mode 100644 mm/damon/damon.h
create mode 100644 mm/damon/dbgfs-test.h
create mode 100644 mm/damon/dbgfs.c
create mode 100644 mm/damon/primitives-test.h
create mode 100644 mm/damon/primitives.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 3d6050d693e3..69bfeb648854 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4759,8 +4759,7 @@ F: Documentation/admin-guide/mm/damon/*
F: Documentation/vm/damon/*
F: include/linux/damon.h
F: include/trace/events/damon.h
-F: mm/damon-test.h
-F: mm/damon.c
+F: mm/damon/*
F: tools/damon/*
F: tools/testing/selftests/damon/*
diff --git a/include/linux/damon.h b/include/linux/damon.h
index be391e7df9cf..264958a62c02 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -10,7 +10,6 @@
#ifndef _DAMON_H_
#define _DAMON_H_
-#include <linux/random.h>
#include <linux/mutex.h>
#include <linux/time64.h>
#include <linux/types.h>
@@ -234,20 +233,58 @@ struct damon_ctx {
void (*aggregate_cb)(struct damon_ctx *context);
};
-/* Reference callback implementations for virtual memory */
-void kdamond_init_vm_regions(struct damon_ctx *ctx);
-void kdamond_update_vm_regions(struct damon_ctx *ctx);
-void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
-unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
-bool kdamond_vm_target_valid(struct damon_target *t);
-void kdamond_vm_cleanup(struct damon_ctx *ctx);
+#ifdef CONFIG_DAMON
-/* Reference callback implementations for physical memory */
-void kdamond_init_phys_regions(struct damon_ctx *ctx);
-void kdamond_update_phys_regions(struct damon_ctx *ctx);
-void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
-unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
+#define MIN_RECORD_BUFFER_LEN 1024
+#define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024)
+#define MAX_RFILE_PATH_LEN 256
+
+#define damon_next_region(r) \
+ (container_of(r->list.next, struct damon_region, list))
+
+#define damon_prev_region(r) \
+ (container_of(r->list.prev, struct damon_region, list))
+
+#define damon_for_each_region(r, t) \
+ list_for_each_entry(r, &t->regions_list, list)
+
+#define damon_for_each_region_safe(r, next, t) \
+ list_for_each_entry_safe(r, next, &t->regions_list, list)
+
+#define damon_for_each_target(t, ctx) \
+ list_for_each_entry(t, &(ctx)->targets_list, list)
+
+#define damon_for_each_target_safe(t, next, ctx) \
+ list_for_each_entry_safe(t, next, &(ctx)->targets_list, list)
+
+#define damon_for_each_scheme(s, ctx) \
+ list_for_each_entry(s, &(ctx)->schemes_list, list)
+
+#define damon_for_each_scheme_safe(s, next, ctx) \
+ list_for_each_entry_safe(s, next, &(ctx)->schemes_list, list)
+
+struct damon_region *damon_new_region(unsigned long start, unsigned long end);
+inline void damon_insert_region(struct damon_region *r,
+ struct damon_region *prev, struct damon_region *next);
+void damon_add_region(struct damon_region *r, struct damon_target *t);
+void damon_destroy_region(struct damon_region *r);
+struct damos *damon_new_scheme(
+ unsigned long min_sz_region, unsigned long max_sz_region,
+ unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+ unsigned int min_age_region, unsigned int max_age_region,
+ enum damos_action action);
+void damon_add_scheme(struct damon_ctx *ctx, struct damos *s);
+void damon_destroy_scheme(struct damos *s);
+
+struct damon_target *damon_new_target(unsigned long id);
+void damon_add_target(struct damon_ctx *ctx, struct damon_target *t);
+void damon_free_target(struct damon_target *t);
+void damon_destroy_target(struct damon_target *t);
+unsigned int damon_nr_regions(struct damon_target *t);
+
+struct damon_ctx *damon_new_ctx(void);
+void damon_destroy_ctx(struct damon_ctx *ctx);
int damon_set_targets(struct damon_ctx *ctx,
unsigned long *ids, ssize_t nr_ids);
int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
@@ -257,9 +294,33 @@ int damon_set_schemes(struct damon_ctx *ctx,
struct damos **schemes, ssize_t nr_schemes);
int damon_set_recording(struct damon_ctx *ctx,
unsigned int rbuf_len, char *rfile_path);
+int damon_nr_running_ctxs(void);
+
int damon_start(struct damon_ctx *ctxs, int nr_ctxs);
int damon_stop(struct damon_ctx *ctxs, int nr_ctxs);
int damon_start_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs);
int damon_stop_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs);
+#endif /* CONFIG_DAMON */
+
+#ifdef CONFIG_DAMON_PRIMITIVES
+
+/* Reference callback implementations for virtual memory */
+void kdamond_init_vm_regions(struct damon_ctx *ctx);
+void kdamond_update_vm_regions(struct damon_ctx *ctx);
+void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
+unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
+bool kdamond_vm_target_valid(struct damon_target *t);
+void kdamond_vm_cleanup(struct damon_ctx *ctx);
+void damon_set_vaddr_primitives(struct damon_ctx *ctx);
+
+/* Reference callback implementations for physical memory */
+void kdamond_init_phys_regions(struct damon_ctx *ctx);
+void kdamond_update_phys_regions(struct damon_ctx *ctx);
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
+void damon_set_paddr_primitives(struct damon_ctx *ctx);
+
+#endif /* CONFIG_DAMON_PRIMITIVES */
+
#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index d7be006813f2..c43e1092099e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -880,29 +880,6 @@ config ARCH_HAS_HUGEPD
config MAPPING_DIRTY_HELPERS
bool
-config DAMON
- bool "Data Access Monitor"
- depends on MMU && !IDLE_PAGE_TRACKING
- select PAGE_EXTENSION if !64BIT
- select PAGE_IDLE_FLAG
- help
- This feature allows to monitor access frequency of each memory
- region. The information can be useful for performance-centric DRAM
- level memory management.
-
- See https://damonitor.github.io/doc/html/latest-damon/index.html for
- more information.
- If unsure, say N.
-
-config DAMON_KUNIT_TEST
- bool "Test for damon"
- depends on DAMON=y && KUNIT
- help
- This builds the DAMON Kunit test suite.
-
- For more information on KUnit and unit tests in general, please refer
- to the KUnit documentation.
-
- If unsure, say N.
+source "mm/damon/Kconfig"
endmenu
diff --git a/mm/Makefile b/mm/Makefile
index 30c5dba52fb2..a6f10848633e 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -121,4 +121,4 @@ obj-$(CONFIG_MEMFD_CREATE) += memfd.o
obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o
obj-$(CONFIG_PTDUMP_CORE) += ptdump.o
obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
-obj-$(CONFIG_DAMON) += damon.o
+obj-$(CONFIG_DAMON) += damon/
diff --git a/mm/damon-test.h b/mm/damon-test.h
deleted file mode 100644
index 681adead0339..000000000000
--- a/mm/damon-test.h
+++ /dev/null
@@ -1,724 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Data Access Monitor Unit Tests
- *
- * Copyright 2019 Amazon.com, Inc. or its affiliates. All rights reserved.
- *
- * Author: SeongJae Park <[email protected]>
- */
-
-#ifdef CONFIG_DAMON_KUNIT_TEST
-
-#ifndef _DAMON_TEST_H
-#define _DAMON_TEST_H
-
-#include <kunit/test.h>
-
-static void damon_test_str_to_target_ids(struct kunit *test)
-{
- char *question;
- unsigned long *answers;
- unsigned long expected[] = {12, 35, 46};
- ssize_t nr_integers = 0, i;
-
- question = "123";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
- KUNIT_EXPECT_EQ(test, 123ul, answers[0]);
- kfree(answers);
-
- question = "123abc";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
- KUNIT_EXPECT_EQ(test, 123ul, answers[0]);
- kfree(answers);
-
- question = "a123";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
- kfree(answers);
-
- question = "12 35";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
- for (i = 0; i < nr_integers; i++)
- KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
- kfree(answers);
-
- question = "12 35 46";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)3, nr_integers);
- for (i = 0; i < nr_integers; i++)
- KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
- kfree(answers);
-
- question = "12 35 abc 46";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
- for (i = 0; i < 2; i++)
- KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
- kfree(answers);
-
- question = "";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
- kfree(answers);
-
- question = "\n";
- answers = str_to_target_ids(question, strnlen(question, 128),
- &nr_integers);
- KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
- kfree(answers);
-}
-
-static void damon_test_regions(struct kunit *test)
-{
- struct damon_region *r;
- struct damon_target *t;
-
- r = damon_new_region(1, 2);
- KUNIT_EXPECT_EQ(test, 1ul, r->ar.start);
- KUNIT_EXPECT_EQ(test, 2ul, r->ar.end);
- KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);
-
- t = damon_new_target(42);
- KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t));
-
- damon_add_region(r, t);
- KUNIT_EXPECT_EQ(test, 1u, nr_damon_regions(t));
-
- damon_del_region(r);
- KUNIT_EXPECT_EQ(test, 0u, nr_damon_regions(t));
-
- damon_free_target(t);
-}
-
-static void damon_test_target(struct kunit *test)
-{
- struct damon_ctx *c = debugfs_ctxs[0];
- struct damon_target *t;
-
- t = damon_new_target(42);
- KUNIT_EXPECT_EQ(test, 42ul, t->id);
- KUNIT_EXPECT_EQ(test, 0u, nr_damon_targets(c));
-
- damon_add_target(c, t);
- KUNIT_EXPECT_EQ(test, 1u, nr_damon_targets(c));
-
- damon_destroy_target(t);
- KUNIT_EXPECT_EQ(test, 0u, nr_damon_targets(c));
-}
-
-static void damon_test_set_targets(struct kunit *test)
-{
- struct damon_ctx *ctx = debugfs_ctxs[0];
- unsigned long ids[] = {1, 2, 3};
- char buf[64];
-
- /* Make DAMON consider target id as plain number */
- ctx->target_valid = NULL;
-
- damon_set_targets(ctx, ids, 3);
- sprint_target_ids(ctx, buf, 64);
- KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2 3\n");
-
- damon_set_targets(ctx, NULL, 0);
- sprint_target_ids(ctx, buf, 64);
- KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
-
- damon_set_targets(ctx, (unsigned long []){1, 2}, 2);
- sprint_target_ids(ctx, buf, 64);
- KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2\n");
-
- damon_set_targets(ctx, (unsigned long []){2}, 1);
- sprint_target_ids(ctx, buf, 64);
- KUNIT_EXPECT_STREQ(test, (char *)buf, "2\n");
-
- damon_set_targets(ctx, NULL, 0);
- sprint_target_ids(ctx, buf, 64);
- KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
-}
-
-static void damon_test_set_recording(struct kunit *test)
-{
- struct damon_ctx *ctx = debugfs_ctxs[0];
- int err;
-
- err = damon_set_recording(ctx, 42, "foo");
- KUNIT_EXPECT_EQ(test, err, -EINVAL);
- damon_set_recording(ctx, 4242, "foo.bar");
- KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 4242u);
- KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo.bar");
- damon_set_recording(ctx, 424242, "foo");
- KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 424242u);
- KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo");
-}
-
-static void damon_test_set_init_regions(struct kunit *test)
-{
- struct damon_ctx *ctx = debugfs_ctxs[0];
- unsigned long ids[] = {1, 2, 3};
- /* Each line represents one region in ``<target id> <start> <end>`` */
- char * const valid_inputs[] = {"2 10 20\n 2 20 30\n2 35 45",
- "2 10 20\n",
- "2 10 20\n1 39 59\n1 70 134\n 2 20 25\n",
- ""};
- /* Reading the file again will show sorted, clean output */
- char * const valid_expects[] = {"2 10 20\n2 20 30\n2 35 45\n",
- "2 10 20\n",
- "1 39 59\n1 70 134\n2 10 20\n2 20 25\n",
- ""};
- char * const invalid_inputs[] = {"4 10 20\n", /* target not exists */
- "2 10 20\n 2 14 26\n", /* regions overlap */
- "1 10 20\n2 30 40\n 1 5 8"}; /* not sorted by address */
- char *input, *expect;
- int i, rc;
- char buf[256];
-
- damon_set_targets(ctx, ids, 3);
-
- /* Put valid inputs and check the results */
- for (i = 0; i < ARRAY_SIZE(valid_inputs); i++) {
- input = valid_inputs[i];
- expect = valid_expects[i];
-
- rc = set_init_regions(ctx, input, strnlen(input, 256));
- KUNIT_EXPECT_EQ(test, rc, 0);
-
- memset(buf, 0, 256);
- sprint_init_regions(ctx, buf, 256);
-
- KUNIT_EXPECT_STREQ(test, (char *)buf, expect);
- }
- /* Put invlid inputs and check the return error code */
- for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) {
- input = invalid_inputs[i];
- pr_info("input: %s\n", input);
- rc = set_init_regions(ctx, input, strnlen(input, 256));
- KUNIT_EXPECT_EQ(test, rc, -EINVAL);
-
- memset(buf, 0, 256);
- sprint_init_regions(ctx, buf, 256);
-
- KUNIT_EXPECT_STREQ(test, (char *)buf, "");
- }
-
- damon_set_targets(ctx, NULL, 0);
-}
-
-static void __link_vmas(struct vm_area_struct *vmas, ssize_t nr_vmas)
-{
- int i, j;
- unsigned long largest_gap, gap;
-
- if (!nr_vmas)
- return;
-
- for (i = 0; i < nr_vmas - 1; i++) {
- vmas[i].vm_next = &vmas[i + 1];
-
- vmas[i].vm_rb.rb_left = NULL;
- vmas[i].vm_rb.rb_right = &vmas[i + 1].vm_rb;
-
- largest_gap = 0;
- for (j = i; j < nr_vmas; j++) {
- if (j == 0)
- continue;
- gap = vmas[j].vm_start - vmas[j - 1].vm_end;
- if (gap > largest_gap)
- largest_gap = gap;
- }
- vmas[i].rb_subtree_gap = largest_gap;
- }
- vmas[i].vm_next = NULL;
- vmas[i].vm_rb.rb_right = NULL;
- vmas[i].rb_subtree_gap = 0;
-}
-
-/*
- * Test damon_three_regions_in_vmas() function
- *
- * In case of virtual memory address spaces monitoring, DAMON converts the
- * complex and dynamic memory mappings of each target task to three
- * discontiguous regions which cover every mapped areas. However, the three
- * regions should not include the two biggest unmapped areas in the original
- * mapping, because the two biggest areas are normally the areas between 1)
- * heap and the mmap()-ed regions, and 2) the mmap()-ed regions and stack.
- * Because these two unmapped areas are very huge but obviously never accessed,
- * covering the region is just a waste.
- *
- * 'damon_three_regions_in_vmas() receives an address space of a process. It
- * first identifies the start of mappings, end of mappings, and the two biggest
- * unmapped areas. After that, based on the information, it constructs the
- * three regions and returns. For more detail, refer to the comment of
- * 'damon_init_regions_of()' function definition in 'mm/damon.c' file.
- *
- * For example, suppose virtual address ranges of 10-20, 20-25, 200-210,
- * 210-220, 300-305, and 307-330 (Other comments represent this mappings in
- * more short form: 10-20-25, 200-210-220, 300-305, 307-330) of a process are
- * mapped. To cover every mappings, the three regions should start with 10,
- * and end with 305. The process also has three unmapped areas, 25-200,
- * 220-300, and 305-307. Among those, 25-200 and 220-300 are the biggest two
- * unmapped areas, and thus it should be converted to three regions of 10-25,
- * 200-220, and 300-330.
- */
-static void damon_test_three_regions_in_vmas(struct kunit *test)
-{
- struct damon_addr_range regions[3] = {0,};
- /* 10-20-25, 200-210-220, 300-305, 307-330 */
- struct vm_area_struct vmas[] = {
- (struct vm_area_struct) {.vm_start = 10, .vm_end = 20},
- (struct vm_area_struct) {.vm_start = 20, .vm_end = 25},
- (struct vm_area_struct) {.vm_start = 200, .vm_end = 210},
- (struct vm_area_struct) {.vm_start = 210, .vm_end = 220},
- (struct vm_area_struct) {.vm_start = 300, .vm_end = 305},
- (struct vm_area_struct) {.vm_start = 307, .vm_end = 330},
- };
-
- __link_vmas(vmas, 6);
-
- damon_three_regions_in_vmas(&vmas[0], regions);
-
- KUNIT_EXPECT_EQ(test, 10ul, regions[0].start);
- KUNIT_EXPECT_EQ(test, 25ul, regions[0].end);
- KUNIT_EXPECT_EQ(test, 200ul, regions[1].start);
- KUNIT_EXPECT_EQ(test, 220ul, regions[1].end);
- KUNIT_EXPECT_EQ(test, 300ul, regions[2].start);
- KUNIT_EXPECT_EQ(test, 330ul, regions[2].end);
-}
-
-/* Clean up global state of damon */
-static void damon_cleanup_global_state(void)
-{
- struct damon_target *t, *next;
-
- damon_for_each_target_safe(t, next, debugfs_ctxs[0])
- damon_destroy_target(t);
-
- debugfs_ctxs[0]->rbuf_offset = 0;
-}
-
-/*
- * Test kdamond_reset_aggregated()
- *
- * DAMON checks access to each region and aggregates this information as the
- * access frequency of each region. In detail, it increases '->nr_accesses' of
- * regions that an access has confirmed. 'kdamond_reset_aggregated()' flushes
- * the aggregated information ('->nr_accesses' of each regions) to the result
- * buffer. As a result of the flushing, the '->nr_accesses' of regions are
- * initialized to zero.
- */
-static void damon_test_aggregate(struct kunit *test)
-{
- struct damon_ctx *ctx = debugfs_ctxs[0];
- unsigned long target_ids[] = {1, 2, 3};
- unsigned long saddr[][3] = {{10, 20, 30}, {5, 42, 49}, {13, 33, 55} };
- unsigned long eaddr[][3] = {{15, 27, 40}, {31, 45, 55}, {23, 44, 66} };
- unsigned long accesses[][3] = {{42, 95, 84}, {10, 20, 30}, {0, 1, 2} };
- struct damon_target *t;
- struct damon_region *r;
- int it, ir;
- ssize_t sz, sr, sp;
-
- damon_set_recording(ctx, 4242, "damon.data");
- damon_set_targets(ctx, target_ids, 3);
-
- it = 0;
- damon_for_each_target(t, ctx) {
- for (ir = 0; ir < 3; ir++) {
- r = damon_new_region(saddr[it][ir], eaddr[it][ir]);
- r->nr_accesses = accesses[it][ir];
- damon_add_region(r, t);
- }
- it++;
- }
- kdamond_reset_aggregated(ctx);
- it = 0;
- damon_for_each_target(t, ctx) {
- ir = 0;
- /* '->nr_accesses' should be zeroed */
- damon_for_each_region(r, t) {
- KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);
- ir++;
- }
- /* regions should be preserved */
- KUNIT_EXPECT_EQ(test, 3, ir);
- it++;
- }
- /* targets also should be preserved */
- KUNIT_EXPECT_EQ(test, 3, it);
-
- /* The aggregated information should be written in the buffer */
- sr = sizeof(r->ar.start) + sizeof(r->ar.end) + sizeof(r->nr_accesses);
- sp = sizeof(t->id) + sizeof(unsigned int) + 3 * sr;
- sz = sizeof(struct timespec64) + sizeof(unsigned int) + 3 * sp;
- KUNIT_EXPECT_EQ(test, (unsigned int)sz, ctx->rbuf_offset);
-
- damon_set_recording(ctx, 0, "damon.data");
- damon_cleanup_global_state();
-}
-
-static void damon_test_write_rbuf(struct kunit *test)
-{
- struct damon_ctx *ctx = debugfs_ctxs[0];
- char *data;
-
- damon_set_recording(debugfs_ctxs[0], 4242, "damon.data");
-
- data = "hello";
- damon_write_rbuf(ctx, data, strnlen(data, 256));
- KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
-
- damon_write_rbuf(ctx, data, 0);
- KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
-
- KUNIT_EXPECT_STREQ(test, (char *)ctx->rbuf, data);
- damon_set_recording(debugfs_ctxs[0], 0, "damon.data");
-}
-
-static struct damon_region *__nth_region_of(struct damon_target *t, int idx)
-{
- struct damon_region *r;
- unsigned int i = 0;
-
- damon_for_each_region(r, t) {
- if (i++ == idx)
- return r;
- }
-
- return NULL;
-}
-
-/*
- * Test 'damon_apply_three_regions()'
- *
- * test kunit object
- * regions an array containing start/end addresses of current
- * monitoring target regions
- * nr_regions the number of the addresses in 'regions'
- * three_regions The three regions that need to be applied now
- * expected start/end addresses of monitoring target regions that
- * 'three_regions' are applied
- * nr_expected the number of addresses in 'expected'
- *
- * The memory mapping of the target processes changes dynamically. To follow
- * the change, DAMON periodically reads the mappings, simplifies it to the
- * three regions, and updates the monitoring target regions to fit in the three
- * regions. The update of current target regions is the role of
- * 'damon_apply_three_regions()'.
- *
- * This test passes the given target regions and the new three regions that
- * need to be applied to the function and check whether it updates the regions
- * as expected.
- */
-static void damon_do_test_apply_three_regions(struct kunit *test,
- unsigned long *regions, int nr_regions,
- struct damon_addr_range *three_regions,
- unsigned long *expected, int nr_expected)
-{
- struct damon_target *t;
- struct damon_region *r;
- int i;
-
- t = damon_new_target(42);
- for (i = 0; i < nr_regions / 2; i++) {
- r = damon_new_region(regions[i * 2], regions[i * 2 + 1]);
- damon_add_region(r, t);
- }
- damon_add_target(debugfs_ctxs[0], t);
-
- damon_apply_three_regions(debugfs_ctxs[0], t, three_regions);
-
- for (i = 0; i < nr_expected / 2; i++) {
- r = __nth_region_of(t, i);
- KUNIT_EXPECT_EQ(test, r->ar.start, expected[i * 2]);
- KUNIT_EXPECT_EQ(test, r->ar.end, expected[i * 2 + 1]);
- }
-
- damon_cleanup_global_state();
-}
-
-/*
- * This function test most common case where the three big regions are only
- * slightly changed. Target regions should adjust their boundary (10-20-30,
- * 50-55, 70-80, 90-100) to fit with the new big regions or remove target
- * regions (57-79) that now out of the three regions.
- */
-static void damon_test_apply_three_regions1(struct kunit *test)
-{
- /* 10-20-30, 50-55-57-59, 70-80-90-100 */
- unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
- 70, 80, 80, 90, 90, 100};
- /* 5-27, 45-55, 73-104 */
- struct damon_addr_range new_three_regions[3] = {
- (struct damon_addr_range){.start = 5, .end = 27},
- (struct damon_addr_range){.start = 45, .end = 55},
- (struct damon_addr_range){.start = 73, .end = 104} };
- /* 5-20-27, 45-55, 73-80-90-104 */
- unsigned long expected[] = {5, 20, 20, 27, 45, 55,
- 73, 80, 80, 90, 90, 104};
-
- damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
- new_three_regions, expected, ARRAY_SIZE(expected));
-}
-
-/*
- * Test slightly bigger change. Similar to above, but the second big region
- * now require two target regions (50-55, 57-59) to be removed.
- */
-static void damon_test_apply_three_regions2(struct kunit *test)
-{
- /* 10-20-30, 50-55-57-59, 70-80-90-100 */
- unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
- 70, 80, 80, 90, 90, 100};
- /* 5-27, 56-57, 65-104 */
- struct damon_addr_range new_three_regions[3] = {
- (struct damon_addr_range){.start = 5, .end = 27},
- (struct damon_addr_range){.start = 56, .end = 57},
- (struct damon_addr_range){.start = 65, .end = 104} };
- /* 5-20-27, 56-57, 65-80-90-104 */
- unsigned long expected[] = {5, 20, 20, 27, 56, 57,
- 65, 80, 80, 90, 90, 104};
-
- damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
- new_three_regions, expected, ARRAY_SIZE(expected));
-}
-
-/*
- * Test a big change. The second big region has totally freed and mapped to
- * different area (50-59 -> 61-63). The target regions which were in the old
- * second big region (50-55-57-59) should be removed and new target region
- * covering the second big region (61-63) should be created.
- */
-static void damon_test_apply_three_regions3(struct kunit *test)
-{
- /* 10-20-30, 50-55-57-59, 70-80-90-100 */
- unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
- 70, 80, 80, 90, 90, 100};
- /* 5-27, 61-63, 65-104 */
- struct damon_addr_range new_three_regions[3] = {
- (struct damon_addr_range){.start = 5, .end = 27},
- (struct damon_addr_range){.start = 61, .end = 63},
- (struct damon_addr_range){.start = 65, .end = 104} };
- /* 5-20-27, 61-63, 65-80-90-104 */
- unsigned long expected[] = {5, 20, 20, 27, 61, 63,
- 65, 80, 80, 90, 90, 104};
-
- damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
- new_three_regions, expected, ARRAY_SIZE(expected));
-}
-
-/*
- * Test another big change. Both of the second and third big regions (50-59
- * and 70-100) has totally freed and mapped to different area (30-32 and
- * 65-68). The target regions which were in the old second and third big
- * regions should now be removed and new target regions covering the new second
- * and third big regions should be crated.
- */
-static void damon_test_apply_three_regions4(struct kunit *test)
-{
- /* 10-20-30, 50-55-57-59, 70-80-90-100 */
- unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
- 70, 80, 80, 90, 90, 100};
- /* 5-7, 30-32, 65-68 */
- struct damon_addr_range new_three_regions[3] = {
- (struct damon_addr_range){.start = 5, .end = 7},
- (struct damon_addr_range){.start = 30, .end = 32},
- (struct damon_addr_range){.start = 65, .end = 68} };
- /* expect 5-7, 30-32, 65-68 */
- unsigned long expected[] = {5, 7, 30, 32, 65, 68};
-
- damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
- new_three_regions, expected, ARRAY_SIZE(expected));
-}
-
-static void damon_test_split_evenly(struct kunit *test)
-{
- struct damon_ctx *c = debugfs_ctxs[0];
- struct damon_target *t;
- struct damon_region *r;
- unsigned long i;
-
- KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, NULL, 5), -EINVAL);
-
- t = damon_new_target(42);
- r = damon_new_region(0, 100);
- KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 0), -EINVAL);
-
- damon_add_region(r, t);
- KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 10), 0);
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 10u);
-
- i = 0;
- damon_for_each_region(r, t) {
- KUNIT_EXPECT_EQ(test, r->ar.start, i++ * 10);
- KUNIT_EXPECT_EQ(test, r->ar.end, i * 10);
- }
- damon_free_target(t);
-
- t = damon_new_target(42);
- r = damon_new_region(5, 59);
- damon_add_region(r, t);
- KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 5), 0);
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 5u);
-
- i = 0;
- damon_for_each_region(r, t) {
- if (i == 4)
- break;
- KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i++);
- KUNIT_EXPECT_EQ(test, r->ar.end, 5 + 10 * i);
- }
- KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i);
- KUNIT_EXPECT_EQ(test, r->ar.end, 59ul);
- damon_free_target(t);
-
- t = damon_new_target(42);
- r = damon_new_region(5, 6);
- damon_add_region(r, t);
- KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 2), -EINVAL);
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 1u);
-
- damon_for_each_region(r, t) {
- KUNIT_EXPECT_EQ(test, r->ar.start, 5ul);
- KUNIT_EXPECT_EQ(test, r->ar.end, 6ul);
- }
- damon_free_target(t);
-}
-
-static void damon_test_split_at(struct kunit *test)
-{
- struct damon_target *t;
- struct damon_region *r;
-
- t = damon_new_target(42);
- r = damon_new_region(0, 100);
- damon_add_region(r, t);
- damon_split_region_at(debugfs_ctxs[0], r, 25);
- KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
- KUNIT_EXPECT_EQ(test, r->ar.end, 25ul);
-
- r = damon_next_region(r);
- KUNIT_EXPECT_EQ(test, r->ar.start, 25ul);
- KUNIT_EXPECT_EQ(test, r->ar.end, 100ul);
-
- damon_free_target(t);
-}
-
-static void damon_test_merge_two(struct kunit *test)
-{
- struct damon_target *t;
- struct damon_region *r, *r2, *r3;
- int i;
-
- t = damon_new_target(42);
- r = damon_new_region(0, 100);
- r->nr_accesses = 10;
- damon_add_region(r, t);
- r2 = damon_new_region(100, 300);
- r2->nr_accesses = 20;
- damon_add_region(r2, t);
-
- damon_merge_two_regions(r, r2);
- KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
- KUNIT_EXPECT_EQ(test, r->ar.end, 300ul);
- KUNIT_EXPECT_EQ(test, r->nr_accesses, 16u);
-
- i = 0;
- damon_for_each_region(r3, t) {
- KUNIT_EXPECT_PTR_EQ(test, r, r3);
- i++;
- }
- KUNIT_EXPECT_EQ(test, i, 1);
-
- damon_free_target(t);
-}
-
-static void damon_test_merge_regions_of(struct kunit *test)
-{
- struct damon_target *t;
- struct damon_region *r;
- unsigned long sa[] = {0, 100, 114, 122, 130, 156, 170, 184};
- unsigned long ea[] = {100, 112, 122, 130, 156, 170, 184, 230};
- unsigned int nrs[] = {0, 0, 10, 10, 20, 30, 1, 2};
-
- unsigned long saddrs[] = {0, 114, 130, 156, 170};
- unsigned long eaddrs[] = {112, 130, 156, 170, 230};
- int i;
-
- t = damon_new_target(42);
- for (i = 0; i < ARRAY_SIZE(sa); i++) {
- r = damon_new_region(sa[i], ea[i]);
- r->nr_accesses = nrs[i];
- damon_add_region(r, t);
- }
-
- damon_merge_regions_of(t, 9, 9999);
- /* 0-112, 114-130, 130-156, 156-170 */
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 5u);
- for (i = 0; i < 5; i++) {
- r = __nth_region_of(t, i);
- KUNIT_EXPECT_EQ(test, r->ar.start, saddrs[i]);
- KUNIT_EXPECT_EQ(test, r->ar.end, eaddrs[i]);
- }
- damon_free_target(t);
-}
-
-static void damon_test_split_regions_of(struct kunit *test)
-{
- struct damon_target *t;
- struct damon_region *r;
-
- t = damon_new_target(42);
- r = damon_new_region(0, 22);
- damon_add_region(r, t);
- damon_split_regions_of(debugfs_ctxs[0], t, 2);
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 2u);
- damon_free_target(t);
-
- t = damon_new_target(42);
- r = damon_new_region(0, 220);
- damon_add_region(r, t);
- damon_split_regions_of(debugfs_ctxs[0], t, 4);
- KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 4u);
- damon_free_target(t);
-}
-
-static struct kunit_case damon_test_cases[] = {
- KUNIT_CASE(damon_test_str_to_target_ids),
- KUNIT_CASE(damon_test_target),
- KUNIT_CASE(damon_test_regions),
- KUNIT_CASE(damon_test_set_targets),
- KUNIT_CASE(damon_test_set_recording),
- KUNIT_CASE(damon_test_set_init_regions),
- KUNIT_CASE(damon_test_three_regions_in_vmas),
- KUNIT_CASE(damon_test_aggregate),
- KUNIT_CASE(damon_test_write_rbuf),
- KUNIT_CASE(damon_test_apply_three_regions1),
- KUNIT_CASE(damon_test_apply_three_regions2),
- KUNIT_CASE(damon_test_apply_three_regions3),
- KUNIT_CASE(damon_test_apply_three_regions4),
- KUNIT_CASE(damon_test_split_evenly),
- KUNIT_CASE(damon_test_split_at),
- KUNIT_CASE(damon_test_merge_two),
- KUNIT_CASE(damon_test_merge_regions_of),
- KUNIT_CASE(damon_test_split_regions_of),
- {},
-};
-
-static struct kunit_suite damon_test_suite = {
- .name = "damon",
- .test_cases = damon_test_cases,
-};
-kunit_test_suite(damon_test_suite);
-
-#endif /* _DAMON_TEST_H */
-
-#endif /* CONFIG_DAMON_KUNIT_TEST */
diff --git a/mm/damon.c b/mm/damon.c
deleted file mode 100644
index c2adfcc1444c..000000000000
--- a/mm/damon.c
+++ /dev/null
@@ -1,2754 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Data Access Monitor
- *
- * Copyright 2019-2020 Amazon.com, Inc. or its affiliates.
- *
- * Author: SeongJae Park <[email protected]>
- *
- * This file is constructed in below parts.
- *
- * - Functions and macros for DAMON data structures
- * - Functions for the initial monitoring target regions construction
- * - Functions for the dynamic monitoring target regions update
- * - Functions for the access checking of the regions
- * - Functions for the target validity check and cleanup
- * - Functions for DAMON core logics and features
- * - Functions for the DAMON programming interface
- * - Functions for the DAMON debugfs interface
- * - Functions for the initialization
- */
-
-#define pr_fmt(fmt) "damon: " fmt
-
-#include <asm-generic/mman-common.h>
-#include <linux/damon.h>
-#include <linux/debugfs.h>
-#include <linux/delay.h>
-#include <linux/kthread.h>
-#include <linux/memory_hotplug.h>
-#include <linux/mm.h>
-#include <linux/mmu_notifier.h>
-#include <linux/module.h>
-#include <linux/page_idle.h>
-#include <linux/pagemap.h>
-#include <linux/random.h>
-#include <linux/rmap.h>
-#include <linux/sched/mm.h>
-#include <linux/sched/task.h>
-#include <linux/slab.h>
-
-#define CREATE_TRACE_POINTS
-#include <trace/events/damon.h>
-
-/* Minimal region size. Every damon_region is aligned by this. */
-#ifndef CONFIG_DAMON_KUNIT_TEST
-#define MIN_REGION PAGE_SIZE
-#else
-#define MIN_REGION 1
-#endif
-
-/*
- * Functions and macros for DAMON data structures
- */
-
-#define damon_next_region(r) \
- (container_of(r->list.next, struct damon_region, list))
-
-#define damon_prev_region(r) \
- (container_of(r->list.prev, struct damon_region, list))
-
-#define damon_for_each_region(r, t) \
- list_for_each_entry(r, &t->regions_list, list)
-
-#define damon_for_each_region_safe(r, next, t) \
- list_for_each_entry_safe(r, next, &t->regions_list, list)
-
-#define damon_for_each_target(t, ctx) \
- list_for_each_entry(t, &(ctx)->targets_list, list)
-
-#define damon_for_each_target_safe(t, next, ctx) \
- list_for_each_entry_safe(t, next, &(ctx)->targets_list, list)
-
-#define damon_for_each_scheme(s, ctx) \
- list_for_each_entry(s, &(ctx)->schemes_list, list)
-
-#define damon_for_each_scheme_safe(s, next, ctx) \
- list_for_each_entry_safe(s, next, &(ctx)->schemes_list, list)
-
-#define MIN_RECORD_BUFFER_LEN 1024
-#define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024)
-#define MAX_RFILE_PATH_LEN 256
-
-/* Get a random number in [l, r) */
-#define damon_rand(l, r) (l + prandom_u32() % (r - l))
-
-static DEFINE_MUTEX(damon_lock);
-static int nr_running_ctxs;
-
-/*
- * Construct a damon_region struct
- *
- * Returns the pointer to the new struct if success, or NULL otherwise
- */
-static struct damon_region *damon_new_region(unsigned long start,
- unsigned long end)
-{
- struct damon_region *region;
-
- region = kmalloc(sizeof(*region), GFP_KERNEL);
- if (!region)
- return NULL;
-
- region->ar.start = start;
- region->ar.end = end;
- region->nr_accesses = 0;
- INIT_LIST_HEAD(®ion->list);
-
- region->age = 0;
- region->last_nr_accesses = 0;
-
- return region;
-}
-
-/*
- * Add a region between two other regions
- */
-static inline void damon_insert_region(struct damon_region *r,
- struct damon_region *prev, struct damon_region *next)
-{
- __list_add(&r->list, &prev->list, &next->list);
-}
-
-static void damon_add_region(struct damon_region *r, struct damon_target *t)
-{
- list_add_tail(&r->list, &t->regions_list);
-}
-
-static void damon_del_region(struct damon_region *r)
-{
- list_del(&r->list);
-}
-
-static void damon_free_region(struct damon_region *r)
-{
- kfree(r);
-}
-
-static void damon_destroy_region(struct damon_region *r)
-{
- damon_del_region(r);
- damon_free_region(r);
-}
-
-/*
- * Construct a damon_target struct
- *
- * Returns the pointer to the new struct if success, or NULL otherwise
- */
-static struct damon_target *damon_new_target(unsigned long id)
-{
- struct damon_target *t;
-
- t = kmalloc(sizeof(*t), GFP_KERNEL);
- if (!t)
- return NULL;
-
- t->id = id;
- INIT_LIST_HEAD(&t->regions_list);
-
- return t;
-}
-
-static void damon_add_target(struct damon_ctx *ctx, struct damon_target *t)
-{
- list_add_tail(&t->list, &ctx->targets_list);
-}
-
-static void damon_del_target(struct damon_target *t)
-{
- list_del(&t->list);
-}
-
-static void damon_free_target(struct damon_target *t)
-{
- struct damon_region *r, *next;
-
- damon_for_each_region_safe(r, next, t)
- damon_free_region(r);
- kfree(t);
-}
-
-static void damon_destroy_target(struct damon_target *t)
-{
- damon_del_target(t);
- damon_free_target(t);
-}
-
-static struct damos *damon_new_scheme(
- unsigned long min_sz_region, unsigned long max_sz_region,
- unsigned int min_nr_accesses, unsigned int max_nr_accesses,
- unsigned int min_age_region, unsigned int max_age_region,
- enum damos_action action)
-{
- struct damos *scheme;
-
- scheme = kmalloc(sizeof(*scheme), GFP_KERNEL);
- if (!scheme)
- return NULL;
- scheme->min_sz_region = min_sz_region;
- scheme->max_sz_region = max_sz_region;
- scheme->min_nr_accesses = min_nr_accesses;
- scheme->max_nr_accesses = max_nr_accesses;
- scheme->min_age_region = min_age_region;
- scheme->max_age_region = max_age_region;
- scheme->action = action;
- scheme->stat_count = 0;
- scheme->stat_sz = 0;
- INIT_LIST_HEAD(&scheme->list);
-
- return scheme;
-}
-
-static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
-{
- list_add_tail(&s->list, &ctx->schemes_list);
-}
-
-static void damon_del_scheme(struct damos *s)
-{
- list_del(&s->list);
-}
-
-static void damon_free_scheme(struct damos *s)
-{
- kfree(s);
-}
-
-static void damon_destroy_scheme(struct damos *s)
-{
- damon_del_scheme(s);
- damon_free_scheme(s);
-}
-
-static void damon_set_vaddr_primitives(struct damon_ctx *ctx)
-{
- ctx->init_target_regions = kdamond_init_vm_regions;
- ctx->update_target_regions = kdamond_update_vm_regions;
- ctx->prepare_access_checks = kdamond_prepare_vm_access_checks;
- ctx->check_accesses = kdamond_check_vm_accesses;
- ctx->target_valid = kdamond_vm_target_valid;
- ctx->cleanup = kdamond_vm_cleanup;
-}
-
-static void damon_set_paddr_primitives(struct damon_ctx *ctx)
-{
- ctx->init_target_regions = kdamond_init_phys_regions;
- ctx->update_target_regions = kdamond_update_phys_regions;
- ctx->prepare_access_checks = kdamond_prepare_phys_access_checks;
- ctx->check_accesses = kdamond_check_phys_accesses;
- ctx->target_valid = NULL;
- ctx->cleanup = NULL;
-}
-
-static struct damon_ctx *damon_new_ctx(void)
-{
- struct damon_ctx *ctx;
-
- ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
- if (!ctx)
- return NULL;
-
- ctx->sample_interval = 5 * 1000;
- ctx->aggr_interval = 100 * 1000;
- ctx->regions_update_interval = 1000 * 1000;
- ctx->min_nr_regions = 10;
- ctx->max_nr_regions = 1000;
-
- damon_set_vaddr_primitives(ctx);
-
- ktime_get_coarse_ts64(&ctx->last_aggregation);
- ctx->last_regions_update = ctx->last_aggregation;
-
- if (damon_set_recording(ctx, 0, "none")) {
- kfree(ctx);
- return NULL;
- }
-
- mutex_init(&ctx->kdamond_lock);
-
- INIT_LIST_HEAD(&ctx->targets_list);
- INIT_LIST_HEAD(&ctx->schemes_list);
-
- return ctx;
-}
-
-static void damon_destroy_ctx(struct damon_ctx *ctx)
-{
- struct damon_target *t, *next_t;
- struct damos *s, *next_s;
-
- damon_for_each_target_safe(t, next_t, ctx)
- damon_destroy_target(t);
-
- damon_for_each_scheme_safe(s, next_s, ctx)
- damon_destroy_scheme(s);
-
- kfree(ctx);
-}
-
-static unsigned int nr_damon_targets(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- unsigned int nr_targets = 0;
-
- damon_for_each_target(t, ctx)
- nr_targets++;
-
- return nr_targets;
-}
-
-static unsigned int nr_damon_regions(struct damon_target *t)
-{
- struct damon_region *r;
- unsigned int nr_regions = 0;
-
- damon_for_each_region(r, t)
- nr_regions++;
-
- return nr_regions;
-}
-
-/* Returns the size upper limit for each monitoring region */
-static unsigned long damon_region_sz_limit(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- struct damon_region *r;
- unsigned long sz = 0;
-
- damon_for_each_target(t, ctx) {
- damon_for_each_region(r, t)
- sz += r->ar.end - r->ar.start;
- }
-
- if (ctx->min_nr_regions)
- sz /= ctx->min_nr_regions;
- if (sz < MIN_REGION)
- sz = MIN_REGION;
-
- return sz;
-}
-
-/*
- * Functions for the initial monitoring target regions construction
- */
-
-/*
- * 't->id' should be the pointer to the relevant 'struct pid' having reference
- * count. Caller must put the returned task, unless it is NULL.
- */
-#define damon_get_task_struct(t) \
- (get_pid_task((struct pid *)t->id, PIDTYPE_PID))
-
-/*
- * Get the mm_struct of the given target
- *
- * Caller _must_ put the mm_struct after use, unless it is NULL.
- *
- * Returns the mm_struct of the target on success, NULL on failure
- */
-static struct mm_struct *damon_get_mm(struct damon_target *t)
-{
- struct task_struct *task;
- struct mm_struct *mm;
-
- task = damon_get_task_struct(t);
- if (!task)
- return NULL;
-
- mm = get_task_mm(task);
- put_task_struct(task);
- return mm;
-}
-
-/*
- * Size-evenly split a region into 'nr_pieces' small regions
- *
- * Returns 0 on success, or negative error code otherwise.
- */
-static int damon_split_region_evenly(struct damon_ctx *ctx,
- struct damon_region *r, unsigned int nr_pieces)
-{
- unsigned long sz_orig, sz_piece, orig_end;
- struct damon_region *n = NULL, *next;
- unsigned long start;
-
- if (!r || !nr_pieces)
- return -EINVAL;
-
- orig_end = r->ar.end;
- sz_orig = r->ar.end - r->ar.start;
- sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, MIN_REGION);
-
- if (!sz_piece)
- return -EINVAL;
-
- r->ar.end = r->ar.start + sz_piece;
- next = damon_next_region(r);
- for (start = r->ar.end; start + sz_piece <= orig_end;
- start += sz_piece) {
- n = damon_new_region(start, start + sz_piece);
- if (!n)
- return -ENOMEM;
- damon_insert_region(n, r, next);
- r = n;
- }
- /* complement last region for possible rounding error */
- if (n)
- n->ar.end = orig_end;
-
- return 0;
-}
-
-static unsigned long sz_range(struct damon_addr_range *r)
-{
- return r->end - r->start;
-}
-
-static void swap_ranges(struct damon_addr_range *r1,
- struct damon_addr_range *r2)
-{
- struct damon_addr_range tmp;
-
- tmp = *r1;
- *r1 = *r2;
- *r2 = tmp;
-}
-
-/*
- * Find three regions separated by two biggest unmapped regions
- *
- * vma the head vma of the target address space
- * regions an array of three address ranges that results will be saved
- *
- * This function receives an address space and finds three regions in it which
- * separated by the two biggest unmapped regions in the space. Please refer to
- * below comments of 'damon_init_vm_regions_of()' function to know why this is
- * necessary.
- *
- * Returns 0 if success, or negative error code otherwise.
- */
-static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
- struct damon_addr_range regions[3])
-{
- struct damon_addr_range gap = {0}, first_gap = {0}, second_gap = {0};
- struct vm_area_struct *last_vma = NULL;
- unsigned long start = 0;
- struct rb_root rbroot;
-
- /* Find two biggest gaps so that first_gap > second_gap > others */
- for (; vma; vma = vma->vm_next) {
- if (!last_vma) {
- start = vma->vm_start;
- goto next;
- }
-
- if (vma->rb_subtree_gap <= sz_range(&second_gap)) {
- rbroot.rb_node = &vma->vm_rb;
- vma = rb_entry(rb_last(&rbroot),
- struct vm_area_struct, vm_rb);
- goto next;
- }
-
- gap.start = last_vma->vm_end;
- gap.end = vma->vm_start;
- if (sz_range(&gap) > sz_range(&second_gap)) {
- swap_ranges(&gap, &second_gap);
- if (sz_range(&second_gap) > sz_range(&first_gap))
- swap_ranges(&second_gap, &first_gap);
- }
-next:
- last_vma = vma;
- }
-
- if (!sz_range(&second_gap) || !sz_range(&first_gap))
- return -EINVAL;
-
- /* Sort the two biggest gaps by address */
- if (first_gap.start > second_gap.start)
- swap_ranges(&first_gap, &second_gap);
-
- /* Store the result */
- regions[0].start = ALIGN(start, MIN_REGION);
- regions[0].end = ALIGN(first_gap.start, MIN_REGION);
- regions[1].start = ALIGN(first_gap.end, MIN_REGION);
- regions[1].end = ALIGN(second_gap.start, MIN_REGION);
- regions[2].start = ALIGN(second_gap.end, MIN_REGION);
- regions[2].end = ALIGN(last_vma->vm_end, MIN_REGION);
-
- return 0;
-}
-
-/*
- * Get the three regions in the given target (task)
- *
- * Returns 0 on success, negative error code otherwise.
- */
-static int damon_three_regions_of(struct damon_target *t,
- struct damon_addr_range regions[3])
-{
- struct mm_struct *mm;
- int rc;
-
- mm = damon_get_mm(t);
- if (!mm)
- return -EINVAL;
-
- mmap_read_lock(mm);
- rc = damon_three_regions_in_vmas(mm->mmap, regions);
- mmap_read_unlock(mm);
-
- mmput(mm);
- return rc;
-}
-
-/*
- * Initialize the monitoring target regions for the given target (task)
- *
- * t the given target
- *
- * Because only a number of small portions of the entire address space
- * is actually mapped to the memory and accessed, monitoring the unmapped
- * regions is wasteful. That said, because we can deal with small noises,
- * tracking every mapping is not strictly required but could even incur a high
- * overhead if the mapping frequently changes or the number of mappings is
- * high. The adaptive regions adjustment mechanism will further help to deal
- * with the noise by simply identifying the unmapped areas as a region that
- * has no access. Moreover, applying the real mappings that would have many
- * unmapped areas inside will make the adaptive mechanism quite complex. That
- * said, too huge unmapped areas inside the monitoring target should be removed
- * to not take the time for the adaptive mechanism.
- *
- * For the reason, we convert the complex mappings to three distinct regions
- * that cover every mapped area of the address space. Also the two gaps
- * between the three regions are the two biggest unmapped areas in the given
- * address space. In detail, this function first identifies the start and the
- * end of the mappings and the two biggest unmapped areas of the address space.
- * Then, it constructs the three regions as below:
- *
- * [mappings[0]->start, big_two_unmapped_areas[0]->start)
- * [big_two_unmapped_areas[0]->end, big_two_unmapped_areas[1]->start)
- * [big_two_unmapped_areas[1]->end, mappings[nr_mappings - 1]->end)
- *
- * As usual memory map of processes is as below, the gap between the heap and
- * the uppermost mmap()-ed region, and the gap between the lowermost mmap()-ed
- * region and the stack will be two biggest unmapped regions. Because these
- * gaps are exceptionally huge areas in usual address space, excluding these
- * two biggest unmapped regions will be sufficient to make a trade-off.
- *
- * <heap>
- * <BIG UNMAPPED REGION 1>
- * <uppermost mmap()-ed region>
- * (other mmap()-ed regions and small unmapped regions)
- * <lowermost mmap()-ed region>
- * <BIG UNMAPPED REGION 2>
- * <stack>
- */
-static void damon_init_vm_regions_of(struct damon_ctx *c,
- struct damon_target *t)
-{
- struct damon_region *r;
- struct damon_addr_range regions[3];
- unsigned long sz = 0, nr_pieces;
- int i;
-
- if (damon_three_regions_of(t, regions)) {
- pr_err("Failed to get three regions of target %lu\n", t->id);
- return;
- }
-
- for (i = 0; i < 3; i++)
- sz += regions[i].end - regions[i].start;
- if (c->min_nr_regions)
- sz /= c->min_nr_regions;
- if (sz < MIN_REGION)
- sz = MIN_REGION;
-
- /* Set the initial three regions of the target */
- for (i = 0; i < 3; i++) {
- r = damon_new_region(regions[i].start, regions[i].end);
- if (!r) {
- pr_err("%d'th init region creation failed\n", i);
- return;
- }
- damon_add_region(r, t);
-
- nr_pieces = (regions[i].end - regions[i].start) / sz;
- damon_split_region_evenly(c, r, nr_pieces);
- }
-}
-
-/* Initialize '->regions_list' of every target (task) */
-void kdamond_init_vm_regions(struct damon_ctx *ctx)
-{
- struct damon_target *t;
-
- damon_for_each_target(t, ctx) {
- /* the user may set the target regions as they want */
- if (!nr_damon_regions(t))
- damon_init_vm_regions_of(ctx, t);
- }
-}
-
-/*
- * The initial regions construction function for the physical address space.
- *
- * This default version does nothing in actual. Users should set the initial
- * regions by themselves before passing their damon_ctx to 'damon_start()', or
- * implement their version of this and set '->init_target_regions' of their
- * damon_ctx to point it.
- */
-void kdamond_init_phys_regions(struct damon_ctx *ctx)
-{
-}
-
-/*
- * Functions for the dynamic monitoring target regions update
- */
-
-/*
- * Check whether a region is intersecting an address range
- *
- * Returns true if it is.
- */
-static bool damon_intersect(struct damon_region *r, struct damon_addr_range *re)
-{
- return !(r->ar.end <= re->start || re->end <= r->ar.start);
-}
-
-/*
- * Update damon regions for the three big regions of the given target
- *
- * t the given target
- * bregions the three big regions of the target
- */
-static void damon_apply_three_regions(struct damon_ctx *ctx,
- struct damon_target *t, struct damon_addr_range bregions[3])
-{
- struct damon_region *r, *next;
- unsigned int i = 0;
-
- /* Remove regions which are not in the three big regions now */
- damon_for_each_region_safe(r, next, t) {
- for (i = 0; i < 3; i++) {
- if (damon_intersect(r, &bregions[i]))
- break;
- }
- if (i == 3)
- damon_destroy_region(r);
- }
-
- /* Adjust intersecting regions to fit with the three big regions */
- for (i = 0; i < 3; i++) {
- struct damon_region *first = NULL, *last;
- struct damon_region *newr;
- struct damon_addr_range *br;
-
- br = &bregions[i];
- /* Get the first and last regions which intersects with br */
- damon_for_each_region(r, t) {
- if (damon_intersect(r, br)) {
- if (!first)
- first = r;
- last = r;
- }
- if (r->ar.start >= br->end)
- break;
- }
- if (!first) {
- /* no damon_region intersects with this big region */
- newr = damon_new_region(
- ALIGN_DOWN(br->start, MIN_REGION),
- ALIGN(br->end, MIN_REGION));
- if (!newr)
- continue;
- damon_insert_region(newr, damon_prev_region(r), r);
- } else {
- first->ar.start = ALIGN_DOWN(br->start, MIN_REGION);
- last->ar.end = ALIGN(br->end, MIN_REGION);
- }
- }
-}
-
-/*
- * Update regions for current memory mappings
- */
-void kdamond_update_vm_regions(struct damon_ctx *ctx)
-{
- struct damon_addr_range three_regions[3];
- struct damon_target *t;
-
- damon_for_each_target(t, ctx) {
- if (damon_three_regions_of(t, three_regions))
- continue;
- damon_apply_three_regions(ctx, t, three_regions);
- }
-}
-
-/*
- * The dynamic monitoring target regions update function for the physical
- * address space.
- *
- * This default version does nothing in actual. Users should update the
- * regions in other callbacks such as '->aggregate_cb', or implement their
- * version of this and set the '->init_target_regions' of their damon_ctx to
- * point it.
- */
-void kdamond_update_phys_regions(struct damon_ctx *ctx)
-{
-}
-
-/*
- * Functions for the access checking of the regions
- */
-
-static void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm,
- unsigned long addr)
-{
- bool referenced = false;
- struct page *page = pte_page(*pte);
-
- if (pte_young(*pte)) {
- referenced = true;
- *pte = pte_mkold(*pte);
- }
-
-#ifdef CONFIG_MMU_NOTIFIER
- if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
- referenced = true;
-#endif /* CONFIG_MMU_NOTIFIER */
-
- if (referenced)
- set_page_young(page);
-
- set_page_idle(page);
-}
-
-static void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm,
- unsigned long addr)
-{
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- bool referenced = false;
- struct page *page = pmd_page(*pmd);
-
- if (pmd_young(*pmd)) {
- referenced = true;
- *pmd = pmd_mkold(*pmd);
- }
-
-#ifdef CONFIG_MMU_NOTIFIER
- if (mmu_notifier_clear_young(mm, addr,
- addr + ((1UL) << HPAGE_PMD_SHIFT)))
- referenced = true;
-#endif /* CONFIG_MMU_NOTIFIER */
-
- if (referenced)
- set_page_young(page);
-
- set_page_idle(page);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-}
-
-static void damon_mkold(struct mm_struct *mm, unsigned long addr)
-{
- pte_t *pte = NULL;
- pmd_t *pmd = NULL;
- spinlock_t *ptl;
-
- if (follow_pte_pmd(mm, addr, NULL, &pte, &pmd, &ptl))
- return;
-
- if (pte) {
- damon_ptep_mkold(pte, mm, addr);
- pte_unmap_unlock(pte, ptl);
- } else {
- damon_pmdp_mkold(pmd, mm, addr);
- spin_unlock(ptl);
- }
-}
-
-static void damon_prepare_vm_access_check(struct damon_ctx *ctx,
- struct mm_struct *mm, struct damon_region *r)
-{
- r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
-
- damon_mkold(mm, r->sampling_addr);
-}
-
-void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- struct mm_struct *mm;
- struct damon_region *r;
-
- damon_for_each_target(t, ctx) {
- mm = damon_get_mm(t);
- if (!mm)
- continue;
- damon_for_each_region(r, t)
- damon_prepare_vm_access_check(ctx, mm, r);
- mmput(mm);
- }
-}
-
-static bool damon_young(struct mm_struct *mm, unsigned long addr,
- unsigned long *page_sz)
-{
- pte_t *pte = NULL;
- pmd_t *pmd = NULL;
- spinlock_t *ptl;
- bool young = false;
-
- if (follow_pte_pmd(mm, addr, NULL, &pte, &pmd, &ptl))
- return false;
-
- *page_sz = PAGE_SIZE;
- if (pte) {
- young = pte_young(*pte);
- if (!young)
- young = !page_is_idle(pte_page(*pte));
- pte_unmap_unlock(pte, ptl);
- return young;
- }
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- young = pmd_young(*pmd);
- if (!young)
- young = !page_is_idle(pmd_page(*pmd));
- spin_unlock(ptl);
- *page_sz = ((1UL) << HPAGE_PMD_SHIFT);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-
- return young;
-}
-
-/*
- * Check whether the region was accessed after the last preparation
- *
- * mm 'mm_struct' for the given virtual address space
- * r the region to be checked
- */
-static void damon_check_vm_access(struct damon_ctx *ctx,
- struct mm_struct *mm, struct damon_region *r)
-{
- static struct mm_struct *last_mm;
- static unsigned long last_addr;
- static unsigned long last_page_sz = PAGE_SIZE;
- static bool last_accessed;
-
- /* If the region is in the last checked page, reuse the result */
- if (mm == last_mm && (ALIGN_DOWN(last_addr, last_page_sz) ==
- ALIGN_DOWN(r->sampling_addr, last_page_sz))) {
- if (last_accessed)
- r->nr_accesses++;
- return;
- }
-
- last_accessed = damon_young(mm, r->sampling_addr, &last_page_sz);
- if (last_accessed)
- r->nr_accesses++;
-
- last_mm = mm;
- last_addr = r->sampling_addr;
-}
-
-unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- struct mm_struct *mm;
- struct damon_region *r;
- unsigned int max_nr_accesses = 0;
-
- damon_for_each_target(t, ctx) {
- mm = damon_get_mm(t);
- if (!mm)
- continue;
- damon_for_each_region(r, t) {
- damon_check_vm_access(ctx, mm, r);
- max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
- }
- mmput(mm);
- }
-
- return max_nr_accesses;
-}
-
-/* access check functions for physical address based regions */
-
-/*
- * Get a page by pfn if it is in the LRU list. Otherwise, returns NULL.
- *
- * The body of this function is stollen from the 'page_idle_get_page()'. We
- * steal rather than reuse it because the code is quite simple.
- */
-static struct page *damon_phys_get_page(unsigned long pfn)
-{
- struct page *page = pfn_to_online_page(pfn);
- pg_data_t *pgdat;
-
- if (!page || !PageLRU(page) ||
- !get_page_unless_zero(page))
- return NULL;
-
- pgdat = page_pgdat(page);
- spin_lock_irq(&pgdat->lru_lock);
- if (unlikely(!PageLRU(page))) {
- put_page(page);
- page = NULL;
- }
- spin_unlock_irq(&pgdat->lru_lock);
- return page;
-}
-
-static bool damon_page_mkold(struct page *page, struct vm_area_struct *vma,
- unsigned long addr, void *arg)
-{
- damon_mkold(vma->vm_mm, addr);
- return true;
-}
-
-static void damon_phys_mkold(unsigned long paddr)
-{
- struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
- struct rmap_walk_control rwc = {
- .rmap_one = damon_page_mkold,
- .anon_lock = page_lock_anon_vma_read,
- };
- bool need_lock;
-
- if (!page)
- return;
-
- if (!page_mapped(page) || !page_rmapping(page)) {
- set_page_idle(page);
- put_page(page);
- return;
- }
-
- need_lock = !PageAnon(page) || PageKsm(page);
- if (need_lock && !trylock_page(page)) {
- put_page(page);
- return;
- }
-
- rmap_walk(page, &rwc);
-
- if (need_lock)
- unlock_page(page);
- put_page(page);
-}
-
-static void damon_prepare_phys_access_check(struct damon_ctx *ctx,
- struct damon_region *r)
-{
- r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
-
- damon_phys_mkold(r->sampling_addr);
-}
-
-void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- struct damon_region *r;
-
- damon_for_each_target(t, ctx) {
- damon_for_each_region(r, t)
- damon_prepare_phys_access_check(ctx, r);
- }
-}
-
-struct damon_phys_access_chk_result {
- unsigned long page_sz;
- bool accessed;
-};
-
-static bool damon_page_accessed(struct page *page, struct vm_area_struct *vma,
- unsigned long addr, void *arg)
-{
- struct damon_phys_access_chk_result *result = arg;
-
- result->accessed = damon_young(vma->vm_mm, addr, &result->page_sz);
-
- /* If accessed, stop walking */
- return !result->accessed;
-}
-
-static bool damon_phys_young(unsigned long paddr, unsigned long *page_sz)
-{
- struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
- struct damon_phys_access_chk_result result = {
- .page_sz = PAGE_SIZE,
- .accessed = false,
- };
- struct rmap_walk_control rwc = {
- .arg = &result,
- .rmap_one = damon_page_accessed,
- .anon_lock = page_lock_anon_vma_read,
- };
- bool need_lock;
-
- if (!page)
- return false;
-
- if (!page_mapped(page) || !page_rmapping(page)) {
- if (page_is_idle(page))
- result.accessed = false;
- else
- result.accessed = true;
- put_page(page);
- goto out;
- }
-
- need_lock = !PageAnon(page) || PageKsm(page);
- if (need_lock && !trylock_page(page)) {
- put_page(page);
- return NULL;
- }
-
- rmap_walk(page, &rwc);
-
- if (need_lock)
- unlock_page(page);
- put_page(page);
-
-out:
- *page_sz = result.page_sz;
- return result.accessed;
-}
-
-/*
- * Check whether the region was accessed after the last preparation
- *
- * mm 'mm_struct' for the given virtual address space
- * r the region of physical address space that needs to be checked
- */
-static void damon_check_phys_access(struct damon_ctx *ctx,
- struct damon_region *r)
-{
- static unsigned long last_addr;
- static unsigned long last_page_sz = PAGE_SIZE;
- static bool last_accessed;
-
- /* If the region is in the last checked page, reuse the result */
- if (ALIGN_DOWN(last_addr, last_page_sz) ==
- ALIGN_DOWN(r->sampling_addr, last_page_sz)) {
- if (last_accessed)
- r->nr_accesses++;
- return;
- }
-
- last_accessed = damon_phys_young(r->sampling_addr, &last_page_sz);
- if (last_accessed)
- r->nr_accesses++;
-
- last_addr = r->sampling_addr;
-}
-
-unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- struct damon_region *r;
- unsigned int max_nr_accesses = 0;
-
- damon_for_each_target(t, ctx) {
- damon_for_each_region(r, t) {
- damon_check_phys_access(ctx, r);
- max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
- }
- }
-
- return max_nr_accesses;
-}
-
-/*
- * Functions for the target validity check and cleanup
- */
-
-bool kdamond_vm_target_valid(struct damon_target *t)
-{
- struct task_struct *task;
-
- task = damon_get_task_struct(t);
- if (task) {
- put_task_struct(task);
- return true;
- }
-
- return false;
-}
-
-void kdamond_vm_cleanup(struct damon_ctx *ctx)
-{
- struct damon_target *t, *next;
-
- damon_for_each_target_safe(t, next, ctx) {
- put_pid((struct pid *)t->id);
- damon_destroy_target(t);
- }
-}
-
-/*
- * Functions for DAMON core logics and features
- */
-
-/*
- * damon_check_reset_time_interval() - Check if a time interval is elapsed.
- * @baseline: the time to check whether the interval has elapsed since
- * @interval: the time interval (microseconds)
- *
- * See whether the given time interval has passed since the given baseline
- * time. If so, it also updates the baseline to current time for next check.
- *
- * Return: true if the time interval has passed, or false otherwise.
- */
-static bool damon_check_reset_time_interval(struct timespec64 *baseline,
- unsigned long interval)
-{
- struct timespec64 now;
-
- ktime_get_coarse_ts64(&now);
- if ((timespec64_to_ns(&now) - timespec64_to_ns(baseline)) <
- interval * 1000)
- return false;
- *baseline = now;
- return true;
-}
-
-/*
- * Check whether it is time to flush the aggregated information
- */
-static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx)
-{
- return damon_check_reset_time_interval(&ctx->last_aggregation,
- ctx->aggr_interval);
-}
-
-/*
- * Flush the content in the result buffer to the result file
- */
-static void damon_flush_rbuffer(struct damon_ctx *ctx)
-{
- ssize_t sz;
- loff_t pos = 0;
- struct file *rfile;
-
- if (!ctx->rbuf_offset)
- return;
-
- rfile = filp_open(ctx->rfile_path,
- O_CREAT | O_RDWR | O_APPEND | O_LARGEFILE, 0644);
- if (IS_ERR(rfile)) {
- pr_err("Cannot open the result file %s\n",
- ctx->rfile_path);
- return;
- }
-
- while (ctx->rbuf_offset) {
- sz = kernel_write(rfile, ctx->rbuf, ctx->rbuf_offset, &pos);
- if (sz < 0)
- break;
- ctx->rbuf_offset -= sz;
- }
- filp_close(rfile, NULL);
-}
-
-/*
- * Write a data into the result buffer
- */
-static void damon_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size)
-{
- if (!ctx->rbuf_len || !ctx->rbuf || !ctx->rfile_path)
- return;
- if (ctx->rbuf_offset + size > ctx->rbuf_len)
- damon_flush_rbuffer(ctx);
- if (ctx->rbuf_offset + size > ctx->rbuf_len) {
- pr_warn("%s: flush failed, or wrong size given(%u, %zu)\n",
- __func__, ctx->rbuf_offset, size);
- return;
- }
-
- memcpy(&ctx->rbuf[ctx->rbuf_offset], data, size);
- ctx->rbuf_offset += size;
-}
-
-/*
- * Flush the aggregated monitoring results to the result buffer
- *
- * Stores current tracking results to the result buffer and reset 'nr_accesses'
- * of each region. The format for the result buffer is as below:
- *
- * <time> <number of targets> <array of target infos>
- *
- * target info: <id> <number of regions> <array of region infos>
- * region info: <start address> <end address> <nr_accesses>
- */
-static void kdamond_reset_aggregated(struct damon_ctx *c)
-{
- struct damon_target *t;
- struct timespec64 now;
- unsigned int nr;
-
- ktime_get_coarse_ts64(&now);
-
- damon_write_rbuf(c, &now, sizeof(now));
- nr = nr_damon_targets(c);
- damon_write_rbuf(c, &nr, sizeof(nr));
-
- damon_for_each_target(t, c) {
- struct damon_region *r;
-
- damon_write_rbuf(c, &t->id, sizeof(t->id));
- nr = nr_damon_regions(t);
- damon_write_rbuf(c, &nr, sizeof(nr));
- damon_for_each_region(r, t) {
- damon_write_rbuf(c, &r->ar.start, sizeof(r->ar.start));
- damon_write_rbuf(c, &r->ar.end, sizeof(r->ar.end));
- damon_write_rbuf(c, &r->nr_accesses,
- sizeof(r->nr_accesses));
- trace_damon_aggregated(t, r, nr);
- r->last_nr_accesses = r->nr_accesses;
- r->nr_accesses = 0;
- }
- }
-}
-
-#ifndef CONFIG_ADVISE_SYSCALLS
-static int damos_madvise(struct damon_target *target, struct damon_region *r,
- int behavior)
-{
- return -EINVAL;
-}
-#else
-static int damos_madvise(struct damon_target *target, struct damon_region *r,
- int behavior)
-{
- struct task_struct *t;
- struct mm_struct *mm;
- int ret = -ENOMEM;
-
- t = damon_get_task_struct(target);
- if (!t)
- goto out;
- mm = damon_get_mm(target);
- if (!mm)
- goto put_task_out;
-
- ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start),
- PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
- mmput(mm);
-put_task_out:
- put_task_struct(t);
-out:
- return ret;
-}
-#endif /* CONFIG_ADVISE_SYSCALLS */
-
-static int damos_do_action(struct damon_target *target, struct damon_region *r,
- enum damos_action action)
-{
- int madv_action;
-
- switch (action) {
- case DAMOS_WILLNEED:
- madv_action = MADV_WILLNEED;
- break;
- case DAMOS_COLD:
- madv_action = MADV_COLD;
- break;
- case DAMOS_PAGEOUT:
- madv_action = MADV_PAGEOUT;
- break;
- case DAMOS_HUGEPAGE:
- madv_action = MADV_HUGEPAGE;
- break;
- case DAMOS_NOHUGEPAGE:
- madv_action = MADV_NOHUGEPAGE;
- break;
- case DAMOS_STAT:
- return 0;
- default:
- pr_warn("Wrong action %d\n", action);
- return -EINVAL;
- }
-
- return damos_madvise(target, r, madv_action);
-}
-
-static void damon_do_apply_schemes(struct damon_ctx *c,
- struct damon_target *t,
- struct damon_region *r)
-{
- struct damos *s;
- unsigned long sz;
-
- damon_for_each_scheme(s, c) {
- sz = r->ar.end - r->ar.start;
- if (sz < s->min_sz_region || s->max_sz_region < sz)
- continue;
- if (r->nr_accesses < s->min_nr_accesses ||
- s->max_nr_accesses < r->nr_accesses)
- continue;
- if (r->age < s->min_age_region || s->max_age_region < r->age)
- continue;
- s->stat_count++;
- s->stat_sz += sz;
- damos_do_action(t, r, s->action);
- if (s->action != DAMOS_STAT)
- r->age = 0;
- }
-}
-
-static void kdamond_apply_schemes(struct damon_ctx *c)
-{
- struct damon_target *t;
- struct damon_region *r;
-
- damon_for_each_target(t, c) {
- damon_for_each_region(r, t)
- damon_do_apply_schemes(c, t, r);
- }
-}
-
-#define sz_damon_region(r) (r->ar.end - r->ar.start)
-
-/*
- * Merge two adjacent regions into one region
- */
-static void damon_merge_two_regions(struct damon_region *l,
- struct damon_region *r)
-{
- unsigned long sz_l = sz_damon_region(l), sz_r = sz_damon_region(r);
-
- l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
- (sz_l + sz_r);
- l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
- l->ar.end = r->ar.end;
- damon_destroy_region(r);
-}
-
-#define diff_of(a, b) (a > b ? a - b : b - a)
-
-/*
- * Merge adjacent regions having similar access frequencies
- *
- * t target affected by this merge operation
- * thres '->nr_accesses' diff threshold for the merge
- * sz_limit size upper limit of each region
- */
-static void damon_merge_regions_of(struct damon_target *t, unsigned int thres,
- unsigned long sz_limit)
-{
- struct damon_region *r, *prev = NULL, *next;
-
- damon_for_each_region_safe(r, next, t) {
- if (diff_of(r->nr_accesses, r->last_nr_accesses) > thres)
- r->age = 0;
- else
- r->age++;
-
- if (prev && prev->ar.end == r->ar.start &&
- diff_of(prev->nr_accesses, r->nr_accesses) <= thres &&
- sz_damon_region(prev) + sz_damon_region(r) <= sz_limit)
- damon_merge_two_regions(prev, r);
- else
- prev = r;
- }
-}
-
-/*
- * Merge adjacent regions having similar access frequencies
- *
- * threshold '->nr_accesses' diff threshold for the merge
- * sz_limit size upper limit of each region
- *
- * This function merges monitoring target regions which are adjacent and their
- * access frequencies are similar. This is for minimizing the monitoring
- * overhead under the dynamically changeable access pattern. If a merge was
- * unnecessarily made, later 'kdamond_split_regions()' will revert it.
- */
-static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
- unsigned long sz_limit)
-{
- struct damon_target *t;
-
- damon_for_each_target(t, c)
- damon_merge_regions_of(t, threshold, sz_limit);
-}
-
-/*
- * Split a region in two
- *
- * r the region to be split
- * sz_r size of the first sub-region that will be made
- */
-static void damon_split_region_at(struct damon_ctx *ctx,
- struct damon_region *r, unsigned long sz_r)
-{
- struct damon_region *new;
-
- new = damon_new_region(r->ar.start + sz_r, r->ar.end);
- r->ar.end = new->ar.start;
-
- new->age = r->age;
- new->last_nr_accesses = r->last_nr_accesses;
-
- damon_insert_region(new, r, damon_next_region(r));
-}
-
-/* Split every region in the given target into 'nr_subs' regions */
-static void damon_split_regions_of(struct damon_ctx *ctx,
- struct damon_target *t, int nr_subs)
-{
- struct damon_region *r, *next;
- unsigned long sz_region, sz_sub = 0;
- int i;
-
- damon_for_each_region_safe(r, next, t) {
- sz_region = r->ar.end - r->ar.start;
-
- for (i = 0; i < nr_subs - 1 &&
- sz_region > 2 * MIN_REGION; i++) {
- /*
- * Randomly select size of left sub-region to be at
- * least 10 percent and at most 90% of original region
- */
- sz_sub = ALIGN_DOWN(damon_rand(1, 10) *
- sz_region / 10, MIN_REGION);
- /* Do not allow blank region */
- if (sz_sub == 0 || sz_sub >= sz_region)
- continue;
-
- damon_split_region_at(ctx, r, sz_sub);
- sz_region = sz_sub;
- }
- }
-}
-
-/*
- * Split every target region into randomly-sized small regions
- *
- * This function splits every target region into random-sized small regions if
- * current total number of the regions is equal or smaller than half of the
- * user-specified maximum number of regions. This is for maximizing the
- * monitoring accuracy under the dynamically changeable access patterns. If a
- * split was unnecessarily made, later 'kdamond_merge_regions()' will revert
- * it.
- */
-static void kdamond_split_regions(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- unsigned int nr_regions = 0;
- static unsigned int last_nr_regions;
- int nr_subregions = 2;
-
- damon_for_each_target(t, ctx)
- nr_regions += nr_damon_regions(t);
-
- if (nr_regions > ctx->max_nr_regions / 2)
- return;
-
- /* Maybe the middle of the region has different access frequency */
- if (last_nr_regions == nr_regions &&
- nr_regions < ctx->max_nr_regions / 3)
- nr_subregions = 3;
-
- damon_for_each_target(t, ctx)
- damon_split_regions_of(ctx, t, nr_subregions);
-
- last_nr_regions = nr_regions;
-}
-
-/*
- * Check whether it is time to check and apply the target monitoring regions
- *
- * Returns true if it is.
- */
-static bool kdamond_need_update_regions(struct damon_ctx *ctx)
-{
- return damon_check_reset_time_interval(&ctx->last_regions_update,
- ctx->regions_update_interval);
-}
-
-/*
- * Check whether current monitoring should be stopped
- *
- * The monitoring is stopped when either the user requested to stop, or all
- * monitoring targets are invalid.
- *
- * Returns true if need to stop current monitoring.
- */
-static bool kdamond_need_stop(struct damon_ctx *ctx)
-{
- struct damon_target *t;
- bool stop;
-
- mutex_lock(&ctx->kdamond_lock);
- stop = ctx->kdamond_stop;
- mutex_unlock(&ctx->kdamond_lock);
- if (stop)
- return true;
-
- if (!ctx->target_valid)
- return false;
-
- damon_for_each_target(t, ctx) {
- if (ctx->target_valid(t))
- return false;
- }
-
- return true;
-}
-
-static void kdamond_write_record_header(struct damon_ctx *ctx)
-{
- int recfmt_ver = 2;
-
- damon_write_rbuf(ctx, "damon_recfmt_ver", 16);
- damon_write_rbuf(ctx, &recfmt_ver, sizeof(recfmt_ver));
-}
-
-/*
- * The monitoring daemon that runs as a kernel thread
- */
-static int kdamond_fn(void *data)
-{
- struct damon_ctx *ctx = (struct damon_ctx *)data;
- struct damon_target *t;
- struct damon_region *r, *next;
- unsigned int max_nr_accesses = 0;
- unsigned long sz_limit = 0;
-
- pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
- if (ctx->init_target_regions)
- ctx->init_target_regions(ctx);
- sz_limit = damon_region_sz_limit(ctx);
-
- kdamond_write_record_header(ctx);
-
- while (!kdamond_need_stop(ctx)) {
- if (ctx->prepare_access_checks)
- ctx->prepare_access_checks(ctx);
- if (ctx->sample_cb)
- ctx->sample_cb(ctx);
-
- usleep_range(ctx->sample_interval, ctx->sample_interval + 1);
-
- if (ctx->check_accesses)
- max_nr_accesses = ctx->check_accesses(ctx);
-
- if (kdamond_aggregate_interval_passed(ctx)) {
- if (ctx->aggregate_cb)
- ctx->aggregate_cb(ctx);
- kdamond_merge_regions(ctx, max_nr_accesses / 10,
- sz_limit);
- kdamond_apply_schemes(ctx);
- kdamond_reset_aggregated(ctx);
- kdamond_split_regions(ctx);
- }
-
- if (kdamond_need_update_regions(ctx)) {
- if (ctx->update_target_regions)
- ctx->update_target_regions(ctx);
- sz_limit = damon_region_sz_limit(ctx);
- }
- }
- damon_flush_rbuffer(ctx);
- damon_for_each_target(t, ctx) {
- damon_for_each_region_safe(r, next, t)
- damon_destroy_region(r);
- }
-
- if (ctx->cleanup)
- ctx->cleanup(ctx);
-
- pr_debug("kdamond (%d) finishes\n", ctx->kdamond->pid);
- mutex_lock(&ctx->kdamond_lock);
- ctx->kdamond = NULL;
- mutex_unlock(&ctx->kdamond_lock);
-
- mutex_lock(&damon_lock);
- nr_running_ctxs--;
- mutex_unlock(&damon_lock);
-
- do_exit(0);
-}
-
-/*
- * Functions for the DAMON programming interface
- */
-
-static bool damon_kdamond_running(struct damon_ctx *ctx)
-{
- bool running;
-
- mutex_lock(&ctx->kdamond_lock);
- running = ctx->kdamond != NULL;
- mutex_unlock(&ctx->kdamond_lock);
-
- return running;
-}
-
-/*
- * __damon_start() - Starts monitoring with given context.
- * @ctx: monitoring context
- *
- * This function should be called while damon_lock is hold.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-static int __damon_start(struct damon_ctx *ctx)
-{
- int err = -EBUSY;
-
- mutex_lock(&ctx->kdamond_lock);
- if (!ctx->kdamond) {
- err = 0;
- ctx->kdamond_stop = false;
- ctx->kdamond = kthread_create(kdamond_fn, ctx, "kdamond.%d",
- nr_running_ctxs);
- if (IS_ERR(ctx->kdamond))
- err = PTR_ERR(ctx->kdamond);
- else
- wake_up_process(ctx->kdamond);
- }
- mutex_unlock(&ctx->kdamond_lock);
-
- return err;
-}
-
-/**
- * damon_start() - Starts the monitorings for a given group of contexts.
- * @ctxs: an array of the contexts to start monitoring
- * @nr_ctxs: size of @ctxs
- *
- * This function starts a group of monitoring threads for a group of monitoring
- * contexts. One thread per each context is created and run concurrently. The
- * caller should handle synchronization between the threads by itself. If a
- * group of threads that created by other 'damon_start()' call is currently
- * running, this function does nothing but returns -EBUSY.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_start(struct damon_ctx *ctxs, int nr_ctxs)
-{
- int i;
- int err = 0;
-
- mutex_lock(&damon_lock);
- if (nr_running_ctxs) {
- mutex_unlock(&damon_lock);
- return -EBUSY;
- }
-
- for (i = 0; i < nr_ctxs; i++) {
- err = __damon_start(&ctxs[i]);
- if (err)
- break;
- nr_running_ctxs++;
- }
- mutex_unlock(&damon_lock);
-
- return err;
-}
-
-int damon_start_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs)
-{
- int i;
- int err = 0;
-
- mutex_lock(&damon_lock);
- if (nr_running_ctxs) {
- mutex_unlock(&damon_lock);
- return -EBUSY;
- }
-
- for (i = 0; i < nr_ctxs; i++) {
- err = __damon_start(ctxs[i]);
- if (err)
- break;
- nr_running_ctxs++;
- }
- mutex_unlock(&damon_lock);
-
- return err;
-}
-
-/*
- * __damon_stop() - Stops monitoring of given context.
- * @ctx: monitoring context
- *
- * Return: 0 on success, negative error code otherwise.
- */
-static int __damon_stop(struct damon_ctx *ctx)
-{
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ctx->kdamond_stop = true;
- mutex_unlock(&ctx->kdamond_lock);
- while (damon_kdamond_running(ctx))
- usleep_range(ctx->sample_interval,
- ctx->sample_interval * 2);
- return 0;
- }
- mutex_unlock(&ctx->kdamond_lock);
-
- return -EPERM;
-}
-
-/**
- * damon_stop() - Stops the monitorings for a given group of contexts.
- * @ctxs: an array of the contexts to stop monitoring
- * @nr_ctxs: size of @ctxs
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_stop(struct damon_ctx *ctxs, int nr_ctxs)
-{
- int i, err = 0;
-
- for (i = 0; i < nr_ctxs; i++) {
- /* nr_running_ctxs is decremented in kdamond_fn */
- err = __damon_stop(&ctxs[i]);
- if (err)
- return err;
- }
-
- return err;
-}
-
-int damon_stop_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs)
-{
- int i, err = 0;
-
- for (i = 0; i < nr_ctxs; i++) {
- /* nr_running_ctxs is decremented in kdamond_fn */
- err = __damon_stop(ctxs[i]);
- if (err)
- return err;
- }
-
- return err;
-}
-
-/**
- * damon_set_schemes() - Set data access monitoring based operation schemes.
- * @ctx: monitoring context
- * @schemes: array of the schemes
- * @nr_schemes: number of entries in @schemes
- *
- * This function should not be called while the kdamond of the context is
- * running.
- *
- * Return: 0 if success, or negative error code otherwise.
- */
-int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
- ssize_t nr_schemes)
-{
- struct damos *s, *next;
- ssize_t i;
-
- damon_for_each_scheme_safe(s, next, ctx)
- damon_destroy_scheme(s);
- for (i = 0; i < nr_schemes; i++)
- damon_add_scheme(ctx, schemes[i]);
- return 0;
-}
-
-/**
- * damon_set_targets() - Set monitoring targets.
- * @ctx: monitoring context
- * @ids: array of target ids
- * @nr_ids: number of entries in @ids
- *
- * This function should not be called while the kdamond is running.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_set_targets(struct damon_ctx *ctx,
- unsigned long *ids, ssize_t nr_ids)
-{
- ssize_t i;
- struct damon_target *t, *next;
-
- damon_for_each_target_safe(t, next, ctx)
- damon_destroy_target(t);
-
- for (i = 0; i < nr_ids; i++) {
- t = damon_new_target(ids[i]);
- if (!t) {
- pr_err("Failed to alloc damon_target\n");
- return -ENOMEM;
- }
- damon_add_target(ctx, t);
- }
-
- return 0;
-}
-
-/**
- * damon_set_recording() - Set attributes for the recording.
- * @ctx: target kdamond context
- * @rbuf_len: length of the result buffer
- * @rfile_path: path to the monitor result files
- *
- * Setting 'rbuf_len' 0 disables recording.
- *
- * This function should not be called while the kdamond is running.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_set_recording(struct damon_ctx *ctx,
- unsigned int rbuf_len, char *rfile_path)
-{
- size_t rfile_path_len;
-
- if (rbuf_len && (rbuf_len > MAX_RECORD_BUFFER_LEN ||
- rbuf_len < MIN_RECORD_BUFFER_LEN)) {
- pr_err("result buffer size (%u) is out of [%d,%d]\n",
- rbuf_len, MIN_RECORD_BUFFER_LEN,
- MAX_RECORD_BUFFER_LEN);
- return -EINVAL;
- }
- rfile_path_len = strnlen(rfile_path, MAX_RFILE_PATH_LEN);
- if (rfile_path_len >= MAX_RFILE_PATH_LEN) {
- pr_err("too long (>%d) result file path %s\n",
- MAX_RFILE_PATH_LEN, rfile_path);
- return -EINVAL;
- }
- ctx->rbuf_len = rbuf_len;
- kfree(ctx->rbuf);
- ctx->rbuf = NULL;
- kfree(ctx->rfile_path);
- ctx->rfile_path = NULL;
-
- if (rbuf_len) {
- ctx->rbuf = kvmalloc(rbuf_len, GFP_KERNEL);
- if (!ctx->rbuf)
- return -ENOMEM;
- }
- ctx->rfile_path = kmalloc(rfile_path_len + 1, GFP_KERNEL);
- if (!ctx->rfile_path)
- return -ENOMEM;
- strncpy(ctx->rfile_path, rfile_path, rfile_path_len + 1);
- return 0;
-}
-
-/**
- * damon_set_attrs() - Set attributes for the monitoring.
- * @ctx: monitoring context
- * @sample_int: time interval between samplings
- * @regions_update_int: time interval between target regions update
- * @aggr_int: time interval between aggregations
- * @min_nr_reg: minimal number of regions
- * @max_nr_reg: maximum number of regions
- *
- * This function should not be called while the kdamond is running.
- * Every time interval is in micro-seconds.
- *
- * Return: 0 on success, negative error code otherwise.
- */
-int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
- unsigned long aggr_int, unsigned long regions_update_int,
- unsigned long min_nr_reg, unsigned long max_nr_reg)
-{
- if (min_nr_reg < 3) {
- pr_err("min_nr_regions (%lu) must be at least 3\n",
- min_nr_reg);
- return -EINVAL;
- }
- if (min_nr_reg > max_nr_reg) {
- pr_err("invalid nr_regions. min (%lu) > max (%lu)\n",
- min_nr_reg, max_nr_reg);
- return -EINVAL;
- }
-
- ctx->sample_interval = sample_int;
- ctx->aggr_interval = aggr_int;
- ctx->regions_update_interval = regions_update_int;
- ctx->min_nr_regions = min_nr_reg;
- ctx->max_nr_regions = max_nr_reg;
-
- return 0;
-}
-
-/*
- * Functions for the DAMON debugfs interface
- */
-
-/* Monitoring contexts for debugfs interface users. */
-static struct damon_ctx **debugfs_ctxs;
-static int debugfs_nr_ctxs = 1;
-
-static ssize_t debugfs_monitor_on_read(struct file *file,
- char __user *buf, size_t count, loff_t *ppos)
-{
- char monitor_on_buf[5];
- bool monitor_on;
- int len;
-
- mutex_lock(&damon_lock);
- monitor_on = nr_running_ctxs != 0;
- mutex_unlock(&damon_lock);
-
- len = scnprintf(monitor_on_buf, 5, monitor_on ? "on\n" : "off\n");
-
- return simple_read_from_buffer(buf, count, ppos, monitor_on_buf, len);
-}
-
-/*
- * Returns non-empty string on success, negarive error code otherwise.
- */
-static char *user_input_str(const char __user *buf, size_t count, loff_t *ppos)
-{
- char *kbuf;
- ssize_t ret;
-
- /* We do not accept continuous write */
- if (*ppos)
- return ERR_PTR(-EINVAL);
-
- kbuf = kmalloc(count + 1, GFP_KERNEL);
- if (!kbuf)
- return ERR_PTR(-ENOMEM);
-
- ret = simple_write_to_buffer(kbuf, count + 1, ppos, buf, count);
- if (ret != count) {
- kfree(kbuf);
- return ERR_PTR(-EIO);
- }
- kbuf[ret] = '\0';
-
- return kbuf;
-}
-
-static ssize_t debugfs_monitor_on_write(struct file *file,
- const char __user *buf, size_t count, loff_t *ppos)
-{
- ssize_t ret = count;
- char *kbuf;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- /* Remove white space */
- if (sscanf(kbuf, "%s", kbuf) != 1)
- return -EINVAL;
- if (!strncmp(kbuf, "on", count))
- err = damon_start_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
- else if (!strncmp(kbuf, "off", count))
- err = damon_stop_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
- else
- return -EINVAL;
-
- if (err)
- ret = err;
- return ret;
-}
-
-static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
-{
- struct damos *s;
- int written = 0;
- int rc;
-
- damon_for_each_scheme(s, c) {
- rc = scnprintf(&buf[written], len - written,
- "%lu %lu %u %u %u %u %d %lu %lu\n",
- s->min_sz_region, s->max_sz_region,
- s->min_nr_accesses, s->max_nr_accesses,
- s->min_age_region, s->max_age_region,
- s->action, s->stat_count, s->stat_sz);
- if (!rc)
- return -ENOMEM;
-
- written += rc;
- }
- return written;
-}
-
-static ssize_t debugfs_schemes_read(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf;
- ssize_t len;
-
- kbuf = kmalloc(count, GFP_KERNEL);
- if (!kbuf)
- return -ENOMEM;
-
- mutex_lock(&ctx->kdamond_lock);
- len = sprint_schemes(ctx, kbuf, count);
- mutex_unlock(&ctx->kdamond_lock);
- if (len < 0)
- goto out;
- len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
-
-out:
- kfree(kbuf);
- return len;
-}
-
-static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes)
-{
- ssize_t i;
-
- for (i = 0; i < nr_schemes; i++)
- kfree(schemes[i]);
- kfree(schemes);
-}
-
-static bool damos_action_valid(int action)
-{
- switch (action) {
- case DAMOS_WILLNEED:
- case DAMOS_COLD:
- case DAMOS_PAGEOUT:
- case DAMOS_HUGEPAGE:
- case DAMOS_NOHUGEPAGE:
- case DAMOS_STAT:
- return true;
- default:
- return false;
- }
-}
-
-/*
- * Converts a string into an array of struct damos pointers
- *
- * Returns an array of struct damos pointers that converted if the conversion
- * success, or NULL otherwise.
- */
-static struct damos **str_to_schemes(const char *str, ssize_t len,
- ssize_t *nr_schemes)
-{
- struct damos *scheme, **schemes;
- const int max_nr_schemes = 256;
- int pos = 0, parsed, ret;
- unsigned long min_sz, max_sz;
- unsigned int min_nr_a, max_nr_a, min_age, max_age;
- unsigned int action;
-
- schemes = kmalloc_array(max_nr_schemes, sizeof(scheme),
- GFP_KERNEL);
- if (!schemes)
- return NULL;
-
- *nr_schemes = 0;
- while (pos < len && *nr_schemes < max_nr_schemes) {
- ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
- &min_sz, &max_sz, &min_nr_a, &max_nr_a,
- &min_age, &max_age, &action, &parsed);
- if (ret != 7)
- break;
- if (!damos_action_valid(action)) {
- pr_err("wrong action %d\n", action);
- goto fail;
- }
-
- pos += parsed;
- scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
- min_age, max_age, action);
- if (!scheme)
- goto fail;
-
- schemes[*nr_schemes] = scheme;
- *nr_schemes += 1;
- }
- return schemes;
-fail:
- free_schemes_arr(schemes, *nr_schemes);
- return NULL;
-}
-
-static ssize_t debugfs_schemes_write(struct file *file, const char __user *buf,
- size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf;
- struct damos **schemes;
- ssize_t nr_schemes = 0, ret = count;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- schemes = str_to_schemes(kbuf, ret, &nr_schemes);
- if (!schemes) {
- ret = -EINVAL;
- goto out;
- }
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ret = -EBUSY;
- goto unlock_out;
- }
-
- err = damon_set_schemes(ctx, schemes, nr_schemes);
- if (err)
- ret = err;
- else
- nr_schemes = 0;
-unlock_out:
- mutex_unlock(&ctx->kdamond_lock);
- free_schemes_arr(schemes, nr_schemes);
-out:
- kfree(kbuf);
- return ret;
-}
-
-#define targetid_is_pid(ctx) \
- (ctx->target_valid == kdamond_vm_target_valid)
-
-static ssize_t sprint_target_ids(struct damon_ctx *ctx, char *buf, ssize_t len)
-{
- struct damon_target *t;
- unsigned long id;
- int written = 0;
- int rc;
-
- damon_for_each_target(t, ctx) {
- id = t->id;
- if (targetid_is_pid(ctx))
- /* Show pid numbers to debugfs users */
- id = (unsigned long)pid_vnr((struct pid *)id);
-
- rc = scnprintf(&buf[written], len - written, "%lu ", id);
- if (!rc)
- return -ENOMEM;
- written += rc;
- }
- if (written)
- written -= 1;
- written += scnprintf(&buf[written], len - written, "\n");
- return written;
-}
-
-static ssize_t debugfs_target_ids_read(struct file *file,
- char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- ssize_t len;
- char ids_buf[320];
-
- mutex_lock(&ctx->kdamond_lock);
- len = sprint_target_ids(ctx, ids_buf, 320);
- mutex_unlock(&ctx->kdamond_lock);
- if (len < 0)
- return len;
-
- return simple_read_from_buffer(buf, count, ppos, ids_buf, len);
-}
-
-/*
- * Converts a string into an array of unsigned long integers
- *
- * Returns an array of unsigned long integers if the conversion success, or
- * NULL otherwise.
- */
-static unsigned long *str_to_target_ids(const char *str, ssize_t len,
- ssize_t *nr_ids)
-{
- unsigned long *ids;
- const int max_nr_ids = 32;
- unsigned long id;
- int pos = 0, parsed, ret;
-
- *nr_ids = 0;
- ids = kmalloc_array(max_nr_ids, sizeof(id), GFP_KERNEL);
- if (!ids)
- return NULL;
- while (*nr_ids < max_nr_ids && pos < len) {
- ret = sscanf(&str[pos], "%lu%n", &id, &parsed);
- pos += parsed;
- if (ret != 1)
- break;
- ids[*nr_ids] = id;
- *nr_ids += 1;
- }
-
- return ids;
-}
-
-/* Returns pid for the given pidfd if it's valid, or NULL otherwise. */
-static struct pid *damon_get_pidfd_pid(unsigned int pidfd)
-{
- struct fd f;
- struct pid *pid;
-
- f = fdget(pidfd);
- if (!f.file)
- return NULL;
-
- pid = pidfd_pid(f.file);
- if (!IS_ERR(pid))
- get_pid(pid);
- else
- pid = NULL;
-
- fdput(f);
- return pid;
-}
-
-static ssize_t debugfs_target_ids_write(struct file *file,
- const char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf, *nrs;
- bool received_pidfds = false;
- unsigned long *targets;
- ssize_t nr_targets;
- ssize_t ret = count;
- int i;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- nrs = kbuf;
- if (!strncmp(kbuf, "paddr\n", count)) {
- /* Configure the context for physical memory monitoring */
- damon_set_paddr_primitives(ctx);
- /* target id is meaningless here, but we set it just for fun */
- scnprintf(kbuf, count, "42 ");
- } else {
- /* Configure the context for virtual memory monitoring */
- damon_set_vaddr_primitives(ctx);
- if (!strncmp(kbuf, "pidfd ", 6)) {
- received_pidfds = true;
- nrs = &kbuf[6];
- }
- }
-
- targets = str_to_target_ids(nrs, ret, &nr_targets);
- if (!targets) {
- ret = -ENOMEM;
- goto out;
- }
-
- if (received_pidfds) {
- for (i = 0; i < nr_targets; i++)
- targets[i] = (unsigned long)damon_get_pidfd_pid(
- (unsigned int)targets[i]);
- } else if (targetid_is_pid(ctx)) {
- for (i = 0; i < nr_targets; i++)
- targets[i] = (unsigned long)find_get_pid(
- (int)targets[i]);
- }
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ret = -EINVAL;
- goto unlock_out;
- }
-
- err = damon_set_targets(ctx, targets, nr_targets);
- if (err)
- ret = err;
-unlock_out:
- mutex_unlock(&ctx->kdamond_lock);
- kfree(targets);
-out:
- kfree(kbuf);
- return ret;
-}
-
-static ssize_t debugfs_record_read(struct file *file,
- char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char record_buf[20 + MAX_RFILE_PATH_LEN];
- int ret;
-
- mutex_lock(&ctx->kdamond_lock);
- ret = scnprintf(record_buf, ARRAY_SIZE(record_buf), "%u %s\n",
- ctx->rbuf_len, ctx->rfile_path);
- mutex_unlock(&ctx->kdamond_lock);
- return simple_read_from_buffer(buf, count, ppos, record_buf, ret);
-}
-
-static ssize_t debugfs_record_write(struct file *file,
- const char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf;
- unsigned int rbuf_len;
- char rfile_path[MAX_RFILE_PATH_LEN];
- ssize_t ret = count;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- if (sscanf(kbuf, "%u %s",
- &rbuf_len, rfile_path) != 2) {
- ret = -EINVAL;
- goto out;
- }
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ret = -EBUSY;
- goto unlock_out;
- }
-
- err = damon_set_recording(ctx, rbuf_len, rfile_path);
- if (err)
- ret = err;
-unlock_out:
- mutex_unlock(&ctx->kdamond_lock);
-out:
- kfree(kbuf);
- return ret;
-}
-
-static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len)
-{
- struct damon_target *t;
- struct damon_region *r;
- int written = 0;
- int rc;
-
- damon_for_each_target(t, c) {
- damon_for_each_region(r, t) {
- rc = scnprintf(&buf[written], len - written,
- "%lu %lu %lu\n",
- t->id, r->ar.start, r->ar.end);
- if (!rc)
- return -ENOMEM;
- written += rc;
- }
- }
- return written;
-}
-
-static ssize_t debugfs_init_regions_read(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf;
- ssize_t len;
-
- kbuf = kmalloc(count, GFP_KERNEL);
- if (!kbuf)
- return -ENOMEM;
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- mutex_unlock(&ctx->kdamond_lock);
- return -EBUSY;
- }
-
- len = sprint_init_regions(ctx, kbuf, count);
- mutex_unlock(&ctx->kdamond_lock);
- if (len < 0)
- goto out;
- len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
-
-out:
- kfree(kbuf);
- return len;
-}
-
-static int add_init_region(struct damon_ctx *c,
- unsigned long target_id, struct damon_addr_range *ar)
-{
- struct damon_target *t;
- struct damon_region *r, *prev;
- int rc = -EINVAL;
-
- if (ar->start >= ar->end)
- return -EINVAL;
-
- damon_for_each_target(t, c) {
- if (t->id == target_id) {
- r = damon_new_region(ar->start, ar->end);
- if (!r)
- return -ENOMEM;
- damon_add_region(r, t);
- if (nr_damon_regions(t) > 1) {
- prev = damon_prev_region(r);
- if (prev->ar.end > r->ar.start) {
- damon_destroy_region(r);
- return -EINVAL;
- }
- }
- rc = 0;
- }
- }
- return rc;
-}
-
-static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len)
-{
- struct damon_target *t;
- struct damon_region *r, *next;
- int pos = 0, parsed, ret;
- unsigned long target_id;
- struct damon_addr_range ar;
- int err;
-
- damon_for_each_target(t, c) {
- damon_for_each_region_safe(r, next, t)
- damon_destroy_region(r);
- }
-
- while (pos < len) {
- ret = sscanf(&str[pos], "%lu %lu %lu%n",
- &target_id, &ar.start, &ar.end, &parsed);
- if (ret != 3)
- break;
- err = add_init_region(c, target_id, &ar);
- if (err)
- goto fail;
- pos += parsed;
- }
-
- return 0;
-
-fail:
- damon_for_each_target(t, c) {
- damon_for_each_region_safe(r, next, t)
- damon_destroy_region(r);
- }
- return err;
-}
-
-static ssize_t debugfs_init_regions_write(struct file *file,
- const char __user *buf, size_t count,
- loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char *kbuf;
- ssize_t ret = count;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ret = -EBUSY;
- goto unlock_out;
- }
-
- err = set_init_regions(ctx, kbuf, ret);
- if (err)
- ret = err;
-
-unlock_out:
- mutex_unlock(&ctx->kdamond_lock);
- kfree(kbuf);
- return ret;
-}
-
-static ssize_t debugfs_attrs_read(struct file *file,
- char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- char kbuf[128];
- int ret;
-
- mutex_lock(&ctx->kdamond_lock);
- ret = scnprintf(kbuf, ARRAY_SIZE(kbuf), "%lu %lu %lu %lu %lu\n",
- ctx->sample_interval, ctx->aggr_interval,
- ctx->regions_update_interval, ctx->min_nr_regions,
- ctx->max_nr_regions);
- mutex_unlock(&ctx->kdamond_lock);
-
- return simple_read_from_buffer(buf, count, ppos, kbuf, ret);
-}
-
-static ssize_t debugfs_attrs_write(struct file *file,
- const char __user *buf, size_t count, loff_t *ppos)
-{
- struct damon_ctx *ctx = file->private_data;
- unsigned long s, a, r, minr, maxr;
- char *kbuf;
- ssize_t ret = count;
- int err;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- if (sscanf(kbuf, "%lu %lu %lu %lu %lu",
- &s, &a, &r, &minr, &maxr) != 5) {
- ret = -EINVAL;
- goto out;
- }
-
- mutex_lock(&ctx->kdamond_lock);
- if (ctx->kdamond) {
- ret = -EBUSY;
- goto unlock_out;
- }
-
- err = damon_set_attrs(ctx, s, a, r, minr, maxr);
- if (err)
- ret = err;
-unlock_out:
- mutex_unlock(&ctx->kdamond_lock);
-out:
- kfree(kbuf);
- return ret;
-}
-
-static ssize_t debugfs_nr_contexts_read(struct file *file,
- char __user *buf, size_t count, loff_t *ppos)
-{
- char kbuf[32];
- int ret;
-
- mutex_lock(&damon_lock);
- ret = scnprintf(kbuf, ARRAY_SIZE(kbuf), "%d\n", debugfs_nr_ctxs);
- mutex_unlock(&damon_lock);
-
- return simple_read_from_buffer(buf, count, ppos, kbuf, ret);
-}
-
-static struct dentry **debugfs_dirs;
-
-static int debugfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx);
-
-static ssize_t debugfs_nr_contexts_write(struct file *file,
- const char __user *buf, size_t count, loff_t *ppos)
-{
- char *kbuf;
- ssize_t ret = count;
- int nr_contexts, i;
- char dirname[32];
- struct dentry *root;
- struct dentry **new_dirs;
- struct damon_ctx **new_ctxs;
-
- kbuf = user_input_str(buf, count, ppos);
- if (IS_ERR(kbuf))
- return PTR_ERR(kbuf);
-
- if (sscanf(kbuf, "%d", &nr_contexts) != 1) {
- ret = -EINVAL;
- goto out;
- }
- if (nr_contexts < 1) {
- pr_err("nr_contexts should be >=1\n");
- ret = -EINVAL;
- goto out;
- }
- if (nr_contexts == debugfs_nr_ctxs)
- goto out;
-
- mutex_lock(&damon_lock);
- if (nr_running_ctxs) {
- ret = -EBUSY;
- goto unlock_out;
- }
-
- for (i = nr_contexts; i < debugfs_nr_ctxs; i++) {
- debugfs_remove(debugfs_dirs[i]);
- damon_destroy_ctx(debugfs_ctxs[i]);
- }
-
- new_dirs = kmalloc_array(nr_contexts, sizeof(*new_dirs), GFP_KERNEL);
- if (!new_dirs) {
- ret = -ENOMEM;
- goto unlock_out;
- }
-
- new_ctxs = kmalloc_array(nr_contexts, sizeof(*debugfs_ctxs),
- GFP_KERNEL);
- if (!new_ctxs) {
- ret = -ENOMEM;
- goto unlock_out;
- }
-
- for (i = 0; i < debugfs_nr_ctxs && i < nr_contexts; i++) {
- new_dirs[i] = debugfs_dirs[i];
- new_ctxs[i] = debugfs_ctxs[i];
- }
- kfree(debugfs_dirs);
- debugfs_dirs = new_dirs;
- kfree(debugfs_ctxs);
- debugfs_ctxs = new_ctxs;
-
- root = debugfs_dirs[0];
- if (!root) {
- ret = -ENOENT;
- goto unlock_out;
- }
-
- for (i = debugfs_nr_ctxs; i < nr_contexts; i++) {
- scnprintf(dirname, sizeof(dirname), "ctx%d", i);
- debugfs_dirs[i] = debugfs_create_dir(dirname, root);
- if (!debugfs_dirs[i]) {
- pr_err("dir %s creation failed\n", dirname);
- ret = -ENOMEM;
- break;
- }
-
- debugfs_ctxs[i] = damon_new_ctx();
- if (!debugfs_ctxs[i]) {
- pr_err("ctx for %s creation failed\n", dirname);
- ret = -ENOMEM;
- break;
- }
-
- if (debugfs_fill_ctx_dir(debugfs_dirs[i], debugfs_ctxs[i])) {
- ret = -ENOMEM;
- break;
- }
- }
-
- debugfs_nr_ctxs = i;
-
-unlock_out:
- mutex_unlock(&damon_lock);
-
-out:
- kfree(kbuf);
- return ret;
-}
-
-static int damon_debugfs_open(struct inode *inode, struct file *file)
-{
- file->private_data = inode->i_private;
-
- return nonseekable_open(inode, file);
-}
-
-static const struct file_operations monitor_on_fops = {
- .owner = THIS_MODULE,
- .read = debugfs_monitor_on_read,
- .write = debugfs_monitor_on_write,
-};
-
-static const struct file_operations target_ids_fops = {
- .owner = THIS_MODULE,
- .open = damon_debugfs_open,
- .read = debugfs_target_ids_read,
- .write = debugfs_target_ids_write,
-};
-
-static const struct file_operations schemes_fops = {
- .owner = THIS_MODULE,
- .open = damon_debugfs_open,
- .read = debugfs_schemes_read,
- .write = debugfs_schemes_write,
-};
-
-static const struct file_operations record_fops = {
- .owner = THIS_MODULE,
- .open = damon_debugfs_open,
- .read = debugfs_record_read,
- .write = debugfs_record_write,
-};
-
-static const struct file_operations init_regions_fops = {
- .owner = THIS_MODULE,
- .open = damon_debugfs_open,
- .read = debugfs_init_regions_read,
- .write = debugfs_init_regions_write,
-};
-
-static const struct file_operations attrs_fops = {
- .owner = THIS_MODULE,
- .open = damon_debugfs_open,
- .read = debugfs_attrs_read,
- .write = debugfs_attrs_write,
-};
-
-static const struct file_operations nr_contexts_fops = {
- .owner = THIS_MODULE,
- .read = debugfs_nr_contexts_read,
- .write = debugfs_nr_contexts_write,
-};
-
-static int debugfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx)
-{
- const char * const file_names[] = {"attrs", "init_regions", "record",
- "schemes", "target_ids"};
- const struct file_operations *fops[] = {&attrs_fops,
- &init_regions_fops, &record_fops, &schemes_fops,
- &target_ids_fops};
- int i;
-
- for (i = 0; i < ARRAY_SIZE(file_names); i++) {
- if (!debugfs_create_file(file_names[i], 0600, dir,
- ctx, fops[i])) {
- pr_err("failed to create %s file\n", file_names[i]);
- return -ENOMEM;
- }
- }
-
- return 0;
-}
-
-static int __init damon_debugfs_init(void)
-{
- struct dentry *debugfs_root;
- const char * const file_names[] = {"nr_contexts", "monitor_on"};
- const struct file_operations *fops[] = {&nr_contexts_fops,
- &monitor_on_fops};
- int i;
-
- debugfs_root = debugfs_create_dir("damon", NULL);
- if (!debugfs_root) {
- pr_err("failed to create the debugfs dir\n");
- return -ENOMEM;
- }
-
- for (i = 0; i < ARRAY_SIZE(file_names); i++) {
- if (!debugfs_create_file(file_names[i], 0600, debugfs_root,
- NULL, fops[i])) {
- pr_err("failed to create %s file\n", file_names[i]);
- return -ENOMEM;
- }
- }
- debugfs_fill_ctx_dir(debugfs_root, debugfs_ctxs[0]);
-
- debugfs_dirs = kmalloc_array(1, sizeof(debugfs_root), GFP_KERNEL);
- debugfs_dirs[0] = debugfs_root;
-
- return 0;
-}
-
-/*
- * Functions for the initialization
- */
-
-static int __init damon_init(void)
-{
- int rc;
-
- debugfs_ctxs = kmalloc(sizeof(*debugfs_ctxs), GFP_KERNEL);
- debugfs_ctxs[0] = damon_new_ctx();
- if (!debugfs_ctxs[0])
- return -ENOMEM;
-
- rc = damon_debugfs_init();
- if (rc)
- pr_err("%s: debugfs init failed\n", __func__);
-
- return rc;
-}
-
-module_init(damon_init);
-
-#include "damon-test.h"
diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig
new file mode 100644
index 000000000000..8b3f3dd3bd32
--- /dev/null
+++ b/mm/damon/Kconfig
@@ -0,0 +1,68 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+menu "Data Access Monitoring"
+
+config DAMON
+ bool "Data Access Monitor"
+ help
+ This feature allows to monitor access frequency of each memory
+ region. The information can be useful for performance-centric DRAM
+ level memory management.
+
+ See https://damonitor.github.io/doc/html/latest-damon/index.html for
+ more information.
+ If unsure, say N.
+
+config DAMON_KUNIT_TEST
+ bool "Test for damon"
+ depends on DAMON && KUNIT
+ help
+ This builds the DAMON Kunit test suite.
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation.
+
+ If unsure, say N.
+
+config DAMON_PRIMITIVES
+ bool "DAMON primitives for virtual/physical address spaces monitoring"
+ depends on DAMON && MMU && !IDLE_PAGE_TRACKING
+ select PAGE_EXTENSION if !64BIT
+ select PAGE_IDLE_FLAG
+ help
+ This builds the default data access monitoring primitives for DAMON.
+ The primitives supports virtual address spaces and physical address
+ spaces using PG_idle flag.
+
+config DAMON_PRIMITIVES_KUNIT_TEST
+ bool "Test for DAMON primitives"
+ depends on DAMON_PRIMITIVES && KUNIT
+ help
+ This builds the DAMON PRIMITIVES Kunit test suite.
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation.
+
+ If unsure, say N.
+
+config DAMON_DBGFS
+ bool "DAMON debugfs interface"
+ depends on DAMON_PRIMITIVES && DEBUG_FS
+ help
+ This builds the debugfs interface for DAMON. The user space admins
+ can use the interface for arbitrary data access monitoring.
+
+ If unsure, say N.
+
+config DAMON_DBGFS_KUNIT_TEST
+ bool "Test for damon debugfs interface"
+ depends on DAMON_DBGFS && KUNIT
+ help
+ This builds the DAMON debugfs interface Kunit test suite.
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation.
+
+ If unsure, say N.
+
+endmenu
diff --git a/mm/damon/Makefile b/mm/damon/Makefile
new file mode 100644
index 000000000000..2295deb2fe0e
--- /dev/null
+++ b/mm/damon/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DAMON) := core.o
+obj-$(CONFIG_DAMON_PRIMITIVES) += primitives.o
+obj-$(CONFIG_DAMON_DBGFS) += dbgfs.o
diff --git a/mm/damon/core-test.h b/mm/damon/core-test.h
new file mode 100644
index 000000000000..c916d773397a
--- /dev/null
+++ b/mm/damon/core-test.h
@@ -0,0 +1,288 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Data Access Monitor Unit Tests
+ *
+ * Copyright 2019 Amazon.com, Inc. or its affiliates. All rights reserved.
+ *
+ * Author: SeongJae Park <[email protected]>
+ */
+
+#ifdef CONFIG_DAMON_KUNIT_TEST
+
+#ifndef _DAMON_CORE_TEST_H
+#define _DAMON_CORE_TEST_H
+
+#include <kunit/test.h>
+
+static void damon_test_regions(struct kunit *test)
+{
+ struct damon_region *r;
+ struct damon_target *t;
+
+ r = damon_new_region(1, 2);
+ KUNIT_EXPECT_EQ(test, 1ul, r->ar.start);
+ KUNIT_EXPECT_EQ(test, 2ul, r->ar.end);
+ KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);
+
+ t = damon_new_target(42);
+ KUNIT_EXPECT_EQ(test, 0u, damon_nr_regions(t));
+
+ damon_add_region(r, t);
+ KUNIT_EXPECT_EQ(test, 1u, damon_nr_regions(t));
+
+ damon_del_region(r);
+ KUNIT_EXPECT_EQ(test, 0u, damon_nr_regions(t));
+
+ damon_free_target(t);
+}
+
+static void damon_test_target(struct kunit *test)
+{
+ struct damon_ctx *c = damon_new_ctx();
+ struct damon_target *t;
+
+ t = damon_new_target(42);
+ KUNIT_EXPECT_EQ(test, 42ul, t->id);
+ KUNIT_EXPECT_EQ(test, 0u, nr_damon_targets(c));
+
+ damon_add_target(c, t);
+ KUNIT_EXPECT_EQ(test, 1u, nr_damon_targets(c));
+
+ damon_destroy_target(t);
+ KUNIT_EXPECT_EQ(test, 0u, nr_damon_targets(c));
+
+ damon_destroy_ctx(c);
+}
+
+static void damon_test_set_recording(struct kunit *test)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ int err;
+
+ err = damon_set_recording(ctx, 42, "foo");
+ KUNIT_EXPECT_EQ(test, err, -EINVAL);
+ damon_set_recording(ctx, 4242, "foo.bar");
+ KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 4242u);
+ KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo.bar");
+ damon_set_recording(ctx, 424242, "foo");
+ KUNIT_EXPECT_EQ(test, ctx->rbuf_len, 424242u);
+ KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo");
+
+ damon_destroy_ctx(ctx);
+}
+
+/*
+ * Test kdamond_reset_aggregated()
+ *
+ * DAMON checks access to each region and aggregates this information as the
+ * access frequency of each region. In detail, it increases '->nr_accesses' of
+ * regions that an access has confirmed. 'kdamond_reset_aggregated()' flushes
+ * the aggregated information ('->nr_accesses' of each regions) to the result
+ * buffer. As a result of the flushing, the '->nr_accesses' of regions are
+ * initialized to zero.
+ */
+static void damon_test_aggregate(struct kunit *test)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ unsigned long target_ids[] = {1, 2, 3};
+ unsigned long saddr[][3] = {{10, 20, 30}, {5, 42, 49}, {13, 33, 55} };
+ unsigned long eaddr[][3] = {{15, 27, 40}, {31, 45, 55}, {23, 44, 66} };
+ unsigned long accesses[][3] = {{42, 95, 84}, {10, 20, 30}, {0, 1, 2} };
+ struct damon_target *t;
+ struct damon_region *r;
+ int it, ir;
+ ssize_t sz, sr, sp;
+
+ damon_set_recording(ctx, 4242, "damon.data");
+ damon_set_targets(ctx, target_ids, 3);
+
+ it = 0;
+ damon_for_each_target(t, ctx) {
+ for (ir = 0; ir < 3; ir++) {
+ r = damon_new_region(saddr[it][ir], eaddr[it][ir]);
+ r->nr_accesses = accesses[it][ir];
+ damon_add_region(r, t);
+ }
+ it++;
+ }
+ kdamond_reset_aggregated(ctx);
+ it = 0;
+ damon_for_each_target(t, ctx) {
+ ir = 0;
+ /* '->nr_accesses' should be zeroed */
+ damon_for_each_region(r, t) {
+ KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);
+ ir++;
+ }
+ /* regions should be preserved */
+ KUNIT_EXPECT_EQ(test, 3, ir);
+ it++;
+ }
+ /* targets also should be preserved */
+ KUNIT_EXPECT_EQ(test, 3, it);
+
+ /* The aggregated information should be written in the buffer */
+ sr = sizeof(r->ar.start) + sizeof(r->ar.end) + sizeof(r->nr_accesses);
+ sp = sizeof(t->id) + sizeof(unsigned int) + 3 * sr;
+ sz = sizeof(struct timespec64) + sizeof(unsigned int) + 3 * sp;
+ KUNIT_EXPECT_EQ(test, (unsigned int)sz, ctx->rbuf_offset);
+
+ damon_destroy_ctx(ctx);
+}
+
+static void damon_test_write_rbuf(struct kunit *test)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ char *data;
+
+ damon_set_recording(ctx, 4242, "damon.data");
+
+ data = "hello";
+ damon_write_rbuf(ctx, data, strnlen(data, 256));
+ KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
+
+ damon_write_rbuf(ctx, data, 0);
+ KUNIT_EXPECT_EQ(test, ctx->rbuf_offset, 5u);
+
+ KUNIT_EXPECT_STREQ(test, (char *)ctx->rbuf, data);
+
+ damon_destroy_ctx(ctx);
+}
+
+static void damon_test_split_at(struct kunit *test)
+{
+ struct damon_ctx *c = damon_new_ctx();
+ struct damon_target *t;
+ struct damon_region *r;
+
+ t = damon_new_target(42);
+ r = damon_new_region(0, 100);
+ damon_add_region(r, t);
+ damon_split_region_at(c, r, 25);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 25ul);
+
+ r = damon_next_region(r);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 25ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 100ul);
+
+ damon_free_target(t);
+ damon_destroy_ctx(c);
+}
+
+static void damon_test_merge_two(struct kunit *test)
+{
+ struct damon_target *t;
+ struct damon_region *r, *r2, *r3;
+ int i;
+
+ t = damon_new_target(42);
+ r = damon_new_region(0, 100);
+ r->nr_accesses = 10;
+ damon_add_region(r, t);
+ r2 = damon_new_region(100, 300);
+ r2->nr_accesses = 20;
+ damon_add_region(r2, t);
+
+ damon_merge_two_regions(r, r2);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 300ul);
+ KUNIT_EXPECT_EQ(test, r->nr_accesses, 16u);
+
+ i = 0;
+ damon_for_each_region(r3, t) {
+ KUNIT_EXPECT_PTR_EQ(test, r, r3);
+ i++;
+ }
+ KUNIT_EXPECT_EQ(test, i, 1);
+
+ damon_free_target(t);
+}
+
+static struct damon_region *__nth_region_of(struct damon_target *t, int idx)
+{
+ struct damon_region *r;
+ unsigned int i = 0;
+
+ damon_for_each_region(r, t) {
+ if (i++ == idx)
+ return r;
+ }
+
+ return NULL;
+}
+
+static void damon_test_merge_regions_of(struct kunit *test)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ unsigned long sa[] = {0, 100, 114, 122, 130, 156, 170, 184};
+ unsigned long ea[] = {100, 112, 122, 130, 156, 170, 184, 230};
+ unsigned int nrs[] = {0, 0, 10, 10, 20, 30, 1, 2};
+
+ unsigned long saddrs[] = {0, 114, 130, 156, 170};
+ unsigned long eaddrs[] = {112, 130, 156, 170, 230};
+ int i;
+
+ t = damon_new_target(42);
+ for (i = 0; i < ARRAY_SIZE(sa); i++) {
+ r = damon_new_region(sa[i], ea[i]);
+ r->nr_accesses = nrs[i];
+ damon_add_region(r, t);
+ }
+
+ damon_merge_regions_of(t, 9, 9999);
+ /* 0-112, 114-130, 130-156, 156-170 */
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 5u);
+ for (i = 0; i < 5; i++) {
+ r = __nth_region_of(t, i);
+ KUNIT_EXPECT_EQ(test, r->ar.start, saddrs[i]);
+ KUNIT_EXPECT_EQ(test, r->ar.end, eaddrs[i]);
+ }
+ damon_free_target(t);
+}
+
+static void damon_test_split_regions_of(struct kunit *test)
+{
+ struct damon_ctx *c = damon_new_ctx();
+ struct damon_target *t;
+ struct damon_region *r;
+
+ t = damon_new_target(42);
+ r = damon_new_region(0, 22);
+ damon_add_region(r, t);
+ damon_split_regions_of(c, t, 2);
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 2u);
+ damon_free_target(t);
+
+ t = damon_new_target(42);
+ r = damon_new_region(0, 220);
+ damon_add_region(r, t);
+ damon_split_regions_of(c, t, 4);
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 4u);
+ damon_free_target(t);
+ damon_destroy_ctx(c);
+}
+
+static struct kunit_case damon_test_cases[] = {
+ KUNIT_CASE(damon_test_target),
+ KUNIT_CASE(damon_test_regions),
+ KUNIT_CASE(damon_test_set_recording),
+ KUNIT_CASE(damon_test_aggregate),
+ KUNIT_CASE(damon_test_write_rbuf),
+ KUNIT_CASE(damon_test_split_at),
+ KUNIT_CASE(damon_test_merge_two),
+ KUNIT_CASE(damon_test_merge_regions_of),
+ KUNIT_CASE(damon_test_split_regions_of),
+ {},
+};
+
+static struct kunit_suite damon_test_suite = {
+ .name = "damon",
+ .test_cases = damon_test_cases,
+};
+kunit_test_suite(damon_test_suite);
+
+#endif /* _DAMON_CORE_TEST_H */
+
+#endif /* CONFIG_DAMON_KUNIT_TEST */
diff --git a/mm/damon/core.c b/mm/damon/core.c
new file mode 100644
index 000000000000..d85ade7b5e23
--- /dev/null
+++ b/mm/damon/core.c
@@ -0,0 +1,1065 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Data Access Monitor
+ *
+ * Author: SeongJae Park <[email protected]>
+ */
+
+#define pr_fmt(fmt) "damon: " fmt
+
+#include <asm-generic/mman-common.h>
+#include <linux/damon.h>
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/random.h>
+#include <linux/sched/mm.h>
+#include <linux/slab.h>
+
+#include "damon.h"
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/damon.h>
+
+/* Minimal region size. Every damon_region is aligned by this. */
+#ifndef CONFIG_DAMON_KUNIT_TEST
+#define MIN_REGION PAGE_SIZE
+#else
+#define MIN_REGION 1
+#endif
+
+/*
+ * Functions and macros for DAMON data structures
+ */
+
+static DEFINE_MUTEX(damon_lock);
+static int nr_running_ctxs;
+
+/*
+ * Construct a damon_region struct
+ *
+ * Returns the pointer to the new struct if success, or NULL otherwise
+ */
+struct damon_region *damon_new_region(unsigned long start, unsigned long end)
+{
+ struct damon_region *region;
+
+ region = kmalloc(sizeof(*region), GFP_KERNEL);
+ if (!region)
+ return NULL;
+
+ region->ar.start = start;
+ region->ar.end = end;
+ region->nr_accesses = 0;
+ INIT_LIST_HEAD(®ion->list);
+
+ region->age = 0;
+ region->last_nr_accesses = 0;
+
+ return region;
+}
+
+/*
+ * Add a region between two other regions
+ */
+inline void damon_insert_region(struct damon_region *r,
+ struct damon_region *prev, struct damon_region *next)
+{
+ __list_add(&r->list, &prev->list, &next->list);
+}
+
+void damon_add_region(struct damon_region *r, struct damon_target *t)
+{
+ list_add_tail(&r->list, &t->regions_list);
+}
+
+static void damon_del_region(struct damon_region *r)
+{
+ list_del(&r->list);
+}
+
+static void damon_free_region(struct damon_region *r)
+{
+ kfree(r);
+}
+
+void damon_destroy_region(struct damon_region *r)
+{
+ damon_del_region(r);
+ damon_free_region(r);
+}
+
+struct damos *damon_new_scheme(
+ unsigned long min_sz_region, unsigned long max_sz_region,
+ unsigned int min_nr_accesses, unsigned int max_nr_accesses,
+ unsigned int min_age_region, unsigned int max_age_region,
+ enum damos_action action)
+{
+ struct damos *scheme;
+
+ scheme = kmalloc(sizeof(*scheme), GFP_KERNEL);
+ if (!scheme)
+ return NULL;
+ scheme->min_sz_region = min_sz_region;
+ scheme->max_sz_region = max_sz_region;
+ scheme->min_nr_accesses = min_nr_accesses;
+ scheme->max_nr_accesses = max_nr_accesses;
+ scheme->min_age_region = min_age_region;
+ scheme->max_age_region = max_age_region;
+ scheme->action = action;
+ scheme->stat_count = 0;
+ scheme->stat_sz = 0;
+ INIT_LIST_HEAD(&scheme->list);
+
+ return scheme;
+}
+
+void damon_add_scheme(struct damon_ctx *ctx, struct damos *s)
+{
+ list_add_tail(&s->list, &ctx->schemes_list);
+}
+
+static void damon_del_scheme(struct damos *s)
+{
+ list_del(&s->list);
+}
+
+static void damon_free_scheme(struct damos *s)
+{
+ kfree(s);
+}
+
+void damon_destroy_scheme(struct damos *s)
+{
+ damon_del_scheme(s);
+ damon_free_scheme(s);
+}
+
+/*
+ * Construct a damon_target struct
+ *
+ * Returns the pointer to the new struct if success, or NULL otherwise
+ */
+struct damon_target *damon_new_target(unsigned long id)
+{
+ struct damon_target *t;
+
+ t = kmalloc(sizeof(*t), GFP_KERNEL);
+ if (!t)
+ return NULL;
+
+ t->id = id;
+ INIT_LIST_HEAD(&t->regions_list);
+
+ return t;
+}
+
+void damon_add_target(struct damon_ctx *ctx, struct damon_target *t)
+{
+ list_add_tail(&t->list, &ctx->targets_list);
+}
+
+static void damon_del_target(struct damon_target *t)
+{
+ list_del(&t->list);
+}
+
+void damon_free_target(struct damon_target *t)
+{
+ struct damon_region *r, *next;
+
+ damon_for_each_region_safe(r, next, t)
+ damon_free_region(r);
+ kfree(t);
+}
+
+void damon_destroy_target(struct damon_target *t)
+{
+ damon_del_target(t);
+ damon_free_target(t);
+}
+
+unsigned int damon_nr_regions(struct damon_target *t)
+{
+ struct damon_region *r;
+ unsigned int nr_regions = 0;
+
+ damon_for_each_region(r, t)
+ nr_regions++;
+
+ return nr_regions;
+}
+
+struct damon_ctx *damon_new_ctx(void)
+{
+ struct damon_ctx *ctx;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return NULL;
+
+ ctx->sample_interval = 5 * 1000;
+ ctx->aggr_interval = 100 * 1000;
+ ctx->regions_update_interval = 1000 * 1000;
+ ctx->min_nr_regions = 10;
+ ctx->max_nr_regions = 1000;
+
+ ktime_get_coarse_ts64(&ctx->last_aggregation);
+ ctx->last_regions_update = ctx->last_aggregation;
+
+ if (damon_set_recording(ctx, 0, "none")) {
+ kfree(ctx);
+ return NULL;
+ }
+
+ mutex_init(&ctx->kdamond_lock);
+
+ INIT_LIST_HEAD(&ctx->targets_list);
+ INIT_LIST_HEAD(&ctx->schemes_list);
+
+ return ctx;
+}
+
+void damon_destroy_ctx(struct damon_ctx *ctx)
+{
+ struct damon_target *t, *next_t;
+ struct damos *s, *next_s;
+
+ damon_for_each_target_safe(t, next_t, ctx)
+ damon_destroy_target(t);
+
+ damon_for_each_scheme_safe(s, next_s, ctx)
+ damon_destroy_scheme(s);
+
+ kfree(ctx);
+}
+
+/**
+ * damon_set_targets() - Set monitoring targets.
+ * @ctx: monitoring context
+ * @ids: array of target ids
+ * @nr_ids: number of entries in @ids
+ *
+ * This function should not be called while the kdamond is running.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int damon_set_targets(struct damon_ctx *ctx,
+ unsigned long *ids, ssize_t nr_ids)
+{
+ ssize_t i;
+ struct damon_target *t, *next;
+
+ damon_for_each_target_safe(t, next, ctx)
+ damon_destroy_target(t);
+
+ for (i = 0; i < nr_ids; i++) {
+ t = damon_new_target(ids[i]);
+ if (!t) {
+ pr_err("Failed to alloc damon_target\n");
+ return -ENOMEM;
+ }
+ damon_add_target(ctx, t);
+ }
+
+ return 0;
+}
+
+/**
+ * damon_set_attrs() - Set attributes for the monitoring.
+ * @ctx: monitoring context
+ * @sample_int: time interval between samplings
+ * @regions_update_int: time interval between target regions update
+ * @aggr_int: time interval between aggregations
+ * @min_nr_reg: minimal number of regions
+ * @max_nr_reg: maximum number of regions
+ *
+ * This function should not be called while the kdamond is running.
+ * Every time interval is in micro-seconds.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
+ unsigned long aggr_int, unsigned long regions_update_int,
+ unsigned long min_nr_reg, unsigned long max_nr_reg)
+{
+ if (min_nr_reg < 3) {
+ pr_err("min_nr_regions (%lu) must be at least 3\n",
+ min_nr_reg);
+ return -EINVAL;
+ }
+ if (min_nr_reg > max_nr_reg) {
+ pr_err("invalid nr_regions. min (%lu) > max (%lu)\n",
+ min_nr_reg, max_nr_reg);
+ return -EINVAL;
+ }
+
+ ctx->sample_interval = sample_int;
+ ctx->aggr_interval = aggr_int;
+ ctx->regions_update_interval = regions_update_int;
+ ctx->min_nr_regions = min_nr_reg;
+ ctx->max_nr_regions = max_nr_reg;
+
+ return 0;
+}
+
+/**
+ * damon_set_schemes() - Set data access monitoring based operation schemes.
+ * @ctx: monitoring context
+ * @schemes: array of the schemes
+ * @nr_schemes: number of entries in @schemes
+ *
+ * This function should not be called while the kdamond of the context is
+ * running.
+ *
+ * Return: 0 if success, or negative error code otherwise.
+ */
+int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes,
+ ssize_t nr_schemes)
+{
+ struct damos *s, *next;
+ ssize_t i;
+
+ damon_for_each_scheme_safe(s, next, ctx)
+ damon_destroy_scheme(s);
+ for (i = 0; i < nr_schemes; i++)
+ damon_add_scheme(ctx, schemes[i]);
+ return 0;
+}
+
+/**
+ * damon_set_recording() - Set attributes for the recording.
+ * @ctx: target kdamond context
+ * @rbuf_len: length of the result buffer
+ * @rfile_path: path to the monitor result files
+ *
+ * Setting 'rbuf_len' 0 disables recording.
+ *
+ * This function should not be called while the kdamond is running.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int damon_set_recording(struct damon_ctx *ctx,
+ unsigned int rbuf_len, char *rfile_path)
+{
+ size_t rfile_path_len;
+
+ if (rbuf_len && (rbuf_len > MAX_RECORD_BUFFER_LEN ||
+ rbuf_len < MIN_RECORD_BUFFER_LEN)) {
+ pr_err("result buffer size (%u) is out of [%d,%d]\n",
+ rbuf_len, MIN_RECORD_BUFFER_LEN,
+ MAX_RECORD_BUFFER_LEN);
+ return -EINVAL;
+ }
+ rfile_path_len = strnlen(rfile_path, MAX_RFILE_PATH_LEN);
+ if (rfile_path_len >= MAX_RFILE_PATH_LEN) {
+ pr_err("too long (>%d) result file path %s\n",
+ MAX_RFILE_PATH_LEN, rfile_path);
+ return -EINVAL;
+ }
+ ctx->rbuf_len = rbuf_len;
+ kfree(ctx->rbuf);
+ ctx->rbuf = NULL;
+ kfree(ctx->rfile_path);
+ ctx->rfile_path = NULL;
+
+ if (rbuf_len) {
+ ctx->rbuf = kvmalloc(rbuf_len, GFP_KERNEL);
+ if (!ctx->rbuf)
+ return -ENOMEM;
+ }
+ ctx->rfile_path = kmalloc(rfile_path_len + 1, GFP_KERNEL);
+ if (!ctx->rfile_path)
+ return -ENOMEM;
+ strncpy(ctx->rfile_path, rfile_path, rfile_path_len + 1);
+ return 0;
+}
+
+/**
+ * damon_nr_running_ctxs() - Return number of currently running contexts.
+ */
+int damon_nr_running_ctxs(void)
+{
+ int nr_ctxs;
+
+ mutex_lock(&damon_lock);
+ nr_ctxs = nr_running_ctxs;
+ mutex_unlock(&damon_lock);
+
+ return nr_ctxs;
+}
+
+static unsigned int nr_damon_targets(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ unsigned int nr_targets = 0;
+
+ damon_for_each_target(t, ctx)
+ nr_targets++;
+
+ return nr_targets;
+}
+
+/* Returns the size upper limit for each monitoring region */
+static unsigned long damon_region_sz_limit(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ unsigned long sz = 0;
+
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region(r, t)
+ sz += r->ar.end - r->ar.start;
+ }
+
+ if (ctx->min_nr_regions)
+ sz /= ctx->min_nr_regions;
+ if (sz < MIN_REGION)
+ sz = MIN_REGION;
+
+ return sz;
+}
+
+static bool damon_kdamond_running(struct damon_ctx *ctx)
+{
+ bool running;
+
+ mutex_lock(&ctx->kdamond_lock);
+ running = ctx->kdamond != NULL;
+ mutex_unlock(&ctx->kdamond_lock);
+
+ return running;
+}
+
+static int kdamond_fn(void *data);
+
+/*
+ * __damon_start() - Starts monitoring with given context.
+ * @ctx: monitoring context
+ *
+ * This function should be called while damon_lock is hold.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+static int __damon_start(struct damon_ctx *ctx)
+{
+ int err = -EBUSY;
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (!ctx->kdamond) {
+ err = 0;
+ ctx->kdamond_stop = false;
+ ctx->kdamond = kthread_create(kdamond_fn, ctx, "kdamond.%d",
+ nr_running_ctxs);
+ if (IS_ERR(ctx->kdamond))
+ err = PTR_ERR(ctx->kdamond);
+ else
+ wake_up_process(ctx->kdamond);
+ }
+ mutex_unlock(&ctx->kdamond_lock);
+
+ return err;
+}
+
+/**
+ * damon_start() - Starts the monitorings for a given group of contexts.
+ * @ctxs: an array of the contexts to start monitoring
+ * @nr_ctxs: size of @ctxs
+ *
+ * This function starts a group of monitoring threads for a group of monitoring
+ * contexts. One thread per each context is created and run concurrently. The
+ * caller should handle synchronization between the threads by itself. If a
+ * group of threads that created by other 'damon_start()' call is currently
+ * running, this function does nothing but returns -EBUSY.
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int damon_start(struct damon_ctx *ctxs, int nr_ctxs)
+{
+ int i;
+ int err = 0;
+
+ mutex_lock(&damon_lock);
+ if (nr_running_ctxs) {
+ mutex_unlock(&damon_lock);
+ return -EBUSY;
+ }
+
+ for (i = 0; i < nr_ctxs; i++) {
+ err = __damon_start(&ctxs[i]);
+ if (err)
+ break;
+ nr_running_ctxs++;
+ }
+ mutex_unlock(&damon_lock);
+
+ return err;
+}
+
+int damon_start_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs)
+{
+ int i;
+ int err = 0;
+
+ mutex_lock(&damon_lock);
+ if (nr_running_ctxs) {
+ mutex_unlock(&damon_lock);
+ return -EBUSY;
+ }
+
+ for (i = 0; i < nr_ctxs; i++) {
+ err = __damon_start(ctxs[i]);
+ if (err)
+ break;
+ nr_running_ctxs++;
+ }
+ mutex_unlock(&damon_lock);
+
+ return err;
+}
+
+/*
+ * __damon_stop() - Stops monitoring of given context.
+ * @ctx: monitoring context
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+static int __damon_stop(struct damon_ctx *ctx)
+{
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ctx->kdamond_stop = true;
+ mutex_unlock(&ctx->kdamond_lock);
+ while (damon_kdamond_running(ctx))
+ usleep_range(ctx->sample_interval,
+ ctx->sample_interval * 2);
+ return 0;
+ }
+ mutex_unlock(&ctx->kdamond_lock);
+
+ return -EPERM;
+}
+
+/**
+ * damon_stop() - Stops the monitorings for a given group of contexts.
+ * @ctxs: an array of the contexts to stop monitoring
+ * @nr_ctxs: size of @ctxs
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int damon_stop(struct damon_ctx *ctxs, int nr_ctxs)
+{
+ int i, err = 0;
+
+ for (i = 0; i < nr_ctxs; i++) {
+ /* nr_running_ctxs is decremented in kdamond_fn */
+ err = __damon_stop(&ctxs[i]);
+ if (err)
+ return err;
+ }
+
+ return err;
+}
+
+int damon_stop_ctx_ptrs(struct damon_ctx **ctxs, int nr_ctxs)
+{
+ int i, err = 0;
+
+ for (i = 0; i < nr_ctxs; i++) {
+ /* nr_running_ctxs is decremented in kdamond_fn */
+ err = __damon_stop(ctxs[i]);
+ if (err)
+ return err;
+ }
+
+ return err;
+}
+
+/*
+ * Functions for DAMON core logics
+ */
+
+/*
+ * damon_check_reset_time_interval() - Check if a time interval is elapsed.
+ * @baseline: the time to check whether the interval has elapsed since
+ * @interval: the time interval (microseconds)
+ *
+ * See whether the given time interval has passed since the given baseline
+ * time. If so, it also updates the baseline to current time for next check.
+ *
+ * Return: true if the time interval has passed, or false otherwise.
+ */
+static bool damon_check_reset_time_interval(struct timespec64 *baseline,
+ unsigned long interval)
+{
+ struct timespec64 now;
+
+ ktime_get_coarse_ts64(&now);
+ if ((timespec64_to_ns(&now) - timespec64_to_ns(baseline)) <
+ interval * 1000)
+ return false;
+ *baseline = now;
+ return true;
+}
+
+/*
+ * Check whether it is time to flush the aggregated information
+ */
+static bool kdamond_aggregate_interval_passed(struct damon_ctx *ctx)
+{
+ return damon_check_reset_time_interval(&ctx->last_aggregation,
+ ctx->aggr_interval);
+}
+
+/*
+ * Flush the content in the result buffer to the result file
+ */
+static void damon_flush_rbuffer(struct damon_ctx *ctx)
+{
+ ssize_t sz;
+ loff_t pos = 0;
+ struct file *rfile;
+
+ if (!ctx->rbuf_offset)
+ return;
+
+ rfile = filp_open(ctx->rfile_path,
+ O_CREAT | O_RDWR | O_APPEND | O_LARGEFILE, 0644);
+ if (IS_ERR(rfile)) {
+ pr_err("Cannot open the result file %s\n",
+ ctx->rfile_path);
+ return;
+ }
+
+ while (ctx->rbuf_offset) {
+ sz = kernel_write(rfile, ctx->rbuf, ctx->rbuf_offset, &pos);
+ if (sz < 0)
+ break;
+ ctx->rbuf_offset -= sz;
+ }
+ filp_close(rfile, NULL);
+}
+
+/*
+ * Write a data into the result buffer
+ */
+static void damon_write_rbuf(struct damon_ctx *ctx, void *data, ssize_t size)
+{
+ if (!ctx->rbuf_len || !ctx->rbuf || !ctx->rfile_path)
+ return;
+ if (ctx->rbuf_offset + size > ctx->rbuf_len)
+ damon_flush_rbuffer(ctx);
+ if (ctx->rbuf_offset + size > ctx->rbuf_len) {
+ pr_warn("%s: flush failed, or wrong size given(%u, %zu)\n",
+ __func__, ctx->rbuf_offset, size);
+ return;
+ }
+
+ memcpy(&ctx->rbuf[ctx->rbuf_offset], data, size);
+ ctx->rbuf_offset += size;
+}
+
+/*
+ * Flush the aggregated monitoring results to the result buffer
+ *
+ * Stores current tracking results to the result buffer and reset 'nr_accesses'
+ * of each region. The format for the result buffer is as below:
+ *
+ * <time> <number of targets> <array of target infos>
+ *
+ * target info: <id> <number of regions> <array of region infos>
+ * region info: <start address> <end address> <nr_accesses>
+ */
+static void kdamond_reset_aggregated(struct damon_ctx *c)
+{
+ struct damon_target *t;
+ struct timespec64 now;
+ unsigned int nr;
+
+ ktime_get_coarse_ts64(&now);
+
+ damon_write_rbuf(c, &now, sizeof(now));
+ nr = nr_damon_targets(c);
+ damon_write_rbuf(c, &nr, sizeof(nr));
+
+ damon_for_each_target(t, c) {
+ struct damon_region *r;
+
+ damon_write_rbuf(c, &t->id, sizeof(t->id));
+ nr = damon_nr_regions(t);
+ damon_write_rbuf(c, &nr, sizeof(nr));
+ damon_for_each_region(r, t) {
+ damon_write_rbuf(c, &r->ar.start, sizeof(r->ar.start));
+ damon_write_rbuf(c, &r->ar.end, sizeof(r->ar.end));
+ damon_write_rbuf(c, &r->nr_accesses,
+ sizeof(r->nr_accesses));
+ trace_damon_aggregated(t, r, nr);
+ r->last_nr_accesses = r->nr_accesses;
+ r->nr_accesses = 0;
+ }
+ }
+}
+
+#ifndef CONFIG_ADVISE_SYSCALLS
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+ int behavior)
+{
+ return -EINVAL;
+}
+#else
+static int damos_madvise(struct damon_target *target, struct damon_region *r,
+ int behavior)
+{
+ struct task_struct *t;
+ struct mm_struct *mm;
+ int ret = -ENOMEM;
+
+ t = damon_get_task_struct(target);
+ if (!t)
+ goto out;
+ mm = damon_get_mm(target);
+ if (!mm)
+ goto put_task_out;
+
+ ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start),
+ PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
+ mmput(mm);
+put_task_out:
+ put_task_struct(t);
+out:
+ return ret;
+}
+#endif /* CONFIG_ADVISE_SYSCALLS */
+
+static int damos_do_action(struct damon_target *target, struct damon_region *r,
+ enum damos_action action)
+{
+ int madv_action;
+
+ switch (action) {
+ case DAMOS_WILLNEED:
+ madv_action = MADV_WILLNEED;
+ break;
+ case DAMOS_COLD:
+ madv_action = MADV_COLD;
+ break;
+ case DAMOS_PAGEOUT:
+ madv_action = MADV_PAGEOUT;
+ break;
+ case DAMOS_HUGEPAGE:
+ madv_action = MADV_HUGEPAGE;
+ break;
+ case DAMOS_NOHUGEPAGE:
+ madv_action = MADV_NOHUGEPAGE;
+ break;
+ case DAMOS_STAT:
+ return 0;
+ default:
+ pr_warn("Wrong action %d\n", action);
+ return -EINVAL;
+ }
+
+ return damos_madvise(target, r, madv_action);
+}
+
+static void damon_do_apply_schemes(struct damon_ctx *c,
+ struct damon_target *t,
+ struct damon_region *r)
+{
+ struct damos *s;
+ unsigned long sz;
+
+ damon_for_each_scheme(s, c) {
+ sz = r->ar.end - r->ar.start;
+ if (sz < s->min_sz_region || s->max_sz_region < sz)
+ continue;
+ if (r->nr_accesses < s->min_nr_accesses ||
+ s->max_nr_accesses < r->nr_accesses)
+ continue;
+ if (r->age < s->min_age_region || s->max_age_region < r->age)
+ continue;
+ s->stat_count++;
+ s->stat_sz += sz;
+ damos_do_action(t, r, s->action);
+ if (s->action != DAMOS_STAT)
+ r->age = 0;
+ }
+}
+
+static void kdamond_apply_schemes(struct damon_ctx *c)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+
+ damon_for_each_target(t, c) {
+ damon_for_each_region(r, t)
+ damon_do_apply_schemes(c, t, r);
+ }
+}
+
+#define sz_damon_region(r) (r->ar.end - r->ar.start)
+
+/*
+ * Merge two adjacent regions into one region
+ */
+static void damon_merge_two_regions(struct damon_region *l,
+ struct damon_region *r)
+{
+ unsigned long sz_l = sz_damon_region(l), sz_r = sz_damon_region(r);
+
+ l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
+ (sz_l + sz_r);
+ l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
+ l->ar.end = r->ar.end;
+ damon_destroy_region(r);
+}
+
+#define diff_of(a, b) (a > b ? a - b : b - a)
+
+/*
+ * Merge adjacent regions having similar access frequencies
+ *
+ * t target affected by this merge operation
+ * thres '->nr_accesses' diff threshold for the merge
+ * sz_limit size upper limit of each region
+ */
+static void damon_merge_regions_of(struct damon_target *t, unsigned int thres,
+ unsigned long sz_limit)
+{
+ struct damon_region *r, *prev = NULL, *next;
+
+ damon_for_each_region_safe(r, next, t) {
+ if (diff_of(r->nr_accesses, r->last_nr_accesses) > thres)
+ r->age = 0;
+ else
+ r->age++;
+
+ if (prev && prev->ar.end == r->ar.start &&
+ diff_of(prev->nr_accesses, r->nr_accesses) <= thres &&
+ sz_damon_region(prev) + sz_damon_region(r) <= sz_limit)
+ damon_merge_two_regions(prev, r);
+ else
+ prev = r;
+ }
+}
+
+/*
+ * Merge adjacent regions having similar access frequencies
+ *
+ * threshold '->nr_accesses' diff threshold for the merge
+ * sz_limit size upper limit of each region
+ *
+ * This function merges monitoring target regions which are adjacent and their
+ * access frequencies are similar. This is for minimizing the monitoring
+ * overhead under the dynamically changeable access pattern. If a merge was
+ * unnecessarily made, later 'kdamond_split_regions()' will revert it.
+ */
+static void kdamond_merge_regions(struct damon_ctx *c, unsigned int threshold,
+ unsigned long sz_limit)
+{
+ struct damon_target *t;
+
+ damon_for_each_target(t, c)
+ damon_merge_regions_of(t, threshold, sz_limit);
+}
+
+/*
+ * Split a region in two
+ *
+ * r the region to be split
+ * sz_r size of the first sub-region that will be made
+ */
+static void damon_split_region_at(struct damon_ctx *ctx,
+ struct damon_region *r, unsigned long sz_r)
+{
+ struct damon_region *new;
+
+ new = damon_new_region(r->ar.start + sz_r, r->ar.end);
+ r->ar.end = new->ar.start;
+
+ new->age = r->age;
+ new->last_nr_accesses = r->last_nr_accesses;
+
+ damon_insert_region(new, r, damon_next_region(r));
+}
+
+/* Split every region in the given target into 'nr_subs' regions */
+static void damon_split_regions_of(struct damon_ctx *ctx,
+ struct damon_target *t, int nr_subs)
+{
+ struct damon_region *r, *next;
+ unsigned long sz_region, sz_sub = 0;
+ int i;
+
+ damon_for_each_region_safe(r, next, t) {
+ sz_region = r->ar.end - r->ar.start;
+
+ for (i = 0; i < nr_subs - 1 &&
+ sz_region > 2 * MIN_REGION; i++) {
+ /*
+ * Randomly select size of left sub-region to be at
+ * least 10 percent and at most 90% of original region
+ */
+ sz_sub = ALIGN_DOWN(damon_rand(1, 10) *
+ sz_region / 10, MIN_REGION);
+ /* Do not allow blank region */
+ if (sz_sub == 0 || sz_sub >= sz_region)
+ continue;
+
+ damon_split_region_at(ctx, r, sz_sub);
+ sz_region = sz_sub;
+ }
+ }
+}
+
+/*
+ * Split every target region into randomly-sized small regions
+ *
+ * This function splits every target region into random-sized small regions if
+ * current total number of the regions is equal or smaller than half of the
+ * user-specified maximum number of regions. This is for maximizing the
+ * monitoring accuracy under the dynamically changeable access patterns. If a
+ * split was unnecessarily made, later 'kdamond_merge_regions()' will revert
+ * it.
+ */
+static void kdamond_split_regions(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ unsigned int nr_regions = 0;
+ static unsigned int last_nr_regions;
+ int nr_subregions = 2;
+
+ damon_for_each_target(t, ctx)
+ nr_regions += damon_nr_regions(t);
+
+ if (nr_regions > ctx->max_nr_regions / 2)
+ return;
+
+ /* Maybe the middle of the region has different access frequency */
+ if (last_nr_regions == nr_regions &&
+ nr_regions < ctx->max_nr_regions / 3)
+ nr_subregions = 3;
+
+ damon_for_each_target(t, ctx)
+ damon_split_regions_of(ctx, t, nr_subregions);
+
+ last_nr_regions = nr_regions;
+}
+
+/*
+ * Check whether it is time to check and apply the target monitoring regions
+ *
+ * Returns true if it is.
+ */
+static bool kdamond_need_update_regions(struct damon_ctx *ctx)
+{
+ return damon_check_reset_time_interval(&ctx->last_regions_update,
+ ctx->regions_update_interval);
+}
+
+/*
+ * Check whether current monitoring should be stopped
+ *
+ * The monitoring is stopped when either the user requested to stop, or all
+ * monitoring targets are invalid.
+ *
+ * Returns true if need to stop current monitoring.
+ */
+static bool kdamond_need_stop(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ bool stop;
+
+ mutex_lock(&ctx->kdamond_lock);
+ stop = ctx->kdamond_stop;
+ mutex_unlock(&ctx->kdamond_lock);
+ if (stop)
+ return true;
+
+ if (!ctx->target_valid)
+ return false;
+
+ damon_for_each_target(t, ctx) {
+ if (ctx->target_valid(t))
+ return false;
+ }
+
+ return true;
+}
+
+static void kdamond_write_record_header(struct damon_ctx *ctx)
+{
+ int recfmt_ver = 2;
+
+ damon_write_rbuf(ctx, "damon_recfmt_ver", 16);
+ damon_write_rbuf(ctx, &recfmt_ver, sizeof(recfmt_ver));
+}
+
+/*
+ * The monitoring daemon that runs as a kernel thread
+ */
+static int kdamond_fn(void *data)
+{
+ struct damon_ctx *ctx = (struct damon_ctx *)data;
+ struct damon_target *t;
+ struct damon_region *r, *next;
+ unsigned int max_nr_accesses = 0;
+ unsigned long sz_limit = 0;
+
+ pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
+ if (ctx->init_target_regions)
+ ctx->init_target_regions(ctx);
+ sz_limit = damon_region_sz_limit(ctx);
+
+ kdamond_write_record_header(ctx);
+
+ while (!kdamond_need_stop(ctx)) {
+ if (ctx->prepare_access_checks)
+ ctx->prepare_access_checks(ctx);
+ if (ctx->sample_cb)
+ ctx->sample_cb(ctx);
+
+ usleep_range(ctx->sample_interval, ctx->sample_interval + 1);
+
+ if (ctx->check_accesses)
+ max_nr_accesses = ctx->check_accesses(ctx);
+
+ if (kdamond_aggregate_interval_passed(ctx)) {
+ if (ctx->aggregate_cb)
+ ctx->aggregate_cb(ctx);
+ kdamond_merge_regions(ctx, max_nr_accesses / 10,
+ sz_limit);
+ kdamond_apply_schemes(ctx);
+ kdamond_reset_aggregated(ctx);
+ kdamond_split_regions(ctx);
+ }
+
+ if (kdamond_need_update_regions(ctx)) {
+ if (ctx->update_target_regions)
+ ctx->update_target_regions(ctx);
+ sz_limit = damon_region_sz_limit(ctx);
+ }
+ }
+ damon_flush_rbuffer(ctx);
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+
+ if (ctx->cleanup)
+ ctx->cleanup(ctx);
+
+ pr_debug("kdamond (%d) finishes\n", ctx->kdamond->pid);
+ mutex_lock(&ctx->kdamond_lock);
+ ctx->kdamond = NULL;
+ mutex_unlock(&ctx->kdamond_lock);
+
+ mutex_lock(&damon_lock);
+ nr_running_ctxs--;
+ mutex_unlock(&damon_lock);
+
+ do_exit(0);
+}
+
+#include "core-test.h"
diff --git a/mm/damon/damon.h b/mm/damon/damon.h
new file mode 100644
index 000000000000..fc565fff4953
--- /dev/null
+++ b/mm/damon/damon.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Author: SeongJae Park <[email protected]>
+ */
+
+/* Get a random number in [l, r) */
+#define damon_rand(l, r) (l + prandom_u32() % (r - l))
+
+/*
+ * 't->id' should be the pointer to the relevant 'struct pid' having reference
+ * count. Caller must put the returned task, unless it is NULL.
+ */
+#define damon_get_task_struct(t) \
+ (get_pid_task((struct pid *)t->id, PIDTYPE_PID))
+
+/*
+ * Get the mm_struct of the given target
+ *
+ * Caller _must_ put the mm_struct after use, unless it is NULL.
+ *
+ * Returns the mm_struct of the target on success, NULL on failure
+ */
+static inline struct mm_struct *damon_get_mm(struct damon_target *t)
+{
+ struct task_struct *task;
+ struct mm_struct *mm;
+
+ task = damon_get_task_struct(t);
+ if (!task)
+ return NULL;
+
+ mm = get_task_mm(task);
+ put_task_struct(task);
+ return mm;
+}
diff --git a/mm/damon/dbgfs-test.h b/mm/damon/dbgfs-test.h
new file mode 100644
index 000000000000..dffb9f70e399
--- /dev/null
+++ b/mm/damon/dbgfs-test.h
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DAMON Debugfs Interface Unit Tests
+ *
+ * Author: SeongJae Park <[email protected]>
+ */
+
+#ifdef CONFIG_DAMON_DBGFS_KUNIT_TEST
+
+#ifndef _DAMON_DBGFS_TEST_H
+#define _DAMON_DBGFS_TEST_H
+
+#include <kunit/test.h>
+
+static void damon_dbgfs_test_str_to_target_ids(struct kunit *test)
+{
+ char *question;
+ unsigned long *answers;
+ unsigned long expected[] = {12, 35, 46};
+ ssize_t nr_integers = 0, i;
+
+ question = "123";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
+ KUNIT_EXPECT_EQ(test, 123ul, answers[0]);
+ kfree(answers);
+
+ question = "123abc";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)1, nr_integers);
+ KUNIT_EXPECT_EQ(test, 123ul, answers[0]);
+ kfree(answers);
+
+ question = "a123";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+ kfree(answers);
+
+ question = "12 35";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
+ for (i = 0; i < nr_integers; i++)
+ KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+ kfree(answers);
+
+ question = "12 35 46";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)3, nr_integers);
+ for (i = 0; i < nr_integers; i++)
+ KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+ kfree(answers);
+
+ question = "12 35 abc 46";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)2, nr_integers);
+ for (i = 0; i < 2; i++)
+ KUNIT_EXPECT_EQ(test, expected[i], answers[i]);
+ kfree(answers);
+
+ question = "";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+ kfree(answers);
+
+ question = "\n";
+ answers = str_to_target_ids(question, strnlen(question, 128),
+ &nr_integers);
+ KUNIT_EXPECT_EQ(test, (ssize_t)0, nr_integers);
+ kfree(answers);
+}
+
+static void damon_dbgfs_test_set_targets(struct kunit *test)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ unsigned long ids[] = {1, 2, 3};
+ char buf[64];
+
+ /* Make DAMON consider target id as plain number */
+ ctx->target_valid = NULL;
+
+ damon_set_targets(ctx, ids, 3);
+ sprint_target_ids(ctx, buf, 64);
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2 3\n");
+
+ damon_set_targets(ctx, NULL, 0);
+ sprint_target_ids(ctx, buf, 64);
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
+
+ damon_set_targets(ctx, (unsigned long []){1, 2}, 2);
+ sprint_target_ids(ctx, buf, 64);
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "1 2\n");
+
+ damon_set_targets(ctx, (unsigned long []){2}, 1);
+ sprint_target_ids(ctx, buf, 64);
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "2\n");
+
+ damon_set_targets(ctx, NULL, 0);
+ sprint_target_ids(ctx, buf, 64);
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "\n");
+
+ damon_destroy_ctx(ctx);
+}
+
+static void damon_dbgfs_test_set_init_regions(struct kunit *test)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ unsigned long ids[] = {1, 2, 3};
+ /* Each line represents one region in ``<target id> <start> <end>`` */
+ char * const valid_inputs[] = {"2 10 20\n 2 20 30\n2 35 45",
+ "2 10 20\n",
+ "2 10 20\n1 39 59\n1 70 134\n 2 20 25\n",
+ ""};
+ /* Reading the file again will show sorted, clean output */
+ char * const valid_expects[] = {"2 10 20\n2 20 30\n2 35 45\n",
+ "2 10 20\n",
+ "1 39 59\n1 70 134\n2 10 20\n2 20 25\n",
+ ""};
+ char * const invalid_inputs[] = {"4 10 20\n", /* target not exists */
+ "2 10 20\n 2 14 26\n", /* regions overlap */
+ "1 10 20\n2 30 40\n 1 5 8"}; /* not sorted by address */
+ char *input, *expect;
+ int i, rc;
+ char buf[256];
+
+ damon_set_targets(ctx, ids, 3);
+
+ /* Put valid inputs and check the results */
+ for (i = 0; i < ARRAY_SIZE(valid_inputs); i++) {
+ input = valid_inputs[i];
+ expect = valid_expects[i];
+
+ rc = set_init_regions(ctx, input, strnlen(input, 256));
+ KUNIT_EXPECT_EQ(test, rc, 0);
+
+ memset(buf, 0, 256);
+ sprint_init_regions(ctx, buf, 256);
+
+ KUNIT_EXPECT_STREQ(test, (char *)buf, expect);
+ }
+ /* Put invlid inputs and check the return error code */
+ for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) {
+ input = invalid_inputs[i];
+ pr_info("input: %s\n", input);
+ rc = set_init_regions(ctx, input, strnlen(input, 256));
+ KUNIT_EXPECT_EQ(test, rc, -EINVAL);
+
+ memset(buf, 0, 256);
+ sprint_init_regions(ctx, buf, 256);
+
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "");
+ }
+
+ damon_set_targets(ctx, NULL, 0);
+ damon_destroy_ctx(ctx);
+}
+
+static struct kunit_case damon_test_cases[] = {
+ KUNIT_CASE(damon_dbgfs_test_str_to_target_ids),
+ KUNIT_CASE(damon_dbgfs_test_set_targets),
+ KUNIT_CASE(damon_dbgfs_test_set_init_regions),
+ {},
+};
+
+static struct kunit_suite damon_test_suite = {
+ .name = "damon-dbgfs",
+ .test_cases = damon_test_cases,
+};
+kunit_test_suite(damon_test_suite);
+
+#endif /* _DAMON_TEST_H */
+
+#endif /* CONFIG_DAMON_KUNIT_TEST */
diff --git a/mm/damon/dbgfs.c b/mm/damon/dbgfs.c
new file mode 100644
index 000000000000..646a492100ff
--- /dev/null
+++ b/mm/damon/dbgfs.c
@@ -0,0 +1,882 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DAMON Debugfs Interface
+ *
+ * Author: SeongJae Park <[email protected]>
+ */
+
+#define pr_fmt(fmt) "damon-dbgfs: " fmt
+
+#include <linux/damon.h>
+#include <linux/debugfs.h>
+#include <linux/file.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+/* Monitoring contexts for debugfs interface users. */
+static struct damon_ctx **debugfs_ctxs;
+static int debugfs_nr_ctxs = 1;
+
+static DEFINE_MUTEX(damon_dbgfs_lock);
+
+static ssize_t debugfs_monitor_on_read(struct file *file,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ char monitor_on_buf[5];
+ bool monitor_on = damon_nr_running_ctxs() != 0;
+ int len;
+
+ len = scnprintf(monitor_on_buf, 5, monitor_on ? "on\n" : "off\n");
+
+ return simple_read_from_buffer(buf, count, ppos, monitor_on_buf, len);
+}
+
+/*
+ * Returns non-empty string on success, negarive error code otherwise.
+ */
+static char *user_input_str(const char __user *buf, size_t count, loff_t *ppos)
+{
+ char *kbuf;
+ ssize_t ret;
+
+ /* We do not accept continuous write */
+ if (*ppos)
+ return ERR_PTR(-EINVAL);
+
+ kbuf = kmalloc(count + 1, GFP_KERNEL);
+ if (!kbuf)
+ return ERR_PTR(-ENOMEM);
+
+ ret = simple_write_to_buffer(kbuf, count + 1, ppos, buf, count);
+ if (ret != count) {
+ kfree(kbuf);
+ return ERR_PTR(-EIO);
+ }
+ kbuf[ret] = '\0';
+
+ return kbuf;
+}
+
+static ssize_t debugfs_monitor_on_write(struct file *file,
+ const char __user *buf, size_t count, loff_t *ppos)
+{
+ ssize_t ret = count;
+ char *kbuf;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ /* Remove white space */
+ if (sscanf(kbuf, "%s", kbuf) != 1)
+ return -EINVAL;
+ if (!strncmp(kbuf, "on", count))
+ err = damon_start_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
+ else if (!strncmp(kbuf, "off", count))
+ err = damon_stop_ctx_ptrs(debugfs_ctxs, debugfs_nr_ctxs);
+ else
+ return -EINVAL;
+
+ if (err)
+ ret = err;
+ return ret;
+}
+
+static ssize_t sprint_schemes(struct damon_ctx *c, char *buf, ssize_t len)
+{
+ struct damos *s;
+ int written = 0;
+ int rc;
+
+ damon_for_each_scheme(s, c) {
+ rc = scnprintf(&buf[written], len - written,
+ "%lu %lu %u %u %u %u %d %lu %lu\n",
+ s->min_sz_region, s->max_sz_region,
+ s->min_nr_accesses, s->max_nr_accesses,
+ s->min_age_region, s->max_age_region,
+ s->action, s->stat_count, s->stat_sz);
+ if (!rc)
+ return -ENOMEM;
+
+ written += rc;
+ }
+ return written;
+}
+
+static ssize_t debugfs_schemes_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf;
+ ssize_t len;
+
+ kbuf = kmalloc(count, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ mutex_lock(&ctx->kdamond_lock);
+ len = sprint_schemes(ctx, kbuf, count);
+ mutex_unlock(&ctx->kdamond_lock);
+ if (len < 0)
+ goto out;
+ len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+ kfree(kbuf);
+ return len;
+}
+
+static void free_schemes_arr(struct damos **schemes, ssize_t nr_schemes)
+{
+ ssize_t i;
+
+ for (i = 0; i < nr_schemes; i++)
+ kfree(schemes[i]);
+ kfree(schemes);
+}
+
+static bool damos_action_valid(int action)
+{
+ switch (action) {
+ case DAMOS_WILLNEED:
+ case DAMOS_COLD:
+ case DAMOS_PAGEOUT:
+ case DAMOS_HUGEPAGE:
+ case DAMOS_NOHUGEPAGE:
+ case DAMOS_STAT:
+ return true;
+ default:
+ return false;
+ }
+}
+
+/*
+ * Converts a string into an array of struct damos pointers
+ *
+ * Returns an array of struct damos pointers that converted if the conversion
+ * success, or NULL otherwise.
+ */
+static struct damos **str_to_schemes(const char *str, ssize_t len,
+ ssize_t *nr_schemes)
+{
+ struct damos *scheme, **schemes;
+ const int max_nr_schemes = 256;
+ int pos = 0, parsed, ret;
+ unsigned long min_sz, max_sz;
+ unsigned int min_nr_a, max_nr_a, min_age, max_age;
+ unsigned int action;
+
+ schemes = kmalloc_array(max_nr_schemes, sizeof(scheme),
+ GFP_KERNEL);
+ if (!schemes)
+ return NULL;
+
+ *nr_schemes = 0;
+ while (pos < len && *nr_schemes < max_nr_schemes) {
+ ret = sscanf(&str[pos], "%lu %lu %u %u %u %u %u%n",
+ &min_sz, &max_sz, &min_nr_a, &max_nr_a,
+ &min_age, &max_age, &action, &parsed);
+ if (ret != 7)
+ break;
+ if (!damos_action_valid(action)) {
+ pr_err("wrong action %d\n", action);
+ goto fail;
+ }
+
+ pos += parsed;
+ scheme = damon_new_scheme(min_sz, max_sz, min_nr_a, max_nr_a,
+ min_age, max_age, action);
+ if (!scheme)
+ goto fail;
+
+ schemes[*nr_schemes] = scheme;
+ *nr_schemes += 1;
+ }
+ return schemes;
+fail:
+ free_schemes_arr(schemes, *nr_schemes);
+ return NULL;
+}
+
+static ssize_t debugfs_schemes_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf;
+ struct damos **schemes;
+ ssize_t nr_schemes = 0, ret = count;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ schemes = str_to_schemes(kbuf, ret, &nr_schemes);
+ if (!schemes) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = damon_set_schemes(ctx, schemes, nr_schemes);
+ if (err)
+ ret = err;
+ else
+ nr_schemes = 0;
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+ free_schemes_arr(schemes, nr_schemes);
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+#define targetid_is_pid(ctx) \
+ (ctx->target_valid == kdamond_vm_target_valid)
+
+static ssize_t sprint_target_ids(struct damon_ctx *ctx, char *buf, ssize_t len)
+{
+ struct damon_target *t;
+ unsigned long id;
+ int written = 0;
+ int rc;
+
+ damon_for_each_target(t, ctx) {
+ id = t->id;
+ if (targetid_is_pid(ctx))
+ /* Show pid numbers to debugfs users */
+ id = (unsigned long)pid_vnr((struct pid *)id);
+
+ rc = scnprintf(&buf[written], len - written, "%lu ", id);
+ if (!rc)
+ return -ENOMEM;
+ written += rc;
+ }
+ if (written)
+ written -= 1;
+ written += scnprintf(&buf[written], len - written, "\n");
+ return written;
+}
+
+static ssize_t debugfs_target_ids_read(struct file *file,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ ssize_t len;
+ char ids_buf[320];
+
+ mutex_lock(&ctx->kdamond_lock);
+ len = sprint_target_ids(ctx, ids_buf, 320);
+ mutex_unlock(&ctx->kdamond_lock);
+ if (len < 0)
+ return len;
+
+ return simple_read_from_buffer(buf, count, ppos, ids_buf, len);
+}
+
+/*
+ * Converts a string into an array of unsigned long integers
+ *
+ * Returns an array of unsigned long integers if the conversion success, or
+ * NULL otherwise.
+ */
+static unsigned long *str_to_target_ids(const char *str, ssize_t len,
+ ssize_t *nr_ids)
+{
+ unsigned long *ids;
+ const int max_nr_ids = 32;
+ unsigned long id;
+ int pos = 0, parsed, ret;
+
+ *nr_ids = 0;
+ ids = kmalloc_array(max_nr_ids, sizeof(id), GFP_KERNEL);
+ if (!ids)
+ return NULL;
+ while (*nr_ids < max_nr_ids && pos < len) {
+ ret = sscanf(&str[pos], "%lu%n", &id, &parsed);
+ pos += parsed;
+ if (ret != 1)
+ break;
+ ids[*nr_ids] = id;
+ *nr_ids += 1;
+ }
+
+ return ids;
+}
+
+/* Returns pid for the given pidfd if it's valid, or NULL otherwise. */
+static struct pid *damon_get_pidfd_pid(unsigned int pidfd)
+{
+ struct fd f;
+ struct pid *pid;
+
+ f = fdget(pidfd);
+ if (!f.file)
+ return NULL;
+
+ pid = pidfd_pid(f.file);
+ if (!IS_ERR(pid))
+ get_pid(pid);
+ else
+ pid = NULL;
+
+ fdput(f);
+ return pid;
+}
+
+static ssize_t debugfs_target_ids_write(struct file *file,
+ const char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf, *nrs;
+ bool received_pidfds = false;
+ unsigned long *targets;
+ ssize_t nr_targets;
+ ssize_t ret = count;
+ int i;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ nrs = kbuf;
+ if (!strncmp(kbuf, "paddr\n", count)) {
+ /* Configure the context for physical memory monitoring */
+ damon_set_paddr_primitives(ctx);
+ /* target id is meaningless here, but we set it just for fun */
+ scnprintf(kbuf, count, "42 ");
+ } else {
+ /* Configure the context for virtual memory monitoring */
+ damon_set_vaddr_primitives(ctx);
+ if (!strncmp(kbuf, "pidfd ", 6)) {
+ received_pidfds = true;
+ nrs = &kbuf[6];
+ }
+ }
+
+ targets = str_to_target_ids(nrs, ret, &nr_targets);
+ if (!targets) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (received_pidfds) {
+ for (i = 0; i < nr_targets; i++)
+ targets[i] = (unsigned long)damon_get_pidfd_pid(
+ (unsigned int)targets[i]);
+ } else if (targetid_is_pid(ctx)) {
+ for (i = 0; i < nr_targets; i++)
+ targets[i] = (unsigned long)find_get_pid(
+ (int)targets[i]);
+ }
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EINVAL;
+ goto unlock_out;
+ }
+
+ err = damon_set_targets(ctx, targets, nr_targets);
+ if (err)
+ ret = err;
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+ kfree(targets);
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+static ssize_t debugfs_record_read(struct file *file,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char record_buf[20 + MAX_RFILE_PATH_LEN];
+ int ret;
+
+ mutex_lock(&ctx->kdamond_lock);
+ ret = scnprintf(record_buf, ARRAY_SIZE(record_buf), "%u %s\n",
+ ctx->rbuf_len, ctx->rfile_path);
+ mutex_unlock(&ctx->kdamond_lock);
+ return simple_read_from_buffer(buf, count, ppos, record_buf, ret);
+}
+
+static ssize_t debugfs_record_write(struct file *file,
+ const char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf;
+ unsigned int rbuf_len;
+ char rfile_path[MAX_RFILE_PATH_LEN];
+ ssize_t ret = count;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ if (sscanf(kbuf, "%u %s",
+ &rbuf_len, rfile_path) != 2) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = damon_set_recording(ctx, rbuf_len, rfile_path);
+ if (err)
+ ret = err;
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ int written = 0;
+ int rc;
+
+ damon_for_each_target(t, c) {
+ damon_for_each_region(r, t) {
+ rc = scnprintf(&buf[written], len - written,
+ "%lu %lu %lu\n",
+ t->id, r->ar.start, r->ar.end);
+ if (!rc)
+ return -ENOMEM;
+ written += rc;
+ }
+ }
+ return written;
+}
+
+static ssize_t debugfs_init_regions_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf;
+ ssize_t len;
+
+ kbuf = kmalloc(count, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ mutex_unlock(&ctx->kdamond_lock);
+ return -EBUSY;
+ }
+
+ len = sprint_init_regions(ctx, kbuf, count);
+ mutex_unlock(&ctx->kdamond_lock);
+ if (len < 0)
+ goto out;
+ len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+ kfree(kbuf);
+ return len;
+}
+
+static int add_init_region(struct damon_ctx *c,
+ unsigned long target_id, struct damon_addr_range *ar)
+{
+ struct damon_target *t;
+ struct damon_region *r, *prev;
+ int rc = -EINVAL;
+
+ if (ar->start >= ar->end)
+ return -EINVAL;
+
+ damon_for_each_target(t, c) {
+ if (t->id == target_id) {
+ r = damon_new_region(ar->start, ar->end);
+ if (!r)
+ return -ENOMEM;
+ damon_add_region(r, t);
+ if (damon_nr_regions(t) > 1) {
+ prev = damon_prev_region(r);
+ if (prev->ar.end > r->ar.start) {
+ damon_destroy_region(r);
+ return -EINVAL;
+ }
+ }
+ rc = 0;
+ }
+ }
+ return rc;
+}
+
+static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len)
+{
+ struct damon_target *t;
+ struct damon_region *r, *next;
+ int pos = 0, parsed, ret;
+ unsigned long target_id;
+ struct damon_addr_range ar;
+ int err;
+
+ damon_for_each_target(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+
+ while (pos < len) {
+ ret = sscanf(&str[pos], "%lu %lu %lu%n",
+ &target_id, &ar.start, &ar.end, &parsed);
+ if (ret != 3)
+ break;
+ err = add_init_region(c, target_id, &ar);
+ if (err)
+ goto fail;
+ pos += parsed;
+ }
+
+ return 0;
+
+fail:
+ damon_for_each_target(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+ return err;
+}
+
+static ssize_t debugfs_init_regions_write(struct file *file,
+ const char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char *kbuf;
+ ssize_t ret = count;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = set_init_regions(ctx, kbuf, ret);
+ if (err)
+ ret = err;
+
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+ kfree(kbuf);
+ return ret;
+}
+
+static ssize_t debugfs_attrs_read(struct file *file,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ char kbuf[128];
+ int ret;
+
+ mutex_lock(&ctx->kdamond_lock);
+ ret = scnprintf(kbuf, ARRAY_SIZE(kbuf), "%lu %lu %lu %lu %lu\n",
+ ctx->sample_interval, ctx->aggr_interval,
+ ctx->regions_update_interval, ctx->min_nr_regions,
+ ctx->max_nr_regions);
+ mutex_unlock(&ctx->kdamond_lock);
+
+ return simple_read_from_buffer(buf, count, ppos, kbuf, ret);
+}
+
+static ssize_t debugfs_attrs_write(struct file *file,
+ const char __user *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = file->private_data;
+ unsigned long s, a, r, minr, maxr;
+ char *kbuf;
+ ssize_t ret = count;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ if (sscanf(kbuf, "%lu %lu %lu %lu %lu",
+ &s, &a, &r, &minr, &maxr) != 5) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = damon_set_attrs(ctx, s, a, r, minr, maxr);
+ if (err)
+ ret = err;
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+static ssize_t debugfs_nr_contexts_read(struct file *file,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ char kbuf[32];
+ int ret;
+
+ mutex_lock(&damon_dbgfs_lock);
+ ret = scnprintf(kbuf, ARRAY_SIZE(kbuf), "%d\n", debugfs_nr_ctxs);
+ mutex_unlock(&damon_dbgfs_lock);
+
+ return simple_read_from_buffer(buf, count, ppos, kbuf, ret);
+}
+
+static struct dentry **debugfs_dirs;
+
+static int debugfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx);
+
+static ssize_t debugfs_nr_contexts_write(struct file *file,
+ const char __user *buf, size_t count, loff_t *ppos)
+{
+ char *kbuf;
+ ssize_t ret = count;
+ int nr_contexts, i;
+ char dirname[32];
+ struct dentry *root;
+ struct dentry **new_dirs;
+ struct damon_ctx **new_ctxs;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ if (sscanf(kbuf, "%d", &nr_contexts) != 1) {
+ ret = -EINVAL;
+ goto out;
+ }
+ if (nr_contexts < 1) {
+ pr_err("nr_contexts should be >=1\n");
+ ret = -EINVAL;
+ goto out;
+ }
+ if (nr_contexts == debugfs_nr_ctxs)
+ goto out;
+
+ mutex_lock(&damon_dbgfs_lock);
+ if (damon_nr_running_ctxs()) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ for (i = nr_contexts; i < debugfs_nr_ctxs; i++) {
+ debugfs_remove(debugfs_dirs[i]);
+ damon_destroy_ctx(debugfs_ctxs[i]);
+ }
+
+ new_dirs = kmalloc_array(nr_contexts, sizeof(*new_dirs), GFP_KERNEL);
+ if (!new_dirs) {
+ ret = -ENOMEM;
+ goto unlock_out;
+ }
+
+ new_ctxs = kmalloc_array(nr_contexts, sizeof(*debugfs_ctxs),
+ GFP_KERNEL);
+ if (!new_ctxs) {
+ ret = -ENOMEM;
+ goto unlock_out;
+ }
+
+ for (i = 0; i < debugfs_nr_ctxs && i < nr_contexts; i++) {
+ new_dirs[i] = debugfs_dirs[i];
+ new_ctxs[i] = debugfs_ctxs[i];
+ }
+ kfree(debugfs_dirs);
+ debugfs_dirs = new_dirs;
+ kfree(debugfs_ctxs);
+ debugfs_ctxs = new_ctxs;
+
+ root = debugfs_dirs[0];
+ if (!root) {
+ ret = -ENOENT;
+ goto unlock_out;
+ }
+
+ for (i = debugfs_nr_ctxs; i < nr_contexts; i++) {
+ scnprintf(dirname, sizeof(dirname), "ctx%d", i);
+ debugfs_dirs[i] = debugfs_create_dir(dirname, root);
+ if (!debugfs_dirs[i]) {
+ pr_err("dir %s creation failed\n", dirname);
+ ret = -ENOMEM;
+ break;
+ }
+
+ debugfs_ctxs[i] = damon_new_ctx();
+ if (!debugfs_ctxs[i]) {
+ pr_err("ctx for %s creation failed\n", dirname);
+ ret = -ENOMEM;
+ break;
+ }
+ damon_set_vaddr_primitives(debugfs_ctxs[i]);
+
+ if (debugfs_fill_ctx_dir(debugfs_dirs[i], debugfs_ctxs[i])) {
+ ret = -ENOMEM;
+ break;
+ }
+ }
+
+ debugfs_nr_ctxs = i;
+
+unlock_out:
+ mutex_unlock(&damon_dbgfs_lock);
+
+out:
+ kfree(kbuf);
+ return ret;
+}
+
+static int damon_debugfs_open(struct inode *inode, struct file *file)
+{
+ file->private_data = inode->i_private;
+
+ return nonseekable_open(inode, file);
+}
+
+static const struct file_operations monitor_on_fops = {
+ .owner = THIS_MODULE,
+ .read = debugfs_monitor_on_read,
+ .write = debugfs_monitor_on_write,
+};
+
+static const struct file_operations target_ids_fops = {
+ .owner = THIS_MODULE,
+ .open = damon_debugfs_open,
+ .read = debugfs_target_ids_read,
+ .write = debugfs_target_ids_write,
+};
+
+static const struct file_operations schemes_fops = {
+ .owner = THIS_MODULE,
+ .open = damon_debugfs_open,
+ .read = debugfs_schemes_read,
+ .write = debugfs_schemes_write,
+};
+
+static const struct file_operations record_fops = {
+ .owner = THIS_MODULE,
+ .open = damon_debugfs_open,
+ .read = debugfs_record_read,
+ .write = debugfs_record_write,
+};
+
+static const struct file_operations init_regions_fops = {
+ .owner = THIS_MODULE,
+ .open = damon_debugfs_open,
+ .read = debugfs_init_regions_read,
+ .write = debugfs_init_regions_write,
+};
+
+static const struct file_operations attrs_fops = {
+ .owner = THIS_MODULE,
+ .open = damon_debugfs_open,
+ .read = debugfs_attrs_read,
+ .write = debugfs_attrs_write,
+};
+
+static const struct file_operations nr_contexts_fops = {
+ .owner = THIS_MODULE,
+ .read = debugfs_nr_contexts_read,
+ .write = debugfs_nr_contexts_write,
+};
+
+static int debugfs_fill_ctx_dir(struct dentry *dir, struct damon_ctx *ctx)
+{
+ const char * const file_names[] = {"attrs", "init_regions", "record",
+ "schemes", "target_ids"};
+ const struct file_operations *fops[] = {&attrs_fops,
+ &init_regions_fops, &record_fops, &schemes_fops,
+ &target_ids_fops};
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(file_names); i++) {
+ if (!debugfs_create_file(file_names[i], 0600, dir,
+ ctx, fops[i])) {
+ pr_err("failed to create %s file\n", file_names[i]);
+ return -ENOMEM;
+ }
+ }
+
+ return 0;
+}
+
+static int __init damon_debugfs_init(void)
+{
+ struct dentry *debugfs_root;
+ const char * const file_names[] = {"nr_contexts", "monitor_on"};
+ const struct file_operations *fops[] = {&nr_contexts_fops,
+ &monitor_on_fops};
+ int i;
+
+ debugfs_root = debugfs_create_dir("damon", NULL);
+ if (!debugfs_root) {
+ pr_err("failed to create the debugfs dir\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < ARRAY_SIZE(file_names); i++) {
+ if (!debugfs_create_file(file_names[i], 0600, debugfs_root,
+ NULL, fops[i])) {
+ pr_err("failed to create %s file\n", file_names[i]);
+ return -ENOMEM;
+ }
+ }
+ debugfs_fill_ctx_dir(debugfs_root, debugfs_ctxs[0]);
+
+ debugfs_dirs = kmalloc_array(1, sizeof(debugfs_root), GFP_KERNEL);
+ debugfs_dirs[0] = debugfs_root;
+
+ return 0;
+}
+
+/*
+ * Functions for the initialization
+ */
+
+static int __init damon_dbgfs_init(void)
+{
+ int rc;
+
+ debugfs_ctxs = kmalloc(sizeof(*debugfs_ctxs), GFP_KERNEL);
+ debugfs_ctxs[0] = damon_new_ctx();
+ if (!debugfs_ctxs[0])
+ return -ENOMEM;
+ damon_set_vaddr_primitives(debugfs_ctxs[0]);
+
+ rc = damon_debugfs_init();
+ if (rc)
+ pr_err("%s: debugfs init failed\n", __func__);
+
+ return rc;
+}
+
+module_init(damon_dbgfs_init);
+
+#include "dbgfs-test.h"
diff --git a/mm/damon/primitives-test.h b/mm/damon/primitives-test.h
new file mode 100644
index 000000000000..04de76367e70
--- /dev/null
+++ b/mm/damon/primitives-test.h
@@ -0,0 +1,328 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Data Access Monitor Unit Tests
+ *
+ * Copyright 2019 Amazon.com, Inc. or its affiliates. All rights reserved.
+ *
+ * Author: SeongJae Park <[email protected]>
+ */
+
+#ifdef CONFIG_DAMON_PRIMITIVES_KUNIT_TEST
+
+#ifndef _DAMON_PRIMITIVES_TEST_H
+#define _DAMON_PRIMITIVES_TEST_H
+
+#include <kunit/test.h>
+
+static void __link_vmas(struct vm_area_struct *vmas, ssize_t nr_vmas)
+{
+ int i, j;
+ unsigned long largest_gap, gap;
+
+ if (!nr_vmas)
+ return;
+
+ for (i = 0; i < nr_vmas - 1; i++) {
+ vmas[i].vm_next = &vmas[i + 1];
+
+ vmas[i].vm_rb.rb_left = NULL;
+ vmas[i].vm_rb.rb_right = &vmas[i + 1].vm_rb;
+
+ largest_gap = 0;
+ for (j = i; j < nr_vmas; j++) {
+ if (j == 0)
+ continue;
+ gap = vmas[j].vm_start - vmas[j - 1].vm_end;
+ if (gap > largest_gap)
+ largest_gap = gap;
+ }
+ vmas[i].rb_subtree_gap = largest_gap;
+ }
+ vmas[i].vm_next = NULL;
+ vmas[i].vm_rb.rb_right = NULL;
+ vmas[i].rb_subtree_gap = 0;
+}
+
+/*
+ * Test damon_three_regions_in_vmas() function
+ *
+ * In case of virtual memory address spaces monitoring, DAMON converts the
+ * complex and dynamic memory mappings of each target task to three
+ * discontiguous regions which cover every mapped areas. However, the three
+ * regions should not include the two biggest unmapped areas in the original
+ * mapping, because the two biggest areas are normally the areas between 1)
+ * heap and the mmap()-ed regions, and 2) the mmap()-ed regions and stack.
+ * Because these two unmapped areas are very huge but obviously never accessed,
+ * covering the region is just a waste.
+ *
+ * 'damon_three_regions_in_vmas() receives an address space of a process. It
+ * first identifies the start of mappings, end of mappings, and the two biggest
+ * unmapped areas. After that, based on the information, it constructs the
+ * three regions and returns. For more detail, refer to the comment of
+ * 'damon_init_regions_of()' function definition in 'mm/damon.c' file.
+ *
+ * For example, suppose virtual address ranges of 10-20, 20-25, 200-210,
+ * 210-220, 300-305, and 307-330 (Other comments represent this mappings in
+ * more short form: 10-20-25, 200-210-220, 300-305, 307-330) of a process are
+ * mapped. To cover every mappings, the three regions should start with 10,
+ * and end with 305. The process also has three unmapped areas, 25-200,
+ * 220-300, and 305-307. Among those, 25-200 and 220-300 are the biggest two
+ * unmapped areas, and thus it should be converted to three regions of 10-25,
+ * 200-220, and 300-330.
+ */
+static void damon_test_three_regions_in_vmas(struct kunit *test)
+{
+ struct damon_addr_range regions[3] = {0,};
+ /* 10-20-25, 200-210-220, 300-305, 307-330 */
+ struct vm_area_struct vmas[] = {
+ (struct vm_area_struct) {.vm_start = 10, .vm_end = 20},
+ (struct vm_area_struct) {.vm_start = 20, .vm_end = 25},
+ (struct vm_area_struct) {.vm_start = 200, .vm_end = 210},
+ (struct vm_area_struct) {.vm_start = 210, .vm_end = 220},
+ (struct vm_area_struct) {.vm_start = 300, .vm_end = 305},
+ (struct vm_area_struct) {.vm_start = 307, .vm_end = 330},
+ };
+
+ __link_vmas(vmas, 6);
+
+ damon_three_regions_in_vmas(&vmas[0], regions);
+
+ KUNIT_EXPECT_EQ(test, 10ul, regions[0].start);
+ KUNIT_EXPECT_EQ(test, 25ul, regions[0].end);
+ KUNIT_EXPECT_EQ(test, 200ul, regions[1].start);
+ KUNIT_EXPECT_EQ(test, 220ul, regions[1].end);
+ KUNIT_EXPECT_EQ(test, 300ul, regions[2].start);
+ KUNIT_EXPECT_EQ(test, 330ul, regions[2].end);
+}
+
+static struct damon_region *__nth_region_of(struct damon_target *t, int idx)
+{
+ struct damon_region *r;
+ unsigned int i = 0;
+
+ damon_for_each_region(r, t) {
+ if (i++ == idx)
+ return r;
+ }
+
+ return NULL;
+}
+
+/*
+ * Test 'damon_apply_three_regions()'
+ *
+ * test kunit object
+ * regions an array containing start/end addresses of current
+ * monitoring target regions
+ * nr_regions the number of the addresses in 'regions'
+ * three_regions The three regions that need to be applied now
+ * expected start/end addresses of monitoring target regions that
+ * 'three_regions' are applied
+ * nr_expected the number of addresses in 'expected'
+ *
+ * The memory mapping of the target processes changes dynamically. To follow
+ * the change, DAMON periodically reads the mappings, simplifies it to the
+ * three regions, and updates the monitoring target regions to fit in the three
+ * regions. The update of current target regions is the role of
+ * 'damon_apply_three_regions()'.
+ *
+ * This test passes the given target regions and the new three regions that
+ * need to be applied to the function and check whether it updates the regions
+ * as expected.
+ */
+static void damon_do_test_apply_three_regions(struct kunit *test,
+ unsigned long *regions, int nr_regions,
+ struct damon_addr_range *three_regions,
+ unsigned long *expected, int nr_expected)
+{
+ struct damon_ctx *ctx = damon_new_ctx();
+ struct damon_target *t;
+ struct damon_region *r;
+ int i;
+
+ t = damon_new_target(42);
+ for (i = 0; i < nr_regions / 2; i++) {
+ r = damon_new_region(regions[i * 2], regions[i * 2 + 1]);
+ damon_add_region(r, t);
+ }
+ damon_add_target(ctx, t);
+
+ damon_apply_three_regions(ctx, t, three_regions);
+
+ for (i = 0; i < nr_expected / 2; i++) {
+ r = __nth_region_of(t, i);
+ KUNIT_EXPECT_EQ(test, r->ar.start, expected[i * 2]);
+ KUNIT_EXPECT_EQ(test, r->ar.end, expected[i * 2 + 1]);
+ }
+
+ damon_destroy_ctx(ctx);
+}
+
+/*
+ * This function test most common case where the three big regions are only
+ * slightly changed. Target regions should adjust their boundary (10-20-30,
+ * 50-55, 70-80, 90-100) to fit with the new big regions or remove target
+ * regions (57-79) that now out of the three regions.
+ */
+static void damon_test_apply_three_regions1(struct kunit *test)
+{
+ /* 10-20-30, 50-55-57-59, 70-80-90-100 */
+ unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
+ 70, 80, 80, 90, 90, 100};
+ /* 5-27, 45-55, 73-104 */
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 45, .end = 55},
+ (struct damon_addr_range){.start = 73, .end = 104} };
+ /* 5-20-27, 45-55, 73-80-90-104 */
+ unsigned long expected[] = {5, 20, 20, 27, 45, 55,
+ 73, 80, 80, 90, 90, 104};
+
+ damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
+ new_three_regions, expected, ARRAY_SIZE(expected));
+}
+
+/*
+ * Test slightly bigger change. Similar to above, but the second big region
+ * now require two target regions (50-55, 57-59) to be removed.
+ */
+static void damon_test_apply_three_regions2(struct kunit *test)
+{
+ /* 10-20-30, 50-55-57-59, 70-80-90-100 */
+ unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
+ 70, 80, 80, 90, 90, 100};
+ /* 5-27, 56-57, 65-104 */
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 56, .end = 57},
+ (struct damon_addr_range){.start = 65, .end = 104} };
+ /* 5-20-27, 56-57, 65-80-90-104 */
+ unsigned long expected[] = {5, 20, 20, 27, 56, 57,
+ 65, 80, 80, 90, 90, 104};
+
+ damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
+ new_three_regions, expected, ARRAY_SIZE(expected));
+}
+
+/*
+ * Test a big change. The second big region has totally freed and mapped to
+ * different area (50-59 -> 61-63). The target regions which were in the old
+ * second big region (50-55-57-59) should be removed and new target region
+ * covering the second big region (61-63) should be created.
+ */
+static void damon_test_apply_three_regions3(struct kunit *test)
+{
+ /* 10-20-30, 50-55-57-59, 70-80-90-100 */
+ unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
+ 70, 80, 80, 90, 90, 100};
+ /* 5-27, 61-63, 65-104 */
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 61, .end = 63},
+ (struct damon_addr_range){.start = 65, .end = 104} };
+ /* 5-20-27, 61-63, 65-80-90-104 */
+ unsigned long expected[] = {5, 20, 20, 27, 61, 63,
+ 65, 80, 80, 90, 90, 104};
+
+ damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
+ new_three_regions, expected, ARRAY_SIZE(expected));
+}
+
+/*
+ * Test another big change. Both of the second and third big regions (50-59
+ * and 70-100) has totally freed and mapped to different area (30-32 and
+ * 65-68). The target regions which were in the old second and third big
+ * regions should now be removed and new target regions covering the new second
+ * and third big regions should be crated.
+ */
+static void damon_test_apply_three_regions4(struct kunit *test)
+{
+ /* 10-20-30, 50-55-57-59, 70-80-90-100 */
+ unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
+ 70, 80, 80, 90, 90, 100};
+ /* 5-7, 30-32, 65-68 */
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 7},
+ (struct damon_addr_range){.start = 30, .end = 32},
+ (struct damon_addr_range){.start = 65, .end = 68} };
+ /* expect 5-7, 30-32, 65-68 */
+ unsigned long expected[] = {5, 7, 30, 32, 65, 68};
+
+ damon_do_test_apply_three_regions(test, regions, ARRAY_SIZE(regions),
+ new_three_regions, expected, ARRAY_SIZE(expected));
+}
+
+static void damon_test_split_evenly(struct kunit *test)
+{
+ struct damon_ctx *c = damon_new_ctx();
+ struct damon_target *t;
+ struct damon_region *r;
+ unsigned long i;
+
+ KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, NULL, 5), -EINVAL);
+
+ t = damon_new_target(42);
+ r = damon_new_region(0, 100);
+ KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 0), -EINVAL);
+
+ damon_add_region(r, t);
+ KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 10), 0);
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 10u);
+
+ i = 0;
+ damon_for_each_region(r, t) {
+ KUNIT_EXPECT_EQ(test, r->ar.start, i++ * 10);
+ KUNIT_EXPECT_EQ(test, r->ar.end, i * 10);
+ }
+ damon_free_target(t);
+
+ t = damon_new_target(42);
+ r = damon_new_region(5, 59);
+ damon_add_region(r, t);
+ KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 5), 0);
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 5u);
+
+ i = 0;
+ damon_for_each_region(r, t) {
+ if (i == 4)
+ break;
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i++);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 5 + 10 * i);
+ }
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 59ul);
+ damon_free_target(t);
+
+ t = damon_new_target(42);
+ r = damon_new_region(5, 6);
+ damon_add_region(r, t);
+ KUNIT_EXPECT_EQ(test, damon_split_region_evenly(c, r, 2), -EINVAL);
+ KUNIT_EXPECT_EQ(test, damon_nr_regions(t), 1u);
+
+ damon_for_each_region(r, t) {
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 6ul);
+ }
+ damon_free_target(t);
+ damon_destroy_ctx(c);
+}
+
+static struct kunit_case damon_test_cases[] = {
+ KUNIT_CASE(damon_test_three_regions_in_vmas),
+ KUNIT_CASE(damon_test_apply_three_regions1),
+ KUNIT_CASE(damon_test_apply_three_regions2),
+ KUNIT_CASE(damon_test_apply_three_regions3),
+ KUNIT_CASE(damon_test_apply_three_regions4),
+ KUNIT_CASE(damon_test_split_evenly),
+ {},
+};
+
+static struct kunit_suite damon_test_suite = {
+ .name = "damon-primitives",
+ .test_cases = damon_test_cases,
+};
+kunit_test_suite(damon_test_suite);
+
+#endif /* _DAMON_PRIMITIVES_TEST_H */
+
+#endif /* CONFIG_DAMON_PRIMITIVES_KUNIT_TEST */
diff --git a/mm/damon/primitives.c b/mm/damon/primitives.c
new file mode 100644
index 000000000000..d7796cbffbd8
--- /dev/null
+++ b/mm/damon/primitives.c
@@ -0,0 +1,811 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Data Access Monitoring Low Level Primitives
+ *
+ * Author: SeongJae Park <[email protected]>
+ *
+ * This file is constructed in below parts.
+ *
+ * - Functions for the initial monitoring target regions construction
+ * - Functions for the dynamic monitoring target regions update
+ * - Functions for the access checking of the regions
+ * - Functions for the target validity check and cleanup
+ */
+
+#define pr_fmt(fmt) "damon: " fmt
+
+#include <asm-generic/mman-common.h>
+#include <linux/damon.h>
+#include <linux/memory_hotplug.h>
+#include <linux/mm.h>
+#include <linux/mmu_notifier.h>
+#include <linux/module.h>
+#include <linux/page_idle.h>
+#include <linux/pagemap.h>
+#include <linux/random.h>
+#include <linux/rmap.h>
+#include <linux/sched/mm.h>
+#include <linux/sched/task.h>
+#include <linux/slab.h>
+
+#include "damon.h"
+
+/* Minimal region size. Every damon_region is aligned by this. */
+#ifndef CONFIG_DAMON_KUNIT_TEST
+#define MIN_REGION PAGE_SIZE
+#else
+#define MIN_REGION 1
+#endif
+
+/*
+ * Functions for the initial monitoring target regions construction
+ */
+
+/*
+ * Get the mm_struct of the given target
+ *
+ * Caller _must_ put the mm_struct after use, unless it is NULL.
+ *
+ * Returns the mm_struct of the target on success, NULL on failure
+ */
+struct mm_struct *damon_get_mm(struct damon_target *t)
+{
+ struct task_struct *task;
+ struct mm_struct *mm;
+
+ task = damon_get_task_struct(t);
+ if (!task)
+ return NULL;
+
+ mm = get_task_mm(task);
+ put_task_struct(task);
+ return mm;
+}
+
+/*
+ * Size-evenly split a region into 'nr_pieces' small regions
+ *
+ * Returns 0 on success, or negative error code otherwise.
+ */
+static int damon_split_region_evenly(struct damon_ctx *ctx,
+ struct damon_region *r, unsigned int nr_pieces)
+{
+ unsigned long sz_orig, sz_piece, orig_end;
+ struct damon_region *n = NULL, *next;
+ unsigned long start;
+
+ if (!r || !nr_pieces)
+ return -EINVAL;
+
+ orig_end = r->ar.end;
+ sz_orig = r->ar.end - r->ar.start;
+ sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, MIN_REGION);
+
+ if (!sz_piece)
+ return -EINVAL;
+
+ r->ar.end = r->ar.start + sz_piece;
+ next = damon_next_region(r);
+ for (start = r->ar.end; start + sz_piece <= orig_end;
+ start += sz_piece) {
+ n = damon_new_region(start, start + sz_piece);
+ if (!n)
+ return -ENOMEM;
+ damon_insert_region(n, r, next);
+ r = n;
+ }
+ /* complement last region for possible rounding error */
+ if (n)
+ n->ar.end = orig_end;
+
+ return 0;
+}
+
+static unsigned long sz_range(struct damon_addr_range *r)
+{
+ return r->end - r->start;
+}
+
+static void swap_ranges(struct damon_addr_range *r1,
+ struct damon_addr_range *r2)
+{
+ struct damon_addr_range tmp;
+
+ tmp = *r1;
+ *r1 = *r2;
+ *r2 = tmp;
+}
+
+/*
+ * Find three regions separated by two biggest unmapped regions
+ *
+ * vma the head vma of the target address space
+ * regions an array of three address ranges that results will be saved
+ *
+ * This function receives an address space and finds three regions in it which
+ * separated by the two biggest unmapped regions in the space. Please refer to
+ * below comments of 'damon_init_vm_regions_of()' function to know why this is
+ * necessary.
+ *
+ * Returns 0 if success, or negative error code otherwise.
+ */
+static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
+ struct damon_addr_range regions[3])
+{
+ struct damon_addr_range gap = {0}, first_gap = {0}, second_gap = {0};
+ struct vm_area_struct *last_vma = NULL;
+ unsigned long start = 0;
+ struct rb_root rbroot;
+
+ /* Find two biggest gaps so that first_gap > second_gap > others */
+ for (; vma; vma = vma->vm_next) {
+ if (!last_vma) {
+ start = vma->vm_start;
+ goto next;
+ }
+
+ if (vma->rb_subtree_gap <= sz_range(&second_gap)) {
+ rbroot.rb_node = &vma->vm_rb;
+ vma = rb_entry(rb_last(&rbroot),
+ struct vm_area_struct, vm_rb);
+ goto next;
+ }
+
+ gap.start = last_vma->vm_end;
+ gap.end = vma->vm_start;
+ if (sz_range(&gap) > sz_range(&second_gap)) {
+ swap_ranges(&gap, &second_gap);
+ if (sz_range(&second_gap) > sz_range(&first_gap))
+ swap_ranges(&second_gap, &first_gap);
+ }
+next:
+ last_vma = vma;
+ }
+
+ if (!sz_range(&second_gap) || !sz_range(&first_gap))
+ return -EINVAL;
+
+ /* Sort the two biggest gaps by address */
+ if (first_gap.start > second_gap.start)
+ swap_ranges(&first_gap, &second_gap);
+
+ /* Store the result */
+ regions[0].start = ALIGN(start, MIN_REGION);
+ regions[0].end = ALIGN(first_gap.start, MIN_REGION);
+ regions[1].start = ALIGN(first_gap.end, MIN_REGION);
+ regions[1].end = ALIGN(second_gap.start, MIN_REGION);
+ regions[2].start = ALIGN(second_gap.end, MIN_REGION);
+ regions[2].end = ALIGN(last_vma->vm_end, MIN_REGION);
+
+ return 0;
+}
+
+/*
+ * Get the three regions in the given target (task)
+ *
+ * Returns 0 on success, negative error code otherwise.
+ */
+static int damon_three_regions_of(struct damon_target *t,
+ struct damon_addr_range regions[3])
+{
+ struct mm_struct *mm;
+ int rc;
+
+ mm = damon_get_mm(t);
+ if (!mm)
+ return -EINVAL;
+
+ mmap_read_lock(mm);
+ rc = damon_three_regions_in_vmas(mm->mmap, regions);
+ mmap_read_unlock(mm);
+
+ mmput(mm);
+ return rc;
+}
+
+/*
+ * Initialize the monitoring target regions for the given target (task)
+ *
+ * t the given target
+ *
+ * Because only a number of small portions of the entire address space
+ * is actually mapped to the memory and accessed, monitoring the unmapped
+ * regions is wasteful. That said, because we can deal with small noises,
+ * tracking every mapping is not strictly required but could even incur a high
+ * overhead if the mapping frequently changes or the number of mappings is
+ * high. The adaptive regions adjustment mechanism will further help to deal
+ * with the noise by simply identifying the unmapped areas as a region that
+ * has no access. Moreover, applying the real mappings that would have many
+ * unmapped areas inside will make the adaptive mechanism quite complex. That
+ * said, too huge unmapped areas inside the monitoring target should be removed
+ * to not take the time for the adaptive mechanism.
+ *
+ * For the reason, we convert the complex mappings to three distinct regions
+ * that cover every mapped area of the address space. Also the two gaps
+ * between the three regions are the two biggest unmapped areas in the given
+ * address space. In detail, this function first identifies the start and the
+ * end of the mappings and the two biggest unmapped areas of the address space.
+ * Then, it constructs the three regions as below:
+ *
+ * [mappings[0]->start, big_two_unmapped_areas[0]->start)
+ * [big_two_unmapped_areas[0]->end, big_two_unmapped_areas[1]->start)
+ * [big_two_unmapped_areas[1]->end, mappings[nr_mappings - 1]->end)
+ *
+ * As usual memory map of processes is as below, the gap between the heap and
+ * the uppermost mmap()-ed region, and the gap between the lowermost mmap()-ed
+ * region and the stack will be two biggest unmapped regions. Because these
+ * gaps are exceptionally huge areas in usual address space, excluding these
+ * two biggest unmapped regions will be sufficient to make a trade-off.
+ *
+ * <heap>
+ * <BIG UNMAPPED REGION 1>
+ * <uppermost mmap()-ed region>
+ * (other mmap()-ed regions and small unmapped regions)
+ * <lowermost mmap()-ed region>
+ * <BIG UNMAPPED REGION 2>
+ * <stack>
+ */
+static void damon_init_vm_regions_of(struct damon_ctx *c,
+ struct damon_target *t)
+{
+ struct damon_region *r;
+ struct damon_addr_range regions[3];
+ unsigned long sz = 0, nr_pieces;
+ int i;
+
+ if (damon_three_regions_of(t, regions)) {
+ pr_err("Failed to get three regions of target %lu\n", t->id);
+ return;
+ }
+
+ for (i = 0; i < 3; i++)
+ sz += regions[i].end - regions[i].start;
+ if (c->min_nr_regions)
+ sz /= c->min_nr_regions;
+ if (sz < MIN_REGION)
+ sz = MIN_REGION;
+
+ /* Set the initial three regions of the target */
+ for (i = 0; i < 3; i++) {
+ r = damon_new_region(regions[i].start, regions[i].end);
+ if (!r) {
+ pr_err("%d'th init region creation failed\n", i);
+ return;
+ }
+ damon_add_region(r, t);
+
+ nr_pieces = (regions[i].end - regions[i].start) / sz;
+ damon_split_region_evenly(c, r, nr_pieces);
+ }
+}
+
+/* Initialize '->regions_list' of every target (task) */
+void kdamond_init_vm_regions(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+
+ damon_for_each_target(t, ctx) {
+ /* the user may set the target regions as they want */
+ if (!damon_nr_regions(t))
+ damon_init_vm_regions_of(ctx, t);
+ }
+}
+
+/*
+ * The initial regions construction function for the physical address space.
+ *
+ * This default version does nothing in actual. Users should set the initial
+ * regions by themselves before passing their damon_ctx to 'damon_start()', or
+ * implement their version of this and set '->init_target_regions' of their
+ * damon_ctx to point it.
+ */
+void kdamond_init_phys_regions(struct damon_ctx *ctx)
+{
+}
+
+/*
+ * Functions for the dynamic monitoring target regions update
+ */
+
+/*
+ * Check whether a region is intersecting an address range
+ *
+ * Returns true if it is.
+ */
+static bool damon_intersect(struct damon_region *r, struct damon_addr_range *re)
+{
+ return !(r->ar.end <= re->start || re->end <= r->ar.start);
+}
+
+/*
+ * Update damon regions for the three big regions of the given target
+ *
+ * t the given target
+ * bregions the three big regions of the target
+ */
+static void damon_apply_three_regions(struct damon_ctx *ctx,
+ struct damon_target *t, struct damon_addr_range bregions[3])
+{
+ struct damon_region *r, *next;
+ unsigned int i = 0;
+
+ /* Remove regions which are not in the three big regions now */
+ damon_for_each_region_safe(r, next, t) {
+ for (i = 0; i < 3; i++) {
+ if (damon_intersect(r, &bregions[i]))
+ break;
+ }
+ if (i == 3)
+ damon_destroy_region(r);
+ }
+
+ /* Adjust intersecting regions to fit with the three big regions */
+ for (i = 0; i < 3; i++) {
+ struct damon_region *first = NULL, *last;
+ struct damon_region *newr;
+ struct damon_addr_range *br;
+
+ br = &bregions[i];
+ /* Get the first and last regions which intersects with br */
+ damon_for_each_region(r, t) {
+ if (damon_intersect(r, br)) {
+ if (!first)
+ first = r;
+ last = r;
+ }
+ if (r->ar.start >= br->end)
+ break;
+ }
+ if (!first) {
+ /* no damon_region intersects with this big region */
+ newr = damon_new_region(
+ ALIGN_DOWN(br->start, MIN_REGION),
+ ALIGN(br->end, MIN_REGION));
+ if (!newr)
+ continue;
+ damon_insert_region(newr, damon_prev_region(r), r);
+ } else {
+ first->ar.start = ALIGN_DOWN(br->start, MIN_REGION);
+ last->ar.end = ALIGN(br->end, MIN_REGION);
+ }
+ }
+}
+
+/*
+ * Update regions for current memory mappings
+ */
+void kdamond_update_vm_regions(struct damon_ctx *ctx)
+{
+ struct damon_addr_range three_regions[3];
+ struct damon_target *t;
+
+ damon_for_each_target(t, ctx) {
+ if (damon_three_regions_of(t, three_regions))
+ continue;
+ damon_apply_three_regions(ctx, t, three_regions);
+ }
+}
+
+/*
+ * The dynamic monitoring target regions update function for the physical
+ * address space.
+ *
+ * This default version does nothing in actual. Users should update the
+ * regions in other callbacks such as '->aggregate_cb', or implement their
+ * version of this and set the '->init_target_regions' of their damon_ctx to
+ * point it.
+ */
+void kdamond_update_phys_regions(struct damon_ctx *ctx)
+{
+}
+
+/*
+ * Functions for the access checking of the regions
+ */
+
+static void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm,
+ unsigned long addr)
+{
+ bool referenced = false;
+ struct page *page = pte_page(*pte);
+
+ if (pte_young(*pte)) {
+ referenced = true;
+ *pte = pte_mkold(*pte);
+ }
+
+#ifdef CONFIG_MMU_NOTIFIER
+ if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
+ referenced = true;
+#endif /* CONFIG_MMU_NOTIFIER */
+
+ if (referenced)
+ set_page_young(page);
+
+ set_page_idle(page);
+}
+
+static void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm,
+ unsigned long addr)
+{
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ bool referenced = false;
+ struct page *page = pmd_page(*pmd);
+
+ if (pmd_young(*pmd)) {
+ referenced = true;
+ *pmd = pmd_mkold(*pmd);
+ }
+
+#ifdef CONFIG_MMU_NOTIFIER
+ if (mmu_notifier_clear_young(mm, addr,
+ addr + ((1UL) << HPAGE_PMD_SHIFT)))
+ referenced = true;
+#endif /* CONFIG_MMU_NOTIFIER */
+
+ if (referenced)
+ set_page_young(page);
+
+ set_page_idle(page);
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+}
+
+static void damon_mkold(struct mm_struct *mm, unsigned long addr)
+{
+ pte_t *pte = NULL;
+ pmd_t *pmd = NULL;
+ spinlock_t *ptl;
+
+ if (follow_pte_pmd(mm, addr, NULL, &pte, &pmd, &ptl))
+ return;
+
+ if (pte) {
+ damon_ptep_mkold(pte, mm, addr);
+ pte_unmap_unlock(pte, ptl);
+ } else {
+ damon_pmdp_mkold(pmd, mm, addr);
+ spin_unlock(ptl);
+ }
+}
+
+static void damon_prepare_vm_access_check(struct damon_ctx *ctx,
+ struct mm_struct *mm, struct damon_region *r)
+{
+ r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
+
+ damon_mkold(mm, r->sampling_addr);
+}
+
+void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct mm_struct *mm;
+ struct damon_region *r;
+
+ damon_for_each_target(t, ctx) {
+ mm = damon_get_mm(t);
+ if (!mm)
+ continue;
+ damon_for_each_region(r, t)
+ damon_prepare_vm_access_check(ctx, mm, r);
+ mmput(mm);
+ }
+}
+
+static bool damon_young(struct mm_struct *mm, unsigned long addr,
+ unsigned long *page_sz)
+{
+ pte_t *pte = NULL;
+ pmd_t *pmd = NULL;
+ spinlock_t *ptl;
+ bool young = false;
+
+ if (follow_pte_pmd(mm, addr, NULL, &pte, &pmd, &ptl))
+ return false;
+
+ *page_sz = PAGE_SIZE;
+ if (pte) {
+ young = pte_young(*pte);
+ if (!young)
+ young = !page_is_idle(pte_page(*pte));
+ pte_unmap_unlock(pte, ptl);
+ return young;
+ }
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ young = pmd_young(*pmd);
+ if (!young)
+ young = !page_is_idle(pmd_page(*pmd));
+ spin_unlock(ptl);
+ *page_sz = ((1UL) << HPAGE_PMD_SHIFT);
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+ return young;
+}
+
+/*
+ * Check whether the region was accessed after the last preparation
+ *
+ * mm 'mm_struct' for the given virtual address space
+ * r the region to be checked
+ */
+static void damon_check_vm_access(struct damon_ctx *ctx,
+ struct mm_struct *mm, struct damon_region *r)
+{
+ static struct mm_struct *last_mm;
+ static unsigned long last_addr;
+ static unsigned long last_page_sz = PAGE_SIZE;
+ static bool last_accessed;
+
+ /* If the region is in the last checked page, reuse the result */
+ if (mm == last_mm && (ALIGN_DOWN(last_addr, last_page_sz) ==
+ ALIGN_DOWN(r->sampling_addr, last_page_sz))) {
+ if (last_accessed)
+ r->nr_accesses++;
+ return;
+ }
+
+ last_accessed = damon_young(mm, r->sampling_addr, &last_page_sz);
+ if (last_accessed)
+ r->nr_accesses++;
+
+ last_mm = mm;
+ last_addr = r->sampling_addr;
+}
+
+unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct mm_struct *mm;
+ struct damon_region *r;
+ unsigned int max_nr_accesses = 0;
+
+ damon_for_each_target(t, ctx) {
+ mm = damon_get_mm(t);
+ if (!mm)
+ continue;
+ damon_for_each_region(r, t) {
+ damon_check_vm_access(ctx, mm, r);
+ max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+ }
+ mmput(mm);
+ }
+
+ return max_nr_accesses;
+}
+
+/* access check functions for physical address based regions */
+
+/*
+ * Get a page by pfn if it is in the LRU list. Otherwise, returns NULL.
+ *
+ * The body of this function is stollen from the 'page_idle_get_page()'. We
+ * steal rather than reuse it because the code is quite simple.
+ */
+static struct page *damon_phys_get_page(unsigned long pfn)
+{
+ struct page *page = pfn_to_online_page(pfn);
+ pg_data_t *pgdat;
+
+ if (!page || !PageLRU(page) ||
+ !get_page_unless_zero(page))
+ return NULL;
+
+ pgdat = page_pgdat(page);
+ spin_lock_irq(&pgdat->lru_lock);
+ if (unlikely(!PageLRU(page))) {
+ put_page(page);
+ page = NULL;
+ }
+ spin_unlock_irq(&pgdat->lru_lock);
+ return page;
+}
+
+static bool damon_page_mkold(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ damon_mkold(vma->vm_mm, addr);
+ return true;
+}
+
+static void damon_phys_mkold(unsigned long paddr)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct rmap_walk_control rwc = {
+ .rmap_one = damon_page_mkold,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return;
+
+ if (!page_mapped(page) || !page_rmapping(page)) {
+ set_page_idle(page);
+ put_page(page);
+ return;
+ }
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page)) {
+ put_page(page);
+ return;
+ }
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+}
+
+static void damon_prepare_phys_access_check(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
+
+ damon_phys_mkold(r->sampling_addr);
+}
+
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region(r, t)
+ damon_prepare_phys_access_check(ctx, r);
+ }
+}
+
+struct damon_phys_access_chk_result {
+ unsigned long page_sz;
+ bool accessed;
+};
+
+static bool damon_page_accessed(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ struct damon_phys_access_chk_result *result = arg;
+
+ result->accessed = damon_young(vma->vm_mm, addr, &result->page_sz);
+
+ /* If accessed, stop walking */
+ return !result->accessed;
+}
+
+static bool damon_phys_young(unsigned long paddr, unsigned long *page_sz)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct damon_phys_access_chk_result result = {
+ .page_sz = PAGE_SIZE,
+ .accessed = false,
+ };
+ struct rmap_walk_control rwc = {
+ .arg = &result,
+ .rmap_one = damon_page_accessed,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return false;
+
+ if (!page_mapped(page) || !page_rmapping(page)) {
+ if (page_is_idle(page))
+ result.accessed = false;
+ else
+ result.accessed = true;
+ put_page(page);
+ goto out;
+ }
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page)) {
+ put_page(page);
+ return NULL;
+ }
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+
+out:
+ *page_sz = result.page_sz;
+ return result.accessed;
+}
+
+/*
+ * Check whether the region was accessed after the last preparation
+ *
+ * mm 'mm_struct' for the given virtual address space
+ * r the region of physical address space that needs to be checked
+ */
+static void damon_check_phys_access(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ static unsigned long last_addr;
+ static unsigned long last_page_sz = PAGE_SIZE;
+ static bool last_accessed;
+
+ /* If the region is in the last checked page, reuse the result */
+ if (ALIGN_DOWN(last_addr, last_page_sz) ==
+ ALIGN_DOWN(r->sampling_addr, last_page_sz)) {
+ if (last_accessed)
+ r->nr_accesses++;
+ return;
+ }
+
+ last_accessed = damon_phys_young(r->sampling_addr, &last_page_sz);
+ if (last_accessed)
+ r->nr_accesses++;
+
+ last_addr = r->sampling_addr;
+}
+
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ unsigned int max_nr_accesses = 0;
+
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region(r, t) {
+ damon_check_phys_access(ctx, r);
+ max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+ }
+ }
+
+ return max_nr_accesses;
+}
+
+/*
+ * Functions for the target validity check and cleanup
+ */
+
+bool kdamond_vm_target_valid(struct damon_target *t)
+{
+ struct task_struct *task;
+
+ task = damon_get_task_struct(t);
+ if (task) {
+ put_task_struct(task);
+ return true;
+ }
+
+ return false;
+}
+
+void kdamond_vm_cleanup(struct damon_ctx *ctx)
+{
+ struct damon_target *t, *next;
+
+ damon_for_each_target_safe(t, next, ctx) {
+ put_pid((struct pid *)t->id);
+ damon_destroy_target(t);
+ }
+}
+
+void damon_set_vaddr_primitives(struct damon_ctx *ctx)
+{
+ ctx->init_target_regions = kdamond_init_vm_regions;
+ ctx->update_target_regions = kdamond_update_vm_regions;
+ ctx->prepare_access_checks = kdamond_prepare_vm_access_checks;
+ ctx->check_accesses = kdamond_check_vm_accesses;
+ ctx->target_valid = kdamond_vm_target_valid;
+ ctx->cleanup = kdamond_vm_cleanup;
+}
+
+void damon_set_paddr_primitives(struct damon_ctx *ctx)
+{
+ ctx->init_target_regions = kdamond_init_phys_regions;
+ ctx->update_target_regions = kdamond_update_phys_regions;
+ ctx->prepare_access_checks = kdamond_prepare_phys_access_checks;
+ ctx->check_accesses = kdamond_check_phys_accesses;
+ ctx->target_valid = NULL;
+ ctx->cleanup = NULL;
+}
+
+#include "primitives-test.h"
--
2.17.1