From: SeongJae Park <[email protected]>
Changes from Previous Version
=============================
- paddr: Support nested iomem sections (Du Fan)
- Rebase on v5.8
Introduction
============
DAMON[1] programming interface users can extend DAMON for any address space by
configuring the address-space specific low level primitives with appropriate
ones including their own implementations. However, because the implementation
for the virtual address space is only available now, the users should implement
their own for other address spaces. Worse yet, the user space users who rely
on the debugfs interface and user space tool, cannot implement their own.
This patchset implements another reference implementation of the low level
primitives for the physical memory address space. With this change, hence, the
kernel space users can monitor both the virtual and the physical address spaces
by simply changing the configuration in the runtime. Further, this patchset
links the implementation to the debugfs interface and the user space tool for
the user space users.
Note that the implementation supports only the user memory, as same to the idle
page access tracking feature.
[1] https://lore.kernel.org/linux-mm/[email protected]/
Baseline and Complete Git Trees
===============================
The patches are based on the v5.8 plus DAMON v19 patchset[1] and DAMOS RFC v14
patchset[2]. You can also clone the complete git tree:
$ git clone git://github.com/sjp38/linux -b cdamon/rfc/v6
The web is also available:
https://github.com/sjp38/linux/releases/tag/cdamon/rfc/v6
[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/[email protected]/
Sequence of Patches
===================
The sequence of patches is as follow.
The first 5 patches allow the user space users manually set the monitoring
regions. The 1st and 2nd patches implements the features in the debugfs
interface and the user space tool . Following two patches each implement
unittests (the 3rd patch) and selftests (the 4th patch) for the new feature.
Finally, the 5th patch documents this new feature.
Following 6 patches implement the physical memory monitoring. The 6th patch
exports rmap essential functions to GPL modules as those will be used by the
DAMON's implementation of the low level primitives for the physical memory
address space. The 7th patch implements the low level primitives. The 8th and
the 9th patches links the feature to the debugfs and the user space tool,
respectively. The 10th patch further implement a handy NUMA specific memory
monitoring feature on the user space tool. Finally, the 11th patch documents
this new features.
Patch History
=============
Changes from RFC v4
(https://lore.kernel.org/linux-mm/[email protected]/)
- Support NUMA specific physical memory monitoring
Changes from RFC v3
(https://lore.kernel.org/linux-mm/[email protected]/)
- Export rmap functions
- Reorganize for physical memory monitoring support only
- Clean up debugfs code
Changes from RFC v2
(https://lore.kernel.org/linux-mm/[email protected]/)
- Support the physical memory monitoring with the user space tool
- Use 'pfn_to_online_page()' (David Hildenbrand)
- Document more detail on random 'pfn' and its safeness (David Hildenbrand)
Changes from RFC v1
(https://lore.kernel.org/linux-mm/[email protected]/)
- Provide the reference primitive implementations for the physical memory
- Connect the extensions with the debugfs interface
SeongJae Park (10):
mm/damon/debugfs: Allow users to set initial monitoring target regions
tools/damon: Support init target regions specification
mm/damon-test: Add more unit tests for 'init_regions'
selftests/damon/_chk_record: Do not check number of gaps
Docs/admin-guide/mm/damon: Document 'init_regions' feature
mm/damon: Implement callbacks for physical memory monitoring
mm/damon/debugfs: Support physical memory monitoring
tools/damon/record: Support physical memory monitoring
tools/damon/record: Support NUMA specific recording
Docs/DAMON: Document physical memory monitoring support
Documentation/admin-guide/mm/damon/usage.rst | 77 +++-
Documentation/vm/damon/design.rst | 29 +-
Documentation/vm/damon/faq.rst | 5 +-
include/linux/damon.h | 6 +
mm/damon-test.h | 53 +++
mm/damon.c | 380 ++++++++++++++++++-
tools/damon/_damon.py | 41 ++
tools/damon/_paddr_layout.py | 147 +++++++
tools/damon/record.py | 57 ++-
tools/damon/schemes.py | 12 +-
tools/testing/selftests/damon/_chk_record.py | 6 -
11 files changed, 768 insertions(+), 45 deletions(-)
create mode 100644 tools/damon/_paddr_layout.py
--
2.17.1
From: SeongJae Park <[email protected]>
Some users would want to monitor only a part of the entire virtual
memory address space. The '->init_target_regions' callback is therefore
provided, but only programming interface can use it.
For the reason, this commit introduces a new debugfs file,
'init_region'. Users can specify which initial monitoring target
address regions they want by writing special input to the file. The
input should describe each region in each line in below form:
<pid> <start address> <end address>
This commit also makes the default '->init_target_regions' callback,
'kdamon_init_vm_regions()' to do nothing if the user has set the initial
target regions already.
Note that the regions will be updated to cover entire memory mapped
regions after 'regions update interval'. If you want the regions to not
be updated after the initial setting, you could set the interval as a
very long time, say, a few decades.
Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon.c | 156 +++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 152 insertions(+), 4 deletions(-)
diff --git a/mm/damon.c b/mm/damon.c
index d25aeccf7939..b2507bae6c57 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -1928,6 +1928,147 @@ static ssize_t debugfs_record_write(struct file *file,
return ret;
}
+static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ int written = 0;
+ int rc;
+
+ damon_for_each_target(t, c) {
+ damon_for_each_region(r, t) {
+ rc = snprintf(&buf[written], len - written,
+ "%lu %lu %lu\n",
+ t->id, r->ar.start, r->ar.end);
+ if (!rc)
+ return -ENOMEM;
+ written += rc;
+ }
+ }
+ return written;
+}
+
+static ssize_t debugfs_init_regions_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = &damon_user_ctx;
+ char *kbuf;
+ ssize_t len;
+
+ kbuf = kmalloc(count, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ mutex_unlock(&ctx->kdamond_lock);
+ return -EBUSY;
+ }
+
+ len = sprint_init_regions(ctx, kbuf, count);
+ mutex_unlock(&ctx->kdamond_lock);
+ if (len < 0)
+ goto out;
+ len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+ kfree(kbuf);
+ return len;
+}
+
+static int add_init_region(struct damon_ctx *c,
+ unsigned long target_id, struct damon_addr_range *ar)
+{
+ struct damon_target *t;
+ struct damon_region *r, *prev;
+ int rc = -EINVAL;
+
+ if (ar->start >= ar->end)
+ return -EINVAL;
+
+ damon_for_each_target(t, c) {
+ if (t->id == target_id) {
+ r = damon_new_region(ar->start, ar->end);
+ if (!r)
+ return -ENOMEM;
+ damon_add_region(r, t);
+ if (nr_damon_regions(t) > 1) {
+ prev = damon_prev_region(r);
+ if (prev->ar.end > r->ar.start) {
+ damon_destroy_region(r);
+ return -EINVAL;
+ }
+ }
+ rc = 0;
+ }
+ }
+ return rc;
+}
+
+static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len)
+{
+ struct damon_target *t;
+ struct damon_region *r, *next;
+ int pos = 0, parsed, ret;
+ unsigned long target_id;
+ struct damon_addr_range ar;
+ int err;
+
+ damon_for_each_target(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+
+ while (pos < len) {
+ ret = sscanf(&str[pos], "%lu %lu %lu%n",
+ &target_id, &ar.start, &ar.end, &parsed);
+ if (ret != 3)
+ break;
+ err = add_init_region(c, target_id, &ar);
+ if (err)
+ goto fail;
+ pos += parsed;
+ }
+
+ return 0;
+
+fail:
+ damon_for_each_target(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+ return err;
+}
+
+static ssize_t debugfs_init_regions_write(struct file *file,
+ const char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ struct damon_ctx *ctx = &damon_user_ctx;
+ char *kbuf;
+ ssize_t ret = count;
+ int err;
+
+ kbuf = user_input_str(buf, count, ppos);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = set_init_regions(ctx, kbuf, ret);
+ if (err)
+ ret = err;
+
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+ kfree(kbuf);
+ return ret;
+}
+
static ssize_t debugfs_attrs_read(struct file *file,
char __user *buf, size_t count, loff_t *ppos)
{
@@ -2004,6 +2145,12 @@ static const struct file_operations record_fops = {
.write = debugfs_record_write,
};
+static const struct file_operations init_regions_fops = {
+ .owner = THIS_MODULE,
+ .read = debugfs_init_regions_read,
+ .write = debugfs_init_regions_write,
+};
+
static const struct file_operations attrs_fops = {
.owner = THIS_MODULE,
.read = debugfs_attrs_read,
@@ -2014,10 +2161,11 @@ static struct dentry *debugfs_root;
static int __init damon_debugfs_init(void)
{
- const char * const file_names[] = {"attrs", "record", "schemes",
- "target_ids", "monitor_on"};
- const struct file_operations *fops[] = {&attrs_fops, &record_fops,
- &schemes_fops, &target_ids_fops, &monitor_on_fops};
+ const char * const file_names[] = {"attrs", "init_regions", "record",
+ "schemes", "target_ids", "monitor_on"};
+ const struct file_operations *fops[] = {&attrs_fops,
+ &init_regions_fops, &record_fops, &schemes_fops,
+ &target_ids_fops, &monitor_on_fops};
int i;
debugfs_root = debugfs_create_dir("damon", NULL);
--
2.17.1
From: SeongJae Park <[email protected]>
This commit updates the damon user space tool to support the initial
monitoring target regions specification.
Signed-off-by: SeongJae Park <[email protected]>
---
tools/damon/_damon.py | 39 +++++++++++++++++++++++++++++++++++++++
tools/damon/record.py | 12 +++++++-----
tools/damon/schemes.py | 12 +++++++-----
3 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/tools/damon/_damon.py b/tools/damon/_damon.py
index a4f6c03c23e4..a22ec3777c16 100644
--- a/tools/damon/_damon.py
+++ b/tools/damon/_damon.py
@@ -12,12 +12,25 @@ debugfs_attrs = None
debugfs_record = None
debugfs_schemes = None
debugfs_target_ids = None
+debugfs_init_regions = None
debugfs_monitor_on = None
def set_target_id(tid):
with open(debugfs_target_ids, 'w') as f:
f.write('%s\n' % tid)
+def set_target(tid, init_regions=[]):
+ rc = set_target_id(tid)
+ if rc:
+ return rc
+
+ if not os.path.exists(debugfs_init_regions):
+ return 0
+
+ string = ' '.join(['%s %d %d' % (tid, r[0], r[1]) for r in init_regions])
+ return subprocess.call('echo "%s" > %s' % (string, debugfs_init_regions),
+ shell=True, executable='/bin/bash')
+
def turn_damon(on_off):
return subprocess.call("echo %s > %s" % (on_off, debugfs_monitor_on),
shell=True, executable="/bin/bash")
@@ -97,6 +110,7 @@ def chk_update_debugfs(debugfs):
global debugfs_record
global debugfs_schemes
global debugfs_target_ids
+ global debugfs_init_regions
global debugfs_monitor_on
debugfs_damon = os.path.join(debugfs, 'damon')
@@ -104,6 +118,7 @@ def chk_update_debugfs(debugfs):
debugfs_record = os.path.join(debugfs_damon, 'record')
debugfs_schemes = os.path.join(debugfs_damon, 'schemes')
debugfs_target_ids = os.path.join(debugfs_damon, 'target_ids')
+ debugfs_init_regions = os.path.join(debugfs_damon, 'init_regions')
debugfs_monitor_on = os.path.join(debugfs_damon, 'monitor_on')
if not os.path.isdir(debugfs_damon):
@@ -131,6 +146,26 @@ def cmd_args_to_attrs(args):
return Attrs(sample_interval, aggr_interval, regions_update_interval,
min_nr_regions, max_nr_regions, rbuf_len, rfile_path, schemes)
+def cmd_args_to_init_regions(args):
+ regions = []
+ for arg in args.regions.split():
+ addrs = arg.split('-')
+ try:
+ if len(addrs) != 2:
+ raise Exception('two addresses not given')
+ start = int(addrs[0])
+ end = int(addrs[1])
+ if start >= end:
+ raise Exception('start >= end')
+ if regions and regions[-1][1] > start:
+ raise Exception('regions overlap')
+ except Exception as e:
+ print('Wrong \'--regions\' argument (%s)' % e)
+ exit(1)
+
+ regions.append([start, end])
+ return regions
+
def set_attrs_argparser(parser):
parser.add_argument('-d', '--debugfs', metavar='<debugfs>', type=str,
default='/sys/kernel/debug', help='debugfs mounted path')
@@ -144,3 +179,7 @@ def set_attrs_argparser(parser):
default=10, help='minimal number of regions')
parser.add_argument('-m', '--maxr', metavar='<# regions>', type=int,
default=1000, help='maximum number of regions')
+
+def set_init_regions_argparser(parser):
+ parser.add_argument('-r', '--regions', metavar='"<start>-<end> ..."',
+ type=str, default='', help='monitoring target address regions')
diff --git a/tools/damon/record.py b/tools/damon/record.py
index 6d1cbe593b94..11fd54001472 100644
--- a/tools/damon/record.py
+++ b/tools/damon/record.py
@@ -24,7 +24,7 @@ def pidfd_open(pid):
return syscall(NR_pidfd_open, pid, 0)
-def do_record(target, is_target_cmd, attrs, old_attrs, pidfd):
+def do_record(target, is_target_cmd, init_regions, attrs, old_attrs, pidfd):
if os.path.isfile(attrs.rfile_path):
os.rename(attrs.rfile_path, attrs.rfile_path + '.old')
@@ -48,8 +48,8 @@ def do_record(target, is_target_cmd, attrs, old_attrs, pidfd):
# only for reference of the pidfd usage.
target = 'pidfd %s' % fd
- if _damon.set_target_id(target):
- print('target id setting (%s) failed' % target)
+ if _damon.set_target(target, init_regions):
+ print('target setting (%s, %s) failed' % (target, init_regions))
cleanup_exit(old_attrs, -2)
if _damon.turn_damon('on'):
print('could not turn on damon' % target)
@@ -91,6 +91,7 @@ def chk_permission():
def set_argparser(parser):
_damon.set_attrs_argparser(parser)
+ _damon.set_init_regions_argparser(parser)
parser.add_argument('target', type=str, metavar='<target>',
help='the target command or the pid to record')
parser.add_argument('--pidfd', action='store_true',
@@ -117,19 +118,20 @@ def main(args=None):
args.schemes = ''
pidfd = args.pidfd
new_attrs = _damon.cmd_args_to_attrs(args)
+ init_regions = _damon.cmd_args_to_init_regions(args)
target = args.target
target_fields = target.split()
if not subprocess.call('which %s &> /dev/null' % target_fields[0],
shell=True, executable='/bin/bash'):
- do_record(target, True, new_attrs, orig_attrs, pidfd)
+ do_record(target, True, init_regions, new_attrs, orig_attrs, pidfd)
else:
try:
pid = int(target)
except:
print('target \'%s\' is neither a command, nor a pid' % target)
exit(1)
- do_record(target, False, new_attrs, orig_attrs, pidfd)
+ do_record(target, False, init_regions, new_attrs, orig_attrs, pidfd)
if __name__ == '__main__':
main()
diff --git a/tools/damon/schemes.py b/tools/damon/schemes.py
index 9095835f6133..cfec89854a08 100644
--- a/tools/damon/schemes.py
+++ b/tools/damon/schemes.py
@@ -14,7 +14,7 @@ import time
import _convert_damos
import _damon
-def run_damon(target, is_target_cmd, attrs, old_attrs):
+def run_damon(target, is_target_cmd, init_regions, attrs, old_attrs):
if os.path.isfile(attrs.rfile_path):
os.rename(attrs.rfile_path, attrs.rfile_path + '.old')
@@ -27,8 +27,8 @@ def run_damon(target, is_target_cmd, attrs, old_attrs):
if is_target_cmd:
p = subprocess.Popen(target, shell=True, executable='/bin/bash')
target = p.pid
- if _damon.set_target_pid(target):
- print('pid setting (%s) failed' % target)
+ if _damon.set_target(target, init_regions):
+ print('target setting (%s, %s) failed' % (target, init_regions))
cleanup_exit(old_attrs, -2)
if _damon.turn_damon('on'):
print('could not turn on damon' % target)
@@ -68,6 +68,7 @@ def chk_permission():
def set_argparser(parser):
_damon.set_attrs_argparser(parser)
+ _damon.set_init_regions_argparser(parser)
parser.add_argument('target', type=str, metavar='<target>',
help='the target command or the pid to record')
parser.add_argument('-c', '--schemes', metavar='<file>', type=str,
@@ -92,19 +93,20 @@ def main(args=None):
args.out = 'null'
args.schemes = _convert_damos.convert(args.schemes, args.sample, args.aggr)
new_attrs = _damon.cmd_args_to_attrs(args)
+ init_regions = _damon.cmd_args_to_init_regions(args)
target = args.target
target_fields = target.split()
if not subprocess.call('which %s &> /dev/null' % target_fields[0],
shell=True, executable='/bin/bash'):
- run_damon(target, True, new_attrs, orig_attrs)
+ run_damon(target, True, init_regions, new_attrs, orig_attrs)
else:
try:
pid = int(target)
except:
print('target \'%s\' is neither a command, nor a pid' % target)
exit(1)
- run_damon(target, False, new_attrs, orig_attrs)
+ run_damon(target, False, init_regions, new_attrs, orig_attrs)
if __name__ == '__main__':
main()
--
2.17.1
From: SeongJae Park <[email protected]>
This commit adds description of the 'init_regions' feature in the DAMON
usage document.
Signed-off-by: SeongJae Park <[email protected]>
---
Documentation/admin-guide/mm/damon/usage.rst | 41 +++++++++++++++++++-
1 file changed, 39 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index 96278227f925..cf0d44ce0ac9 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -281,8 +281,9 @@ for at least 100 milliseconds using below commands::
debugfs Interface
=================
-DAMON exports five files, ``attrs``, ``target_ids``, ``record``, ``schemes``
-and ``monitor_on`` under its debugfs directory, ``<debugfs>/damon/``.
+DAMON exports six files, ``attrs``, ``target_ids``, ``init_regions``,
+``record``, ``schemes`` and ``monitor_on`` under its debugfs directory,
+``<debugfs>/damon/``.
Attributes
@@ -321,6 +322,42 @@ check it again::
Note that setting the target ids doesn't start the monitoring.
+Initial Monitoring Target Regions
+---------------------------------
+
+In case of the debugfs based monitoring, DAMON automatically sets and updates
+the monitoring target regions so that entire memory mappings of target
+processes can be covered. However, users might want to limit the monitoring
+region to specific address ranges, such as the heap, the stack, or specific
+file-mapped area. Or, some users might know the initial access pattern of
+their workloads and therefore want to set optimal initial regions for the
+'adaptive regions adjustment'.
+
+In such cases, users can explicitly set the initial monitoring target regions
+as they want, by writing proper values to the ``init_regions`` file. Each line
+of the input should represent one region in below form.::
+
+ <target id> <start address> <end address>
+
+The ``target id`` should already in ``target_ids`` file, and the regions should
+be passed in address order. For example, below commands will set a couple of
+address ranges, ``1-100`` and ``100-200`` as the initial monitoring target
+region of process 42, and another couple of address ranges, ``20-40`` and
+``50-100`` as that of process 4242.::
+
+ # cd <debugfs>/damon
+ # echo "42 1 100
+ 42 100 200
+ 4242 20 40
+ 4242 50 100" > init_regions
+
+Note that this sets the initial monitoring target regions only. In case of
+virtual memory monitoring, DAMON will automatically updates the boundary of the
+regions after one ``regions update interval``. Therefore, users should set the
+``regions update interval`` large enough in this case, if they don't want the
+update.
+
+
Record
------
--
2.17.1
From: SeongJae Park <[email protected]>
This commit makes the debugfs interface to support the physical memory
monitoring, in addition to the virtual memory monitoring.
Users can do the physical memory monitoring by writing a special
keyword, 'paddr\n' to the 'pids' debugfs file. Then, DAMON will check
the special keyword and configure the callbacks of the monitoring
context for the debugfs user for physical memory. This will internally
add one fake monitoring target process, which has pid as -1.
Unlike the virtual memory monitoring, DAMON debugfs will not
automatically set the monitoring target region. Therefore, users should
also set the monitoring target address region using the 'init_regions'
debugfs file. While doing this, the 'pid' in the input should be '-1'.
Finally, the physical memory monitoring will not automatically
terminated because it has fake monitoring target process. The user
should explicitly turn off the monitoring by writing 'off' to the
'monitor_on' debugfs file.
Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/mm/damon.c b/mm/damon.c
index c8d834ce188d..0b6960f62ae9 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -2035,9 +2035,27 @@ static ssize_t debugfs_target_ids_write(struct file *file,
return PTR_ERR(kbuf);
nrs = kbuf;
- if (!strncmp(kbuf, "pidfd ", 6)) {
- received_pidfds = true;
- nrs = &kbuf[6];
+ if (!strncmp(kbuf, "paddr\n", count)) {
+ /* Configure the context for physical memory monitoring */
+ ctx->init_target_regions = kdamond_init_phys_regions;
+ ctx->update_target_regions = kdamond_update_phys_regions;
+ ctx->prepare_access_checks = kdamond_prepare_phys_access_checks;
+ ctx->check_accesses = kdamond_check_phys_accesses;
+ ctx->target_valid = NULL;
+
+ /* target id is meaningless here, but we set it just for fun */
+ snprintf(kbuf, count, "-1 ");
+ } else {
+ /* Configure the context for virtual memory monitoring */
+ ctx->init_target_regions = kdamond_init_vm_regions;
+ ctx->update_target_regions = kdamond_update_vm_regions;
+ ctx->prepare_access_checks = kdamond_prepare_vm_access_checks;
+ ctx->check_accesses = kdamond_check_vm_accesses;
+ ctx->target_valid = kdamond_vm_target_valid;
+ if (!strncmp(kbuf, "pidfd ", 6)) {
+ received_pidfds = true;
+ nrs = &kbuf[6];
+ }
}
targets = str_to_target_ids(nrs, ret, &nr_targets);
--
2.17.1
From: SeongJae Park <[email protected]>
This commit adds more test cases for the new feature, 'init_regions'.
Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon-test.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/mm/damon-test.h b/mm/damon-test.h
index 71413ffc1dcb..e67e8fb17eca 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -161,6 +161,58 @@ static void damon_test_set_recording(struct kunit *test)
KUNIT_EXPECT_STREQ(test, ctx->rfile_path, "foo");
}
+static void damon_test_set_init_regions(struct kunit *test)
+{
+ struct damon_ctx *ctx = &damon_user_ctx;
+ unsigned long ids[] = {1, 2, 3};
+ /* Each line represents one region in ``<target id> <start> <end>`` */
+ char * const valid_inputs[] = {"2 10 20\n 2 20 30\n2 35 45",
+ "2 10 20\n",
+ "2 10 20\n1 39 59\n1 70 134\n 2 20 25\n",
+ ""};
+ /* Reading the file again will show sorted, clean output */
+ char * const valid_expects[] = {"2 10 20\n2 20 30\n2 35 45\n",
+ "2 10 20\n",
+ "1 39 59\n1 70 134\n2 10 20\n2 20 25\n",
+ ""};
+ char * const invalid_inputs[] = {"4 10 20\n", /* target not exists */
+ "2 10 20\n 2 14 26\n", /* regions overlap */
+ "1 10 20\n2 30 40\n 1 5 8"}; /* not sorted by address */
+ char *input, *expect;
+ int i, rc;
+ char buf[256];
+
+ damon_set_targets(ctx, ids, 3);
+
+ /* Put valid inputs and check the results */
+ for (i = 0; i < ARRAY_SIZE(valid_inputs); i++) {
+ input = valid_inputs[i];
+ expect = valid_expects[i];
+
+ rc = set_init_regions(ctx, input, strnlen(input, 256));
+ KUNIT_EXPECT_EQ(test, rc, 0);
+
+ memset(buf, 0, 256);
+ sprint_init_regions(ctx, buf, 256);
+
+ KUNIT_EXPECT_STREQ(test, (char *)buf, expect);
+ }
+ /* Put invlid inputs and check the return error code */
+ for (i = 0; i < ARRAY_SIZE(invalid_inputs); i++) {
+ input = invalid_inputs[i];
+ pr_info("input: %s\n", input);
+ rc = set_init_regions(ctx, input, strnlen(input, 256));
+ KUNIT_EXPECT_EQ(test, rc, -EINVAL);
+
+ memset(buf, 0, 256);
+ sprint_init_regions(ctx, buf, 256);
+
+ KUNIT_EXPECT_STREQ(test, (char *)buf, "");
+ }
+
+ damon_set_targets(ctx, NULL, 0);
+}
+
static void __link_vmas(struct vm_area_struct *vmas, ssize_t nr_vmas)
{
int i, j;
@@ -645,6 +697,7 @@ static struct kunit_case damon_test_cases[] = {
KUNIT_CASE(damon_test_regions),
KUNIT_CASE(damon_test_set_targets),
KUNIT_CASE(damon_test_set_recording),
+ KUNIT_CASE(damon_test_set_init_regions),
KUNIT_CASE(damon_test_three_regions_in_vmas),
KUNIT_CASE(damon_test_aggregate),
KUNIT_CASE(damon_test_write_rbuf),
--
2.17.1
From: SeongJae Park <[email protected]>
This commit updates the DAMON user space tool (damo-record) for NUMA
specific physical memory monitoring. With this change, users can
monitor accesses to physical memory of specific NUMA node.
Signed-off-by: SeongJae Park <[email protected]>
---
tools/damon/_paddr_layout.py | 147 +++++++++++++++++++++++++++++++++++
tools/damon/record.py | 18 ++++-
2 files changed, 164 insertions(+), 1 deletion(-)
create mode 100644 tools/damon/_paddr_layout.py
diff --git a/tools/damon/_paddr_layout.py b/tools/damon/_paddr_layout.py
new file mode 100644
index 000000000000..561c2b6729f6
--- /dev/null
+++ b/tools/damon/_paddr_layout.py
@@ -0,0 +1,147 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+import os
+
+class PaddrRange:
+ start = None
+ end = None
+ nid = None
+ state = None
+ name = None
+
+ def __init__(self, start, end, nid, state, name):
+ self.start = start
+ self.end = end
+ self.nid = nid
+ self.state = state
+ self.name = name
+
+ def interleaved(self, prange):
+ if self.end <= prange.start:
+ return None
+ if prange.end <= self.start:
+ return None
+ return [max(self.start, prange.start), min(self.end, prange.end)]
+
+ def __str__(self):
+ return '%x-%x, nid %s, state %s, name %s' % (self.start, self.end,
+ self.nid, self.state, self.name)
+
+class MemBlock:
+ nid = None
+ index = None
+ state = None
+
+ def __init__(self, nid, index, state):
+ self.nid = nid
+ self.index = index
+ self.state = state
+
+ def __str__(self):
+ return '%d (%s)' % (self.index, self.state)
+
+ def __repr__(self):
+ return self.__str__()
+
+def readfile(file_path):
+ with open(file_path, 'r') as f:
+ return f.read()
+
+def collapse_ranges(ranges):
+ ranges = sorted(ranges, key=lambda x: x.start)
+ merged = []
+ for r in ranges:
+ if not merged:
+ merged.append(r)
+ continue
+ last = merged[-1]
+ if last.end != r.start or last.nid != r.nid or last.state != r.state:
+ merged.append(r)
+ else:
+ last.end = r.end
+ return merged
+
+def memblocks_to_ranges(blocks, block_size):
+ ranges = []
+ for b in blocks:
+ ranges.append(PaddrRange(b.index * block_size,
+ (b.index + 1) * block_size, b.nid, b.state, None))
+
+ return collapse_ranges(ranges)
+
+def memblock_ranges():
+ SYSFS='/sys/devices/system/node'
+ sz_block = int(readfile('/sys/devices/system/memory/block_size_bytes'), 16)
+ sys_nodes = [x for x in os.listdir(SYSFS) if x.startswith('node')]
+
+ blocks = []
+ for sys_node in sys_nodes:
+ nid = int(sys_node[4:])
+
+ sys_node_files = os.listdir(os.path.join(SYSFS, sys_node))
+ for f in sys_node_files:
+ if not f.startswith('memory'):
+ continue
+ index = int(f[6:])
+ sys_state = os.path.join(SYSFS, sys_node, f, 'state')
+ state = readfile(sys_state).strip()
+
+ blocks.append(MemBlock(nid, index, state))
+
+ return memblocks_to_ranges(blocks, sz_block)
+
+def iomem_ranges():
+ ranges = []
+
+ with open('/proc/iomem', 'r') as f:
+ # example of the line: '100000000-42b201fff : System RAM'
+ for line in f:
+ fields = line.split(':')
+ if len(fields) < 2:
+ continue
+ name = ':'.join(fields[1:]).strip()
+ addrs = fields[0].split('-')
+ if len(addrs) != 2:
+ continue
+ start = int(addrs[0], 16)
+ end = int(addrs[1], 16) + 1
+ ranges.append(PaddrRange(start, end, None, None, name))
+
+ return ranges
+
+def integrate(memblock_parsed, iomem_parsed):
+ merged = []
+
+ for r in iomem_parsed:
+ for r2 in memblock_parsed:
+ if r2.start <= r.start and r.end <= r2.end:
+ r.nid = r2.nid
+ r.state = r2.state
+ merged.append(r)
+ elif r2.start <= r.start and r.start < r2.end and r2.end < r.end:
+ sub = PaddrRange(r2.end, r.end, None, None, r.name)
+ iomem_parsed.append(sub)
+ r.end = r2.end
+ r.nid = r2.nid
+ r.state = r2.state
+ merged.append(r)
+ merged = sorted(merged, key=lambda x: x.start)
+ return merged
+
+def paddr_ranges():
+ return integrate(memblock_ranges(), iomem_ranges())
+
+def pr_ranges(ranges):
+ print('#%12s %13s\tnode\tstate\tresource\tsize' % ('start', 'end'))
+ for r in ranges:
+ print('%13d %13d\t%s\t%s\t%s\t%d' % (r.start, r.end, r.nid,
+ r.state, r.name, r.end - r.start))
+
+def main():
+ ranges = paddr_ranges()
+
+ pr_ranges(ranges)
+
+if __name__ == '__main__':
+ main()
diff --git a/tools/damon/record.py b/tools/damon/record.py
index 6fd0b59c73e0..e9d6bfc70ead 100644
--- a/tools/damon/record.py
+++ b/tools/damon/record.py
@@ -12,6 +12,7 @@ import subprocess
import time
import _damon
+import _paddr_layout
def pidfd_open(pid):
import ctypes
@@ -98,6 +99,8 @@ def set_argparser(parser):
help='use pidfd type target id')
parser.add_argument('-l', '--rbuf', metavar='<len>', type=int,
default=1024*1024, help='length of record result buffer')
+ parser.add_argument('--numa_node', metavar='<node id>', type=int,
+ help='if target is \'paddr\', limit it to the numa node')
parser.add_argument('-o', '--out', metavar='<file path>', type=str,
default='damon.data', help='output file path')
@@ -124,6 +127,15 @@ def default_paddr_region():
ret = [start, end]
return ret
+def paddr_region_of(numa_node):
+ regions = []
+ paddr_ranges = _paddr_layout.paddr_ranges()
+ for r in paddr_ranges:
+ if r.nid == numa_node and r.name == 'System RAM':
+ regions.append([r.start, r.end])
+
+ return regions
+
def main(args=None):
global orig_attrs
if not args:
@@ -142,12 +154,16 @@ def main(args=None):
pidfd = args.pidfd
new_attrs = _damon.cmd_args_to_attrs(args)
init_regions = _damon.cmd_args_to_init_regions(args)
+ numa_node = args.numa_node
target = args.target
target_fields = target.split()
if target == 'paddr': # physical memory address space
if not init_regions:
- init_regions = [default_paddr_region()]
+ if numa_node:
+ init_regions = paddr_region_of(numa_node)
+ else:
+ init_regions = [default_paddr_region()]
do_record(target, False, init_regions, new_attrs, orig_attrs, pidfd)
elif not subprocess.call('which %s &> /dev/null' % target_fields[0],
shell=True, executable='/bin/bash'):
--
2.17.1
From: SeongJae Park <[email protected]>
Now the regions can be explicitly set as users want. Therefore checking
the number of gaps doesn't make sense. Remove the condition.
Signed-off-by: SeongJae Park <[email protected]>
---
tools/testing/selftests/damon/_chk_record.py | 6 ------
1 file changed, 6 deletions(-)
diff --git a/tools/testing/selftests/damon/_chk_record.py b/tools/testing/selftests/damon/_chk_record.py
index 73e128904319..5f11be64abed 100644
--- a/tools/testing/selftests/damon/_chk_record.py
+++ b/tools/testing/selftests/damon/_chk_record.py
@@ -37,12 +37,9 @@ def chk_task_info(f):
print('too many regions: %d > %d' % (nr_regions, max_nr_regions))
exit(1)
- nr_gaps = 0
eaddr = 0
for r in range(nr_regions):
saddr = struct.unpack('L', f.read(8))[0]
- if eaddr and saddr != eaddr:
- nr_gaps += 1
eaddr = struct.unpack('L', f.read(8))[0]
nr_accesses = struct.unpack('I', f.read(4))[0]
@@ -56,9 +53,6 @@ def chk_task_info(f):
print('too high nr_access: expected %d but %d' %
(max_nr_accesses, nr_accesses))
exit(1)
- if nr_gaps != 2:
- print('number of gaps are not two but %d' % nr_gaps)
- exit(1)
def parse_time_us(bindat):
sec = struct.unpack('l', bindat[0:8])[0]
--
2.17.1
From: SeongJae Park <[email protected]>
This commit updates the DAMON documents for the physical memory
monitoring support.
Signed-off-by: SeongJae Park <[email protected]>
---
Documentation/admin-guide/mm/damon/usage.rst | 42 ++++++++++++++++----
Documentation/vm/damon/design.rst | 29 +++++++++-----
Documentation/vm/damon/faq.rst | 5 +--
3 files changed, 54 insertions(+), 22 deletions(-)
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index cf0d44ce0ac9..88b8e9254a7e 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -10,15 +10,16 @@ DAMON provides below three interfaces for different users.
This is for privileged people such as system administrators who want a
just-working human-friendly interface. Using this, users can use the DAMON’s
major features in a human-friendly way. It may not be highly tuned for
- special cases, though. It supports only virtual address spaces monitoring.
+ special cases, though. It supports both virtual and physical address spaces
+ monitoring.
- *debugfs interface.*
This is for privileged user space programmers who want more optimized use of
DAMON. Using this, users can use DAMON’s major features by reading
from and writing to special debugfs files. Therefore, you can write and use
your personalized DAMON debugfs wrapper programs that reads/writes the
debugfs files instead of you. The DAMON user space tool is also a reference
- implementation of such programs. It supports only virtual address spaces
- monitoring.
+ implementation of such programs. It supports both virtual and physical
+ address spaces monitoring.
- *Kernel Space Programming Interface.*
This is for kernel space programmers. Using this, users can utilize every
feature of DAMON most flexibly and efficiently by writing kernel space
@@ -49,8 +50,10 @@ Recording Data Access Pattern
The ``record`` subcommand records the data access pattern of target workloads
in a file (``./damon.data`` by default). You can specify the target with 1)
-the command for execution of the monitoring target process, or 2) pid of
-running target process. Below example shows a command target usage::
+the command for execution of the monitoring target process, 2) pid of running
+target process, or 3) the special keyword, 'paddr', if you want to monitor the
+system's physical memory address space. Below example shows a command target
+usage::
# cd <kernel>/tools/damon/
# damo record "sleep 5"
@@ -61,6 +64,15 @@ of the process. Below example shows a pid target usage::
# sleep 5 &
# damo record `pidof sleep`
+Finally, below example shows the use of the special keyword, 'paddr'::
+
+ # damo record paddr
+
+In this case, the monitoring target regions defaults to the largetst 'System
+RAM' region specified in '/proc/iomem' file. Note that the initial monitoring
+target region is maintained rather than dynamically updated like the virtual
+memory address spaces monitoring case.
+
The location of the recorded file can be explicitly set using ``-o`` option.
You can further tune this by setting the monitoring attributes. To know about
the monitoring attributes in detail, please refer to the
@@ -319,20 +331,34 @@ check it again::
# cat target_ids
42 4242
+Users can also monitor the physical memory address space of the system by
+writing a special keyword, "``paddr\n``" to the file. Because physical address
+space monitoring doesn't support multiple targets, reading the file will show a
+fake value, ``-1``, as below::
+
+ # cd <debugfs>/damon
+ # echo paddr > target_ids
+ # cat target_ids
+ -1
+
Note that setting the target ids doesn't start the monitoring.
Initial Monitoring Target Regions
---------------------------------
-In case of the debugfs based monitoring, DAMON automatically sets and updates
-the monitoring target regions so that entire memory mappings of target
-processes can be covered. However, users might want to limit the monitoring
+In case of the virtual address space monitoring, DAMON automatically sets and
+updates the monitoring target regions so that entire memory mappings of target
+processes can be covered. However, users might want to limit the monitoring
region to specific address ranges, such as the heap, the stack, or specific
file-mapped area. Or, some users might know the initial access pattern of
their workloads and therefore want to set optimal initial regions for the
'adaptive regions adjustment'.
+In contrast, DAMON do not automatically sets and updates the monitoring target
+regions in case of physical memory monitoring. Therefore, users should set the
+monitoring target regions by themselves.
+
In such cases, users can explicitly set the initial monitoring target regions
as they want, by writing proper values to the ``init_regions`` file. Each line
of the input should represent one region in below form.::
diff --git a/Documentation/vm/damon/design.rst b/Documentation/vm/damon/design.rst
index 727d72093f8f..0666e19018fd 100644
--- a/Documentation/vm/damon/design.rst
+++ b/Documentation/vm/damon/design.rst
@@ -35,27 +35,34 @@ two parts:
1. Identification of the monitoring target address range for the address space.
2. Access check of specific address range in the target space.
-DAMON currently provides the implementation of the primitives for only the
-virtual address spaces. Below two subsections describe how it works.
+DAMON currently provides the implementations of the primitives for the physical
+and virtual address spaces. Below two subsections describe how those work.
PTE Accessed-bit Based Access Check
-----------------------------------
-The implementation for the virtual address space uses PTE Accessed-bit for
-basic access checks. It finds the relevant PTE Accessed bit from the address
-by walking the page table for the target task of the address. In this way, the
-implementation finds and clears the bit for next sampling target address and
-checks whether the bit set again after one sampling period. This could disturb
-other kernel subsystems using the Accessed bits, namely Idle page tracking and
-the reclaim logic. To avoid such disturbances, DAMON makes it mutually
-exclusive with Idle page tracking and uses ``PG_idle`` and ``PG_young`` page
-flags to solve the conflict with the reclaim logic, as Idle page tracking does.
+Both of the implementations for physical and virtual address spaces use PTE
+Accessed-bit for basic access checks. Only one difference is the way of
+finding the relevant PTE Accessed bit(s) from the address. While the
+implementation for the virtual address walks the page table for the target task
+of the address, the implementation for the physical address walks every page
+table having a mapping to the address. In this way, the implementations find
+and clear the bit(s) for next sampling target address and checks whether the
+bit(s) set again after one sampling period. This could disturb other kernel
+subsystems using the Accessed bits, namely Idle page tracking and the reclaim
+logic. To avoid such disturbances, DAMON makes it mutually exclusive with Idle
+page tracking and uses ``PG_idle`` and ``PG_young`` page flags to solve the
+conflict with the reclaim logic, as Idle page tracking does.
VMA-based Target Address Range Construction
-------------------------------------------
+This is only for the virtual address space primitives implementation. That for
+the physical address space simply asks users to manually set the monitoring
+target address ranges.
+
Only small parts in the super-huge virtual address space of the processes are
mapped to the physical memory and accessed. Thus, tracking the unmapped
address regions is just wasteful. However, because DAMON can deal with some
diff --git a/Documentation/vm/damon/faq.rst b/Documentation/vm/damon/faq.rst
index 088128bbf22b..6469d54c480f 100644
--- a/Documentation/vm/damon/faq.rst
+++ b/Documentation/vm/damon/faq.rst
@@ -43,10 +43,9 @@ constructions and actual access checks can be implemented and configured on the
DAMON core by the users. In this way, DAMON users can monitor any address
space with any access check technique.
-Nonetheless, DAMON provides vma tracking and PTE Accessed bit check based
+Nonetheless, DAMON provides vma/rmap tracking and PTE Accessed bit check based
implementations of the address space dependent functions for the virtual memory
-by default, for a reference and convenient use. In near future, we will
-provide those for physical memory address space.
+and the physical memory by default, for a reference and convenient use.
Can I simply monitor page granularity?
--
2.17.1
From: SeongJae Park <[email protected]>
This commit implements the four callbacks (->init_target_regions,
->update_target_regions, ->prepare_access_check, and ->check_accesses)
for the basic access monitoring of the physical memory address space.
By setting the callback pointers to point those, users can easily
monitor the accesses to the physical memory.
Internally, it uses the PTE Accessed bit, as similar to that of the
virtual memory support. Also, it supports only user memory pages, as
idle page tracking also does, for the same reason. If the monitoring
target physical memory address range contains non-user memory pages,
access check of the pages will do nothing but simply treat the pages as
not accessed.
Users who want to use other access check primitives and/or monitor the
non-user memory regions could implement and use their own callbacks.
Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 6 ++
mm/damon.c | 200 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 206 insertions(+)
diff --git a/include/linux/damon.h b/include/linux/damon.h
index f02798ac9ec5..bbd748514677 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -237,6 +237,12 @@ void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
bool kdamond_vm_target_valid(struct damon_target *t);
+/* Reference callback implementations for physical memory */
+void kdamond_init_phys_regions(struct damon_ctx *ctx);
+void kdamond_update_phys_regions(struct damon_ctx *ctx);
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
+
int damon_set_targets(struct damon_ctx *ctx,
unsigned long *ids, ssize_t nr_ids);
int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
diff --git a/mm/damon.c b/mm/damon.c
index b2507bae6c57..c8d834ce188d 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -28,11 +28,14 @@
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/kthread.h>
+#include <linux/memory_hotplug.h>
#include <linux/mm.h>
#include <linux/mmu_notifier.h>
#include <linux/module.h>
#include <linux/page_idle.h>
+#include <linux/pagemap.h>
#include <linux/random.h>
+#include <linux/rmap.h>
#include <linux/sched/mm.h>
#include <linux/sched/task.h>
#include <linux/slab.h>
@@ -543,6 +546,18 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
}
}
+/*
+ * The initial regions construction function for the physical address space.
+ *
+ * This default version does nothing in actual. Users should set the initial
+ * regions by themselves before passing their damon_ctx to 'start_damon()', or
+ * implement their version of this and set '->init_target_regions' of their
+ * damon_ctx to point it.
+ */
+void kdamond_init_phys_regions(struct damon_ctx *ctx)
+{
+}
+
/*
* Functions for the dynamic monitoring target regions update
*/
@@ -626,6 +641,19 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx)
}
}
+/*
+ * The dynamic monitoring target regions update function for the physical
+ * address space.
+ *
+ * This default version does nothing in actual. Users should update the
+ * regions in other callbacks such as '->aggregate_cb', or implement their
+ * version of this and set the '->init_target_regions' of their damon_ctx to
+ * point it.
+ */
+void kdamond_update_phys_regions(struct damon_ctx *ctx)
+{
+}
+
/*
* Functions for the access checking of the regions
*/
@@ -801,6 +829,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
return max_nr_accesses;
}
+/* access check functions for physical address based regions */
+
+/*
+ * Get a page by pfn if it is in the LRU list. Otherwise, returns NULL.
+ *
+ * The body of this function is stollen from the 'page_idle_get_page()'. We
+ * steal rather than reuse it because the code is quite simple .
+ */
+static struct page *damon_phys_get_page(unsigned long pfn)
+{
+ struct page *page = pfn_to_online_page(pfn);
+ pg_data_t *pgdat;
+
+ if (!page || !PageLRU(page) ||
+ !get_page_unless_zero(page))
+ return NULL;
+
+ pgdat = page_pgdat(page);
+ spin_lock_irq(&pgdat->lru_lock);
+ if (unlikely(!PageLRU(page))) {
+ put_page(page);
+ page = NULL;
+ }
+ spin_unlock_irq(&pgdat->lru_lock);
+ return page;
+}
+
+static bool damon_page_mkold(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ damon_mkold(vma->vm_mm, addr);
+ return true;
+}
+
+static void damon_phys_mkold(unsigned long paddr)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct rmap_walk_control rwc = {
+ .rmap_one = damon_page_mkold,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return;
+
+ if (!page_mapped(page) || !page_rmapping(page))
+ return;
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page))
+ return;
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+}
+
+static void damon_prepare_phys_access_check(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
+
+ damon_phys_mkold(r->sampling_addr);
+}
+
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region(r, t)
+ damon_prepare_phys_access_check(ctx, r);
+ }
+}
+
+struct damon_phys_access_chk_result {
+ unsigned long page_sz;
+ bool accessed;
+};
+
+static bool damon_page_accessed(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ struct damon_phys_access_chk_result *result = arg;
+
+ result->accessed = damon_young(vma->vm_mm, addr, &result->page_sz);
+
+ /* If accessed, stop walking */
+ return !result->accessed;
+}
+
+static bool damon_phys_young(unsigned long paddr, unsigned long *page_sz)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct damon_phys_access_chk_result result = {
+ .page_sz = PAGE_SIZE,
+ .accessed = false,
+ };
+ struct rmap_walk_control rwc = {
+ .arg = &result,
+ .rmap_one = damon_page_accessed,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return false;
+
+ if (!page_mapped(page) || !page_rmapping(page))
+ return false;
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page))
+ return false;
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+
+ *page_sz = result.page_sz;
+ return result.accessed;
+}
+
+/*
+ * Check whether the region was accessed after the last preparation
+ *
+ * mm 'mm_struct' for the given virtual address space
+ * r the region of physical address space that needs to be checked
+ */
+static void damon_check_phys_access(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ static unsigned long last_addr;
+ static unsigned long last_page_sz = PAGE_SIZE;
+ static bool last_accessed;
+
+ /* If the region is in the last checked page, reuse the result */
+ if (ALIGN_DOWN(last_addr, last_page_sz) ==
+ ALIGN_DOWN(r->sampling_addr, last_page_sz)) {
+ if (last_accessed)
+ r->nr_accesses++;
+ return;
+ }
+
+ last_accessed = damon_phys_young(r->sampling_addr, &last_page_sz);
+ if (last_accessed)
+ r->nr_accesses++;
+
+ last_addr = r->sampling_addr;
+}
+
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx)
+{
+ struct damon_target *t;
+ struct damon_region *r;
+ unsigned int max_nr_accesses = 0;
+
+ damon_for_each_target(t, ctx) {
+ damon_for_each_region(r, t) {
+ damon_check_phys_access(ctx, r);
+ max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+ }
+ }
+
+ return max_nr_accesses;
+}
/*
* Functions for the target validity check
--
2.17.1
From: SeongJae Park <[email protected]>
This commit allows users to record the data accesses on physical memory
address space by passing 'paddr' as target to 'damo-record'. If the
init regions are given, the regions will be monitored. Else, it will
monitor biggest conitguous 'System RAM' region in '/proc/iomem' and
monitor the region.
Signed-off-by: SeongJae Park <[email protected]>
---
tools/damon/_damon.py | 2 ++
tools/damon/record.py | 29 ++++++++++++++++++++++++++++-
2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/tools/damon/_damon.py b/tools/damon/_damon.py
index a22ec3777c16..cf14a0d59b94 100644
--- a/tools/damon/_damon.py
+++ b/tools/damon/_damon.py
@@ -27,6 +27,8 @@ def set_target(tid, init_regions=[]):
if not os.path.exists(debugfs_init_regions):
return 0
+ if tid == 'paddr':
+ tid = -1
string = ' '.join(['%s %d %d' % (tid, r[0], r[1]) for r in init_regions])
return subprocess.call('echo "%s" > %s' % (string, debugfs_init_regions),
shell=True, executable='/bin/bash')
diff --git a/tools/damon/record.py b/tools/damon/record.py
index 11fd54001472..6fd0b59c73e0 100644
--- a/tools/damon/record.py
+++ b/tools/damon/record.py
@@ -101,6 +101,29 @@ def set_argparser(parser):
parser.add_argument('-o', '--out', metavar='<file path>', type=str,
default='damon.data', help='output file path')
+def default_paddr_region():
+ "Largest System RAM region becomes the default"
+ ret = []
+ with open('/proc/iomem', 'r') as f:
+ # example of the line: '100000000-42b201fff : System RAM'
+ for line in f:
+ fields = line.split(':')
+ if len(fields) != 2:
+ continue
+ name = fields[1].strip()
+ if name != 'System RAM':
+ continue
+ addrs = fields[0].split('-')
+ if len(addrs) != 2:
+ continue
+ start = int(addrs[0], 16)
+ end = int(addrs[1], 16)
+
+ sz_region = end - start
+ if not ret or sz_region > (ret[1] - ret[0]):
+ ret = [start, end]
+ return ret
+
def main(args=None):
global orig_attrs
if not args:
@@ -122,7 +145,11 @@ def main(args=None):
target = args.target
target_fields = target.split()
- if not subprocess.call('which %s &> /dev/null' % target_fields[0],
+ if target == 'paddr': # physical memory address space
+ if not init_regions:
+ init_regions = [default_paddr_region()]
+ do_record(target, False, init_regions, new_attrs, orig_attrs, pidfd)
+ elif not subprocess.call('which %s &> /dev/null' % target_fields[0],
shell=True, executable='/bin/bash'):
do_record(target, True, init_regions, new_attrs, orig_attrs, pidfd)
else:
--
2.17.1
On Wed, 5 Aug 2020 08:59:41 +0200 SeongJae Park <[email protected]> wrote:
> From: SeongJae Park <[email protected]>
>
> Changes from Previous Version
> =============================
>
> - paddr: Support nested iomem sections (Du Fan)
> - Rebase on v5.8
>
> Introduction
> ============
>
> DAMON[1] programming interface users can extend DAMON for any address space by
> configuring the address-space specific low level primitives with appropriate
> ones including their own implementations. However, because the implementation
> for the virtual address space is only available now, the users should implement
> their own for other address spaces. Worse yet, the user space users who rely
> on the debugfs interface and user space tool, cannot implement their own.
>
> This patchset implements another reference implementation of the low level
> primitives for the physical memory address space. With this change, hence, the
> kernel space users can monitor both the virtual and the physical address spaces
> by simply changing the configuration in the runtime. Further, this patchset
> links the implementation to the debugfs interface and the user space tool for
> the user space users.
>
> Note that the implementation supports only the user memory, as same to the idle
> page access tracking feature.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
This patchset doesn't works for physical address monitoring because I forgot
below patch. Sorry for missing it. Please apply it before you test this
patchset. Or, you can clone the patch applied complete git tree:
$ git clone git://github.com/sjp38/linux -b cdamon/rfc/v6.1
The web is also available:
https://github.com/sjp38/linux/releases/tag/cdamon/rfc/v6.1
The patch will be split and squashed in appropriate patch in the next spin.
=============================== >8 ===========================================
From edf6b586f4ac3f8f4d61ebde56d644422bd93bee Mon Sep 17 00:00:00 2001
From: SeongJae Park <[email protected]>
Date: Thu, 6 Aug 2020 08:18:49 +0000
Subject: [PATCH] mm/damon: Fix paddr target id problem
The target id for 'paddr' is meaningless, but we set it as '-1' for fun
and smooth interaction with the user space interfaces. However, the
target ids are 'unsigned long' and thus using '-1' makes no sense. This
commit changes the fake number to another funny but unsigned number,
'42'.
Signed-off-by: SeongJae Park <[email protected]>
---
Documentation/admin-guide/mm/damon/usage.rst | 4 ++--
mm/damon.c | 2 +-
tools/damon/_damon.py | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index 88b8e9254a7e..3e2f1519c96a 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -334,12 +334,12 @@ check it again::
Users can also monitor the physical memory address space of the system by
writing a special keyword, "``paddr\n``" to the file. Because physical address
space monitoring doesn't support multiple targets, reading the file will show a
-fake value, ``-1``, as below::
+fake value, ``42``, as below::
# cd <debugfs>/damon
# echo paddr > target_ids
# cat target_ids
- -1
+ 42
Note that setting the target ids doesn't start the monitoring.
diff --git a/mm/damon.c b/mm/damon.c
index a9757a0e5cf7..66268cb45b51 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -2047,7 +2047,7 @@ static ssize_t debugfs_target_ids_write(struct file *file,
ctx->target_valid = NULL;
/* target id is meaningless here, but we set it just for fun */
- snprintf(kbuf, count, "-1 ");
+ snprintf(kbuf, count, "42 ");
} else {
/* Configure the context for virtual memory monitoring */
ctx->init_target_regions = kdamond_init_vm_regions;
diff --git a/tools/damon/_damon.py b/tools/damon/_damon.py
index cf14a0d59b94..6ff278117e84 100644
--- a/tools/damon/_damon.py
+++ b/tools/damon/_damon.py
@@ -28,7 +28,7 @@ def set_target(tid, init_regions=[]):
return 0
if tid == 'paddr':
- tid = -1
+ tid = 42
string = ' '.join(['%s %d %d' % (tid, r[0], r[1]) for r in init_regions])
return subprocess.call('echo "%s" > %s' % (string, debugfs_init_regions),
shell=True, executable='/bin/bash')
--
2.17.1