2020-06-03 14:15:04

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 0/9] DAMON: Support Access Monitoring of Any Address Space Including Physical Memory

From: SeongJae Park <[email protected]>

Currently, DAMON[1] supports only virtual memory address spaces because it
utilizes PTE Accessed bits as its low-level access check primitive and ``struct
vma`` as a way to address the monitoring target regions. However, the core
idea of DAMON, which makes it able to provide the accurate, efficient, and
scalable monitoring, is in a separate higher layer. Therefore, DAMON can be
extended for other various address spaces by changing the two low primitives to
others for the address spaces.

This patchset makes the DAMON's low level primitives configurable and provide
reference implementation of the primitives for the virtual memory address
spaces and the physical memory address space. Therefore, users can monitor
both of the two address spaces by simply configuring the provided low level
primitives.

Kernel space users can also implement the primitives by themselves for their
special use cases. Clean/dirty/entire page cache, NUMA nodes, specific files,
or block devices would be examples of such special use cases.

[1] https://lore.kernel.org/linux-mm/[email protected]/


Baseline and Complete Git Trees
===============================

The patches are based on the v5.7 plus DAMON v14 patchset and DAMOS RFC v10
patchset. You can also clone the complete git tree:

$ git clone git://github.com/sjp38/linux -b cdamon/rfc/v2

The web is also available:
https://github.com/sjp38/linux/releases/tag/cdamon/rfc/v2

[1] https://lore.kernel.org/linux-mm/[email protected]/
[2] https://lore.kernel.org/linux-mm/[email protected]/


Sequence of Patches
===================

The sequence of patches is as follow. The 1st patch defines the monitoring
region again based on pure address range abstraction so that no assumption of
virtual memory is in there. The following 2nd patch cleans up code using the
new abstraction.

The 3rd patch allows users to configure the low level pritimives for
initialization and dynamic update of the target address regions, which were
previously coupled with virtual memory area. Then, the 4th patch allow user
space to also be able to set the monitoring target regions and document it in
the 5th patch.

The 6th patch further makes the access check primitives, which were based on
PTE Accessed bit, configurable. Now any address space can be supported. The
7th patch provides the reference implementations of the configurable primitives
for physical memory monitoring. The 8th patch makes the debugfs interface to
be able to use the physical memory monitoring, and finally the 9th patch
documents this.


Patch History
=============

Changes from RFC v1
(https://lore.kernel.org/linux-mm/[email protected]/)
- Provide the reference primitive implementations for the physical memory
- Connect the extensions with the debugfs interface

SeongJae Park (9):
mm/damon: Use vm-independent address range concept
mm/damon: Clean up code using 'struct damon_addr_range'
mm/damon: Make monitoring target regions init/update configurable
mm/damon/debugfs: Allow users to set initial monitoring target regions
Docs/damon: Document 'initial_regions' feature
mm/damon: Make access check primitive configurable
mm/damon: Implement callbacks for physical memory monitoring
mm/damon/debugfs: Support physical memory monitoring
Docs/damon: Document physical memory monitoring support

Documentation/admin-guide/mm/damon/usage.rst | 56 ++-
include/linux/damon.h | 47 +-
mm/damon-test.h | 78 +--
mm/damon.c | 504 ++++++++++++++++---
4 files changed, 564 insertions(+), 121 deletions(-)

--
2.17.1


2020-06-03 14:15:36

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 1/9] mm/damon: Use vm-independent address range concept

From: SeongJae Park <[email protected]>

DAMON's main idea is not limited to virtual address space. To prepare
for further expansion of the support for other address spaces including
physical memory, this commit modifies one of its core struct, 'struct
damon_region' to use virtual memory independent address space concept.

Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 20 +++++++++++------
mm/damon-test.h | 42 ++++++++++++++++++------------------
mm/damon.c | 50 +++++++++++++++++++++----------------------
3 files changed, 59 insertions(+), 53 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index e77256cf30dd..b4b06ca905a2 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -16,11 +16,18 @@
#include <linux/types.h>

/**
- * struct damon_region - Represents a monitoring target region of
- * [@vm_start, @vm_end).
- *
- * @vm_start: Start address of the region (inclusive).
- * @vm_end: End address of the region (exclusive).
+ * struct damon_addr_range - Represents an address region of [@start, @end).
+ * @start: Start address of the region (inclusive).
+ * @end: End address of the region (exclusive).
+ */
+struct damon_addr_range {
+ unsigned long start;
+ unsigned long end;
+};
+
+/**
+ * struct damon_region - Represents a monitoring target region.
+ * @ar: The address range of the region.
* @sampling_addr: Address of the sample for the next access check.
* @nr_accesses: Access frequency of this region.
* @list: List head for siblings.
@@ -33,8 +40,7 @@
* region are set as region size-weighted average of those of the two regions.
*/
struct damon_region {
- unsigned long vm_start;
- unsigned long vm_end;
+ struct damon_addr_range ar;
unsigned long sampling_addr;
unsigned int nr_accesses;
struct list_head list;
diff --git a/mm/damon-test.h b/mm/damon-test.h
index 5b18619efe72..9dd2061502cb 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -78,8 +78,8 @@ static void damon_test_regions(struct kunit *test)
struct damon_task *t;

r = damon_new_region(&damon_user_ctx, 1, 2);
- KUNIT_EXPECT_EQ(test, 1ul, r->vm_start);
- KUNIT_EXPECT_EQ(test, 2ul, r->vm_end);
+ KUNIT_EXPECT_EQ(test, 1ul, r->ar.start);
+ KUNIT_EXPECT_EQ(test, 2ul, r->ar.end);
KUNIT_EXPECT_EQ(test, 0u, r->nr_accesses);

t = damon_new_task(42);
@@ -267,7 +267,7 @@ static void damon_test_aggregate(struct kunit *test)
KUNIT_EXPECT_EQ(test, 3, it);

/* The aggregated information should be written in the buffer */
- sr = sizeof(r->vm_start) + sizeof(r->vm_end) + sizeof(r->nr_accesses);
+ sr = sizeof(r->ar.start) + sizeof(r->ar.end) + sizeof(r->nr_accesses);
sp = sizeof(t->pid) + sizeof(unsigned int) + 3 * sr;
sz = sizeof(struct timespec64) + sizeof(unsigned int) + 3 * sp;
KUNIT_EXPECT_EQ(test, (unsigned int)sz, ctx->rbuf_offset);
@@ -350,8 +350,8 @@ static void damon_do_test_apply_three_regions(struct kunit *test,

for (i = 0; i < nr_expected / 2; i++) {
r = __nth_region_of(t, i);
- KUNIT_EXPECT_EQ(test, r->vm_start, expected[i * 2]);
- KUNIT_EXPECT_EQ(test, r->vm_end, expected[i * 2 + 1]);
+ KUNIT_EXPECT_EQ(test, r->ar.start, expected[i * 2]);
+ KUNIT_EXPECT_EQ(test, r->ar.end, expected[i * 2 + 1]);
}

damon_cleanup_global_state();
@@ -470,8 +470,8 @@ static void damon_test_split_evenly(struct kunit *test)

i = 0;
damon_for_each_region(r, t) {
- KUNIT_EXPECT_EQ(test, r->vm_start, i++ * 10);
- KUNIT_EXPECT_EQ(test, r->vm_end, i * 10);
+ KUNIT_EXPECT_EQ(test, r->ar.start, i++ * 10);
+ KUNIT_EXPECT_EQ(test, r->ar.end, i * 10);
}
damon_free_task(t);

@@ -485,11 +485,11 @@ static void damon_test_split_evenly(struct kunit *test)
damon_for_each_region(r, t) {
if (i == 4)
break;
- KUNIT_EXPECT_EQ(test, r->vm_start, 5 + 10 * i++);
- KUNIT_EXPECT_EQ(test, r->vm_end, 5 + 10 * i);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i++);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 5 + 10 * i);
}
- KUNIT_EXPECT_EQ(test, r->vm_start, 5 + 10 * i);
- KUNIT_EXPECT_EQ(test, r->vm_end, 59ul);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5 + 10 * i);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 59ul);
damon_free_task(t);

t = damon_new_task(42);
@@ -499,8 +499,8 @@ static void damon_test_split_evenly(struct kunit *test)
KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 1u);

damon_for_each_region(r, t) {
- KUNIT_EXPECT_EQ(test, r->vm_start, 5ul);
- KUNIT_EXPECT_EQ(test, r->vm_end, 6ul);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 5ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 6ul);
}
damon_free_task(t);
}
@@ -514,12 +514,12 @@ static void damon_test_split_at(struct kunit *test)
r = damon_new_region(&damon_user_ctx, 0, 100);
damon_add_region(r, t);
damon_split_region_at(&damon_user_ctx, r, 25);
- KUNIT_EXPECT_EQ(test, r->vm_start, 0ul);
- KUNIT_EXPECT_EQ(test, r->vm_end, 25ul);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 25ul);

r = damon_next_region(r);
- KUNIT_EXPECT_EQ(test, r->vm_start, 25ul);
- KUNIT_EXPECT_EQ(test, r->vm_end, 100ul);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 25ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 100ul);

damon_free_task(t);
}
@@ -539,8 +539,8 @@ static void damon_test_merge_two(struct kunit *test)
damon_add_region(r2, t);

damon_merge_two_regions(r, r2);
- KUNIT_EXPECT_EQ(test, r->vm_start, 0ul);
- KUNIT_EXPECT_EQ(test, r->vm_end, 300ul);
+ KUNIT_EXPECT_EQ(test, r->ar.start, 0ul);
+ KUNIT_EXPECT_EQ(test, r->ar.end, 300ul);
KUNIT_EXPECT_EQ(test, r->nr_accesses, 16u);

i = 0;
@@ -577,8 +577,8 @@ static void damon_test_merge_regions_of(struct kunit *test)
KUNIT_EXPECT_EQ(test, nr_damon_regions(t), 5u);
for (i = 0; i < 5; i++) {
r = __nth_region_of(t, i);
- KUNIT_EXPECT_EQ(test, r->vm_start, saddrs[i]);
- KUNIT_EXPECT_EQ(test, r->vm_end, eaddrs[i]);
+ KUNIT_EXPECT_EQ(test, r->ar.start, saddrs[i]);
+ KUNIT_EXPECT_EQ(test, r->ar.end, eaddrs[i]);
}
damon_free_task(t);
}
diff --git a/mm/damon.c b/mm/damon.c
index ea6a8b6886b8..a9676b804b0b 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -80,7 +80,7 @@ static struct damon_ctx damon_user_ctx = {
* Returns the pointer to the new struct if success, or NULL otherwise
*/
static struct damon_region *damon_new_region(struct damon_ctx *ctx,
- unsigned long vm_start, unsigned long vm_end)
+ unsigned long start, unsigned long end)
{
struct damon_region *region;

@@ -88,8 +88,8 @@ static struct damon_region *damon_new_region(struct damon_ctx *ctx,
if (!region)
return NULL;

- region->vm_start = vm_start;
- region->vm_end = vm_end;
+ region->ar.start = start;
+ region->ar.end = end;
region->nr_accesses = 0;
INIT_LIST_HEAD(&region->list);

@@ -277,16 +277,16 @@ static int damon_split_region_evenly(struct damon_ctx *ctx,
if (!r || !nr_pieces)
return -EINVAL;

- orig_end = r->vm_end;
- sz_orig = r->vm_end - r->vm_start;
+ orig_end = r->ar.end;
+ sz_orig = r->ar.end - r->ar.start;
sz_piece = ALIGN_DOWN(sz_orig / nr_pieces, MIN_REGION);

if (!sz_piece)
return -EINVAL;

- r->vm_end = r->vm_start + sz_piece;
+ r->ar.end = r->ar.start + sz_piece;
next = damon_next_region(r);
- for (start = r->vm_end; start + sz_piece <= orig_end;
+ for (start = r->ar.end; start + sz_piece <= orig_end;
start += sz_piece) {
n = damon_new_region(ctx, start, start + sz_piece);
if (!n)
@@ -296,7 +296,7 @@ static int damon_split_region_evenly(struct damon_ctx *ctx,
}
/* complement last region for possible rounding error */
if (n)
- n->vm_end = orig_end;
+ n->ar.end = orig_end;

return 0;
}
@@ -509,7 +509,7 @@ static void damon_mkold(struct mm_struct *mm, unsigned long addr)
static void damon_prepare_access_check(struct damon_ctx *ctx,
struct mm_struct *mm, struct damon_region *r)
{
- r->sampling_addr = damon_rand(r->vm_start, r->vm_end);
+ r->sampling_addr = damon_rand(r->ar.start, r->ar.end);

damon_mkold(mm, r->sampling_addr);
}
@@ -709,12 +709,12 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
nr = nr_damon_regions(t);
damon_write_rbuf(c, &nr, sizeof(nr));
damon_for_each_region(r, t) {
- damon_write_rbuf(c, &r->vm_start, sizeof(r->vm_start));
- damon_write_rbuf(c, &r->vm_end, sizeof(r->vm_end));
+ damon_write_rbuf(c, &r->ar.start, sizeof(r->ar.start));
+ damon_write_rbuf(c, &r->ar.end, sizeof(r->ar.end));
damon_write_rbuf(c, &r->nr_accesses,
sizeof(r->nr_accesses));
trace_damon_aggregated(t->pid, nr,
- r->vm_start, r->vm_end, r->nr_accesses);
+ r->ar.start, r->ar.end, r->nr_accesses);
r->last_nr_accesses = r->nr_accesses;
r->nr_accesses = 0;
}
@@ -742,8 +742,8 @@ static int damos_madvise(struct damon_task *task, struct damon_region *r,
if (!mm)
goto put_task_out;

- ret = do_madvise(t, mm, PAGE_ALIGN(r->vm_start),
- PAGE_ALIGN(r->vm_end - r->vm_start), behavior);
+ ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start),
+ PAGE_ALIGN(r->ar.end - r->ar.start), behavior);
mmput(mm);
put_task_out:
put_task_struct(t);
@@ -790,7 +790,7 @@ static void damon_do_apply_schemes(struct damon_ctx *c, struct damon_task *t,
unsigned long sz;

damon_for_each_scheme(s, c) {
- sz = r->vm_end - r->vm_start;
+ sz = r->ar.end - r->ar.start;
if ((s->min_sz_region && sz < s->min_sz_region) ||
(s->max_sz_region && s->max_sz_region < sz))
continue;
@@ -821,7 +821,7 @@ static void kdamond_apply_schemes(struct damon_ctx *c)
}
}

-#define sz_damon_region(r) (r->vm_end - r->vm_start)
+#define sz_damon_region(r) (r->ar.end - r->ar.start)

/*
* Merge two adjacent regions into one region
@@ -834,7 +834,7 @@ static void damon_merge_two_regions(struct damon_region *l,
l->nr_accesses = (l->nr_accesses * sz_l + r->nr_accesses * sz_r) /
(sz_l + sz_r);
l->age = (l->age * sz_l + r->age * sz_r) / (sz_l + sz_r);
- l->vm_end = r->vm_end;
+ l->ar.end = r->ar.end;
damon_destroy_region(r);
}

@@ -856,7 +856,7 @@ static void damon_merge_regions_of(struct damon_task *t, unsigned int thres)
else
r->age++;

- if (prev && prev->vm_end == r->vm_start &&
+ if (prev && prev->ar.end == r->ar.start &&
diff_of(prev->nr_accesses, r->nr_accesses) <= thres)
damon_merge_two_regions(prev, r);
else
@@ -893,8 +893,8 @@ static void damon_split_region_at(struct damon_ctx *ctx,
{
struct damon_region *new;

- new = damon_new_region(ctx, r->vm_start + sz_r, r->vm_end);
- r->vm_end = new->vm_start;
+ new = damon_new_region(ctx, r->ar.start + sz_r, r->ar.end);
+ r->ar.end = new->ar.start;

new->age = r->age;
new->last_nr_accesses = r->last_nr_accesses;
@@ -911,7 +911,7 @@ static void damon_split_regions_of(struct damon_ctx *ctx,
int i;

damon_for_each_region_safe(r, next, t) {
- sz_region = r->vm_end - r->vm_start;
+ sz_region = r->ar.end - r->ar.start;

for (i = 0; i < nr_subs - 1 &&
sz_region > 2 * MIN_REGION; i++) {
@@ -985,7 +985,7 @@ static bool kdamond_need_update_regions(struct damon_ctx *ctx)
*/
static bool damon_intersect(struct damon_region *r, struct region *re)
{
- return !(r->vm_end <= re->start || re->end <= r->vm_start);
+ return !(r->ar.end <= re->start || re->end <= r->ar.start);
}

/*
@@ -1024,7 +1024,7 @@ static void damon_apply_three_regions(struct damon_ctx *ctx,
first = r;
last = r;
}
- if (r->vm_start >= br->end)
+ if (r->ar.start >= br->end)
break;
}
if (!first) {
@@ -1036,8 +1036,8 @@ static void damon_apply_three_regions(struct damon_ctx *ctx,
continue;
damon_insert_region(newr, damon_prev_region(r), r);
} else {
- first->vm_start = ALIGN_DOWN(br->start, MIN_REGION);
- last->vm_end = ALIGN(br->end, MIN_REGION);
+ first->ar.start = ALIGN_DOWN(br->start, MIN_REGION);
+ last->ar.end = ALIGN(br->end, MIN_REGION);
}
}
}
--
2.17.1

2020-06-03 14:16:10

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 2/9] mm/damon: Clean up code using 'struct damon_addr_range'

From: SeongJae Park <[email protected]>

There are unnecessarily duplicated code in DAMON, that can be eliminated
by using the new struct, 'damon_addr_range'. This commit cleans up the
DAMON code in the way.

Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon-test.h | 36 ++++++++++++++++++------------------
mm/damon.c | 46 ++++++++++++++++++++--------------------------
2 files changed, 38 insertions(+), 44 deletions(-)

diff --git a/mm/damon-test.h b/mm/damon-test.h
index 9dd2061502cb..6d01f0e782d5 100644
--- a/mm/damon-test.h
+++ b/mm/damon-test.h
@@ -177,7 +177,7 @@ static void damon_test_set_recording(struct kunit *test)
*/
static void damon_test_three_regions_in_vmas(struct kunit *test)
{
- struct region regions[3] = {0,};
+ struct damon_addr_range regions[3] = {0,};
/* 10-20-25, 200-210-220, 300-305, 307-330 */
struct vm_area_struct vmas[] = {
(struct vm_area_struct) {.vm_start = 10, .vm_end = 20},
@@ -331,7 +331,7 @@ static struct damon_region *__nth_region_of(struct damon_task *t, int idx)
*/
static void damon_do_test_apply_three_regions(struct kunit *test,
unsigned long *regions, int nr_regions,
- struct region *three_regions,
+ struct damon_addr_range *three_regions,
unsigned long *expected, int nr_expected)
{
struct damon_task *t;
@@ -369,10 +369,10 @@ static void damon_test_apply_three_regions1(struct kunit *test)
unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
70, 80, 80, 90, 90, 100};
/* 5-27, 45-55, 73-104 */
- struct region new_three_regions[3] = {
- (struct region){.start = 5, .end = 27},
- (struct region){.start = 45, .end = 55},
- (struct region){.start = 73, .end = 104} };
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 45, .end = 55},
+ (struct damon_addr_range){.start = 73, .end = 104} };
/* 5-20-27, 45-55, 73-80-90-104 */
unsigned long expected[] = {5, 20, 20, 27, 45, 55,
73, 80, 80, 90, 90, 104};
@@ -391,10 +391,10 @@ static void damon_test_apply_three_regions2(struct kunit *test)
unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
70, 80, 80, 90, 90, 100};
/* 5-27, 56-57, 65-104 */
- struct region new_three_regions[3] = {
- (struct region){.start = 5, .end = 27},
- (struct region){.start = 56, .end = 57},
- (struct region){.start = 65, .end = 104} };
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 56, .end = 57},
+ (struct damon_addr_range){.start = 65, .end = 104} };
/* 5-20-27, 56-57, 65-80-90-104 */
unsigned long expected[] = {5, 20, 20, 27, 56, 57,
65, 80, 80, 90, 90, 104};
@@ -415,10 +415,10 @@ static void damon_test_apply_three_regions3(struct kunit *test)
unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
70, 80, 80, 90, 90, 100};
/* 5-27, 61-63, 65-104 */
- struct region new_three_regions[3] = {
- (struct region){.start = 5, .end = 27},
- (struct region){.start = 61, .end = 63},
- (struct region){.start = 65, .end = 104} };
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 27},
+ (struct damon_addr_range){.start = 61, .end = 63},
+ (struct damon_addr_range){.start = 65, .end = 104} };
/* 5-20-27, 61-63, 65-80-90-104 */
unsigned long expected[] = {5, 20, 20, 27, 61, 63,
65, 80, 80, 90, 90, 104};
@@ -440,10 +440,10 @@ static void damon_test_apply_three_regions4(struct kunit *test)
unsigned long regions[] = {10, 20, 20, 30, 50, 55, 55, 57, 57, 59,
70, 80, 80, 90, 90, 100};
/* 5-7, 30-32, 65-68 */
- struct region new_three_regions[3] = {
- (struct region){.start = 5, .end = 7},
- (struct region){.start = 30, .end = 32},
- (struct region){.start = 65, .end = 68} };
+ struct damon_addr_range new_three_regions[3] = {
+ (struct damon_addr_range){.start = 5, .end = 7},
+ (struct damon_addr_range){.start = 30, .end = 32},
+ (struct damon_addr_range){.start = 65, .end = 68} };
/* expect 5-7, 30-32, 65-68 */
unsigned long expected[] = {5, 7, 30, 32, 65, 68};

diff --git a/mm/damon.c b/mm/damon.c
index a9676b804b0b..f6dd34425185 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -301,19 +301,15 @@ static int damon_split_region_evenly(struct damon_ctx *ctx,
return 0;
}

-struct region {
- unsigned long start;
- unsigned long end;
-};
-
-static unsigned long sz_region(struct region *r)
+static unsigned long sz_range(struct damon_addr_range *r)
{
return r->end - r->start;
}

-static void swap_regions(struct region *r1, struct region *r2)
+static void swap_ranges(struct damon_addr_range *r1,
+ struct damon_addr_range *r2)
{
- struct region tmp;
+ struct damon_addr_range tmp;

tmp = *r1;
*r1 = *r2;
@@ -324,7 +320,7 @@ static void swap_regions(struct region *r1, struct region *r2)
* Find three regions separated by two biggest unmapped regions
*
* vma the head vma of the target address space
- * regions an array of three 'struct region's that results will be saved
+ * regions an array of three address ranges that results will be saved
*
* This function receives an address space and finds three regions in it which
* separated by the two biggest unmapped regions in the space. Please refer to
@@ -334,9 +330,9 @@ static void swap_regions(struct region *r1, struct region *r2)
* Returns 0 if success, or negative error code otherwise.
*/
static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
- struct region regions[3])
+ struct damon_addr_range regions[3])
{
- struct region gap = {0}, first_gap = {0}, second_gap = {0};
+ struct damon_addr_range gap = {0}, first_gap = {0}, second_gap = {0};
struct vm_area_struct *last_vma = NULL;
unsigned long start = 0;

@@ -349,20 +345,20 @@ static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
}
gap.start = last_vma->vm_end;
gap.end = vma->vm_start;
- if (sz_region(&gap) > sz_region(&second_gap)) {
- swap_regions(&gap, &second_gap);
- if (sz_region(&second_gap) > sz_region(&first_gap))
- swap_regions(&second_gap, &first_gap);
+ if (sz_range(&gap) > sz_range(&second_gap)) {
+ swap_ranges(&gap, &second_gap);
+ if (sz_range(&second_gap) > sz_range(&first_gap))
+ swap_ranges(&second_gap, &first_gap);
}
last_vma = vma;
}

- if (!sz_region(&second_gap) || !sz_region(&first_gap))
+ if (!sz_range(&second_gap) || !sz_range(&first_gap))
return -EINVAL;

/* Sort the two biggest gaps by address */
if (first_gap.start > second_gap.start)
- swap_regions(&first_gap, &second_gap);
+ swap_ranges(&first_gap, &second_gap);

/* Store the result */
regions[0].start = ALIGN(start, MIN_REGION);
@@ -381,7 +377,7 @@ static int damon_three_regions_in_vmas(struct vm_area_struct *vma,
* Returns 0 on success, negative error code otherwise.
*/
static int damon_three_regions_of(struct damon_task *t,
- struct region regions[3])
+ struct damon_addr_range regions[3])
{
struct mm_struct *mm;
int rc;
@@ -443,7 +439,7 @@ static int damon_three_regions_of(struct damon_task *t,
static void damon_init_regions_of(struct damon_ctx *c, struct damon_task *t)
{
struct damon_region *r, *m = NULL;
- struct region regions[3];
+ struct damon_addr_range regions[3];
int i;

if (damon_three_regions_of(t, regions)) {
@@ -977,13 +973,11 @@ static bool kdamond_need_update_regions(struct damon_ctx *ctx)
}

/*
- * Check whether regions are intersecting
- *
- * Note that this function checks 'struct damon_region' and 'struct region'.
+ * Check whether a region is intersecting an address range
*
* Returns true if it is.
*/
-static bool damon_intersect(struct damon_region *r, struct region *re)
+static bool damon_intersect(struct damon_region *r, struct damon_addr_range *re)
{
return !(r->ar.end <= re->start || re->end <= r->ar.start);
}
@@ -995,7 +989,7 @@ static bool damon_intersect(struct damon_region *r, struct region *re)
* bregions the three big regions of the task
*/
static void damon_apply_three_regions(struct damon_ctx *ctx,
- struct damon_task *t, struct region bregions[3])
+ struct damon_task *t, struct damon_addr_range bregions[3])
{
struct damon_region *r, *next;
unsigned int i = 0;
@@ -1014,7 +1008,7 @@ static void damon_apply_three_regions(struct damon_ctx *ctx,
for (i = 0; i < 3; i++) {
struct damon_region *first = NULL, *last;
struct damon_region *newr;
- struct region *br;
+ struct damon_addr_range *br;

br = &bregions[i];
/* Get the first and last regions which intersects with br */
@@ -1047,7 +1041,7 @@ static void damon_apply_three_regions(struct damon_ctx *ctx,
*/
static void kdamond_update_regions(struct damon_ctx *ctx)
{
- struct region three_regions[3];
+ struct damon_addr_range three_regions[3];
struct damon_task *t;

damon_for_each_task(t, ctx) {
--
2.17.1

2020-06-03 14:16:27

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 3/9] mm/damon: Make monitoring target regions init/update configurable

From: SeongJae Park <[email protected]>

This commit allows DAMON users to configure their own monitoring target
regions initializer / updater. Using this, users can confine the
monitoring address spaces as they want. For example, users can track
only stack, heap, shared memory area, or specific file-backed area, as
they want.

Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 13 +++++++++++++
mm/damon.c | 17 ++++++++++-------
2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index b4b06ca905a2..a1b6810ce0eb 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -158,9 +158,16 @@ struct damos {
* @tasks_list: Head of monitoring target tasks (&damon_task) list.
* @schemes_list: Head of schemes (&damos) list.
*
+ * @init_target_regions: Constructs initial monitoring target regions.
+ * @update_target_regions: Updates monitoring target regions.
* @sample_cb: Called for each sampling interval.
* @aggregate_cb: Called for each aggregation interval.
*
+ * The monitoring thread calls @init_target_regions before starting the
+ * monitoring, @update_target_regions for each @regions_update_interval. By
+ * setting these callbacks to appropriate functions, therefore, users can
+ * monitor specific range of virtual address space.
+ *
* @sample_cb and @aggregate_cb are called from @kdamond for each of the
* sampling intervals and aggregation intervals, respectively. Therefore,
* users can safely access to the monitoring results via @tasks_list without
@@ -190,10 +197,16 @@ struct damon_ctx {
struct list_head schemes_list; /* 'damos' objects */

/* callbacks */
+ void (*init_target_regions)(struct damon_ctx *context);
+ void (*update_target_regions)(struct damon_ctx *context);
void (*sample_cb)(struct damon_ctx *context);
void (*aggregate_cb)(struct damon_ctx *context);
};

+/* Reference callback implementations for virtual memory */
+void kdamond_init_vm_regions(struct damon_ctx *ctx);
+void kdamond_update_vm_regions(struct damon_ctx *ctx);
+
int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
unsigned long aggr_int, unsigned long regions_update_int,
diff --git a/mm/damon.c b/mm/damon.c
index f6dd34425185..2a3c1abb9b47 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -72,6 +72,9 @@ static struct damon_ctx damon_user_ctx = {
.regions_update_interval = 1000 * 1000,
.min_nr_regions = 10,
.max_nr_regions = 1000,
+
+ .init_target_regions = kdamond_init_vm_regions,
+ .update_target_regions = kdamond_update_vm_regions,
};

/*
@@ -324,7 +327,7 @@ static void swap_ranges(struct damon_addr_range *r1,
*
* This function receives an address space and finds three regions in it which
* separated by the two biggest unmapped regions in the space. Please refer to
- * below comments of 'damon_init_regions_of()' function to know why this is
+ * below comments of 'damon_init_vm_regions_of()' function to know why this is
* necessary.
*
* Returns 0 if success, or negative error code otherwise.
@@ -436,7 +439,7 @@ static int damon_three_regions_of(struct damon_task *t,
* <BIG UNMAPPED REGION 2>
* <stack>
*/
-static void damon_init_regions_of(struct damon_ctx *c, struct damon_task *t)
+static void damon_init_vm_regions_of(struct damon_ctx *c, struct damon_task *t)
{
struct damon_region *r, *m = NULL;
struct damon_addr_range regions[3];
@@ -465,12 +468,12 @@ static void damon_init_regions_of(struct damon_ctx *c, struct damon_task *t)
}

/* Initialize '->regions_list' of every task */
-static void kdamond_init_regions(struct damon_ctx *ctx)
+void kdamond_init_vm_regions(struct damon_ctx *ctx)
{
struct damon_task *t;

damon_for_each_task(t, ctx)
- damon_init_regions_of(ctx, t);
+ damon_init_vm_regions_of(ctx, t);
}

static void damon_mkold(struct mm_struct *mm, unsigned long addr)
@@ -1039,7 +1042,7 @@ static void damon_apply_three_regions(struct damon_ctx *ctx,
/*
* Update regions for current memory mappings
*/
-static void kdamond_update_regions(struct damon_ctx *ctx)
+void kdamond_update_vm_regions(struct damon_ctx *ctx)
{
struct damon_addr_range three_regions[3];
struct damon_task *t;
@@ -1101,7 +1104,7 @@ static int kdamond_fn(void *data)
unsigned int max_nr_accesses = 0;

pr_info("kdamond (%d) starts\n", ctx->kdamond->pid);
- kdamond_init_regions(ctx);
+ ctx->init_target_regions(ctx);

kdamond_write_record_header(ctx);

@@ -1124,7 +1127,7 @@ static int kdamond_fn(void *data)
}

if (kdamond_need_update_regions(ctx))
- kdamond_update_regions(ctx);
+ ctx->update_target_regions(ctx);
}
damon_flush_rbuffer(ctx);
damon_for_each_task(t, ctx) {
--
2.17.1

2020-06-03 14:16:34

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 4/9] mm/damon/debugfs: Allow users to set initial monitoring target regions

From: SeongJae Park <[email protected]>

Some users would want to monitor only a part of the entire virtual
memory address space. The '->init_target_regions' callback is therefore
provided, but only programming interface can use it.

For the reason, this commit introduces a new debugfs file,
'init_region'. Users can specify which initial monitoring target
address regions they want by writing special input to the file. The
input should describe each region in each line in below form:

<pid> <start address> <end address>

This commit also makes the default '->init_target_regions' callback,
'kdamon_init_vm_regions()' to do nothing if the user has set the initial
target regions already.

Note that the regions will be updated to cover entire memory mapped
regions after 'regions update interval'. If you want the regions to not
be updated after the initial setting, you could set the interval as a
very long time, say, a few decades.

Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon.c | 169 +++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 163 insertions(+), 6 deletions(-)

diff --git a/mm/damon.c b/mm/damon.c
index 2a3c1abb9b47..c7806e81248a 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -472,8 +472,10 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
{
struct damon_task *t;

- damon_for_each_task(t, ctx)
- damon_init_vm_regions_of(ctx, t);
+ damon_for_each_task(t, ctx) {
+ if (!nr_damon_regions(t))
+ damon_init_vm_regions_of(ctx, t);
+ }
}

static void damon_mkold(struct mm_struct *mm, unsigned long addr)
@@ -1685,6 +1687,154 @@ static ssize_t debugfs_record_write(struct file *file,
return ret;
}

+/* This function should not be called while the monitoring is ongoing */
+static ssize_t sprint_init_regions(struct damon_ctx *c, char *buf, ssize_t len)
+{
+ struct damon_task *t;
+ struct damon_region *r;
+ int written = 0;
+ int rc;
+
+ damon_for_each_task(t, c) {
+ damon_for_each_region(r, t) {
+ rc = snprintf(&buf[written], len - written,
+ "%d %lu %lu\n",
+ t->pid, r->ar.start, r->ar.end);
+ if (!rc)
+ return -ENOMEM;
+ written += rc;
+ }
+ }
+ return written;
+}
+
+static ssize_t debugfs_init_regions_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = &damon_user_ctx;
+ char *kbuf;
+ ssize_t len;
+
+ kbuf = kmalloc(count, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ mutex_unlock(&ctx->kdamond_lock);
+ return -EBUSY;
+ }
+
+ len = sprint_init_regions(ctx, kbuf, count);
+ mutex_unlock(&ctx->kdamond_lock);
+ if (len < 0)
+ goto out;
+ len = simple_read_from_buffer(buf, count, ppos, kbuf, len);
+
+out:
+ kfree(kbuf);
+ return len;
+}
+
+static int add_init_region(struct damon_ctx *c,
+ int pid, struct damon_addr_range *ar)
+{
+ struct damon_task *t;
+ struct damon_region *r, *prev;
+ int rc = -EINVAL;
+
+ if (ar->start >= ar->end)
+ return -EINVAL;
+
+ damon_for_each_task(t, c) {
+ if (t->pid == pid) {
+ r = damon_new_region(c, ar->start, ar->end);
+ if (!r)
+ return -ENOMEM;
+ damon_add_region(r, t);
+ if (nr_damon_regions(t) > 1) {
+ prev = damon_prev_region(r);
+ if (prev->ar.end > r->ar.start) {
+ damon_destroy_region(r);
+ return -EINVAL;
+ }
+ }
+ rc = 0;
+ }
+ }
+ return rc;
+}
+
+static int set_init_regions(struct damon_ctx *c, const char *str, ssize_t len)
+{
+ struct damon_task *t;
+ struct damon_region *r, *next;
+ int pos = 0, parsed, ret;
+ int pid;
+ struct damon_addr_range ar;
+ int err;
+
+ damon_for_each_task(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+
+ while (pos < len) {
+ ret = sscanf(&str[pos], "%d %lu %lu%n",
+ &pid, &ar.start, &ar.end, &parsed);
+ if (ret != 3)
+ break;
+ err = add_init_region(c, pid, &ar);
+ if (err)
+ goto fail;
+ pos += parsed;
+ }
+
+ return 0;
+
+fail:
+ damon_for_each_task(t, c) {
+ damon_for_each_region_safe(r, next, t)
+ damon_destroy_region(r);
+ }
+ return err;
+}
+
+static ssize_t debugfs_init_regions_write(struct file *file, const char __user
+ *buf, size_t count, loff_t *ppos)
+{
+ struct damon_ctx *ctx = &damon_user_ctx;
+ char *kbuf;
+ ssize_t ret;
+ int err;
+
+ if (*ppos)
+ return -EINVAL;
+
+ kbuf = kmalloc(count, GFP_KERNEL);
+ if (!kbuf)
+ return -ENOMEM;
+
+ ret = simple_write_to_buffer(kbuf, count, ppos, buf, count);
+ if (ret < 0)
+ goto out;
+
+ mutex_lock(&ctx->kdamond_lock);
+ if (ctx->kdamond) {
+ ret = -EBUSY;
+ goto unlock_out;
+ }
+
+ err = set_init_regions(ctx, kbuf, ret);
+ if (err)
+ ret = err;
+
+unlock_out:
+ mutex_unlock(&ctx->kdamond_lock);
+out:
+ kfree(kbuf);
+ return ret;
+}

static ssize_t debugfs_attrs_read(struct file *file,
char __user *buf, size_t count, loff_t *ppos)
@@ -1766,6 +1916,12 @@ static const struct file_operations record_fops = {
.write = debugfs_record_write,
};

+static const struct file_operations init_regions_fops = {
+ .owner = THIS_MODULE,
+ .read = debugfs_init_regions_read,
+ .write = debugfs_init_regions_write,
+};
+
static const struct file_operations attrs_fops = {
.owner = THIS_MODULE,
.read = debugfs_attrs_read,
@@ -1776,10 +1932,11 @@ static struct dentry *debugfs_root;

static int __init damon_debugfs_init(void)
{
- const char * const file_names[] = {"attrs", "record", "schemes",
- "pids", "monitor_on"};
- const struct file_operations *fops[] = {&attrs_fops, &record_fops,
- &schemes_fops, &pids_fops, &monitor_on_fops};
+ const char * const file_names[] = {"attrs", "init_regions", "record",
+ "schemes", "pids", "monitor_on"};
+ const struct file_operations *fops[] = {&attrs_fops,
+ &init_regions_fops, &record_fops, &schemes_fops, &pids_fops,
+ &monitor_on_fops};
int i;

debugfs_root = debugfs_create_dir("damon", NULL);
--
2.17.1

2020-06-03 14:16:45

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 5/9] Docs/damon: Document 'initial_regions' feature

From: SeongJae Park <[email protected]>

This commit documents the 'initial_regions' feature.

Signed-off-by: SeongJae Park <[email protected]>
---
Documentation/admin-guide/mm/damon/usage.rst | 34 ++++++++++++++++++++
1 file changed, 34 insertions(+)

diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index 18a19c35b4f3..137ed770c2d6 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -326,6 +326,40 @@ having pids 42 and 4242 as the processes to be monitored and check it again::
Note that setting the pids doesn't start the monitoring.


+Initla Monitoring Target Regions
+--------------------------------
+
+DAMON automatically sets and updates the monitoring target regions so that
+entire memory mappings of target processes can be covered. However, users
+might want to limit the monitoring region to specific address ranges, such as
+the heap, the stack, or specific file-mapped area. Or, some users might know
+the initial access pattern of their workloads and therefore want to set optimal
+initial regions for the 'adaptive regions adjustment'.
+
+In such cases, users can explicitly set the initial monitoring target regions
+as they want, by writing proper values to the ``init_regions`` file. Each line
+of the input should represent one region in below form.::
+
+ <pid> <start address> <end address>
+
+The ``pid`` should be already in ``pids`` file, and the regions should be
+passed in address order. For example, below commands will set a couple of
+address ranges, ``1-100`` and ``100-200`` as the initial monitoring target
+region of process 42, and another couple of address ranges, ``20-40`` and
+``50-100`` as that of process 4242.::
+
+ # cd <debugfs>/damon
+ # echo "42 1 100
+ 42 100 200
+ 4242 20 40
+ 4242 50 100" > init_regions
+
+Note that this sets the initial monitoring target regions only. DAMON will
+automatically updates the boundary of the regions after one ``regions update
+interval``. Therefore, users should set the ``regions update interval`` large
+enough.
+
+
Record
------

--
2.17.1

2020-06-03 14:17:38

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 6/9] mm/damon: Make access check primitive configurable

From: SeongJae Park <[email protected]>

DAMON assumes the target region is in virtual address space and
therefore uses PTE Accessed bit checking for access checking. However,
as some CPU provides H/W based memory access check features that usually
more accurate and light-weight than PTE Accessed bit checking, some
users would want to use those in special use cases. Also, some users
might want to use DAMON for different address spaces such as physical
memory space, which needs different ways to check the access.

This commit therefore allows DAMON users to configure the low level
access check primitives as they want.

Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 13 +++++++++++--
mm/damon.c | 20 +++++++++++---------
2 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index a1b6810ce0eb..1a788bfd1b4e 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -160,13 +160,18 @@ struct damos {
*
* @init_target_regions: Constructs initial monitoring target regions.
* @update_target_regions: Updates monitoring target regions.
+ * @prepare_access_checks: Prepares next access check of target regions.
+ * @check_accesses: Checks the access of target regions.
* @sample_cb: Called for each sampling interval.
* @aggregate_cb: Called for each aggregation interval.
*
* The monitoring thread calls @init_target_regions before starting the
- * monitoring, @update_target_regions for each @regions_update_interval. By
+ * monitoring, @update_target_regions for each @regions_update_interval, and
+ * @prepare_access_checks and @check_accesses for each @sample_interval. By
* setting these callbacks to appropriate functions, therefore, users can
- * monitor specific range of virtual address space.
+ * monitor any address space with special handling. If these are not
+ * explicitly configured, the functions for virtual memory address space
+ * monitoring are used.
*
* @sample_cb and @aggregate_cb are called from @kdamond for each of the
* sampling intervals and aggregation intervals, respectively. Therefore,
@@ -199,6 +204,8 @@ struct damon_ctx {
/* callbacks */
void (*init_target_regions)(struct damon_ctx *context);
void (*update_target_regions)(struct damon_ctx *context);
+ void (*prepare_access_checks)(struct damon_ctx *context);
+ unsigned int (*check_accesses)(struct damon_ctx *context);
void (*sample_cb)(struct damon_ctx *context);
void (*aggregate_cb)(struct damon_ctx *context);
};
@@ -206,6 +213,8 @@ struct damon_ctx {
/* Reference callback implementations for virtual memory */
void kdamond_init_vm_regions(struct damon_ctx *ctx);
void kdamond_update_vm_regions(struct damon_ctx *ctx);
+void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
+unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);

int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
diff --git a/mm/damon.c b/mm/damon.c
index c7806e81248a..f5cbc97a3bbc 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -75,6 +75,8 @@ static struct damon_ctx damon_user_ctx = {

.init_target_regions = kdamond_init_vm_regions,
.update_target_regions = kdamond_update_vm_regions,
+ .prepare_access_checks = kdamond_prepare_vm_access_checks,
+ .check_accesses = kdamond_check_vm_accesses,
};

/*
@@ -507,7 +509,7 @@ static void damon_mkold(struct mm_struct *mm, unsigned long addr)
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
}

-static void damon_prepare_access_check(struct damon_ctx *ctx,
+static void damon_prepare_vm_access_check(struct damon_ctx *ctx,
struct mm_struct *mm, struct damon_region *r)
{
r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
@@ -515,7 +517,7 @@ static void damon_prepare_access_check(struct damon_ctx *ctx,
damon_mkold(mm, r->sampling_addr);
}

-static void kdamond_prepare_access_checks(struct damon_ctx *ctx)
+void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx)
{
struct damon_task *t;
struct mm_struct *mm;
@@ -526,7 +528,7 @@ static void kdamond_prepare_access_checks(struct damon_ctx *ctx)
if (!mm)
continue;
damon_for_each_region(r, t)
- damon_prepare_access_check(ctx, mm, r);
+ damon_prepare_vm_access_check(ctx, mm, r);
mmput(mm);
}
}
@@ -564,7 +566,7 @@ static bool damon_young(struct mm_struct *mm, unsigned long addr,
* mm 'mm_struct' for the given virtual address space
* r the region to be checked
*/
-static void damon_check_access(struct damon_ctx *ctx,
+static void damon_check_vm_access(struct damon_ctx *ctx,
struct mm_struct *mm, struct damon_region *r)
{
static struct mm_struct *last_mm;
@@ -588,7 +590,7 @@ static void damon_check_access(struct damon_ctx *ctx,
last_addr = r->sampling_addr;
}

-static unsigned int kdamond_check_accesses(struct damon_ctx *ctx)
+unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
{
struct damon_task *t;
struct mm_struct *mm;
@@ -600,12 +602,12 @@ static unsigned int kdamond_check_accesses(struct damon_ctx *ctx)
if (!mm)
continue;
damon_for_each_region(r, t) {
- damon_check_access(ctx, mm, r);
+ damon_check_vm_access(ctx, mm, r);
max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
}
-
mmput(mm);
}
+
return max_nr_accesses;
}

@@ -1111,13 +1113,13 @@ static int kdamond_fn(void *data)
kdamond_write_record_header(ctx);

while (!kdamond_need_stop(ctx)) {
- kdamond_prepare_access_checks(ctx);
+ ctx->prepare_access_checks(ctx);
if (ctx->sample_cb)
ctx->sample_cb(ctx);

usleep_range(ctx->sample_interval, ctx->sample_interval + 1);

- max_nr_accesses = kdamond_check_accesses(ctx);
+ max_nr_accesses = ctx->check_accesses(ctx);

if (kdamond_aggregate_interval_passed(ctx)) {
kdamond_merge_regions(ctx, max_nr_accesses / 10);
--
2.17.1

2020-06-03 14:18:34

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

From: SeongJae Park <[email protected]>

This commit implements the four callbacks (->init_target_regions,
->update_target_regions, ->prepare_access_check, and ->check_accesses)
for the basic access monitoring of the physical memory address space.
By setting the callback pointers to point those, users can easily
monitor the accesses to the physical memory.

Internally, it uses the PTE Accessed bit, as similar to that of the
virtual memory support. Also, it supports only page frames that
supported by idle page tracking. Acutally, most of the code is stollen
from idle page tracking. Users who want to use other access check
primitives and monitor the frames that not supported with this
implementation could implement their own callbacks on their own.

Signed-off-by: SeongJae Park <[email protected]>
---
include/linux/damon.h | 5 ++
mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 189 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 1a788bfd1b4e..f96503a532ea 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);

+void kdamond_init_phys_regions(struct damon_ctx *ctx);
+void kdamond_update_phys_regions(struct damon_ctx *ctx);
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
+
int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
unsigned long aggr_int, unsigned long regions_update_int,
diff --git a/mm/damon.c b/mm/damon.c
index f5cbc97a3bbc..6a5c6d540580 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -19,7 +19,9 @@
#include <linux/mm.h>
#include <linux/module.h>
#include <linux/page_idle.h>
+#include <linux/pagemap.h>
#include <linux/random.h>
+#include <linux/rmap.h>
#include <linux/sched/mm.h>
#include <linux/sched/task.h>
#include <linux/slab.h>
@@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
}
}

+/* Do nothing. Users should set the initial regions by themselves */
+void kdamond_init_phys_regions(struct damon_ctx *ctx)
+{
+}
+
static void damon_mkold(struct mm_struct *mm, unsigned long addr)
{
pte_t *pte = NULL;
@@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
return max_nr_accesses;
}

+/* access check functions for physical address based regions */
+
+/* This code is stollen from page_idle.c */
+static struct page *damon_phys_get_page(unsigned long pfn)
+{
+ struct page *page;
+ pg_data_t *pgdat;
+
+ if (!pfn_valid(pfn))
+ return NULL;
+
+ page = pfn_to_page(pfn);
+ if (!page || !PageLRU(page) ||
+ !get_page_unless_zero(page))
+ return NULL;
+
+ pgdat = page_pgdat(page);
+ spin_lock_irq(&pgdat->lru_lock);
+ if (unlikely(!PageLRU(page))) {
+ put_page(page);
+ page = NULL;
+ }
+ spin_unlock_irq(&pgdat->lru_lock);
+ return page;
+}
+
+static bool damon_page_mkold(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ damon_mkold(vma->vm_mm, addr);
+ return true;
+}
+
+static void damon_phys_mkold(unsigned long paddr)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct rmap_walk_control rwc = {
+ .rmap_one = damon_page_mkold,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return;
+
+ if (!page_mapped(page) || !page_rmapping(page))
+ return;
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page))
+ return;
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+}
+
+static void damon_prepare_phys_access_check(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ r->sampling_addr = damon_rand(r->ar.start, r->ar.end);
+
+ damon_phys_mkold(r->sampling_addr);
+}
+
+void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx)
+{
+ struct damon_task *t;
+ struct damon_region *r;
+
+ damon_for_each_task(t, ctx) {
+ damon_for_each_region(r, t)
+ damon_prepare_phys_access_check(ctx, r);
+ }
+}
+
+struct damon_phys_access_chk_result {
+ unsigned long page_sz;
+ bool accessed;
+};
+
+static bool damon_page_accessed(struct page *page, struct vm_area_struct *vma,
+ unsigned long addr, void *arg)
+{
+ struct damon_phys_access_chk_result *result = arg;
+
+ result->accessed = damon_young(vma->vm_mm, addr, &result->page_sz);
+
+ /* If accessed, stop walking */
+ return !result->accessed;
+}
+
+static bool damon_phys_young(unsigned long paddr, unsigned long *page_sz)
+{
+ struct page *page = damon_phys_get_page(PHYS_PFN(paddr));
+ struct damon_phys_access_chk_result result = {
+ .page_sz = PAGE_SIZE,
+ .accessed = false,
+ };
+ struct rmap_walk_control rwc = {
+ .arg = &result,
+ .rmap_one = damon_page_accessed,
+ .anon_lock = page_lock_anon_vma_read,
+ };
+ bool need_lock;
+
+ if (!page)
+ return false;
+
+ if (!page_mapped(page) || !page_rmapping(page))
+ return false;
+
+ need_lock = !PageAnon(page) || PageKsm(page);
+ if (need_lock && !trylock_page(page))
+ return false;
+
+ rmap_walk(page, &rwc);
+
+ if (need_lock)
+ unlock_page(page);
+ put_page(page);
+
+ *page_sz = result.page_sz;
+ return result.accessed;
+}
+
+/*
+ * Check whether the region was accessed after the last preparation
+ *
+ * mm 'mm_struct' for the given virtual address space
+ * r the region of physical address space that needs to be checked
+ */
+static void damon_check_phys_access(struct damon_ctx *ctx,
+ struct damon_region *r)
+{
+ static unsigned long last_addr;
+ static unsigned long last_page_sz = PAGE_SIZE;
+ static bool last_accessed;
+
+ /* If the region is in the last checked page, reuse the result */
+ if (ALIGN_DOWN(last_addr, last_page_sz) ==
+ ALIGN_DOWN(r->sampling_addr, last_page_sz)) {
+ if (last_accessed)
+ r->nr_accesses++;
+ return;
+ }
+
+ last_accessed = damon_phys_young(r->sampling_addr, &last_page_sz);
+ if (last_accessed)
+ r->nr_accesses++;
+
+ last_addr = r->sampling_addr;
+}
+
+unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx)
+{
+ struct damon_task *t;
+ struct damon_region *r;
+ unsigned int max_nr_accesses = 0;
+
+ damon_for_each_task(t, ctx) {
+ damon_for_each_region(r, t) {
+ damon_check_phys_access(ctx, r);
+ max_nr_accesses = max(r->nr_accesses, max_nr_accesses);
+ }
+ }
+
+ return max_nr_accesses;
+}
+
/*
* damon_check_reset_time_interval() - Check if a time interval is elapsed.
* @baseline: the time to check whether the interval has elapsed since
@@ -1058,6 +1237,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx)
}
}

+/* Do nothing. If necessary, users should update regions in other callbacks */
+void kdamond_update_phys_regions(struct damon_ctx *ctx)
+{
+}
+
/*
* Check whether current monitoring should be stopped
*
--
2.17.1

2020-06-03 14:18:53

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 8/9] mm/damon/debugfs: Support physical memory monitoring

From: SeongJae Park <[email protected]>

This commit makes the debugfs interface to support the physical memory
monitoring, in addition to the virtual memory monitoring.

Users can do the physical memory monitoring by writing a special
keyword, 'paddr\n' to the 'pids' debugfs file. Then, DAMON will check
the special keyword and configure the callbacks of the monitoring
context for the debugfs user for physical memory. This will internally
add one fake monitoring target process, which has pid as -1.

Unlike the virtual memory monitoring, DAMON debugfs will not
automatically set the monitoring target region. Therefore, users should
also set the monitoring target address region using the 'init_regions'
debugfs file. While doing this, the 'pid' in the input should be '-1'.

Finally, the physical memory monitoring will not automatically
terminated because it has fake monitoring target process. The user
should explicitly turn off the monitoring by writing 'off' to the
'monitor_on' debugfs file.

Signed-off-by: SeongJae Park <[email protected]>
---
mm/damon.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/mm/damon.c b/mm/damon.c
index 6a5c6d540580..7361d5885118 100644
--- a/mm/damon.c
+++ b/mm/damon.c
@@ -1263,6 +1263,9 @@ static bool kdamond_need_stop(struct damon_ctx *ctx)
return true;

damon_for_each_task(t, ctx) {
+ if (t->pid == -1)
+ return false;
+
task = damon_get_task_struct(t);
if (task) {
put_task_struct(task);
@@ -1796,6 +1799,23 @@ static ssize_t debugfs_pids_write(struct file *file,
if (ret < 0)
goto out;

+ if (!strncmp(kbuf, "paddr\n", count)) {
+ /* Configure the context for physical memory monitoring */
+ ctx->init_target_regions = kdamond_init_phys_regions;
+ ctx->update_target_regions = kdamond_update_phys_regions;
+ ctx->prepare_access_checks = kdamond_prepare_phys_access_checks;
+ ctx->check_accesses = kdamond_check_phys_accesses;
+
+ /* Set the fake target task pid as -1 */
+ snprintf(kbuf, count, "-1");
+ } else {
+ /* Configure the context for virtual memory monitoring */
+ ctx->init_target_regions = kdamond_init_vm_regions;
+ ctx->update_target_regions = kdamond_update_vm_regions;
+ ctx->prepare_access_checks = kdamond_prepare_vm_access_checks;
+ ctx->check_accesses = kdamond_check_vm_accesses;
+ }
+
targets = str_to_pids(kbuf, ret, &nr_targets);
if (!targets) {
ret = -ENOMEM;
--
2.17.1

2020-06-03 14:19:35

by SeongJae Park

[permalink] [raw]
Subject: [RFC v2 9/9] Docs/damon: Document physical memory monitoring support

From: SeongJae Park <[email protected]>

This commit adds description for the physical memory monitoring usage in
the DAMON document.

Signed-off-by: SeongJae Park <[email protected]>
---
Documentation/admin-guide/mm/damon/usage.rst | 42 ++++++++++++++------
1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst
index 137ed770c2d6..359745f0dbfb 100644
--- a/Documentation/admin-guide/mm/damon/usage.rst
+++ b/Documentation/admin-guide/mm/damon/usage.rst
@@ -314,27 +314,42 @@ check it again::
Target PIDs
-----------

-Users can get and set the pids of monitoring target processes by reading from
-and writing to the ``pids`` file. For example, below commands set processes
-having pids 42 and 4242 as the processes to be monitored and check it again::
+To monitor the virtual memory address spaces of specific processes, users can
+get and set the pids of monitoring target processes by reading from and writing
+to the ``pids`` file. For example, below commands set processes having pids 42
+and 4242 as the processes to be monitored and check it again::

# cd <debugfs>/damon
# echo 42 4242 > pids
# cat pids
42 4242

+Users can also monitor the physical memory address space of the system by
+writing a special keyword, "``paddr\n``" to the file. In this case, reading the
+file will show ``-1``, as below::
+
+ # cd <debugfs>/damon
+ # echo paddr > pids
+ # cat pids
+ -1
+
Note that setting the pids doesn't start the monitoring.


Initla Monitoring Target Regions
--------------------------------

-DAMON automatically sets and updates the monitoring target regions so that
-entire memory mappings of target processes can be covered. However, users
-might want to limit the monitoring region to specific address ranges, such as
-the heap, the stack, or specific file-mapped area. Or, some users might know
-the initial access pattern of their workloads and therefore want to set optimal
-initial regions for the 'adaptive regions adjustment'.
+In case of the virtual memory monitoring, DAMON automatically sets and updates
+the monitoring target regions so that entire memory mappings of target
+processes can be covered. However, users might want to limit the monitoring
+region to specific address ranges, such as the heap, the stack, or specific
+file-mapped area. Or, some users might know the initial access pattern of
+their workloads and therefore want to set optimal initial regions for the
+'adaptive regions adjustment'.
+
+In contrast, DAMON do not automatically sets and updates the monitoring target
+regions in case of physical memory monitoring. Therefore, users should set the
+monitoring target regions by themselves.

In such cases, users can explicitly set the initial monitoring target regions
as they want, by writing proper values to the ``init_regions`` file. Each line
@@ -354,10 +369,11 @@ region of process 42, and another couple of address ranges, ``20-40`` and
4242 20 40
4242 50 100" > init_regions

-Note that this sets the initial monitoring target regions only. DAMON will
-automatically updates the boundary of the regions after one ``regions update
-interval``. Therefore, users should set the ``regions update interval`` large
-enough.
+Note that this sets the initial monitoring target regions only. In case of
+virtual memory monitoring, DAMON will automatically updates the boundary of the
+regions after one ``regions update interval``. Therefore, users should set the
+``regions update interval`` large enough in this case, if they don't want the
+update.


Record
--
2.17.1

2020-06-03 16:12:27

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On 03.06.20 16:11, SeongJae Park wrote:
> From: SeongJae Park <[email protected]>
>
> This commit implements the four callbacks (->init_target_regions,
> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> for the basic access monitoring of the physical memory address space.
> By setting the callback pointers to point those, users can easily
> monitor the accesses to the physical memory.
>
> Internally, it uses the PTE Accessed bit, as similar to that of the
> virtual memory support. Also, it supports only page frames that
> supported by idle page tracking. Acutally, most of the code is stollen
> from idle page tracking. Users who want to use other access check
> primitives and monitor the frames that not supported with this
> implementation could implement their own callbacks on their own.
>
> Signed-off-by: SeongJae Park <[email protected]>
> ---
> include/linux/damon.h | 5 ++
> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 189 insertions(+)
>
> diff --git a/include/linux/damon.h b/include/linux/damon.h
> index 1a788bfd1b4e..f96503a532ea 100644
> --- a/include/linux/damon.h
> +++ b/include/linux/damon.h
> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
>
> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> +
> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> unsigned long aggr_int, unsigned long regions_update_int,
> diff --git a/mm/damon.c b/mm/damon.c
> index f5cbc97a3bbc..6a5c6d540580 100644
> --- a/mm/damon.c
> +++ b/mm/damon.c
> @@ -19,7 +19,9 @@
> #include <linux/mm.h>
> #include <linux/module.h>
> #include <linux/page_idle.h>
> +#include <linux/pagemap.h>
> #include <linux/random.h>
> +#include <linux/rmap.h>
> #include <linux/sched/mm.h>
> #include <linux/sched/task.h>
> #include <linux/slab.h>
> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> }
> }
>
> +/* Do nothing. Users should set the initial regions by themselves */
> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> +{
> +}
> +
> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> {
> pte_t *pte = NULL;
> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> return max_nr_accesses;
> }
>
> +/* access check functions for physical address based regions */
> +
> +/* This code is stollen from page_idle.c */
> +static struct page *damon_phys_get_page(unsigned long pfn)
> +{
> + struct page *page;
> + pg_data_t *pgdat;
> +
> + if (!pfn_valid(pfn))
> + return NULL;
> +

Who provides these pfns? Can these be random pfns, supplied unchecked by
user space? Or are they at least mapped into some user space process?

IOW, do we need a pfn_to_online_page() to make sure the memmap even was
initialized?

--
Thanks,

David / dhildenb

2020-06-04 07:29:25

by SeongJae Park

[permalink] [raw]
Subject: Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:

> On 03.06.20 16:11, SeongJae Park wrote:
> > From: SeongJae Park <[email protected]>
> >
> > This commit implements the four callbacks (->init_target_regions,
> > ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> > for the basic access monitoring of the physical memory address space.
> > By setting the callback pointers to point those, users can easily
> > monitor the accesses to the physical memory.
> >
> > Internally, it uses the PTE Accessed bit, as similar to that of the
> > virtual memory support. Also, it supports only page frames that
> > supported by idle page tracking. Acutally, most of the code is stollen
> > from idle page tracking. Users who want to use other access check
> > primitives and monitor the frames that not supported with this
> > implementation could implement their own callbacks on their own.
> >
> > Signed-off-by: SeongJae Park <[email protected]>
> > ---
> > include/linux/damon.h | 5 ++
> > mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 189 insertions(+)
> >
> > diff --git a/include/linux/damon.h b/include/linux/damon.h
> > index 1a788bfd1b4e..f96503a532ea 100644
> > --- a/include/linux/damon.h
> > +++ b/include/linux/damon.h
> > @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> > void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> > unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >
> > +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> > +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> > +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> > +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> > +
> > int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> > int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> > unsigned long aggr_int, unsigned long regions_update_int,
> > diff --git a/mm/damon.c b/mm/damon.c
> > index f5cbc97a3bbc..6a5c6d540580 100644
> > --- a/mm/damon.c
> > +++ b/mm/damon.c
> > @@ -19,7 +19,9 @@
> > #include <linux/mm.h>
> > #include <linux/module.h>
> > #include <linux/page_idle.h>
> > +#include <linux/pagemap.h>
> > #include <linux/random.h>
> > +#include <linux/rmap.h>
> > #include <linux/sched/mm.h>
> > #include <linux/sched/task.h>
> > #include <linux/slab.h>
> > @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> > }
> > }
> >
> > +/* Do nothing. Users should set the initial regions by themselves */
> > +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> > +{
> > +}
> > +
> > static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> > {
> > pte_t *pte = NULL;
> > @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> > return max_nr_accesses;
> > }
> >
> > +/* access check functions for physical address based regions */
> > +
> > +/* This code is stollen from page_idle.c */
> > +static struct page *damon_phys_get_page(unsigned long pfn)
> > +{
> > + struct page *page;
> > + pg_data_t *pgdat;
> > +
> > + if (!pfn_valid(pfn))
> > + return NULL;
> > +
>
> Who provides these pfns? Can these be random pfns, supplied unchecked by
> user space? Or are they at least mapped into some user space process?

Your guess is right, users can give random physical address and that will be
translated into pfn.

>
> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> initialized?

Thank you for pointing out this! I will use it in the next spin. Also, this
code is stollen from page_idle_get_page(). Seems like it should also be
modified to use it. I will send the patch for it, either.


Thanks,
SeongJae Park

>
> --
> Thanks,
>
> David / dhildenb

2020-06-04 15:03:53

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On 04.06.20 09:26, SeongJae Park wrote:
> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:
>
>> On 03.06.20 16:11, SeongJae Park wrote:
>>> From: SeongJae Park <[email protected]>
>>>
>>> This commit implements the four callbacks (->init_target_regions,
>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
>>> for the basic access monitoring of the physical memory address space.
>>> By setting the callback pointers to point those, users can easily
>>> monitor the accesses to the physical memory.
>>>
>>> Internally, it uses the PTE Accessed bit, as similar to that of the
>>> virtual memory support. Also, it supports only page frames that
>>> supported by idle page tracking. Acutally, most of the code is stollen
>>> from idle page tracking. Users who want to use other access check
>>> primitives and monitor the frames that not supported with this
>>> implementation could implement their own callbacks on their own.
>>>
>>> Signed-off-by: SeongJae Park <[email protected]>
>>> ---
>>> include/linux/damon.h | 5 ++
>>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 189 insertions(+)
>>>
>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
>>> index 1a788bfd1b4e..f96503a532ea 100644
>>> --- a/include/linux/damon.h
>>> +++ b/include/linux/damon.h
>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
>>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
>>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
>>>
>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
>>> +
>>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
>>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
>>> unsigned long aggr_int, unsigned long regions_update_int,
>>> diff --git a/mm/damon.c b/mm/damon.c
>>> index f5cbc97a3bbc..6a5c6d540580 100644
>>> --- a/mm/damon.c
>>> +++ b/mm/damon.c
>>> @@ -19,7 +19,9 @@
>>> #include <linux/mm.h>
>>> #include <linux/module.h>
>>> #include <linux/page_idle.h>
>>> +#include <linux/pagemap.h>
>>> #include <linux/random.h>
>>> +#include <linux/rmap.h>
>>> #include <linux/sched/mm.h>
>>> #include <linux/sched/task.h>
>>> #include <linux/slab.h>
>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
>>> }
>>> }
>>>
>>> +/* Do nothing. Users should set the initial regions by themselves */
>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
>>> +{
>>> +}
>>> +
>>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
>>> {
>>> pte_t *pte = NULL;
>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
>>> return max_nr_accesses;
>>> }
>>>
>>> +/* access check functions for physical address based regions */
>>> +
>>> +/* This code is stollen from page_idle.c */
>>> +static struct page *damon_phys_get_page(unsigned long pfn)
>>> +{
>>> + struct page *page;
>>> + pg_data_t *pgdat;
>>> +
>>> + if (!pfn_valid(pfn))
>>> + return NULL;
>>> +
>>
>> Who provides these pfns? Can these be random pfns, supplied unchecked by
>> user space? Or are they at least mapped into some user space process?
>
> Your guess is right, users can give random physical address and that will be
> translated into pfn.
>

Note the difference to idle tracking: "Idle page tracking only considers
user memory pages", this is very different to your use case. Note that
this is why there is no pfn_to_online_page() check in page idle code.

>>
>> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
>> initialized?
>
> Thank you for pointing out this! I will use it in the next spin. Also, this
> code is stollen from page_idle_get_page(). Seems like it should also be
> modified to use it. I will send the patch for it, either.

pfn_to_online_page() will only succeed for system RAM pages, not
dax/pmem (ZONE_DEVICE). dax/pmem needs special care.

I can spot that you are taking references to random struct pages. This
looks dangerous to me and might mess in complicated ways with page
migration/isolation/onlining/offlining etc. I am not sure if we want that.

--
Thanks,

David / dhildenb

2020-06-04 15:43:12

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On 04.06.20 17:23, SeongJae Park wrote:
> On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <[email protected]> wrote:
>
>> On 04.06.20 09:26, SeongJae Park wrote:
>>> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:
>>>
>>>> On 03.06.20 16:11, SeongJae Park wrote:
>>>>> From: SeongJae Park <[email protected]>
>>>>>
>>>>> This commit implements the four callbacks (->init_target_regions,
>>>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
>>>>> for the basic access monitoring of the physical memory address space.
>>>>> By setting the callback pointers to point those, users can easily
>>>>> monitor the accesses to the physical memory.
>>>>>
>>>>> Internally, it uses the PTE Accessed bit, as similar to that of the
>>>>> virtual memory support. Also, it supports only page frames that
>>>>> supported by idle page tracking. Acutally, most of the code is stollen
>>>>> from idle page tracking. Users who want to use other access check
>>>>> primitives and monitor the frames that not supported with this
>>>>> implementation could implement their own callbacks on their own.
>>>>>
>>>>> Signed-off-by: SeongJae Park <[email protected]>
>>>>> ---
>>>>> include/linux/damon.h | 5 ++
>>>>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
>>>>> 2 files changed, 189 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
>>>>> index 1a788bfd1b4e..f96503a532ea 100644
>>>>> --- a/include/linux/damon.h
>>>>> +++ b/include/linux/damon.h
>>>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
>>>>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
>>>>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
>>>>>
>>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
>>>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
>>>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
>>>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
>>>>> +
>>>>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
>>>>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
>>>>> unsigned long aggr_int, unsigned long regions_update_int,
>>>>> diff --git a/mm/damon.c b/mm/damon.c
>>>>> index f5cbc97a3bbc..6a5c6d540580 100644
>>>>> --- a/mm/damon.c
>>>>> +++ b/mm/damon.c
>>>>> @@ -19,7 +19,9 @@
>>>>> #include <linux/mm.h>
>>>>> #include <linux/module.h>
>>>>> #include <linux/page_idle.h>
>>>>> +#include <linux/pagemap.h>
>>>>> #include <linux/random.h>
>>>>> +#include <linux/rmap.h>
>>>>> #include <linux/sched/mm.h>
>>>>> #include <linux/sched/task.h>
>>>>> #include <linux/slab.h>
>>>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
>>>>> }
>>>>> }
>>>>>
>>>>> +/* Do nothing. Users should set the initial regions by themselves */
>>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
>>>>> +{
>>>>> +}
>>>>> +
>>>>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
>>>>> {
>>>>> pte_t *pte = NULL;
>>>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
>>>>> return max_nr_accesses;
>>>>> }
>>>>>
>>>>> +/* access check functions for physical address based regions */
>>>>> +
>>>>> +/* This code is stollen from page_idle.c */
>>>>> +static struct page *damon_phys_get_page(unsigned long pfn)
>>>>> +{
>>>>> + struct page *page;
>>>>> + pg_data_t *pgdat;
>>>>> +
>>>>> + if (!pfn_valid(pfn))
>>>>> + return NULL;
>>>>> +
>>>>
>>>> Who provides these pfns? Can these be random pfns, supplied unchecked by
>>>> user space? Or are they at least mapped into some user space process?
>>>
>>> Your guess is right, users can give random physical address and that will be
>>> translated into pfn.
>>>
>>
>> Note the difference to idle tracking: "Idle page tracking only considers
>> user memory pages", this is very different to your use case. Note that
>> this is why there is no pfn_to_online_page() check in page idle code.
>
> My use case is same to that of idle page. I also ignore non-user pages.
> Actually, this function is for filtering of the non-user pages, which is simply
> stollen from the page_idle.

Okay, that is valuable information, I missed that. The comment in
page_idle.c is actually pretty valuable.

In both cases, user space can provide random physical address but you
will only care about user pages. Understood.

That turns things less dangerous. :)

>>>> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
>>>> initialized?
>>>
>>> Thank you for pointing out this! I will use it in the next spin. Also, this
>>> code is stollen from page_idle_get_page(). Seems like it should also be
>>> modified to use it. I will send the patch for it, either.
>>
>> pfn_to_online_page() will only succeed for system RAM pages, not
>> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
>>
>> I can spot that you are taking references to random struct pages. This
>> looks dangerous to me and might mess in complicated ways with page
>> migration/isolation/onlining/offlining etc. I am not sure if we want that.
>
> AFAIU, page_idle users can also pass random pfns by randomly accessing the
> bitmap file. Am I missing something?

I am definitely no expert on page idle tracking. If that is the case,
then we'll also need pfn_to_online_page() handling (and might have to
care about ZONE_DEVICE, not hard but needs some extra LOCs).

I am still not sure if grabbing references on theoretically isolated
pageblocks is okay, but that's just complicated stuff and as you state,
is already performed. At least I can read "With such an indicator of
user pages we can skip isolated pages". So isolated pages during page
migration are properly handled.


Instead of stealing, factor out, document, and reuse? That makes it
clearer that you are not inventing the wheel, and if we have to fix
something, we only have to fix at a single point.

--
Thanks,

David / dhildenb

2020-06-04 15:54:56

by SeongJae Park

[permalink] [raw]
Subject: Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On Thu, 4 Jun 2020 17:39:49 +0200 David Hildenbrand <[email protected]> wrote:

> On 04.06.20 17:23, SeongJae Park wrote:
> > On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <[email protected]> wrote:
> >
> >> On 04.06.20 09:26, SeongJae Park wrote:
> >>> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:
> >>>
> >>>> On 03.06.20 16:11, SeongJae Park wrote:
> >>>>> From: SeongJae Park <[email protected]>
> >>>>>
> >>>>> This commit implements the four callbacks (->init_target_regions,
> >>>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> >>>>> for the basic access monitoring of the physical memory address space.
> >>>>> By setting the callback pointers to point those, users can easily
> >>>>> monitor the accesses to the physical memory.
> >>>>>
> >>>>> Internally, it uses the PTE Accessed bit, as similar to that of the
> >>>>> virtual memory support. Also, it supports only page frames that
> >>>>> supported by idle page tracking. Acutally, most of the code is stollen
> >>>>> from idle page tracking. Users who want to use other access check
> >>>>> primitives and monitor the frames that not supported with this
> >>>>> implementation could implement their own callbacks on their own.
> >>>>>
> >>>>> Signed-off-by: SeongJae Park <[email protected]>
> >>>>> ---
> >>>>> include/linux/damon.h | 5 ++
> >>>>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
> >>>>> 2 files changed, 189 insertions(+)
> >>>>>
> >>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
> >>>>> index 1a788bfd1b4e..f96503a532ea 100644
> >>>>> --- a/include/linux/damon.h
> >>>>> +++ b/include/linux/damon.h
> >>>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> >>>>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> >>>>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >>>>>
> >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> >>>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> >>>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> >>>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> >>>>> +
> >>>>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> >>>>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> >>>>> unsigned long aggr_int, unsigned long regions_update_int,
> >>>>> diff --git a/mm/damon.c b/mm/damon.c
> >>>>> index f5cbc97a3bbc..6a5c6d540580 100644
> >>>>> --- a/mm/damon.c
> >>>>> +++ b/mm/damon.c
> >>>>> @@ -19,7 +19,9 @@
> >>>>> #include <linux/mm.h>
> >>>>> #include <linux/module.h>
> >>>>> #include <linux/page_idle.h>
> >>>>> +#include <linux/pagemap.h>
> >>>>> #include <linux/random.h>
> >>>>> +#include <linux/rmap.h>
> >>>>> #include <linux/sched/mm.h>
> >>>>> #include <linux/sched/task.h>
> >>>>> #include <linux/slab.h>
> >>>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> +/* Do nothing. Users should set the initial regions by themselves */
> >>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> >>>>> +{
> >>>>> +}
> >>>>> +
> >>>>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> >>>>> {
> >>>>> pte_t *pte = NULL;
> >>>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> >>>>> return max_nr_accesses;
> >>>>> }
> >>>>>
> >>>>> +/* access check functions for physical address based regions */
> >>>>> +
> >>>>> +/* This code is stollen from page_idle.c */
> >>>>> +static struct page *damon_phys_get_page(unsigned long pfn)
> >>>>> +{
> >>>>> + struct page *page;
> >>>>> + pg_data_t *pgdat;
> >>>>> +
> >>>>> + if (!pfn_valid(pfn))
> >>>>> + return NULL;
> >>>>> +
> >>>>
> >>>> Who provides these pfns? Can these be random pfns, supplied unchecked by
> >>>> user space? Or are they at least mapped into some user space process?
> >>>
> >>> Your guess is right, users can give random physical address and that will be
> >>> translated into pfn.
> >>>
> >>
> >> Note the difference to idle tracking: "Idle page tracking only considers
> >> user memory pages", this is very different to your use case. Note that
> >> this is why there is no pfn_to_online_page() check in page idle code.
> >
> > My use case is same to that of idle page. I also ignore non-user pages.
> > Actually, this function is for filtering of the non-user pages, which is simply
> > stollen from the page_idle.
>
> Okay, that is valuable information, I missed that. The comment in
> page_idle.c is actually pretty valuable.
>
> In both cases, user space can provide random physical address but you
> will only care about user pages. Understood.
>
> That turns things less dangerous. :)

Glad to hear this. I will refine this point in the next spin! :)

>
> >>>> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> >>>> initialized?
> >>>
> >>> Thank you for pointing out this! I will use it in the next spin. Also, this
> >>> code is stollen from page_idle_get_page(). Seems like it should also be
> >>> modified to use it. I will send the patch for it, either.
> >>
> >> pfn_to_online_page() will only succeed for system RAM pages, not
> >> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
> >>
> >> I can spot that you are taking references to random struct pages. This
> >> looks dangerous to me and might mess in complicated ways with page
> >> migration/isolation/onlining/offlining etc. I am not sure if we want that.
> >
> > AFAIU, page_idle users can also pass random pfns by randomly accessing the
> > bitmap file. Am I missing something?
>
> I am definitely no expert on page idle tracking. If that is the case,
> then we'll also need pfn_to_online_page() handling (and might have to
> care about ZONE_DEVICE, not hard but needs some extra LOCs).

Agree, I will post the patch soon. That said, if you get any doubt, please
don't hesitate yelling.

>
> I am still not sure if grabbing references on theoretically isolated
> pageblocks is okay, but that's just complicated stuff and as you state,
> is already performed. At least I can read "With such an indicator of
> user pages we can skip isolated pages". So isolated pages during page
> migration are properly handled.
>
>
> Instead of stealing, factor out, document, and reuse? That makes it
> clearer that you are not inventing the wheel, and if we have to fix
> something, we only have to fix at a single point.

Good point, I will consider reusing the code instead of stealing in the next
spin.


Thanks,
SeongJae Park

>
> --
> Thanks,
>
> David / dhildenb
>

2020-06-04 19:54:37

by SeongJae Park

[permalink] [raw]
Subject: Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <[email protected]> wrote:

> On 04.06.20 09:26, SeongJae Park wrote:
> > On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:
> >
> >> On 03.06.20 16:11, SeongJae Park wrote:
> >>> From: SeongJae Park <[email protected]>
> >>>
> >>> This commit implements the four callbacks (->init_target_regions,
> >>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> >>> for the basic access monitoring of the physical memory address space.
> >>> By setting the callback pointers to point those, users can easily
> >>> monitor the accesses to the physical memory.
> >>>
> >>> Internally, it uses the PTE Accessed bit, as similar to that of the
> >>> virtual memory support. Also, it supports only page frames that
> >>> supported by idle page tracking. Acutally, most of the code is stollen
> >>> from idle page tracking. Users who want to use other access check
> >>> primitives and monitor the frames that not supported with this
> >>> implementation could implement their own callbacks on their own.
> >>>
> >>> Signed-off-by: SeongJae Park <[email protected]>
> >>> ---
> >>> include/linux/damon.h | 5 ++
> >>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
> >>> 2 files changed, 189 insertions(+)
> >>>
> >>> diff --git a/include/linux/damon.h b/include/linux/damon.h
> >>> index 1a788bfd1b4e..f96503a532ea 100644
> >>> --- a/include/linux/damon.h
> >>> +++ b/include/linux/damon.h
> >>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> >>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> >>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >>>
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> >>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> >>> +
> >>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> >>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> >>> unsigned long aggr_int, unsigned long regions_update_int,
> >>> diff --git a/mm/damon.c b/mm/damon.c
> >>> index f5cbc97a3bbc..6a5c6d540580 100644
> >>> --- a/mm/damon.c
> >>> +++ b/mm/damon.c
> >>> @@ -19,7 +19,9 @@
> >>> #include <linux/mm.h>
> >>> #include <linux/module.h>
> >>> #include <linux/page_idle.h>
> >>> +#include <linux/pagemap.h>
> >>> #include <linux/random.h>
> >>> +#include <linux/rmap.h>
> >>> #include <linux/sched/mm.h>
> >>> #include <linux/sched/task.h>
> >>> #include <linux/slab.h>
> >>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> >>> }
> >>> }
> >>>
> >>> +/* Do nothing. Users should set the initial regions by themselves */
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> >>> +{
> >>> +}
> >>> +
> >>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> >>> {
> >>> pte_t *pte = NULL;
> >>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> >>> return max_nr_accesses;
> >>> }
> >>>
> >>> +/* access check functions for physical address based regions */
> >>> +
> >>> +/* This code is stollen from page_idle.c */
> >>> +static struct page *damon_phys_get_page(unsigned long pfn)
> >>> +{
> >>> + struct page *page;
> >>> + pg_data_t *pgdat;
> >>> +
> >>> + if (!pfn_valid(pfn))
> >>> + return NULL;
> >>> +
> >>
> >> Who provides these pfns? Can these be random pfns, supplied unchecked by
> >> user space? Or are they at least mapped into some user space process?
> >
> > Your guess is right, users can give random physical address and that will be
> > translated into pfn.
> >
>
> Note the difference to idle tracking: "Idle page tracking only considers
> user memory pages", this is very different to your use case. Note that
> this is why there is no pfn_to_online_page() check in page idle code.

My use case is same to that of idle page. I also ignore non-user pages.
Actually, this function is for filtering of the non-user pages, which is simply
stollen from the page_idle.

>
> >>
> >> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> >> initialized?
> >
> > Thank you for pointing out this! I will use it in the next spin. Also, this
> > code is stollen from page_idle_get_page(). Seems like it should also be
> > modified to use it. I will send the patch for it, either.
>
> pfn_to_online_page() will only succeed for system RAM pages, not
> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
>
> I can spot that you are taking references to random struct pages. This
> looks dangerous to me and might mess in complicated ways with page
> migration/isolation/onlining/offlining etc. I am not sure if we want that.

AFAIU, page_idle users can also pass random pfns by randomly accessing the
bitmap file. Am I missing something?


Thanks,
SeongJae Park

>
> --
> Thanks,
>
> David / dhildenb

2020-06-04 21:24:02

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring

On 04.06.20 17:51, SeongJae Park wrote:
> On Thu, 4 Jun 2020 17:39:49 +0200 David Hildenbrand <[email protected]> wrote:
>
>> On 04.06.20 17:23, SeongJae Park wrote:
>>> On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <[email protected]> wrote:
>>>
>>>> On 04.06.20 09:26, SeongJae Park wrote:
>>>>> On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <[email protected]> wrote:
>>>>>
>>>>>> On 03.06.20 16:11, SeongJae Park wrote:
>>>>>>> From: SeongJae Park <[email protected]>
>>>>>>>
>>>>>>> This commit implements the four callbacks (->init_target_regions,
>>>>>>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
>>>>>>> for the basic access monitoring of the physical memory address space.
>>>>>>> By setting the callback pointers to point those, users can easily
>>>>>>> monitor the accesses to the physical memory.
>>>>>>>
>>>>>>> Internally, it uses the PTE Accessed bit, as similar to that of the
>>>>>>> virtual memory support. Also, it supports only page frames that
>>>>>>> supported by idle page tracking. Acutally, most of the code is stollen
>>>>>>> from idle page tracking. Users who want to use other access check
>>>>>>> primitives and monitor the frames that not supported with this
>>>>>>> implementation could implement their own callbacks on their own.
>>>>>>>
>>>>>>> Signed-off-by: SeongJae Park <[email protected]>
>>>>>>> ---
>>>>>>> include/linux/damon.h | 5 ++
>>>>>>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>> 2 files changed, 189 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/linux/damon.h b/include/linux/damon.h
>>>>>>> index 1a788bfd1b4e..f96503a532ea 100644
>>>>>>> --- a/include/linux/damon.h
>>>>>>> +++ b/include/linux/damon.h
>>>>>>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
>>>>>>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
>>>>>>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
>>>>>>>
>>>>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
>>>>>>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
>>>>>>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
>>>>>>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
>>>>>>> +
>>>>>>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
>>>>>>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
>>>>>>> unsigned long aggr_int, unsigned long regions_update_int,
>>>>>>> diff --git a/mm/damon.c b/mm/damon.c
>>>>>>> index f5cbc97a3bbc..6a5c6d540580 100644
>>>>>>> --- a/mm/damon.c
>>>>>>> +++ b/mm/damon.c
>>>>>>> @@ -19,7 +19,9 @@
>>>>>>> #include <linux/mm.h>
>>>>>>> #include <linux/module.h>
>>>>>>> #include <linux/page_idle.h>
>>>>>>> +#include <linux/pagemap.h>
>>>>>>> #include <linux/random.h>
>>>>>>> +#include <linux/rmap.h>
>>>>>>> #include <linux/sched/mm.h>
>>>>>>> #include <linux/sched/task.h>
>>>>>>> #include <linux/slab.h>
>>>>>>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> +/* Do nothing. Users should set the initial regions by themselves */
>>>>>>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
>>>>>>> +{
>>>>>>> +}
>>>>>>> +
>>>>>>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
>>>>>>> {
>>>>>>> pte_t *pte = NULL;
>>>>>>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
>>>>>>> return max_nr_accesses;
>>>>>>> }
>>>>>>>
>>>>>>> +/* access check functions for physical address based regions */
>>>>>>> +
>>>>>>> +/* This code is stollen from page_idle.c */
>>>>>>> +static struct page *damon_phys_get_page(unsigned long pfn)
>>>>>>> +{
>>>>>>> + struct page *page;
>>>>>>> + pg_data_t *pgdat;
>>>>>>> +
>>>>>>> + if (!pfn_valid(pfn))
>>>>>>> + return NULL;
>>>>>>> +
>>>>>>
>>>>>> Who provides these pfns? Can these be random pfns, supplied unchecked by
>>>>>> user space? Or are they at least mapped into some user space process?
>>>>>
>>>>> Your guess is right, users can give random physical address and that will be
>>>>> translated into pfn.
>>>>>
>>>>
>>>> Note the difference to idle tracking: "Idle page tracking only considers
>>>> user memory pages", this is very different to your use case. Note that
>>>> this is why there is no pfn_to_online_page() check in page idle code.
>>>
>>> My use case is same to that of idle page. I also ignore non-user pages.
>>> Actually, this function is for filtering of the non-user pages, which is simply
>>> stollen from the page_idle.
>>
>> Okay, that is valuable information, I missed that. The comment in
>> page_idle.c is actually pretty valuable.
>>
>> In both cases, user space can provide random physical address but you
>> will only care about user pages. Understood.
>>
>> That turns things less dangerous. :)
>
> Glad to hear this. I will refine this point in the next spin! :)

Perfect, when I read "physical address space", I was assuming you would
magically track access to e.g., memmap, page tables and stuff like that.
And I wondered how you would do that :D

BTW now that I realized that as we only care about LRU pages,
page_to_online_page() is sufficient, no need to worry about dax/pmem.

--
Thanks,

David / dhildenb