Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp363734pxa; Tue, 4 Aug 2020 07:27:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwTnAvgopILOL9iAj9c0xma8bHLkear/gNzbwDUpvw0ecZzH9gSujYvIsslRfFLlVQbAb/R X-Received: by 2002:a17:907:385:: with SMTP id ss5mr21095106ejb.496.1596551250886; Tue, 04 Aug 2020 07:27:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596551250; cv=none; d=google.com; s=arc-20160816; b=ERrXE1jyLlR/WxxLkDuEHEBtRzC1rRWyjtCQDy42OvlSA1azVjoMBSm7Ju3Rup2kcd H+IGumbRqFWMNjmwq/lTy6B6EeQNxmo51Vfna323FS5tVxHWtZgnFDHLqhJCM2QdsHXF NCoV10SSdEyvWKyxgco1B4jQjy5v9Xuf9ItJ3Am7LHHdfJ8s+aCkksQzgyxRfmykDsrX h6BTkHhWUNq+7nEqa8NdddcAkglmh7rM25rIMkYMglKL4ztjmRejg0dhEyoEOVGcKqSz 2NCL2QIvt7L1IeIz5I19xy27KWF4gdbRJ3elWGdppzBsxWIql+Cs/JP/vIbnBejjHYLA LztQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:ironport-sdr:dkim-signature; bh=40IHp0jiwlzmdCD6RxFAobNJs6ywURYkZuvUXiy1U3o=; b=0dFBxGlVWPH3mkHjg07PArTEHeKL0hbqrVUtkY3znMfh6QkoPmvCmQbVUQIWygpsAh RC7jJSQZfPMtXUX9JHYDaZtXjYVEp4FOIjzy32S37BQdO+77V7sFoLGrZ7edAIiWxxFt 0Pve8TsTfW6Vh/4cMZksNq8Q9pow77UHfu73AfzcoV6VVcyFyvMhq5dZrnUgNhuX99HO h5OfVo2EvuoXC6PCBsflbv0vstXRCZZ3jO5qkM6yMSmuJPA9/P9rP3bOjEqnh9CCa1Yg ggDvJPgrHO/91zxhWVSyp9DcOv/L4j3En6bZyX+KTzUtVBgQ7PJuntmIaJQrIOdTq2yy JP2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=DIJiVqo7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h14si12231926eds.339.2020.08.04.07.27.02; Tue, 04 Aug 2020 07:27:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=DIJiVqo7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728842AbgHDO0G (ORCPT + 99 others); Tue, 4 Aug 2020 10:26:06 -0400 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:43313 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729093AbgHDO0D (ORCPT ); Tue, 4 Aug 2020 10:26:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1596551161; x=1628087161; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=40IHp0jiwlzmdCD6RxFAobNJs6ywURYkZuvUXiy1U3o=; b=DIJiVqo7nQltPxnvqprXLCdk90nrit28qnckkkj5pMlxKl43jzPGxaU7 AD6fug7dd0XiTw3xsdYGZ/ExehbTx9VGY3bZjKVn0H2rgnV03Sios9taH lbiDVxA5e/I+ZW5IOI/Yyem0b1LDw+/whamAcb5yK73UXT8FqxLi7/lan c=; IronPort-SDR: Cjw/qoV7otIc6+CqSDUQtRorzPG01UDQCSwRl0cKoVZ9IdFtwfHP1EtQ6s7WUmxpc0OSFvzZHJ yNXC5a57BQAw== X-IronPort-AV: E=Sophos;i="5.75,434,1589241600"; d="scan'208";a="64245406" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-2b-859fe132.us-west-2.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 04 Aug 2020 14:25:47 +0000 Received: from EX13MTAUEA002.ant.amazon.com (pdx4-ws-svc-p6-lb7-vlan2.pdx.amazon.com [10.170.41.162]) by email-inbound-relay-2b-859fe132.us-west-2.amazon.com (Postfix) with ESMTPS id 552F62232CB; Tue, 4 Aug 2020 14:25:44 +0000 (UTC) Received: from EX13D31EUA001.ant.amazon.com (10.43.165.15) by EX13MTAUEA002.ant.amazon.com (10.43.61.77) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 4 Aug 2020 14:25:43 +0000 Received: from u886c93fd17d25d.ant.amazon.com (10.43.161.203) by EX13D31EUA001.ant.amazon.com (10.43.165.15) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 4 Aug 2020 14:25:26 +0000 From: SeongJae Park To: CC: SeongJae Park , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [RFC v14 2/7] mm/damon: Implement data access monitoring-based operation schemes Date: Tue, 4 Aug 2020 16:24:25 +0200 Message-ID: <20200804142430.15384-3-sjpark@amazon.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200804142430.15384-1-sjpark@amazon.com> References: <20200804142430.15384-1-sjpark@amazon.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.43.161.203] X-ClientProxiedBy: EX13D17UWC004.ant.amazon.com (10.43.162.195) To EX13D31EUA001.ant.amazon.com (10.43.165.15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: SeongJae Park In many cases, users might use DAMON for simple data access aware memory management optimizations such as applying an operation scheme to a memory region of a specific size having a specific access frequency for a specific time. For example, "page out a memory region larger than 100 MiB but having a low access frequency more than 10 minutes", or "Use THP for a memory region larger than 2 MiB having a high access frequency for more than 2 seconds". Most simple form of the solution would be doing offline data access pattern profiling using DAMON and modifying the application source code or system configuration based on the profiling results. Or, developing a daemon constructed with two modules (one for access monitoring and the other for applying memory management actions via mlock(), madvise(), sysctl, etc) is imaginable. To minimize users from spending their time for implementation of such simple data access monitoring-based operation schemes, this commit makes DAMON to handle such schemes directly. With this commit, users can simply specify their desired schemes to DAMON. Then, DAMON will automatically apply the schemes to the the user-specified target processes. Each of the schemes is composed with conditions for filtering of the target memory regions and desired memory management action for the target. Specifically, the format is:: The filtering conditions are size of memory region, number of accesses to the region monitored by DAMON, and the age of the region. The age of region is incremented periodically but reset when its addresses or access frequency has significantly changed or the action of a scheme was applied. For the action, current implementation supports only a few of madvise() hints, ``MADV_WILLNEED``, ``MADV_COLD``, ``MADV_PAGEOUT``, ``MADV_HUGEPAGE``, and ``MADV_NOHUGEPAGE``. Signed-off-by: SeongJae Park --- include/linux/damon.h | 48 ++++++++++++++ mm/damon.c | 146 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 194 insertions(+) diff --git a/include/linux/damon.h b/include/linux/damon.h index ca5bd8de4aa4..df10bd00443e 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -67,6 +67,50 @@ struct damon_target { struct list_head list; }; +/** + * enum damos_action - Represents an action of a Data Access Monitoring-based + * Operation Scheme. + * + * @DAMOS_WILLNEED: Call ``madvise()`` for the region with MADV_WILLNEED. + * @DAMOS_COLD: Call ``madvise()`` for the region with MADV_COLD. + * @DAMOS_PAGEOUT: Call ``madvise()`` for the region with MADV_PAGEOUT. + * @DAMOS_HUGEPAGE: Call ``madvise()`` for the region with MADV_HUGEPAGE. + * @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with MADV_NOHUGEPAGE. + */ +enum damos_action { + DAMOS_WILLNEED, + DAMOS_COLD, + DAMOS_PAGEOUT, + DAMOS_HUGEPAGE, + DAMOS_NOHUGEPAGE, +}; + +/** + * struct damos - Represents a Data Access Monitoring-based Operation Scheme. + * @min_sz_region: Minimum size of target regions. + * @max_sz_region: Maximum size of target regions. + * @min_nr_accesses: Minimum ``->nr_accesses`` of target regions. + * @max_nr_accesses: Maximum ``->nr_accesses`` of target regions. + * @min_age_region: Minimum age of target regions. + * @max_age_region: Maximum age of target regions. + * @action: &damo_action to be applied to the target regions. + * @list: List head for siblings. + * + * For each aggregation interval, DAMON applies @action to monitoring target + * regions fit in the condition and updates the statistics. Note that both + * the minimums and the maximums are inclusive. + */ +struct damos { + unsigned long min_sz_region; + unsigned long max_sz_region; + unsigned int min_nr_accesses; + unsigned int max_nr_accesses; + unsigned int min_age_region; + unsigned int max_age_region; + enum damos_action action; + struct list_head list; +}; + /** * struct damon_ctx - Represents a context for each monitoring. This is the * main interface that allows users to set the attributes and get the results @@ -113,6 +157,7 @@ struct damon_target { * @kdamond_lock. Accesses to other fields must be protected by themselves. * * @targets_list: Head of monitoring targets (&damon_target) list. + * @schemes_list: Head of schemes (&damos) list. * * @init_target_regions: Constructs initial monitoring target regions. * @update_target_regions: Updates monitoring target regions. @@ -167,6 +212,7 @@ struct damon_ctx { struct mutex kdamond_lock; struct list_head targets_list; /* 'damon_target' objects */ + struct list_head schemes_list; /* 'damos' objects */ /* callbacks */ void (*init_target_regions)(struct damon_ctx *context); @@ -190,6 +236,8 @@ int damon_set_targets(struct damon_ctx *ctx, int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int, unsigned long aggr_int, unsigned long regions_update_int, unsigned long min_nr_reg, unsigned long max_nr_reg); +int damon_set_schemes(struct damon_ctx *ctx, + struct damos **schemes, ssize_t nr_schemes); int damon_set_recording(struct damon_ctx *ctx, unsigned int rbuf_len, char *rfile_path); int damon_start(struct damon_ctx *ctx); diff --git a/mm/damon.c b/mm/damon.c index 69090aa22207..e402717a2c0e 100644 --- a/mm/damon.c +++ b/mm/damon.c @@ -23,6 +23,7 @@ #define CREATE_TRACE_POINTS +#include #include #include #include @@ -66,6 +67,12 @@ #define damon_for_each_target_safe(t, next, ctx) \ list_for_each_entry_safe(t, next, &(ctx)->targets_list, list) +#define damon_for_each_scheme(s, ctx) \ + list_for_each_entry(s, &(ctx)->schemes_list, list) + +#define damon_for_each_scheme_safe(s, next, ctx) \ + list_for_each_entry_safe(s, next, &(ctx)->schemes_list, list) + #define MIN_RECORD_BUFFER_LEN 1024 #define MAX_RECORD_BUFFER_LEN (4 * 1024 * 1024) #define MAX_RFILE_PATH_LEN 256 @@ -187,6 +194,27 @@ static void damon_destroy_target(struct damon_target *t) damon_free_target(t); } +static void damon_add_scheme(struct damon_ctx *ctx, struct damos *s) +{ + list_add_tail(&s->list, &ctx->schemes_list); +} + +static void damon_del_scheme(struct damos *s) +{ + list_del(&s->list); +} + +static void damon_free_scheme(struct damos *s) +{ + kfree(s); +} + +static void damon_destroy_scheme(struct damos *s) +{ + damon_del_scheme(s); + damon_free_scheme(s); +} + static unsigned int nr_damon_targets(struct damon_ctx *ctx) { struct damon_target *t; @@ -890,6 +918,98 @@ static void kdamond_reset_aggregated(struct damon_ctx *c) } } +#ifndef CONFIG_ADVISE_SYSCALLS +static int damos_madvise(struct damon_target *target, struct damon_region *r, + int behavior) +{ + return -EINVAL; +} +#else +static int damos_madvise(struct damon_target *target, struct damon_region *r, + int behavior) +{ + struct task_struct *t; + struct mm_struct *mm; + int ret = -ENOMEM; + + t = damon_get_task_struct(target); + if (!t) + goto out; + mm = damon_get_mm(target); + if (!mm) + goto put_task_out; + + ret = do_madvise(t, mm, PAGE_ALIGN(r->ar.start), + PAGE_ALIGN(r->ar.end - r->ar.start), behavior); + mmput(mm); +put_task_out: + put_task_struct(t); +out: + return ret; +} +#endif /* CONFIG_ADVISE_SYSCALLS */ + +static int damos_do_action(struct damon_target *target, struct damon_region *r, + enum damos_action action) +{ + int madv_action; + + switch (action) { + case DAMOS_WILLNEED: + madv_action = MADV_WILLNEED; + break; + case DAMOS_COLD: + madv_action = MADV_COLD; + break; + case DAMOS_PAGEOUT: + madv_action = MADV_PAGEOUT; + break; + case DAMOS_HUGEPAGE: + madv_action = MADV_HUGEPAGE; + break; + case DAMOS_NOHUGEPAGE: + madv_action = MADV_NOHUGEPAGE; + break; + default: + pr_warn("Wrong action %d\n", action); + return -EINVAL; + } + + return damos_madvise(target, r, madv_action); +} + +static void damon_do_apply_schemes(struct damon_ctx *c, + struct damon_target *t, + struct damon_region *r) +{ + struct damos *s; + unsigned long sz; + + damon_for_each_scheme(s, c) { + sz = r->ar.end - r->ar.start; + if (sz < s->min_sz_region || s->max_sz_region < sz) + continue; + if (r->nr_accesses < s->min_nr_accesses || + s->max_nr_accesses < r->nr_accesses) + continue; + if (r->age < s->min_age_region || s->max_age_region < r->age) + continue; + damos_do_action(t, r, s->action); + r->age = 0; + } +} + +static void kdamond_apply_schemes(struct damon_ctx *c) +{ + struct damon_target *t; + struct damon_region *r; + + damon_for_each_target(t, c) { + damon_for_each_region(r, t) + damon_do_apply_schemes(c, t, r); + } +} + #define sz_damon_region(r) (r->ar.end - r->ar.start) /* @@ -1122,6 +1242,7 @@ static int kdamond_fn(void *data) ctx->aggregate_cb(ctx); kdamond_merge_regions(ctx, max_nr_accesses / 10, sz_limit); + kdamond_apply_schemes(ctx); kdamond_reset_aggregated(ctx); kdamond_split_regions(ctx); } @@ -1207,6 +1328,30 @@ int damon_stop(struct damon_ctx *ctx) return -EPERM; } +/** + * damon_set_schemes() - Set data access monitoring based operation schemes. + * @ctx: monitoring context + * @schemes: array of the schemes + * @nr_schemes: number of entries in @schemes + * + * This function should not be called while the kdamond of the context is + * running. + * + * Return: 0 if success, or negative error code otherwise. + */ +int damon_set_schemes(struct damon_ctx *ctx, struct damos **schemes, + ssize_t nr_schemes) +{ + struct damos *s, *next; + ssize_t i; + + damon_for_each_scheme_safe(s, next, ctx) + damon_destroy_scheme(s); + for (i = 0; i < nr_schemes; i++) + damon_add_scheme(ctx, schemes[i]); + return 0; +} + /** * damon_set_targets() - Set monitoring targets. * @ctx: monitoring context @@ -1718,6 +1863,7 @@ static int __init damon_init_user_ctx(void) mutex_init(&ctx->kdamond_lock); INIT_LIST_HEAD(&ctx->targets_list); + INIT_LIST_HEAD(&ctx->schemes_list); return 0; } -- 2.17.1