2013-04-24 01:41:38

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 0/6] Per process reclaim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

The patch[1] adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
echo anon > /proc/PID/reclaim
Reclaim all pages
echo all > /proc/PID/reclaim

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
The patch[4] causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

Another requirement is per address space reclaim.(By Michael Kerrisk)
In case of Webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so patch[5] supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

echo [addr] [size-byte] > /proc/pid/reclaim

* Changelog from v1
* Change reclaim knob interface - Dave Hansen
* proc.txt document change - Rob Landley

Minchan Kim (6):
[1] mm: Per process reclaim
[2] mm: make shrink_page_list with pages work from multiple zones
[3] mm: Remove shrink_page
[4] mm: Enhance per process reclaim to consider shared pages
[5] mm: Support address range reclaim
[6] add documentation on proc.txt

Documentation/filesystems/proc.txt | 22 +++++
fs/proc/base.c | 3 +
fs/proc/internal.h | 1 +
fs/proc/task_mmu.c | 179 +++++++++++++++++++++++++++++++++++++
include/linux/ksm.h | 6 +-
include/linux/rmap.h | 10 ++-
mm/Kconfig | 8 ++
mm/internal.h | 4 +-
mm/ksm.c | 9 +-
mm/memory-failure.c | 2 +-
mm/migrate.c | 6 +-
mm/rmap.c | 57 ++++++++----
mm/vmscan.c | 57 +++++++++++-
13 files changed, 334 insertions(+), 30 deletions(-)

--
1.8.2


2013-04-24 01:41:45

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 1/6] mm: Per process reclaim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
echo anon > /proc/PID/reclaim
Reclaim all pages
echo all > /proc/PID/reclaim

Signed-off-by: Minchan Kim <[email protected]>
---
fs/proc/base.c | 3 ++
fs/proc/internal.h | 1 +
fs/proc/task_mmu.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/rmap.h | 4 ++
mm/Kconfig | 13 ++++++
mm/internal.h | 7 +--
mm/vmscan.c | 59 +++++++++++++++++++++++++
7 files changed, 202 insertions(+), 6 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index df0dfdf..f0c8806 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2532,6 +2532,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("mounts", S_IRUGO, proc_mounts_operations),
REG("mountinfo", S_IRUGO, proc_mountinfo_operations),
REG("mountstats", S_IRUSR, proc_mountstats_operations),
+#ifdef CONFIG_PROCESS_RECLAIM
+ REG("reclaim", S_IWUSR, proc_reclaim_operations),
+#endif
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps", S_IRUGO, proc_pid_smaps_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 55f3418..6ffa7cc 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -52,6 +52,7 @@ extern const struct file_operations proc_pagemap2_operations;
extern const struct file_operations proc_net_operations;
extern const struct inode_operations proc_net_inode_operations;
extern const struct inode_operations proc_pid_link_inode_operations;
+extern const struct file_operations proc_reclaim_operations;

struct proc_maps_private {
struct pid *pid;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 1ec2553..fcc1c32 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -11,6 +11,7 @@
#include <linux/rmap.h>
#include <linux/swap.h>
#include <linux/swapops.h>
+#include <linux/mm_inline.h>

#include <asm/elf.h>
#include <asm/uaccess.h>
@@ -1182,6 +1183,126 @@ const struct file_operations proc_pagemap2_operations = {
};
#endif /* CONFIG_PROC_PAGE_MONITOR */

+#ifdef CONFIG_PROCESS_RECLAIM
+static int reclaim_pte_range(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct vm_area_struct *vma = walk->private;
+ pte_t *pte, ptent;
+ spinlock_t *ptl;
+ struct page *page;
+ LIST_HEAD(page_list);
+ int isolated;
+
+ split_huge_page_pmd(vma, addr, pmd);
+ if (pmd_trans_unstable(pmd))
+ return 0;
+cont:
+ isolated = 0;
+ pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ ptent = *pte;
+ if (!pte_present(ptent))
+ continue;
+
+ page = vm_normal_page(vma, addr, ptent);
+ if (!page)
+ continue;
+
+ if (isolate_lru_page(page))
+ continue;
+
+ list_add(&page->lru, &page_list);
+ inc_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
+ isolated++;
+ if (isolated >= SWAP_CLUSTER_MAX)
+ break;
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ reclaim_pages_from_list(&page_list);
+ if (addr != end)
+ goto cont;
+
+ cond_resched();
+ return 0;
+}
+
+enum reclaim_type {
+ RECLAIM_FILE,
+ RECLAIM_ANON,
+ RECLAIM_ALL,
+ RECLAIM_RANGE,
+};
+
+static ssize_t reclaim_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct task_struct *task;
+ char buffer[PROC_NUMBUF];
+ struct mm_struct *mm;
+ struct vm_area_struct *vma;
+ enum reclaim_type type;
+ char *type_buf;
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ type_buf = strstrip(buffer);
+ if (!strcmp(type_buf, "file"))
+ type = RECLAIM_FILE;
+ else if (!strcmp(type_buf, "anon"))
+ type = RECLAIM_ANON;
+ else if (!strcmp(type_buf, "all"))
+ type = RECLAIM_ALL;
+ else
+ return -EINVAL;
+
+ task = get_proc_task(file->f_path.dentry->d_inode);
+ if (!task)
+ return -ESRCH;
+
+ mm = get_task_mm(task);
+ if (mm) {
+ struct mm_walk reclaim_walk = {
+ .pmd_entry = reclaim_pte_range,
+ .mm = mm,
+ };
+
+ down_read(&mm->mmap_sem);
+ for (vma = mm->mmap; vma; vma = vma->vm_next) {
+ reclaim_walk.private = vma;
+
+ if (is_vm_hugetlb_page(vma))
+ continue;
+
+ if (type == RECLAIM_ANON && vma->vm_file)
+ continue;
+ if (type == RECLAIM_FILE && !vma->vm_file)
+ continue;
+
+ walk_page_range(vma->vm_start, vma->vm_end,
+ &reclaim_walk);
+ }
+ flush_tlb_mm(mm);
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ }
+ put_task_struct(task);
+
+ return count;
+}
+
+const struct file_operations proc_reclaim_operations = {
+ .write = reclaim_write,
+ .llseek = noop_llseek,
+};
+#endif
+
#ifdef CONFIG_NUMA

struct numa_maps {
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..a24e34e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -10,6 +10,10 @@
#include <linux/rwsem.h>
#include <linux/memcontrol.h>

+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+
/*
* The anon_vma heads a list of private "related" vmas, to scan if
* an anonymous page pointing to this anon_vma needs to be unmapped:
diff --git a/mm/Kconfig b/mm/Kconfig
index ef93df1..314bf49 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -479,3 +479,16 @@ config MEM_SOFT_DIRTY
it can be cleared by hands.

See Documentation/vm/soft-dirty.txt for more details.
+
+config PROCESS_RECLAIM
+ bool "Enable process reclaim"
+ depends on PROC_FS
+ default n
+ help
+ It allows to reclaim pages of the process by /proc/pid/reclaim.
+
+ (echo file > /proc/PID/reclaim) reclaims file-backed pages only.
+ (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
+ (echo all > /proc/PID/reclaim) reclaims all pages.
+
+ Any other vaule is ignored.
diff --git a/mm/internal.h b/mm/internal.h
index 8562de0..589a29b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -86,12 +86,6 @@ static inline void get_page_foll(struct page *page)
extern unsigned long highest_memmap_pfn;

/*
- * in mm/vmscan.c:
- */
-extern int isolate_lru_page(struct page *page);
-extern void putback_lru_page(struct page *page);
-
-/*
* in mm/rmap.c:
*/
extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
@@ -360,6 +354,7 @@ extern unsigned long vm_mmap_pgoff(struct file *, unsigned long,
extern void set_pageblock_order(void);
unsigned long reclaim_clean_pages_from_list(struct zone *zone,
struct list_head *page_list);
+
/* The ALLOC_WMARK bits are used as an index to zone->watermark */
#define ALLOC_WMARK_MIN WMARK_MIN
#define ALLOC_WMARK_LOW WMARK_LOW
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 5e12c60..6934f5b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -992,6 +992,65 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
return ret;
}

+#ifdef CONFIG_PROCESS_RECLAIM
+static unsigned long shrink_page(struct page *page,
+ struct zone *zone,
+ struct scan_control *sc,
+ enum ttu_flags ttu_flags,
+ unsigned long *ret_nr_dirty,
+ unsigned long *ret_nr_writeback,
+ bool force_reclaim,
+ struct list_head *ret_pages)
+{
+ int reclaimed;
+ LIST_HEAD(page_list);
+ list_add(&page->lru, &page_list);
+
+ reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
+ ret_nr_dirty, ret_nr_writeback,
+ force_reclaim);
+ if (!reclaimed)
+ list_splice(&page_list, ret_pages);
+
+ return reclaimed;
+}
+
+unsigned long reclaim_pages_from_list(struct list_head *page_list)
+{
+ struct scan_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ .priority = DEF_PRIORITY,
+ .may_unmap = 1,
+ .may_swap = 1,
+ };
+
+ LIST_HEAD(ret_pages);
+ struct page *page;
+ unsigned long dummy1, dummy2;
+ unsigned long nr_reclaimed = 0;
+
+ while (!list_empty(page_list)) {
+ page = lru_to_page(page_list);
+ list_del(&page->lru);
+
+ ClearPageActive(page);
+ nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+ TTU_UNMAP|TTU_IGNORE_ACCESS,
+ &dummy1, &dummy2, true, &ret_pages);
+ }
+
+ while (!list_empty(&ret_pages)) {
+ page = lru_to_page(&ret_pages);
+ list_del(&page->lru);
+ dec_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
+ putback_lru_page(page);
+ }
+
+ return nr_reclaimed;
+}
+#endif
+
/*
* Attempt to remove the specified page from its LRU. Only take this page
* if it is of the appropriate PageActive status. Pages which are being
--
1.8.2

2013-04-24 01:42:15

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 6/6] add documentation on proc.txt

This patch adds stuff about new reclaim field in proc.txt

Cc: Rob Landley <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---

Rob, I didn't add your Acked-by because interface was slight changed.
I hope you give Acke-by after review again.
Thanks.

Documentation/filesystems/proc.txt | 22 ++++++++++++++++++++++
mm/Kconfig | 7 +------
2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 488c094..1411ad0 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
maps Memory maps to executables and library files (2.4)
mem Memory held by this process
root Link to the root directory of this process
+ reclaim Reclaim pages in this process
stat Process status
statm Process memory status information
status Process status in human readable form
@@ -489,6 +490,27 @@ To clear the soft-dirty bit

Any other value written to /proc/PID/clear_refs will have no effect.

+The file /proc/PID/reclaim is used to reclaim pages in this process.
+To reclaim file-backed pages,
+ > echo file > /proc/PID/reclaim
+
+To reclaim anonymous pages,
+ > echo anon > /proc/PID/reclaim
+
+To reclaim all pages,
+ > echo all > /proc/PID/reclaim
+
+Also, you can specify address range of process so part of address space
+will be reclaimed. The format is following as
+ > echo addr size-byte > /proc/PID/reclaim
+
+NOTE: addr should be page-aligned.
+
+Below is example which try to reclaim 2 pages from 0x100000.
+
+To reclaim both pages in address range,
+ > echo $((1<<20) 8192 > /proc/PID/reclaim
+
The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
using /proc/kpageflags and number of times a page is mapped using
/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
diff --git a/mm/Kconfig b/mm/Kconfig
index 314bf49..9d6b306 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -486,9 +486,4 @@ config PROCESS_RECLAIM
default n
help
It allows to reclaim pages of the process by /proc/pid/reclaim.
-
- (echo file > /proc/PID/reclaim) reclaims file-backed pages only.
- (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
- (echo all > /proc/PID/reclaim) reclaims all pages.
-
- Any other vaule is ignored.
+ See Documentation/filesystem/proc.txt for more details.
--
1.8.2

2013-04-24 01:42:14

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 2/6] mm: make shrink_page_list with pages work from multiple zones

Shrink_page_list expects all pages come from a same zone
but it's too limited to use.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Signed-off-by: Minchan Kim <[email protected]>
---
mm/vmscan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6934f5b..82f4d6c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -706,7 +706,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto keep;

VM_BUG_ON(PageActive(page));
- VM_BUG_ON(page_zone(page) != zone);
+ if (zone)
+ VM_BUG_ON(page_zone(page) != zone);

sc->nr_scanned++;

@@ -952,7 +953,7 @@ keep:
* back off and wait for congestion to clear because further reclaim
* will encounter the same problem
*/
- if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
+ if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc) && zone)
zone_set_flag(zone, ZONE_CONGESTED);

free_hot_cold_page_list(&free_pages, 1);
--
1.8.2

2013-04-24 01:42:12

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 5/6] mm: Support address range reclaim

This patch adds address range reclaim of a process.
The requirement is following as,

Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

echo [addr] [size-byte] > /proc/pid/reclaim

addr should be page-aligned.

So now reclaim konb's interface is following as.

echo file > /proc/pid/reclaim
reclaim file-backed pages only

echo anon > /proc/pid/reclaim
reclaim anonymous pages only

echo all > /proc/pid/reclaim
reclaim all pages

echo $((1<<20)) 8192 > /proc/pid/reclaim
reclaim pages in (0x100000 - 0x102000)

Signed-off-by: Minchan Kim <[email protected]>
---
fs/proc/task_mmu.c | 88 ++++++++++++++++++++++++++++++++++++++++++++----------
mm/internal.h | 3 ++
2 files changed, 76 insertions(+), 15 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 79b674e..dff9756 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -12,6 +12,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mm_inline.h>
+#include <linux/ctype.h>

#include <asm/elf.h>
#include <asm/uaccess.h>
@@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
struct task_struct *task;
- char buffer[PROC_NUMBUF];
+ char buffer[200];
struct mm_struct *mm;
struct vm_area_struct *vma;
enum reclaim_type type;
char *type_buf;
+ struct mm_walk reclaim_walk = {};
+ unsigned long start = 0;
+ unsigned long end = 0;

memset(buffer, 0, sizeof(buffer));
if (count > sizeof(buffer) - 1)
@@ -1259,42 +1263,96 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
type = RECLAIM_ANON;
else if (!strcmp(type_buf, "all"))
type = RECLAIM_ALL;
+ else if (isdigit(*type_buf))
+ type = RECLAIM_RANGE;
else
- return -EINVAL;
+ goto out_err;
+
+ if (type == RECLAIM_RANGE) {
+ int ret;
+ size_t len;
+ unsigned long len_in;
+ char *token;
+
+ token = strsep(&type_buf, " ");
+ if (!token)
+ goto out_err;
+ ret = kstrtoul(token, 10, &start);
+ if (ret < 0 || (start & ~PAGE_MASK))
+ goto out_err;
+
+ token = strsep(&type_buf, " ");
+ if (!token)
+ goto out_err;
+ ret = kstrtoul(token, 10, &len_in);
+ if (ret < 0)
+ goto out_err;
+
+ len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+ /*
+ * Check to see whether len was rounded up from small -ve
+ * to zero.
+ */
+ if (len_in && !len)
+ goto out_err;
+
+ end = start + len;
+ if (end < start)
+ goto out_err;
+ }

task = get_proc_task(file->f_path.dentry->d_inode);
if (!task)
return -ESRCH;

mm = get_task_mm(task);
- if (mm) {
- struct mm_walk reclaim_walk = {
- .pmd_entry = reclaim_pte_range,
- .mm = mm,
- };
+ if (!mm)
+ goto out;

- down_read(&mm->mmap_sem);
- for (vma = mm->mmap; vma; vma = vma->vm_next) {
- reclaim_walk.private = vma;
+ reclaim_walk.mm = mm;
+ reclaim_walk.pmd_entry = reclaim_pte_range;

+ down_read(&mm->mmap_sem);
+ if (type == RECLAIM_RANGE) {
+ vma = find_vma(mm, start);
+ while (vma) {
+ if (vma->vm_start > end)
+ break;
+ if (is_vm_hugetlb_page(vma))
+ continue;
+
+ reclaim_walk.private = vma;
+ walk_page_range(max(vma->vm_start, start),
+ min(vma->vm_end, end),
+ &reclaim_walk);
+ vma = vma->vm_next;
+ }
+ } else {
+ for (vma = mm->mmap; vma; vma = vma->vm_next) {
if (is_vm_hugetlb_page(vma))
continue;

if (type == RECLAIM_ANON && vma->vm_file)
continue;
+
if (type == RECLAIM_FILE && !vma->vm_file)
continue;

+ reclaim_walk.private = vma;
walk_page_range(vma->vm_start, vma->vm_end,
- &reclaim_walk);
+ &reclaim_walk);
}
- flush_tlb_mm(mm);
- up_read(&mm->mmap_sem);
- mmput(mm);
}
- put_task_struct(task);

+ flush_tlb_mm(mm);
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+out:
+ put_task_struct(task);
return count;
+
+out_err:
+ return -EINVAL;
}

const struct file_operations proc_reclaim_operations = {
diff --git a/mm/internal.h b/mm/internal.h
index 589a29b..1f7ce8f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -85,6 +85,9 @@ static inline void get_page_foll(struct page *page)

extern unsigned long highest_memmap_pfn;

+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+
/*
* in mm/rmap.c:
*/
--
1.8.2

2013-04-24 01:42:09

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 4/6] mm: Enhance per process reclaim to consider shared pages

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Signed-off-by: Sangseok Lee <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
---
fs/proc/task_mmu.c | 2 +-
include/linux/ksm.h | 6 ++++--
include/linux/rmap.h | 8 +++++---
mm/ksm.c | 9 ++++++++-
mm/memory-failure.c | 2 +-
mm/migrate.c | 6 ++++--
mm/rmap.c | 57 +++++++++++++++++++++++++++++++++++++---------------
mm/vmscan.c | 14 +++++++++++--
8 files changed, 76 insertions(+), 28 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index fcc1c32..79b674e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1220,7 +1220,7 @@ cont:
break;
}
pte_unmap_unlock(pte - 1, ptl);
- reclaim_pages_from_list(&page_list);
+ reclaim_pages_from_list(&page_list, vma);
if (addr != end)
goto cont;

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..d8e556b 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,

int page_referenced_ksm(struct page *page,
struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
+int try_to_unmap_ksm(struct page *page,
+ enum ttu_flags flags, struct vm_area_struct *vma);
int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
struct vm_area_struct *, unsigned long, void *), void *arg);
void ksm_migrate_page(struct page *newpage, struct page *oldpage);
@@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
return 0;
}

-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+static inline int try_to_unmap_ksm(struct page *page,
+ enum ttu_flags flags, struct vm_area_struct *target_vma)
{
return 0;
}
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a24e34e..6c7d030 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -12,7 +12,8 @@

extern int isolate_lru_page(struct page *page);
extern void putback_lru_page(struct page *page);
-extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
+ struct vm_area_struct *vma);

/*
* The anon_vma heads a list of private "related" vmas, to scan if
@@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,

#define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)

-int try_to_unmap(struct page *, enum ttu_flags flags);
+int try_to_unmap(struct page *, enum ttu_flags flags,
+ struct vm_area_struct *vma);
int try_to_unmap_one(struct page *, struct vm_area_struct *,
unsigned long address, enum ttu_flags flags);

@@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
return 0;
}

-#define try_to_unmap(page, refs) SWAP_FAIL
+#define try_to_unmap(page, refs, vma) SWAP_FAIL

static inline int page_mkclean(struct page *page)
{
diff --git a/mm/ksm.c b/mm/ksm.c
index 7f629e4..44de936 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1949,7 +1949,8 @@ out:
return referenced;
}

-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
struct stable_node *stable_node;
struct hlist_node *hlist;
@@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
stable_node = page_stable_node(page);
if (!stable_node)
return SWAP_FAIL;
+
+ if (target_vma) {
+ unsigned long address = vma_address(page, target_vma);
+ ret = try_to_unmap_one(page, target_vma, address, flags);
+ goto out;
+ }
again:
hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
struct anon_vma *anon_vma = rmap_item->anon_vma;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ceb0c7f..f3928e4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
if (hpage != ppage)
lock_page(ppage);

- ret = try_to_unmap(ppage, ttu);
+ ret = try_to_unmap(ppage, ttu, NULL);
if (ret != SWAP_SUCCESS)
printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
pfn, page_mapcount(ppage));
diff --git a/mm/migrate.c b/mm/migrate.c
index c9c5eee..aef29a0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
}

/* Establish migration ptes or remove ptes */
- try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+ NULL);

skip_unmap:
if (!page_mapped(page))
@@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
if (PageAnon(hpage))
anon_vma = page_get_anon_vma(hpage);

- try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+ NULL);

if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..43718fc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)

/**
* try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
+ * rmap method if @vma is NULL
* @page: the page to unmap/unlock
* @flags: action and flags
+ * @target_vma: vma for unmapping a @page
*
* Find all the mappings of a page using the mapping pointer and the vma chains
* contained in the anon_vma struct it points to.
*
+ * If @target_vma isn't NULL, this function unmap a page from the vma
+ *
* This function is only called from try_to_unmap/try_to_munlock for
* anonymous pages.
* When called from try_to_munlock(), the mmap_sem of the mm containing the vma
@@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
* vm_flags for that VMA. That should be OK, because that vma shouldn't be
* 'LOCKED.
*/
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
+ int ret = SWAP_AGAIN;
+ unsigned long address;
struct anon_vma *anon_vma;
pgoff_t pgoff;
struct anon_vma_chain *avc;
- int ret = SWAP_AGAIN;
+
+ if (target_vma) {
+ address = vma_address(page, target_vma);
+ return try_to_unmap_one(page, target_vma, address, flags);
+ }

anon_vma = page_lock_anon_vma_read(page);
if (!anon_vma)
@@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
- unsigned long address;

/*
* During exec, a temporary VMA is setup and later moved.
@@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
* try_to_unmap_file - unmap/unlock file page using the object-based rmap method
* @page: the page to unmap/unlock
* @flags: action and flags
+ * @target_vma: vma for unmapping @page
*
* Find all the mappings of a page using the mapping pointer and the vma chains
* contained in the address_space struct it points to.
@@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
* vm_flags for that VMA. That should be OK, because that vma shouldn't be
* 'LOCKED.
*/
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
struct address_space *mapping = page->mapping;
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1512,16 +1523,26 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
unsigned long max_nl_cursor = 0;
unsigned long max_nl_size = 0;
unsigned int mapcount;
+ unsigned long address;

if (PageHuge(page))
pgoff = page->index << compound_order(page);

mutex_lock(&mapping->i_mmap_mutex);
- vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
- unsigned long address = vma_address(page, vma);
- ret = try_to_unmap_one(page, vma, address, flags);
- if (ret != SWAP_AGAIN || !page_mapped(page))
+ if (target_vma) {
+ /* We don't handle non-linear vma on ramfs */
+ if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
goto out;
+ address = vma_address(page, target_vma);
+ ret = try_to_unmap_one(page, target_vma, address, flags);
+ goto out;
+ } else {
+ vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+ address = vma_address(page, vma);
+ ret = try_to_unmap_one(page, vma, address, flags);
+ if (ret != SWAP_AGAIN || !page_mapped(page))
+ goto out;
+ }
}

if (list_empty(&mapping->i_mmap_nonlinear))
@@ -1602,9 +1623,12 @@ out:
* try_to_unmap - try to remove all page table mappings to a page
* @page: the page to get unmapped
* @flags: action and flags
+ * @vma : target vma for reclaim
*
* Tries to remove all the page table entries which are mapping this
* page, used in the pageout path. Caller must hold the page lock.
+ * If @vma is not NULL, this function try to remove @page from only @vma
+ * without peeking all mapped vma for @page.
* Return values are:
*
* SWAP_SUCCESS - we succeeded in removing all mappings
@@ -1612,7 +1636,8 @@ out:
* SWAP_FAIL - the page is unswappable
* SWAP_MLOCK - page is mlocked.
*/
-int try_to_unmap(struct page *page, enum ttu_flags flags)
+int try_to_unmap(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *vma)
{
int ret;

@@ -1620,11 +1645,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));

if (unlikely(PageKsm(page)))
- ret = try_to_unmap_ksm(page, flags);
+ ret = try_to_unmap_ksm(page, flags, vma);
else if (PageAnon(page))
- ret = try_to_unmap_anon(page, flags);
+ ret = try_to_unmap_anon(page, flags, vma);
else
- ret = try_to_unmap_file(page, flags);
+ ret = try_to_unmap_file(page, flags, vma);
if (ret != SWAP_MLOCK && !page_mapped(page))
ret = SWAP_SUCCESS;
return ret;
@@ -1650,11 +1675,11 @@ int try_to_munlock(struct page *page)
VM_BUG_ON(!PageLocked(page) || PageLRU(page));

if (unlikely(PageKsm(page)))
- return try_to_unmap_ksm(page, TTU_MUNLOCK);
+ return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
else if (PageAnon(page))
- return try_to_unmap_anon(page, TTU_MUNLOCK);
+ return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
else
- return try_to_unmap_file(page, TTU_MUNLOCK);
+ return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
}

void __put_anon_vma(struct anon_vma *anon_vma)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 42af075..79d6959 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -93,6 +93,13 @@ struct scan_control {
* are scanned.
*/
nodemask_t *nodemask;
+
+ /*
+ * Reclaim pages from a vma. If the page is shared by other tasks
+ * it is zapped from a vma without reclaim so it ends up remaining
+ * on memory until last task zap it.
+ */
+ struct vm_area_struct *target_vma;
};

#define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -794,7 +801,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
* processes. Try to unmap it here.
*/
if (page_mapped(page) && mapping) {
- switch (try_to_unmap(page, ttu_flags)) {
+ switch (try_to_unmap(page,
+ ttu_flags, sc->target_vma)) {
case SWAP_FAIL:
goto activate_locked;
case SWAP_AGAIN:
@@ -1001,13 +1009,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
}

#ifdef CONFIG_PROCESS_RECLAIM
-unsigned long reclaim_pages_from_list(struct list_head *page_list)
+unsigned long reclaim_pages_from_list(struct list_head *page_list,
+ struct vm_area_struct *vma)
{
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.priority = DEF_PRIORITY,
.may_unmap = 1,
.may_swap = 1,
+ .target_vma = vma,
};

unsigned long nr_reclaimed;
--
1.8.2

2013-04-24 01:42:07

by Minchan Kim

[permalink] [raw]
Subject: [PATCH v2 3/6] mm: Remove shrink_page

By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Signed-off-by: Minchan Kim <[email protected]>
---
mm/vmscan.c | 47 ++++++++++++++---------------------------------
1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 82f4d6c..42af075 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -924,6 +924,13 @@ free_it:
* appear not as the counts should be low
*/
list_add(&page->lru, &free_pages);
+ /*
+ * If pagelist are from multiple zones, we should decrease
+ * NR_ISOLATED_ANON + x on freed pages in here.
+ */
+ if (!zone)
+ dec_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
continue;

cull_mlocked:
@@ -994,28 +1001,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
}

#ifdef CONFIG_PROCESS_RECLAIM
-static unsigned long shrink_page(struct page *page,
- struct zone *zone,
- struct scan_control *sc,
- enum ttu_flags ttu_flags,
- unsigned long *ret_nr_dirty,
- unsigned long *ret_nr_writeback,
- bool force_reclaim,
- struct list_head *ret_pages)
-{
- int reclaimed;
- LIST_HEAD(page_list);
- list_add(&page->lru, &page_list);
-
- reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
- ret_nr_dirty, ret_nr_writeback,
- force_reclaim);
- if (!reclaimed)
- list_splice(&page_list, ret_pages);
-
- return reclaimed;
-}
-
unsigned long reclaim_pages_from_list(struct list_head *page_list)
{
struct scan_control sc = {
@@ -1025,23 +1010,19 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
.may_swap = 1,
};

- LIST_HEAD(ret_pages);
+ unsigned long nr_reclaimed;
struct page *page;
unsigned long dummy1, dummy2;
- unsigned long nr_reclaimed = 0;
-
- while (!list_empty(page_list)) {
- page = lru_to_page(page_list);
- list_del(&page->lru);

+ list_for_each_entry(page, page_list, lru)
ClearPageActive(page);
- nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+
+ nr_reclaimed = shrink_page_list(page_list, NULL, &sc,
TTU_UNMAP|TTU_IGNORE_ACCESS,
- &dummy1, &dummy2, true, &ret_pages);
- }
+ &dummy1, &dummy2, true);

- while (!list_empty(&ret_pages)) {
- page = lru_to_page(&ret_pages);
+ while (!list_empty(page_list)) {
+ page = lru_to_page(page_list);
list_del(&page->lru);
dec_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
--
1.8.2

2013-04-24 06:49:50

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH v2 6/6] add documentation on proc.txt

On 04/23/2013 08:41:04 PM, Minchan Kim wrote:
> This patch adds stuff about new reclaim field in proc.txt
>
> Cc: Rob Landley <[email protected]>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
>
> Rob, I didn't add your Acked-by because interface was slight changed.
> I hope you give Acke-by after review again.
> Thanks.
>
> Documentation/filesystems/proc.txt | 22 ++++++++++++++++++++++
> mm/Kconfig | 7 +------
> 2 files changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.txt
> b/Documentation/filesystems/proc.txt
> index 488c094..1411ad0 100644
> --- a/Documentation/filesystems/proc.txt
> +++ b/Documentation/filesystems/proc.txt
> @@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
> maps Memory maps to executables and library files
> (2.4)
> mem Memory held by this process
> root Link to the root directory of this process
> + reclaim Reclaim pages in this process
> stat Process status
> statm Process memory status information
> status Process status in human readable form
> @@ -489,6 +490,27 @@ To clear the soft-dirty bit
>
> Any other value written to /proc/PID/clear_refs will have no effect.
>
> +The file /proc/PID/reclaim is used to reclaim pages in this process.
> +To reclaim file-backed pages,
> + > echo file > /proc/PID/reclaim
> +
> +To reclaim anonymous pages,
> + > echo anon > /proc/PID/reclaim
> +
> +To reclaim all pages,
> + > echo all > /proc/PID/reclaim
> +
> +Also, you can specify address range of process so part of address
> space
> +will be reclaimed. The format is following as
> + > echo addr size-byte > /proc/PID/reclaim
> +
> +NOTE: addr should be page-aligned.

And size in bytes should be a multiple of page size?

> +
> +Below is example which try to reclaim 2 pages from 0x100000.
> +
> +To reclaim both pages in address range,
> + > echo $((1<<20) 8192 > /proc/PID/reclaim

Would you like to balance your parentheses?

Acked-by: Rob Landley <[email protected]>

Rob-

2013-04-24 08:18:19

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v2 6/6] add documentation on proc.txt

Hello Rob,

On Wed, Apr 24, 2013 at 01:49:45AM -0500, Rob Landley wrote:
> On 04/23/2013 08:41:04 PM, Minchan Kim wrote:
> >This patch adds stuff about new reclaim field in proc.txt
> >
> >Cc: Rob Landley <[email protected]>
> >Signed-off-by: Minchan Kim <[email protected]>
> >---
> >
> >Rob, I didn't add your Acked-by because interface was slight changed.
> >I hope you give Acke-by after review again.
> >Thanks.
> >
> > Documentation/filesystems/proc.txt | 22 ++++++++++++++++++++++
> > mm/Kconfig | 7 +------
> > 2 files changed, 23 insertions(+), 6 deletions(-)
> >
> >diff --git a/Documentation/filesystems/proc.txt
> >b/Documentation/filesystems/proc.txt
> >index 488c094..1411ad0 100644
> >--- a/Documentation/filesystems/proc.txt
> >+++ b/Documentation/filesystems/proc.txt
> >@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
> > maps Memory maps to executables and library files (2.4)
> > mem Memory held by this process
> > root Link to the root directory of this process
> >+ reclaim Reclaim pages in this process
> > stat Process status
> > statm Process memory status information
> > status Process status in human readable form
> >@@ -489,6 +490,27 @@ To clear the soft-dirty bit
> >
> > Any other value written to /proc/PID/clear_refs will have no effect.
> >
> >+The file /proc/PID/reclaim is used to reclaim pages in this process.
> >+To reclaim file-backed pages,
> >+ > echo file > /proc/PID/reclaim
> >+
> >+To reclaim anonymous pages,
> >+ > echo anon > /proc/PID/reclaim
> >+
> >+To reclaim all pages,
> >+ > echo all > /proc/PID/reclaim
> >+
> >+Also, you can specify address range of process so part of address
> >space
> >+will be reclaimed. The format is following as
> >+ > echo addr size-byte > /proc/PID/reclaim
> >+
> >+NOTE: addr should be page-aligned.
>
> And size in bytes should be a multiple of page size?

Not necessary. It's same with madvise that VM handle the page
which includes the byte.

>
> >+
> >+Below is example which try to reclaim 2 pages from 0x100000.
> >+
> >+To reclaim both pages in address range,
> >+ > echo $((1<<20) 8192 > /proc/PID/reclaim
>
> Would you like to balance your parentheses?

Fixed. I will include your Acked-by in next spin.
Thanks!

>
> Acked-by: Rob Landley <[email protected]>
>
> Rob

--
Kind regards,
Minchan Kim

2013-04-24 11:01:51

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 5/6] mm: Support address range reclaim

Hi Minchan,

On Wed, 24 Apr 2013 10:41:03 +0900, Minchan Kim wrote:
> This patch adds address range reclaim of a process.
> The requirement is following as,
>
> Like webkit1, it uses a address space for handling multi tabs.
> IOW, it uses *one* process model so all tabs shares address space
> of the process. In such scenario, per-process reclaim is rather
> coarse-grained so this patch supports more fine-grained reclaim
> for being able to reclaim target address range of the process.
> For reclaim target range, you should use following format.
>
> echo [addr] [size-byte] > /proc/pid/reclaim
>
> addr should be page-aligned.
>
> So now reclaim konb's interface is following as.
>
> echo file > /proc/pid/reclaim
> reclaim file-backed pages only
>
> echo anon > /proc/pid/reclaim
> reclaim anonymous pages only
>
> echo all > /proc/pid/reclaim
> reclaim all pages
>
> echo $((1<<20)) 8192 > /proc/pid/reclaim
> reclaim pages in (0x100000 - 0x102000)
>
> Signed-off-by: Minchan Kim <[email protected]>
> ---
> fs/proc/task_mmu.c | 88 ++++++++++++++++++++++++++++++++++++++++++++----------
> mm/internal.h | 3 ++
> 2 files changed, 76 insertions(+), 15 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 79b674e..dff9756 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -12,6 +12,7 @@
> #include <linux/swap.h>
> #include <linux/swapops.h>
> #include <linux/mm_inline.h>
> +#include <linux/ctype.h>
>
> #include <asm/elf.h>
> #include <asm/uaccess.h>
> @@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
> size_t count, loff_t *ppos)
> {
> struct task_struct *task;
> - char buffer[PROC_NUMBUF];
> + char buffer[200];
> struct mm_struct *mm;
> struct vm_area_struct *vma;
> enum reclaim_type type;
> char *type_buf;
> + struct mm_walk reclaim_walk = {};
> + unsigned long start = 0;
> + unsigned long end = 0;
>
> memset(buffer, 0, sizeof(buffer));
> if (count > sizeof(buffer) - 1)
> @@ -1259,42 +1263,96 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
> type = RECLAIM_ANON;
> else if (!strcmp(type_buf, "all"))
> type = RECLAIM_ALL;
> + else if (isdigit(*type_buf))
> + type = RECLAIM_RANGE;
> else
> - return -EINVAL;
> + goto out_err;
> +
> + if (type == RECLAIM_RANGE) {
> + int ret;
> + size_t len;
> + unsigned long len_in;
> + char *token;
> +
> + token = strsep(&type_buf, " ");
> + if (!token)
> + goto out_err;
> + ret = kstrtoul(token, 10, &start);

Why not using

start = memparse(token, NULL);

to support something like:

# echo 0x100000 8K > /proc/pid/reclaim


Thanks,
Namhyung

2013-04-25 00:50:54

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH v2 5/6] mm: Support address range reclaim

Hey Namhyung,

On Wed, Apr 24, 2013 at 08:01:48PM +0900, Namhyung Kim wrote:
> Hi Minchan,
>
> On Wed, 24 Apr 2013 10:41:03 +0900, Minchan Kim wrote:
> > This patch adds address range reclaim of a process.
> > The requirement is following as,
> >
> > Like webkit1, it uses a address space for handling multi tabs.
> > IOW, it uses *one* process model so all tabs shares address space
> > of the process. In such scenario, per-process reclaim is rather
> > coarse-grained so this patch supports more fine-grained reclaim
> > for being able to reclaim target address range of the process.
> > For reclaim target range, you should use following format.
> >
> > echo [addr] [size-byte] > /proc/pid/reclaim
> >
> > addr should be page-aligned.
> >
> > So now reclaim konb's interface is following as.
> >
> > echo file > /proc/pid/reclaim
> > reclaim file-backed pages only
> >
> > echo anon > /proc/pid/reclaim
> > reclaim anonymous pages only
> >
> > echo all > /proc/pid/reclaim
> > reclaim all pages
> >
> > echo $((1<<20)) 8192 > /proc/pid/reclaim
> > reclaim pages in (0x100000 - 0x102000)
> >
> > Signed-off-by: Minchan Kim <[email protected]>
> > ---
> > fs/proc/task_mmu.c | 88 ++++++++++++++++++++++++++++++++++++++++++++----------
> > mm/internal.h | 3 ++
> > 2 files changed, 76 insertions(+), 15 deletions(-)
> >
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index 79b674e..dff9756 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -12,6 +12,7 @@
> > #include <linux/swap.h>
> > #include <linux/swapops.h>
> > #include <linux/mm_inline.h>
> > +#include <linux/ctype.h>
> >
> > #include <asm/elf.h>
> > #include <asm/uaccess.h>
> > @@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > struct task_struct *task;
> > - char buffer[PROC_NUMBUF];
> > + char buffer[200];
> > struct mm_struct *mm;
> > struct vm_area_struct *vma;
> > enum reclaim_type type;
> > char *type_buf;
> > + struct mm_walk reclaim_walk = {};
> > + unsigned long start = 0;
> > + unsigned long end = 0;
> >
> > memset(buffer, 0, sizeof(buffer));
> > if (count > sizeof(buffer) - 1)
> > @@ -1259,42 +1263,96 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
> > type = RECLAIM_ANON;
> > else if (!strcmp(type_buf, "all"))
> > type = RECLAIM_ALL;
> > + else if (isdigit(*type_buf))
> > + type = RECLAIM_RANGE;
> > else
> > - return -EINVAL;
> > + goto out_err;
> > +
> > + if (type == RECLAIM_RANGE) {
> > + int ret;
> > + size_t len;
> > + unsigned long len_in;
> > + char *token;
> > +
> > + token = strsep(&type_buf, " ");
> > + if (!token)
> > + goto out_err;
> > + ret = kstrtoul(token, 10, &start);
>
> Why not using
>
> start = memparse(token, NULL);
>
> to support something like:
>
> # echo 0x100000 8K > /proc/pid/reclaim
>

Because I'm brain-damage. :(
Thanks for noticing useful function.

>
> Thanks,
> Namhyung
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

--
Kind regards,
Minchan Kim