Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755341AbbGCNdU (ORCPT ); Fri, 3 Jul 2015 09:33:20 -0400 Received: from mailout3.samsung.com ([203.254.224.33]:53484 "EHLO mailout3.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754979AbbGCNdG (ORCPT ); Fri, 3 Jul 2015 09:33:06 -0400 X-AuditID: cbfee68e-f79c56d000006efb-3b-55968f0fb370 From: Pintu Kumar To: corbet@lwn.net, akpm@linux-foundation.org, vbabka@suse.cz, gorcunov@openvz.org, pintu.k@samsung.com, mhocko@suse.cz, emunson@akamai.com, kirill.shutemov@linux.intel.com, standby24x7@gmail.com, hannes@cmpxchg.org, vdavydov@parallels.com, hughd@google.com, minchan@kernel.org, tj@kernel.org, rientjes@google.com, xypron.glpk@gmx.de, dzickus@redhat.com, prarit@redhat.com, ebiederm@xmission.com, rostedt@goodmis.org, uobergfe@redhat.com, paulmck@linux.vnet.ibm.com, iamjoonsoo.kim@lge.com, ddstreet@ieee.org, sasha.levin@oracle.com, koct9i@gmail.com, mgorman@suse.de, cj@linux.com, opensource.ganesh@gmail.com, vinmenon@codeaurora.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-pm@vger.kernel.org Cc: cpgs@samsung.com, pintu_agarwal@yahoo.com, vishnu.ps@samsung.com, rohit.kr@samsung.com, iqbal.ams@samsung.com Subject: [PATCH 1/1] kernel/sysctl.c: Add /proc/sys/vm/shrink_memory feature Date: Fri, 03 Jul 2015 18:50:07 +0530 Message-id: <1435929607-3435-1-git-send-email-pintu.k@samsung.com> X-Mailer: git-send-email 1.7.9.5 X-Brightmail-Tracker: H4sIAAAAAAAAAzVSaUiTcRjv/55zJLwtj3+jILSwosOzHlEiiuD9YFBqX4LMNd+mpFM2FYWC ZN7mW17ljaXl0WC0oZiRrqll2KFZeSzFyAxL1zo0U9K2rG+/3/P8jufDIyFlpbRcEqdOFjRq RbwXI6X0boEX93BXrp30XemgodqgZ8Awo0MwZc5BMGPZCQPZ9ykYf+YHq22ZLNhefCegy2Qn 4I7xGHz4JlLQXKBjoGqpmIDRj9UOmjdJw43sBgqGOqoZmNCv0vC9sAdBiW0awSfdTxpuz39h IWvsPQVzJgcVf3azsDBndaQvFrKQ3XCXgI+6LAo6cycJqO+xkvCyfJiFpcZeBOLULQKqMkQE Q60VJBhazSyUiuMIusvs6JAPX9JTR/Lds19IfkgsJPh7leMsb/8axWd2jrF8nTGF/3D1IcEb W/IY3vitmOXz514RfF/5MsVPvbpO8LVPTvCTFhPF14sl9HH5KWlojBAflypo9h2MlsZOX6tE STa/tGmDjrmEFnfkIxcJ5gLxW10uuYY98MCEgXFiGXcL4SLTkf+anrI2Oh9JHfObCPd+HSbX SAaBmx8t/3UznA9enF1GzoUbt0BjW42dcC5ITouf1s5RTryRC8NLw61/DRS3HS+acxx1Eokr dxiPiFInxNxWXF0c6ozBnMkFv3ldwK7JObxQYqHWNFuw0fzv6E34YdMIdRVtqEPrWpC7kKRM 0p5Vafz3ahUJ2hS1aq8yMcGIHM/SvzIttqNRc4gFcRLktd51oL/spIxWpGrTEywoyNFQRMrd lYmO/1Inn/EL2O8PQYFBAf4Hgvd7ebpGy39FyDiVIlk4LwhJguaMJiVe0FoQIXGRX0Jw0Tur T854htRvvhkZXrr7sce54lgusupK17YZFaGURUnDYbVmaZBKPDSmvHvh8+/0IZ9NiTEdl201 5nbvB2ERbESwuaI7DRszBm2nfcNfWGcn3h1utB7wfF2bXjdaqS8v7NSXNr19/qMiksFFjPg+ v6kmODtz/o3H0SODkgQvShur8NtFarSKPzxwEGknAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrPKsWRmVeSWpSXmKPExsVy+t9jAV3+/mmhBhf3mVrMWb+GzWL9y2ZG iycH2hktXh7StLjQtpvF4u5ZQ4v/21rYLd6d/8xksX/zByaL1Zt8LZ5+6mOxWNndzGYx+9ck Joubz+cAuZ0PWC0Wti1hsbi8aw6bxb01/1ktPvceYbSY/O4Zo8Wr5u+sFsu+vme3aL31mMXi 7WYgt+/7YXaLb29vA03/0ctu0bZkI5PF8+ZWFot9HQ+YLBYfuc1scWnGdXaLX8uPMlr0PVnK ZDG7sY/R4vLWmcwW67ceYLeY0neX0eLw1A+MDuoek48sYPY4/OY9s8flvl4mj52z7rJ7fPgY 59Gy7xa7x4JNpR5PJxxk8ti0qpPNY9OnSeweXW+vMHmcmPGbxePJlelMHvNOBno8OLSZxWNx 32TWAKmoBkabjNTElNQihdS85PyUzLx0WyXv4HjneFMzA0NdQ0sLcyWFvMTcVFslF58AXbfM HGDIKimUJeaUAoUCEouLlfTtME0IDXHTtYBpjND1DQmC6zEyQAMJaxgznk2bxVjwzrDi2fpm tgbGHxpdjJwcEgImEkembmOFsMUkLtxbz9bFyMUhJLCIUeLox+vMEE4jk8TKY7+ZQarYBNQl frz5zQiSEBH4xirxbu4HJpAEs0CxxJl5b1lAbGEBH4lf17eCNbAIqEr8ONAONJaDg1fASeJG HxeIKSGgIDFnks0ERu4FjAyrGEVTC5ILipPSc430ihNzi0vz0vWS83M3MYIT1zPpHYyrGiwO MQpwMCrx8F44PTVUiDWxrLgy9xCjBAezkgjv8+BpoUK8KYmVValF+fFFpTmpxYcYTYF2T2SW Ek3OBybVvJJ4Q2MTc1NjU0sTCxMzSyVx3pP5PqFCAumJJanZqakFqUUwfUwcnFINjJ5hmhvs pM7N9Zh7NdrKmC+7aU6TdqnUp417zO0bDJ7afD6VenbjwUJT8yKvk+XZ51gFZNc53OBoak40 iLlet95vZkJHgHq2YNLm82wTLoeaR7xM+s8csHnd7WWbxDZx+2kovvNZFbXsgZtWAacjR8lM 0UDBrg9FaV6bkovTtmjvq41WM6lQYinOSDTUYi4qTgQA7cGzF3IDAAA= DLP-Filter: Pass X-MTR: 20000000000000000@CPGS X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6450 Lines: 182 This patch provides 2 things: 1. Add new control called shrink_memory in /proc/sys/vm/. This control can be used to aggressively reclaim memory system-wide in one shot from the user space. A value of 1 will instruct the kernel to reclaim as much as totalram_pages in the system. Example: echo 1 > /proc/sys/vm/shrink_memory 2. Enable shrink_all_memory API in kernel with new CONFIG_SHRINK_MEMORY. Currently, shrink_all_memory function is used only during hibernation. With the new config we can make use of this API for non-hibernation case also without disturbing the hibernation case. The detailed paper was presented in Embedded Linux Conference, Mar-2015 http://events.linuxfoundation.org/sites/events/files/slides/ %5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf Scenarios were this can be used and helpful are: 1) Can be invoked just after system boot-up is finished. 2) Can be invoked just before entering entire system suspend. 3) Can be invoked from kernel when order-4 pages starts failing. 4) Can be helpful to completely avoid or delay the kerenl OOM condition. 5) Can be developed as a system-tool to quickly defragment entire system from user space, without the need to kill any application. Signed-off-by: Pintu Kumar --- Documentation/sysctl/vm.txt | 16 ++++++++++++++++ include/linux/swap.h | 7 +++++++ kernel/sysctl.c | 9 +++++++++ mm/Kconfig | 8 ++++++++ mm/vmscan.c | 23 +++++++++++++++++++++-- 5 files changed, 61 insertions(+), 2 deletions(-) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 9832ec5..a959ad1 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -54,6 +54,7 @@ Currently, these files are in /proc/sys/vm: - page-cluster - panic_on_oom - percpu_pagelist_fraction +- shrink_memory - stat_interval - swappiness - user_reserve_kbytes @@ -718,6 +719,21 @@ sysctl, it will revert to this default behavior. ============================================================== +shrink_memory + +This control is available only when CONFIG_SHRINK_MEMORY is set. This control +can be used to aggressively reclaim memory system-wide in one shot. A value of +1 will instruct the kernel to reclaim as much as totalram_pages in the system. +For example, to reclaim all memory system-wide we can do: +# echo 1 > /proc/sys/vm/shrink_memory + +For more information about this control, please visit the following +presentation in embedded linux conference, 2015. +http://events.linuxfoundation.org/sites/events/files/slides/ +%5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf + +============================================================== + stat_interval The time interval between which vm statistics are updated. The default diff --git a/include/linux/swap.h b/include/linux/swap.h index 9a7adfb..6505b0b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -333,6 +333,13 @@ extern int vm_swappiness; extern int remove_mapping(struct address_space *mapping, struct page *page); extern unsigned long vm_total_pages; +#ifdef CONFIG_SHRINK_MEMORY +extern int sysctl_shrink_memory; +extern int sysctl_shrinkmem_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *length, loff_t *ppos); +#endif + + #ifdef CONFIG_NUMA extern int zone_reclaim_mode; extern int sysctl_min_unmapped_ratio; diff --git a/kernel/sysctl.c b/kernel/sysctl.c index c566b56..2895099 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1351,6 +1351,15 @@ static struct ctl_table vm_table[] = { }, #endif /* CONFIG_COMPACTION */ +#ifdef CONFIG_SHRINK_MEMORY + { + .procname = "shrink_memory", + .data = &sysctl_shrink_memory, + .maxlen = sizeof(int), + .mode = 0200, + .proc_handler = sysctl_shrinkmem_handler, + }, +#endif { .procname = "min_free_kbytes", .data = &min_free_kbytes, diff --git a/mm/Kconfig b/mm/Kconfig index b3a60ee..8e04bd9 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -657,3 +657,11 @@ config DEFERRED_STRUCT_PAGE_INIT when kswapd starts. This has a potential performance impact on processes running early in the lifetime of the systemm until kswapd finishes the initialisation. + +config SHRINK_MEMORY + bool "Allow for system-wide shrinking of memory" + default n + depends on MMU + help + It enables support for system-wide memory reclaim in one shot using + echo 1 > /proc/sys/vm/shrink_memory. diff --git a/mm/vmscan.c b/mm/vmscan.c index c8d8282..837b88d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3557,7 +3557,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx) wake_up_interruptible(&pgdat->kswapd_wait); } -#ifdef CONFIG_HIBERNATION +#if defined CONFIG_HIBERNATION || CONFIG_SHRINK_MEMORY /* * Try to free `nr_to_reclaim' of memory, system-wide, and return the number of * freed pages. @@ -3571,12 +3571,17 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim) struct reclaim_state reclaim_state; struct scan_control sc = { .nr_to_reclaim = nr_to_reclaim, +#ifdef CONFIG_SHRINK_MEMORY + .gfp_mask = (GFP_HIGHUSER_MOVABLE | GFP_RECLAIM_MASK), + .hibernation_mode = 0, +#else .gfp_mask = GFP_HIGHUSER_MOVABLE, + .hibernation_mode = 1, +#endif .priority = DEF_PRIORITY, .may_writepage = 1, .may_unmap = 1, .may_swap = 1, - .hibernation_mode = 1, }; struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask); struct task_struct *p = current; @@ -3597,6 +3602,20 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim) } #endif /* CONFIG_HIBERNATION */ +#ifdef CONFIG_SHRINK_MEMORY +int sysctl_shrink_memory; +/* This is the entry point for system-wide shrink memory ++via /proc/sys/vm/shrink_memory */ +int sysctl_shrinkmem_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *length, loff_t *ppos) +{ + if (write) + shrink_all_memory(totalram_pages); + + return 0; +} +#endif + /* It's optimal to keep kswapds on the same CPUs as their memory, but not required for correctness. So if the last cpu in a node goes away, we get changed to run anywhere: as the first one comes back, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/