Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756483AbbGFN4K (ORCPT ); Mon, 6 Jul 2015 09:56:10 -0400 Received: from mailout1.samsung.com ([203.254.224.24]:38144 "EHLO mailout1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754496AbbGFN4F (ORCPT ); Mon, 6 Jul 2015 09:56:05 -0400 X-AuditID: cbfee690-f796f6d000005054-ba-559a88f289f4 From: PINTU KUMAR To: "'Heinrich Schuchardt'" , corbet@lwn.net, akpm@linux-foundation.org, vbabka@suse.cz, gorcunov@openvz.org, mhocko@suse.cz, emunson@akamai.com, kirill.shutemov@linux.intel.com, standby24x7@gmail.com, hannes@cmpxchg.org, vdavydov@parallels.com, hughd@google.com, minchan@kernel.org, tj@kernel.org, rientjes@google.com, dzickus@redhat.com, prarit@redhat.com, ebiederm@xmission.com, rostedt@goodmis.org, uobergfe@redhat.com, paulmck@linux.vnet.ibm.com, iamjoonsoo.kim@lge.com, ddstreet@ieee.org, sasha.levin@oracle.com, koct9i@gmail.com, mgorman@suse.de, cj@linux.com, opensource.ganesh@gmail.com, vinmenon@codeaurora.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-pm@vger.kernel.org Cc: cpgs@samsung.com, pintu_agarwal@yahoo.com, vishnu.ps@samsung.com, rohit.kr@samsung.com, iqbal.ams@samsung.com, pintu.agarwal@gmail.com References: <1435929607-3435-1-git-send-email-pintu.k@samsung.com> <5596C48F.8050800@gmx.de> In-reply-to: <5596C48F.8050800@gmx.de> Subject: RE: [PATCH 1/1] kernel/sysctl.c: Add /proc/sys/vm/shrink_memory feature Date: Mon, 06 Jul 2015 19:22:47 +0530 Message-id: <0ffd01d0b7f3$6bf5a120$43e0e360$@samsung.com> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-index: AQIslilycOCfT8Oq4c8d4RbMXOXODQJ8UumXnQLUw4A= Content-language: en-us X-Brightmail-Tracker: H4sIAAAAAAAAA01SbUxTZxjNe+/tvZfOLq8d6mvHNseiTJMxQcoe2ecv82oi0amJMU69gzsg ApIWiCNbIhQRkTaM4oTCEAE/KTYrwdBJoFY2nUr8gKBCqDWrENAqlgwQlrF+uIR/5zzPec45 Px6RVVt4jZiRnSvrsqXMaF7JWSMTfvzIX2rZsdbYHAN1NisPtjEDAq/zCIIx12q4U3KZg+He OJi/VCzA89uTDHS3TTDQYt8MT/wmDs4fM/BQO1vJwMPRugA96lHAqZJmDvp+q+PBbZ1XwKSx B4H5+QiCccO0As78/UKAw4N/ceBrC9Dy7nsMTPmGAu4zRgFKmn9lYNRwmIOuUg8DTT1DLNyr vi/A7NnfEZi8pxmoLTQh6GuvYcHW7hSgyjSM4OrxCfRVDDX3NLD06rMXLO0zGRnqsAwLdOLl HlrcNSjQBnsefVJxhaH2C0d5avdXCrTM18/Q69VzHPX2n2Bo/Z9bqcfVxtEmk1mxRbNL+Vmq nJmRL+s+/mKfMr1m4C6bM77h4PB0FzqExqEMRYgEJ5AjrcV8GC8ld9y2AFaKanwakeqfi4T/ RQMT9YrwohGRW80WIUx8iBT+UxQ4EUUex5A/nKrgPBK/UpCTVdUoSFhcjkjvjD9kpcZ7yGxX YygvAq8ijtbJ0PFbeAspbEwKjjm8klwcm2OCWIXXk+madhTGi8mM2c0FMYvXEJvjGhPG75E2 q48NN11BOnqfhvSROIlc/MXLhjXLSKXncag0waeUZGB0lg2HYTJldnHBDgS/Q+zO1z7LyZVz D7gKRCwLoi0Loi0Loi0LIhoQdwEtkXNScvTfpukSYvVSlj4vOy025UCWHQVe9ua/IxUdyO38 1IWwiKIXqXb+ULNDrZDy9d9nuZA20OgnVrMk5UDgy7Nz98atS4wHbYJ2Xfwn6xOjl6meal5t U+M0KVfeL8s5sm6vLi9T1rsQI0ZoDqH465yj8920/d8NnlkuaI57V0oll+n2qg/eTPbcWu3c ZDzBd4pvPO6+1J34oLWl6lFd7Pudk60dUal0ap+2PGqkf3ts6eYhw4fWG9pz127okhabHs0n pzZ9/fnbUavqpaKWZI9mauuc6uSiY6NluzfedGcV3f8m/WXBl1JGwdoCh789P5rTp0txa1id XvoPqKSI+60DAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA2WTbUxbVRjHPffe9l7ImtxVNo71g+Qmc4gpUl62h7lNkxlyiWBQaTR+GN7R m8KkL7aUiB8WOhhChcqbETqc5UWNyCi2YmBGxgqi06ETlm2OUFlWtsDs6KgKAhu2NDNEz6f/ /znP/5fnnJzDkHK/VMEU60tFk14o4aSx1I/3b7DKpRqHOqWxXwHtrl4puOYrEfhH3kEw730C LlZ/TcHMhAo2vqqi4c7PIQLOeoIEfO7OhbklOwWfvVsphZOrTQT8eqs9bGtnJdBR3U3B1Jl2 Kfh6NyQQqh9D0HznJoKFymUJfPLnIg0nrt2gIOAJ27qzkwT8FZgO01fqaaju/oKAW5UnKBiu mSWga2yahMnWKzSsfvotArv/YwJOWu0IpgbaSHANjNDQYp9BMPp+ED27m28ec5L86O+LJD9l ryf4IccMzQfvHuarhq/RvNNt4ecazhG8u6dWyruXmmjeFrhE8N+3rlG8/9IHBH/q/Iv8rNdD 8V32Zkme4rUKtL9IFDSiKUHUFxo0xXrtAe75lwsOFWTsSVEpVZmwl0vQCzrxAPdcTp4yq7gk fLNcQplQYgmX8gSzmXvq4P8J6vwsJTwIvqrMzX/p30xqyn/W672oqO3yL6RxIeutmeVhVIEW wIZiGMym48vBU5Ko3okv+lxSG4pl5Gwnwhe6HXTUBBC2rh8P7zCMlN2Nx0dkkXoc+7cEf9TS iiKGZOsQnlhZoiMoOXsYrw53SiM6hn0cD50ObYYfZvOwtXNfpEyxu3Df/BoR0TI2Ey+3DaCo 3o5Xmn1URJNsEnYNfUdE9WPY0xsgo5Mm4MGJ25v9cew+3Pehn4z2xOOm2et0A5I7tqAcW1CO LSjHlogTUT1oh2gsNJqPaHWpyWZBZ7botcmFBp0bbb7wm48Oop4K8CKWQdw2Wf+xNrVcIpSZ y3VehBmSi5OtlznUcplGKH9bNBkKTJYS0exFGeGTNpKKHYWG8H/Rlxao0tL3ZqRlZKZD+p5M Ll523pCjlrNaoVR8QxSNoulBjmBiFBXIc70x55h2XGqzLuY/0xJ8esN7JDsU+G0q+fi9bOML aR0x993uul2WP+I7axuTfrh3WzOY+s36k1c7dh60KWuuuPc7TM6jd5lXitkvfW9aE32pNaHJ qrn+qwuKo0wDz2e/l6hB22zl+rgLa9nth7Yv/4Qf6jp9bnT8TNUjzti+6dxEjjIXCaok0mQW /gGmi3L69wMAAA== DLP-Filter: Pass X-MTR: 20000000000000000@CPGS X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9077 Lines: 241 > -----Original Message----- > From: Heinrich Schuchardt [mailto:xypron.glpk@gmx.de] > Sent: Friday, July 03, 2015 10:51 PM > To: Pintu Kumar; corbet@lwn.net; akpm@linux-foundation.org; vbabka@suse.cz; > gorcunov@openvz.org; mhocko@suse.cz; emunson@akamai.com; > kirill.shutemov@linux.intel.com; standby24x7@gmail.com; > hannes@cmpxchg.org; vdavydov@parallels.com; hughd@google.com; > minchan@kernel.org; tj@kernel.org; rientjes@google.com; > dzickus@redhat.com; prarit@redhat.com; ebiederm@xmission.com; > rostedt@goodmis.org; uobergfe@redhat.com; paulmck@linux.vnet.ibm.com; > iamjoonsoo.kim@lge.com; ddstreet@ieee.org; sasha.levin@oracle.com; > koct9i@gmail.com; mgorman@suse.de; cj@linux.com; > opensource.ganesh@gmail.com; vinmenon@codeaurora.org; linux- > doc@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org; linux- > pm@vger.kernel.org > Cc: cpgs@samsung.com; pintu_agarwal@yahoo.com; vishnu.ps@samsung.com; > rohit.kr@samsung.com; iqbal.ams@samsung.com > Subject: Re: [PATCH 1/1] kernel/sysctl.c: Add /proc/sys/vm/shrink_memory > feature > > On 03.07.2015 15:20, Pintu Kumar wrote: > > This patch provides 2 things: > > 1. Add new control called shrink_memory in /proc/sys/vm/. > > This control can be used to aggressively reclaim memory system-wide in > > one shot from the user space. A value of 1 will instruct the kernel to > > reclaim as much as totalram_pages in the system. > > Example: echo 1 > /proc/sys/vm/shrink_memory > > > > 2. Enable shrink_all_memory API in kernel with new > CONFIG_SHRINK_MEMORY. > > Currently, shrink_all_memory function is used only during hibernation. > > With the new config we can make use of this API for non-hibernation > > case also without disturbing the hibernation case. > > > > The detailed paper was presented in Embedded Linux Conference, > > Mar-2015 http://events.linuxfoundation.org/sites/events/files/slides/ > > %5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf > > > > Scenarios were this can be used and helpful are: > > 1) Can be invoked just after system boot-up is finished. > > 2) Can be invoked just before entering entire system suspend. > > 3) Can be invoked from kernel when order-4 pages starts failing. > > 4) Can be helpful to completely avoid or delay the kerenl OOM condition. > > 5) Can be developed as a system-tool to quickly defragment entire system > > from user space, without the need to kill any application. > > > > Signed-off-by: Pintu Kumar > > --- > > Documentation/sysctl/vm.txt | 16 ++++++++++++++++ > > include/linux/swap.h | 7 +++++++ > > kernel/sysctl.c | 9 +++++++++ > > mm/Kconfig | 8 ++++++++ > > mm/vmscan.c | 23 +++++++++++++++++++++-- > > 5 files changed, 61 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt > > index 9832ec5..a959ad1 100644 > > --- a/Documentation/sysctl/vm.txt > > +++ b/Documentation/sysctl/vm.txt > > @@ -54,6 +54,7 @@ Currently, these files are in /proc/sys/vm: > > - page-cluster > > - panic_on_oom > > - percpu_pagelist_fraction > > +- shrink_memory > > - stat_interval > > - swappiness > > - user_reserve_kbytes > > @@ -718,6 +719,21 @@ sysctl, it will revert to this default behavior. > > > > ============================================================== > > > > +shrink_memory > > + > > +This control is available only when CONFIG_SHRINK_MEMORY is set. This > > +control can be used to aggressively reclaim memory system-wide in one > > +shot. A value of > > +1 will instruct the kernel to reclaim as much as totalram_pages in the system. > > +For example, to reclaim all memory system-wide we can do: > > +# echo 1 > /proc/sys/vm/shrink_memory > > The API should be as restrictive as possible to allow for extensibility. > > You describe "1" as the only used value. So, please add here: > > "If any other value than 1 is written to shrink_memory an error EINVAL occurs." > Ok, I will handle this error case in next patch set. Actual, I did exactly like compact_memory, so I made this way. > > + > > +For more information about this control, please visit the following > > +presentation in embedded linux conference, 2015. > > +http://events.linuxfoundation.org/sites/events/files/slides/ > > +%5BELC-2015%5D-System-wide-Memory-Defragmenter.pdf > > + > > +============================================================== > > + > > stat_interval > > > > The time interval between which vm statistics are updated. The > > default diff --git a/include/linux/swap.h b/include/linux/swap.h index > > 9a7adfb..6505b0b 100644 > > --- a/include/linux/swap.h > > +++ b/include/linux/swap.h > > @@ -333,6 +333,13 @@ extern int vm_swappiness; extern int > > remove_mapping(struct address_space *mapping, struct page *page); > > extern unsigned long vm_total_pages; > > > > +#ifdef CONFIG_SHRINK_MEMORY > > +extern int sysctl_shrink_memory; > > +extern int sysctl_shrinkmem_handler(struct ctl_table *table, int write, > > + void __user *buffer, size_t *length, loff_t *ppos); #endif > > + > > + > > #ifdef CONFIG_NUMA > > extern int zone_reclaim_mode; > > extern int sysctl_min_unmapped_ratio; diff --git a/kernel/sysctl.c > > b/kernel/sysctl.c index c566b56..2895099 100644 > > --- a/kernel/sysctl.c > > +++ b/kernel/sysctl.c > > @@ -1351,6 +1351,15 @@ static struct ctl_table vm_table[] = { > > }, > > > > #endif /* CONFIG_COMPACTION */ > > +#ifdef CONFIG_SHRINK_MEMORY > > + { > > + .procname = "shrink_memory", > > + .data = &sysctl_shrink_memory, > > + .maxlen = sizeof(int), > > + .mode = 0200, > > + .proc_handler = sysctl_shrinkmem_handler, > > Supply the value limits. > > int min_shrink_memory = 1; > int max_shrink_memory = 1; > > .extra1 = &min_shrink_memory, > .extra2 = &max_shrink_memory, > Ok, I will include this value as well in the new patch set. > > + }, > > +#endif > > { > > .procname = "min_free_kbytes", > > .data = &min_free_kbytes, > > diff --git a/mm/Kconfig b/mm/Kconfig > > index b3a60ee..8e04bd9 100644 > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -657,3 +657,11 @@ config DEFERRED_STRUCT_PAGE_INIT > > when kswapd starts. This has a potential performance impact on > > processes running early in the lifetime of the systemm until kswapd > > finishes the initialisation. > > + > > +config SHRINK_MEMORY > > + bool "Allow for system-wide shrinking of memory" > > + default n > > + depends on MMU > > + help > > + It enables support for system-wide memory reclaim in one shot using > > + echo 1 > /proc/sys/vm/shrink_memory. > > diff --git a/mm/vmscan.c b/mm/vmscan.c index c8d8282..837b88d 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -3557,7 +3557,7 @@ void wakeup_kswapd(struct zone *zone, int order, > enum zone_type classzone_idx) > > wake_up_interruptible(&pgdat->kswapd_wait); > > } > > > > -#ifdef CONFIG_HIBERNATION > > +#if defined CONFIG_HIBERNATION || CONFIG_SHRINK_MEMORY > > /* > > * Try to free `nr_to_reclaim' of memory, system-wide, and return the number > of > > * freed pages. > > @@ -3571,12 +3571,17 @@ unsigned long shrink_all_memory(unsigned long > nr_to_reclaim) > > struct reclaim_state reclaim_state; > > struct scan_control sc = { > > .nr_to_reclaim = nr_to_reclaim, > > +#ifdef CONFIG_SHRINK_MEMORY > > + .gfp_mask = (GFP_HIGHUSER_MOVABLE | GFP_RECLAIM_MASK), > > + .hibernation_mode = 0, > > +#else > > .gfp_mask = GFP_HIGHUSER_MOVABLE, > > + .hibernation_mode = 1, > > +#endif > > .priority = DEF_PRIORITY, > > .may_writepage = 1, > > .may_unmap = 1, > > .may_swap = 1, > > - .hibernation_mode = 1, > > }; > > struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask); > > struct task_struct *p = current; > > @@ -3597,6 +3602,20 @@ unsigned long shrink_all_memory(unsigned long > > nr_to_reclaim) } #endif /* CONFIG_HIBERNATION */ > > > > +#ifdef CONFIG_SHRINK_MEMORY > > +int sysctl_shrink_memory; > > +/* This is the entry point for system-wide shrink memory > > ++via /proc/sys/vm/shrink_memory */ > > +int sysctl_shrinkmem_handler(struct ctl_table *table, int write, > > + void __user *buffer, size_t *length, loff_t *ppos) { > > Check if *buffer contains "1". If the value is not "1" return -EINVAL. > > The check can be done using function proc_dointvec_minmax(). > Ok, I will include this case also in the new patch set. Thanks for the review and suggestions. > Best regards > > Heinrich Schuchardt > > > + if (write) > > + shrink_all_memory(totalram_pages); > > + > > + return 0; > > +} > > +#endif > > + > > /* It's optimal to keep kswapds on the same CPUs as their memory, but > > not required for correctness. So if the last cpu in a node goes > > away, we get changed to run anywhere: as the first one comes back, > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/