2013-07-04 16:07:27

by Michal Hocko

[permalink] [raw]
Subject: [RFC] mm: Honor min_free_kbytes set by user

min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
currently which is right thing to do in most cases but this could be
unexpected if admin increased the value to prevent from allocation
failures and the new min_free_kbytes would be decreased as a result of
memory hotadd.

This patch saves the user defined value and allows updating
min_free_kbytes only if it is higher than the saved one.

A warning is printed when the new value is ignored.

Signed-off-by: Michal Hocko <[email protected]>
---
mm/page_alloc.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 22c528e..a785fad 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
};

int min_free_kbytes = 1024;
+int user_min_free_kbytes;

static unsigned long __meminitdata nr_kernel_pages;
static unsigned long __meminitdata nr_all_pages;
@@ -5592,14 +5593,22 @@ static void __meminit setup_per_zone_inactive_ratio(void)
int __meminit init_per_zone_wmark_min(void)
{
unsigned long lowmem_kbytes;
+ int new_min_free_kbytes;

lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
-
- min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
- if (min_free_kbytes < 128)
- min_free_kbytes = 128;
- if (min_free_kbytes > 65536)
- min_free_kbytes = 65536;
+ new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
+
+ if (new_min_free_kbytes > user_min_free_kbytes) {
+ min_free_kbytes = new_min_free_kbytes;
+ if (min_free_kbytes < 128)
+ min_free_kbytes = 128;
+ if (min_free_kbytes > 65536)
+ min_free_kbytes = 65536;
+ } else {
+ printk(KERN_WARNING "min_free_kbytes is not updated to %d"
+ "because user defined value %d is preferred\n",
+ new_min_free_kbytes, user_min_free_kbytes);
+ }
setup_per_zone_wmarks();
refresh_zone_stat_thresholds();
setup_per_zone_lowmem_reserve();
@@ -5617,8 +5626,10 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos)
{
proc_dointvec(table, write, buffer, length, ppos);
- if (write)
+ if (write) {
+ user_min_free_kbytes = min_free_kbytes;
setup_per_zone_wmarks();
+ }
return 0;
}

--
1.8.3.1


2013-07-04 16:10:41

by Joe Perches

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On Thu, 2013-07-04 at 18:07 +0200, Michal Hocko wrote:
> A warning is printed when the new value is ignored.

[]

> + printk(KERN_WARNING "min_free_kbytes is not updated to %d"
> + "because user defined value %d is preferred\n",
> + new_min_free_kbytes, user_min_free_kbytes);

Please use pr_warn and coalesce the format.
You'd've noticed a missing space between %d and because.

2013-07-04 16:16:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On Thu 04-07-13 09:10:39, Joe Perches wrote:
> On Thu, 2013-07-04 at 18:07 +0200, Michal Hocko wrote:
> > A warning is printed when the new value is ignored.
>
> []
>
> > + printk(KERN_WARNING "min_free_kbytes is not updated to %d"
> > + "because user defined value %d is preferred\n",
> > + new_min_free_kbytes, user_min_free_kbytes);
>
> Please use pr_warn and coalesce the format.

Sure can do that. mm/page_alloc.c doesn't seem to be unified in that
regards (44 printks and only 4 pr_<foo>) so I used printk.

> You'd've noticed a missing space between %d and because.

True

Thanks
--
Michal Hocko
SUSE Labs

2013-07-04 16:20:08

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On Thu 04-07-13 18:16:41, Michal Hocko wrote:
> On Thu 04-07-13 09:10:39, Joe Perches wrote:
> > On Thu, 2013-07-04 at 18:07 +0200, Michal Hocko wrote:
> > > A warning is printed when the new value is ignored.
> >
> > []
> >
> > > + printk(KERN_WARNING "min_free_kbytes is not updated to %d"
> > > + "because user defined value %d is preferred\n",
> > > + new_min_free_kbytes, user_min_free_kbytes);
> >
> > Please use pr_warn and coalesce the format.
>
> Sure can do that. mm/page_alloc.c doesn't seem to be unified in that
> regards (44 printks and only 4 pr_<foo>) so I used printk.
>
> > You'd've noticed a missing space between %d and because.
>
> True
>

Checkpatch fixes
---
>From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Thu, 4 Jul 2013 17:15:54 +0200
Subject: [PATCH] mm: Honor min_free_kbytes set by user

min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
currently which is right thing to do in most cases but this could be
unexpected if admin increased the value to prevent from allocation
failures and the new min_free_kbytes would be decreased as a result of
memory hotadd.

This patch saves the user defined value and allows updating
min_free_kbytes only if it is higher than the saved one.

A warning is printed when the new value is ignored.

Signed-off-by: Michal Hocko <[email protected]>
---
mm/page_alloc.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 22c528e..9c011fc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
};

int min_free_kbytes = 1024;
+int user_min_free_kbytes;

static unsigned long __meminitdata nr_kernel_pages;
static unsigned long __meminitdata nr_all_pages;
@@ -5592,14 +5593,21 @@ static void __meminit setup_per_zone_inactive_ratio(void)
int __meminit init_per_zone_wmark_min(void)
{
unsigned long lowmem_kbytes;
+ int new_min_free_kbytes;

lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
-
- min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
- if (min_free_kbytes < 128)
- min_free_kbytes = 128;
- if (min_free_kbytes > 65536)
- min_free_kbytes = 65536;
+ new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
+
+ if (new_min_free_kbytes > user_min_free_kbytes) {
+ min_free_kbytes = new_min_free_kbytes;
+ if (min_free_kbytes < 128)
+ min_free_kbytes = 128;
+ if (min_free_kbytes > 65536)
+ min_free_kbytes = 65536;
+ } else {
+ pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
+ new_min_free_kbytes, user_min_free_kbytes);
+ }
setup_per_zone_wmarks();
refresh_zone_stat_thresholds();
setup_per_zone_lowmem_reserve();
@@ -5617,8 +5625,10 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, int write,
void __user *buffer, size_t *length, loff_t *ppos)
{
proc_dointvec(table, write, buffer, length, ppos);
- if (write)
+ if (write) {
+ user_min_free_kbytes = min_free_kbytes;
setup_per_zone_wmarks();
+ }
return 0;
}

--
1.8.3.1

--
Michal Hocko
SUSE Labs

2013-07-04 16:35:49

by Zhang Yanfei

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On 07/05/2013 12:20 AM, Michal Hocko wrote:

[snip]

> ---
> From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Thu, 4 Jul 2013 17:15:54 +0200
> Subject: [PATCH] mm: Honor min_free_kbytes set by user
>
> min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
> currently which is right thing to do in most cases but this could be
> unexpected if admin increased the value to prevent from allocation
> failures and the new min_free_kbytes would be decreased as a result of
> memory hotadd.
>
> This patch saves the user defined value and allows updating
> min_free_kbytes only if it is higher than the saved one.
>
> A warning is printed when the new value is ignored.

Looks reasonable.

Acked-by: Zhang Yanfei <[email protected]>

>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/page_alloc.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 22c528e..9c011fc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
> };
>
> int min_free_kbytes = 1024;
> +int user_min_free_kbytes;
>
> static unsigned long __meminitdata nr_kernel_pages;
> static unsigned long __meminitdata nr_all_pages;
> @@ -5592,14 +5593,21 @@ static void __meminit setup_per_zone_inactive_ratio(void)
> int __meminit init_per_zone_wmark_min(void)
> {
> unsigned long lowmem_kbytes;
> + int new_min_free_kbytes;
>
> lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
> -
> - min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> - if (min_free_kbytes < 128)
> - min_free_kbytes = 128;
> - if (min_free_kbytes > 65536)
> - min_free_kbytes = 65536;
> + new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> +
> + if (new_min_free_kbytes > user_min_free_kbytes) {
> + min_free_kbytes = new_min_free_kbytes;
> + if (min_free_kbytes < 128)
> + min_free_kbytes = 128;
> + if (min_free_kbytes > 65536)
> + min_free_kbytes = 65536;
> + } else {
> + pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
> + new_min_free_kbytes, user_min_free_kbytes);
> + }
> setup_per_zone_wmarks();
> refresh_zone_stat_thresholds();
> setup_per_zone_lowmem_reserve();
> @@ -5617,8 +5625,10 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, int write,
> void __user *buffer, size_t *length, loff_t *ppos)
> {
> proc_dointvec(table, write, buffer, length, ppos);
> - if (write)
> + if (write) {
> + user_min_free_kbytes = min_free_kbytes;
> setup_per_zone_wmarks();
> + }
> return 0;
> }
>


--
Thanks.
Zhang Yanfei

2013-07-08 18:48:07

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

> Checkpatch fixes
> ---
> From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Thu, 4 Jul 2013 17:15:54 +0200
> Subject: [PATCH] mm: Honor min_free_kbytes set by user
>
> min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
> currently which is right thing to do in most cases but this could be
> unexpected if admin increased the value to prevent from allocation
> failures and the new min_free_kbytes would be decreased as a result of
> memory hotadd.
>
> This patch saves the user defined value and allows updating
> min_free_kbytes only if it is higher than the saved one.
>
> A warning is printed when the new value is ignored.
>
> Signed-off-by: Michal Hocko <[email protected]>

Thank you. I have similar patch and I have been bothered long time to
refine and post it.
Yes, current logic is not memory hotplug aware and could be dangerous.

Acked-by: KOSAKI Motohiro <[email protected]>





2013-07-09 23:41:36

by Jiri Kosina

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On Thu, 4 Jul 2013, Michal Hocko wrote:

> On Thu 04-07-13 18:16:41, Michal Hocko wrote:
> > On Thu 04-07-13 09:10:39, Joe Perches wrote:
> > > On Thu, 2013-07-04 at 18:07 +0200, Michal Hocko wrote:
> > > > A warning is printed when the new value is ignored.
> > >
> > > []
> > >
> > > > + printk(KERN_WARNING "min_free_kbytes is not updated to %d"
> > > > + "because user defined value %d is preferred\n",
> > > > + new_min_free_kbytes, user_min_free_kbytes);
> > >
> > > Please use pr_warn and coalesce the format.
> >
> > Sure can do that. mm/page_alloc.c doesn't seem to be unified in that
> > regards (44 printks and only 4 pr_<foo>) so I used printk.
> >
> > > You'd've noticed a missing space between %d and because.
> >
> > True
> >
>
> Checkpatch fixes
> ---
> >From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <[email protected]>
> Date: Thu, 4 Jul 2013 17:15:54 +0200
> Subject: [PATCH] mm: Honor min_free_kbytes set by user
>
> min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
> currently which is right thing to do in most cases but this could be
> unexpected if admin increased the value to prevent from allocation
> failures and the new min_free_kbytes would be decreased as a result of
> memory hotadd.
>
> This patch saves the user defined value and allows updating
> min_free_kbytes only if it is higher than the saved one.
>
> A warning is printed when the new value is ignored.
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/page_alloc.c | 24 +++++++++++++++++-------
> 1 file changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 22c528e..9c011fc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
> };
>
> int min_free_kbytes = 1024;
> +int user_min_free_kbytes;

Minor nit: any reason this can't be static?

>
> static unsigned long __meminitdata nr_kernel_pages;
> static unsigned long __meminitdata nr_all_pages;
> @@ -5592,14 +5593,21 @@ static void __meminit setup_per_zone_inactive_ratio(void)
> int __meminit init_per_zone_wmark_min(void)
> {
> unsigned long lowmem_kbytes;
> + int new_min_free_kbytes;
>
> lowmem_kbytes = nr_free_buffer_pages() * (PAGE_SIZE >> 10);
> -
> - min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> - if (min_free_kbytes < 128)
> - min_free_kbytes = 128;
> - if (min_free_kbytes > 65536)
> - min_free_kbytes = 65536;
> + new_min_free_kbytes = int_sqrt(lowmem_kbytes * 16);
> +
> + if (new_min_free_kbytes > user_min_free_kbytes) {
> + min_free_kbytes = new_min_free_kbytes;
> + if (min_free_kbytes < 128)
> + min_free_kbytes = 128;
> + if (min_free_kbytes > 65536)
> + min_free_kbytes = 65536;
> + } else {
> + pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
> + new_min_free_kbytes, user_min_free_kbytes);
> + }
> setup_per_zone_wmarks();
> refresh_zone_stat_thresholds();
> setup_per_zone_lowmem_reserve();
> @@ -5617,8 +5625,10 @@ int min_free_kbytes_sysctl_handler(ctl_table *table, int write,
> void __user *buffer, size_t *length, loff_t *ppos)
> {
> proc_dointvec(table, write, buffer, length, ppos);
> - if (write)
> + if (write) {
> + user_min_free_kbytes = min_free_kbytes;
> setup_per_zone_wmarks();
> + }
> return 0;
> }
>
>

--
Jiri Kosina
SUSE Labs

2013-07-10 06:57:53

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC] mm: Honor min_free_kbytes set by user

On Wed 10-07-13 01:40:06, Jiri Kosina wrote:
> On Thu, 4 Jul 2013, Michal Hocko wrote:
[...]
> > >From 5f089c0b2a57ff6c08710ac9698d65aede06079f Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <[email protected]>
> > Date: Thu, 4 Jul 2013 17:15:54 +0200
> > Subject: [PATCH] mm: Honor min_free_kbytes set by user
> >
> > min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min)
> > currently which is right thing to do in most cases but this could be
> > unexpected if admin increased the value to prevent from allocation
> > failures and the new min_free_kbytes would be decreased as a result of
> > memory hotadd.
> >
> > This patch saves the user defined value and allows updating
> > min_free_kbytes only if it is higher than the saved one.
> >
> > A warning is printed when the new value is ignored.
> >
> > Signed-off-by: Michal Hocko <[email protected]>
> > ---
> > mm/page_alloc.c | 24 +++++++++++++++++-------
> > 1 file changed, 17 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 22c528e..9c011fc 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -204,6 +204,7 @@ static char * const zone_names[MAX_NR_ZONES] = {
> > };
> >
> > int min_free_kbytes = 1024;
> > +int user_min_free_kbytes;
>
> Minor nit: any reason this can't be static?

Yes, it can and should be static. Care to queue a fix in your trivial
tree? I can post a fix if you want.

Thanks
--
Michal Hocko
SUSE Labs