2014-01-07 00:09:18

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] Add a sysctl for numa_balancing v2

From: Andi Kleen <[email protected]>

[It turns out the documentation patch was already merged
earlier. So just resending without documentation.]

As discussed earlier, this adds a working sysctl to enable/disable
automatic numa memory balancing at runtime.

This allows to track down performance problems with this
feature and is generally a good idea.

This was possible earlier through debugfs, but only with special
debugging options set. Also fix the boot message.

v2: Remove documentation as the documentation for this
sysctl was already merged earlier.
Acked-by: Mel Gorman <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
include/linux/sched/sysctl.h | 4 ++++
kernel/sched/core.c | 24 +++++++++++++++++++++++-
kernel/sysctl.c | 9 +++++++++
mm/mempolicy.c | 2 +-
4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 41467f8..e134535 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -100,4 +100,8 @@ extern int sched_rt_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);

+extern int sched_numa_balancing(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos);
+
#endif /* _SCHED_SYSCTL_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a88f4a4..4dc22da 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1763,7 +1763,29 @@ void set_numabalancing_state(bool enabled)
numabalancing_enabled = enabled;
}
#endif /* CONFIG_SCHED_DEBUG */
-#endif /* CONFIG_NUMA_BALANCING */
+
+#ifdef CONFIG_PROC_SYSCTL
+int sched_numa_balancing(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table t;
+ int err;
+ int state = numabalancing_enabled;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ t = *table;
+ t.data = &state;
+ err = proc_dointvec_minmax(&t, write, buffer, lenp, ppos);
+ if (err < 0)
+ return err;
+ if (write)
+ set_numabalancing_state(state);
+ return err;
+}
+#endif
+#endif

/*
* fork()/clone()-time setup:
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 34a6047..9e0e790 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -398,6 +398,15 @@ static struct ctl_table kern_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec,
},
+ {
+ .procname = "numa_balancing",
+ .data = NULL, /* filled in by handler */
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = sched_numa_balancing,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
#endif /* CONFIG_NUMA_BALANCING */
#endif /* CONFIG_SCHED_DEBUG */
{
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 0cd2c4d..947293e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2668,7 +2668,7 @@ static void __init check_numabalancing_enable(void)

if (nr_node_ids > 1 && !numabalancing_override) {
printk(KERN_INFO "Enabling automatic NUMA balancing. "
- "Configure with numa_balancing= or sysctl");
+ "Configure with numa_balancing= or the kernel.numa_balancing sysctl");
set_numabalancing_state(numabalancing_default);
}
}
--
1.8.3.1


2014-01-07 21:58:28

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Add a sysctl for numa_balancing v2

On Mon, 6 Jan 2014 16:08:46 -0800 Andi Kleen <[email protected]> wrote:

> From: Andi Kleen <[email protected]>
>
> [It turns out the documentation patch was already merged
> earlier. So just resending without documentation.]

Confused. How could we have merged the documentation for this feature
but not the feature itself?

> As discussed earlier, this adds a working sysctl to enable/disable
> automatic numa memory balancing at runtime.
>
> This allows to track down performance problems with this
> feature and is generally a good idea.
>
> This was possible earlier through debugfs, but only with special
> debugging options set. Also fix the boot message.
>
> ...
>
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -398,6 +398,15 @@ static struct ctl_table kern_table[] = {
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> + {
> + .procname = "numa_balancing",
> + .data = NULL, /* filled in by handler */
> + .maxlen = sizeof(unsigned int),
> + .mode = 0644,
> + .proc_handler = sched_numa_balancing,
> + .extra1 = &zero,
> + .extra2 = &one,
> + },

The name "sched_numa_balancing" is wrong. All the other entries use
"sysctl_numa_balancing_foo", and so should this one.

--- a/include/linux/sched/sysctl.h~numa-add-a-sysctl-for-numa_balancing-fix
+++ a/include/linux/sched/sysctl.h
@@ -100,7 +100,7 @@ extern int sched_rt_handler(struct ctl_t
void __user *buffer, size_t *lenp,
loff_t *ppos);

-extern int sched_numa_balancing(struct ctl_table *table, int write,
+extern int sysctl_numa_balancing(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp,
loff_t *ppos);

--- a/kernel/sched/core.c~numa-add-a-sysctl-for-numa_balancing-fix
+++ a/kernel/sched/core.c
@@ -1766,7 +1766,7 @@ void set_numabalancing_state(bool enable
#endif /* CONFIG_SCHED_DEBUG */

#ifdef CONFIG_PROC_SYSCTL
-int sched_numa_balancing(struct ctl_table *table, int write,
+int sysctl_numa_balancing(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
struct ctl_table t;
--- a/kernel/sysctl.c~numa-add-a-sysctl-for-numa_balancing-fix
+++ a/kernel/sysctl.c
@@ -401,7 +401,7 @@ static struct ctl_table kern_table[] = {
.data = NULL, /* filled in by handler */
.maxlen = sizeof(unsigned int),
.mode = 0644,
- .proc_handler = sched_numa_balancing,
+ .proc_handler = sysctl_numa_balancing,
.extra1 = &zero,
.extra2 = &one,
},
_

2014-01-07 22:13:24

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Add a sysctl for numa_balancing v2

On Tue, Jan 07, 2014 at 01:58:17PM -0800, Andrew Morton wrote:
> On Mon, 6 Jan 2014 16:08:46 -0800 Andi Kleen <[email protected]> wrote:
>
> > From: Andi Kleen <[email protected]>
> >
> > [It turns out the documentation patch was already merged
> > earlier. So just resending without documentation.]
>
> Confused. How could we have merged the documentation for this feature
> but not the feature itself?

Originally all numa balancing sysctl were undocumented. I had
a TBD for this in my original patch. Mel then documented them,
but also included the documentation for the new sysctl in that patch.
Mel's documentation patch was then merged
(as part of his regular merges I suppose), but the global sysctl
patch wasn't.

-Andi

2014-01-07 23:02:09

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] Add a sysctl for numa_balancing v2

On Tue, Jan 07, 2014 at 11:13:16PM +0100, Andi Kleen wrote:
> On Tue, Jan 07, 2014 at 01:58:17PM -0800, Andrew Morton wrote:
> > On Mon, 6 Jan 2014 16:08:46 -0800 Andi Kleen <[email protected]> wrote:
> >
> > > From: Andi Kleen <[email protected]>
> > >
> > > [It turns out the documentation patch was already merged
> > > earlier. So just resending without documentation.]
> >
> > Confused. How could we have merged the documentation for this feature
> > but not the feature itself?
>
> Originally all numa balancing sysctl were undocumented. I had
> a TBD for this in my original patch. Mel then documented them,
> but also included the documentation for the new sysctl in that patch.
> Mel's documentation patch was then merged
> (as part of his regular merges I suppose), but the global sysctl
> patch wasn't.
>

Yeah, it was an oversight on my part. I can't remember if I accidentally
dropped Andi's patch or thought it had been merged already. Sorry.

--
Mel Gorman
SUSE Labs