Subject: [PATCH] sched: Print sched_group::__cpu_power in sched_domain_debug.

A simple patch to print the value of __cpu_power in sched_domain_debug.

---

Gautham R Shenoy (1):
sched: Print sched_group::__cpu_power in sched_domain_debug.


kernel/sched.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

--
Thanks and Regards
gautham.


Subject: [PATCH] sched: Print sched_group::__cpu_power in sched_domain_debug.

If the user changes the value of the sched_mc/smt_power_savings sysfs tunable,
it'll trigger a rebuilding of the whole sched_domain tree, with the
SD_POWERSAVINGS_BALANCE flag set at certain levels.

As a result, there would be a change in the __cpu_power of sched_groups
in the sched_domain hierarchy.

Print the __cpu_power values for each sched_group in sched_domain_debug
to help verify this change and correlate it with the change in the
load-balancing behavior.

Signed-off-by: Gautham R Shenoy <[email protected]>
---

kernel/sched.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 706517c..fbac83b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7363,7 +7363,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s", str);
+ printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ group->__cpu_power);

group = group->next;
} while (group != sd->groups);

Subject: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

Commit-ID: 46e0bb9c12f4bab539736f1714cbf16600f681ec
Gitweb: http://git.kernel.org/tip/46e0bb9c12f4bab539736f1714cbf16600f681ec
Author: Gautham R Shenoy <[email protected]>
AuthorDate: Mon, 30 Mar 2009 10:25:20 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 1 Apr 2009 17:58:03 +0200

sched: Print sched_group::__cpu_power in sched_domain_debug

Impact: extend debug info /proc/sched_debug

If the user changes the value of the sched_mc/smt_power_savings sysfs
tunable, it'll trigger a rebuilding of the whole sched_domain tree,
with the SD_POWERSAVINGS_BALANCE flag set at certain levels.

As a result, there would be a change in the __cpu_power of sched_groups
in the sched_domain hierarchy.

Print the __cpu_power values for each sched_group in sched_domain_debug
to help verify this change and correlate it with the change in the
load-balancing behavior.

Signed-off-by: Gautham R Shenoy <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/sched.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 8d1bdbe..6234d10 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6963,7 +6963,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s", str);
+ printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ group->__cpu_power);

group = group->next;
} while (group != sd->groups);

2009-04-13 18:53:24

by Tony Luck

[permalink] [raw]
Subject: Re: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -6963,7 +6963,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
> ? ? ? ? ? ? ? ?cpumask_or(groupmask, groupmask, sched_group_cpus(group));
>
> ? ? ? ? ? ? ? ?cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
> - ? ? ? ? ? ? ? printk(KERN_CONT " %s", str);
> + ? ? ? ? ? ? ? printk(KERN_CONT " %s (__cpu_power = %d)", str,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? group->__cpu_power);

This has added a lot of clutter to the console log during boot
(especially on large systems).

Here is the start of the diff output comparing old and new console
messages on a 16 cpu machine:

77c77
< groups: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
---
> groups: 0 (__cpu_power = 1024) 1 (__cpu_power = 1024) 2 (__cpu_power = 1024) 3 (__cpu_power = 1024) 4 (__cpu_power = 1024) 5 (__cpu_power = 1024) 6 (__cpu_power = 1024) 7 (__cpu_power = 1024) 8 (
__cpu_power = 1024) 9 (__cpu_power = 1024) 10 (__cpu_power = 1024) 11
(__cpu_power = 1024) 12 (__cpu_power = 1024) 13 (__cpu_power = 1024)
14 (__cpu_power = 1024) 15 (__cpu_power = 1024)
80c80
< groups: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0
---
> groups: 1 (__cpu_power = 1024) 2 (__cpu_power = 1024) 3 (__cpu_power = 1024) 4 (__cpu_power = 1024) 5 (__cpu_power = 1024) 6 (__cpu_power = 1024) 7 (__cpu_power = 1024) 8 (__cpu_power = 1024) 9 (
__cpu_power = 1024) 10 (__cpu_power = 1024) 11 (__cpu_power = 1024) 12
(__cpu_power = 1024) 13 (__cpu_power = 1024) 14 (__cpu_power = 1024)
15 (__cpu_power = 1024) 0 (__cpu_power = 1024)

continues as each group is reported.

-Tony

2009-04-14 00:10:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug


* Tony Luck <[email protected]> wrote:

> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -6963,7 +6963,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
> > ? ? ? ? ? ? ? ?cpumask_or(groupmask, groupmask, sched_group_cpus(group));
> >
> > ? ? ? ? ? ? ? ?cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
> > - ? ? ? ? ? ? ? printk(KERN_CONT " %s", str);
> > + ? ? ? ? ? ? ? printk(KERN_CONT " %s (__cpu_power = %d)", str,
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? group->__cpu_power);
>
> This has added a lot of clutter to the console log during boot
> (especially on large systems).
>
> Here is the start of the diff output comparing old and new console
> messages on a 16 cpu machine:
>
> 77c77
> < groups: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> ---
> > groups: 0 (__cpu_power = 1024) 1 (__cpu_power = 1024) 2 (__cpu_power = 1024) 3 (__cpu_power = 1024) 4 (__cpu_power = 1024) 5 (__cpu_power = 1024) 6 (__cpu_power = 1024) 7 (__cpu_power = 1024) 8 (
> __cpu_power = 1024) 9 (__cpu_power = 1024) 10 (__cpu_power = 1024) 11
> (__cpu_power = 1024) 12 (__cpu_power = 1024) 13 (__cpu_power = 1024)
> 14 (__cpu_power = 1024) 15 (__cpu_power = 1024)
> 80c80

indeed ...

I think we should skip the printout in the default (power==1024)
case. Gautham?

Ingo

Subject: Re: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

On Tue, Apr 14, 2009 at 02:09:34AM +0200, Ingo Molnar wrote:
>
> * Tony Luck <[email protected]> wrote:
>
> > > --- a/kernel/sched.c
> > > +++ b/kernel/sched.c
> > > @@ -6963,7 +6963,8 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
> > > ? ? ? ? ? ? ? ?cpumask_or(groupmask, groupmask, sched_group_cpus(group));
> > >
> > > ? ? ? ? ? ? ? ?cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
> > > - ? ? ? ? ? ? ? printk(KERN_CONT " %s", str);
> > > + ? ? ? ? ? ? ? printk(KERN_CONT " %s (__cpu_power = %d)", str,
> > > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? group->__cpu_power);
> >
> > This has added a lot of clutter to the console log during boot
> > (especially on large systems).
> >
> > Here is the start of the diff output comparing old and new console
> > messages on a 16 cpu machine:
> >
> > 77c77
> > < groups: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> > ---
> > > groups: 0 (__cpu_power = 1024) 1 (__cpu_power = 1024) 2 (__cpu_power = 1024) 3 (__cpu_power = 1024) 4 (__cpu_power = 1024) 5 (__cpu_power = 1024) 6 (__cpu_power = 1024) 7 (__cpu_power = 1024) 8 (
> > __cpu_power = 1024) 9 (__cpu_power = 1024) 10 (__cpu_power = 1024) 11
> > (__cpu_power = 1024) 12 (__cpu_power = 1024) 13 (__cpu_power = 1024)
> > 14 (__cpu_power = 1024) 15 (__cpu_power = 1024)
> > 80c80
>
> indeed ...
>
> I think we should skip the printout in the default (power==1024)
> case. Gautham?

Makes sense. Patch appended.
>
> Ingo
-->

sched: Avoid printing sched_group::__cpu_power for default case.

From: Gautham R Shenoy <[email protected]>

The following commit produces a messy dmesg output while attempting to print
the sched_group::__cpu_power for each group in the sched_domain hierarchy.

commit 46e0bb9c12f4bab539736f1714cbf16600f681ec
Author: Gautham R Shenoy <[email protected]>
Date: Mon Mar 30 10:25:20 2009 +0530
sched: Print sched_group::__cpu_power in sched_domain_debug

Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).

Reported-by: Tony Luck <[email protected]>
Signed-off-by: Gautham R Shenoy <[email protected]>
---

kernel/sched.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)


diff --git a/kernel/sched.c b/kernel/sched.c
index 681d4ae..0584e04 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7467,8 +7467,9 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s (__cpu_power = %d)", str,
- group->__cpu_power);
+ if (group->__cpu_power != SCHED_LOAD_SCALE)
+ printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ group->__cpu_power);

group = group->next;
} while (group != sd->groups);
--
Thanks and Regards
gautham

Subject: [tip:sched/urgent] sched: Avoid printing sched_group::__cpu_power for default case

Commit-ID: fa0dacb0eec2f531e457058c05e3fe8b7606ed08
Gitweb: http://git.kernel.org/tip/fa0dacb0eec2f531e457058c05e3fe8b7606ed08
Author: Gautham R Shenoy <[email protected]>
AuthorDate: Tue, 14 Apr 2009 09:09:36 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 14 Apr 2009 14:01:00 +0200

sched: Avoid printing sched_group::__cpu_power for default case

Impact: reduce syslog clutter

Commit 46e0bb9c12f4 ("sched: Print sched_group::__cpu_power
in sched_domain_debug") produces a messy dmesg output while
attempting to print the sched_group::__cpu_power for each
group in the sched_domain hierarchy.

Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).

Reported-by: Tony Luck <[email protected]>
Signed-off-by: Gautham R Shenoy <[email protected]>
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/sched.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index e90e70e..ebd574c 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7367,8 +7367,9 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s (__cpu_power = %d)", str,
- group->__cpu_power);
+ if (group->__cpu_power != SCHED_LOAD_SCALE)
+ printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ group->__cpu_power);

group = group->next;
} while (group != sd->groups);

2009-04-14 16:30:26

by Tony Luck

[permalink] [raw]
Subject: RE: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

- printk(KERN_CONT " %s (__cpu_power = %d)", str,
- group->__cpu_power);
+ if (group->__cpu_power != SCHED_LOAD_SCALE)
+ printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ group->__cpu_power);

Much quieter ... but perhaps a little too quiet?
Is this what you want? Now the console output looks like this:

CPU0 attaching sched-domain:
domain 0: span 0-15 level CPU
groups:
CPU1 attaching sched-domain:
domain 0: span 0-15 level CPU
groups:
CPU2 attaching sched-domain:
domain 0: span 0-15 level CPU
groups:
...

instead of the original:

CPU0 attaching sched-domain:
domain 0: span 0-15 level CPU
groups: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
CPU1 attaching sched-domain:
domain 0: span 0-15 level CPU
groups: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0
CPU2 attaching sched-domain:
domain 0: span 0-15 level CPU
groups: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1
...

Either we don't need the empty "groups:" line, or we should
still list the cpus in the group? I'm not really sure what
information you are trying to convey here.

-Tony

Subject: Re: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

On Tue, Apr 14, 2009 at 09:29:53AM -0700, Luck, Tony wrote:
> - printk(KERN_CONT " %s (__cpu_power = %d)", str,
> - group->__cpu_power);
> + if (group->__cpu_power != SCHED_LOAD_SCALE)
> + printk(KERN_CONT " %s (__cpu_power = %d)", str,
> + group->__cpu_power);
>
> Much quieter ... but perhaps a little too quiet?
> Is this what you want? Now the console output looks like this:
>
> CPU0 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups:
> CPU1 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups:
> CPU2 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups:
> ...
>
> instead of the original:
>
> CPU0 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> CPU1 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0
> CPU2 attaching sched-domain:
> domain 0: span 0-15 level CPU
> groups: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1
> ...
>
> Either we don't need the empty "groups:" line, or we should
> still list the cpus in the group? I'm not really sure what
> information you are trying to convey here.

We should be listing the cpus in the group. We should not be listing the
__cpu_power of the group, if the __cpu_power has the default value. In
the patch that I sent this morning, I made a mistake by making dependent
the printing of both cpus as well as __cpu_power, on the if
condition which checks if __cpu_power is default or not.

if (group->__cpu_power != SCHED_LOAD_SCALE)
printk(KERN_CONT " %s (__cpu_power = %d)", str,
group->__cpu_power);

Unfortunately for me, it's not the first goof up I've been involved in
today. Please find the updated patch below.

---->
sched: Avoid printing sched_group::__cpu_power for default case.

From: Gautham R Shenoy <[email protected]>

The following commit produces a messy dmesg output while attempting to print
the sched_group::__cpu_power for each group in the sched_domain hierarchy.

commit 46e0bb9c12f4bab539736f1714cbf16600f681ec
Author: Gautham R Shenoy <[email protected]>
Date: Mon Mar 30 10:25:20 2009 +0530
sched: Print sched_group::__cpu_power in sched_domain_debug

Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).

Reported-by: Tony Luck <[email protected]>
Signed-off-by: Gautham R Shenoy <[email protected]>
---

kernel/sched.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)


diff --git a/kernel/sched.c b/kernel/sched.c
index 681d4ae..db2df70 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7467,7 +7467,9 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ printk(KERN_CONT " %s", str);
+ if (group->__cpu_power != SCHED_LOAD_SCALE)
+ printk(KERN_CONT " (__cpu_power = %d)",
group->__cpu_power);

group = group->next;

--
Thanks and Regards
gautham

2009-04-14 17:19:12

by Tony Luck

[permalink] [raw]
Subject: RE: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug

- printk(KERN_CONT " %s (__cpu_power = %d)", str,
+ printk(KERN_CONT " %s", str);
+ if (group->__cpu_power != SCHED_LOAD_SCALE)
+ printk(KERN_CONT " (__cpu_power = %d)",
group->__cpu_power);

That looks better.

Acked-by: Tony Luck <[email protected]>

-Tony

2009-04-14 18:40:57

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:sched/urgent] sched: Print sched_group::__cpu_power in sched_domain_debug


* Luck, Tony <[email protected]> wrote:

> - printk(KERN_CONT " %s (__cpu_power = %d)", str,
> + printk(KERN_CONT " %s", str);
> + if (group->__cpu_power != SCHED_LOAD_SCALE)
> + printk(KERN_CONT " (__cpu_power = %d)",
> group->__cpu_power);
>
> That looks better.
>
> Acked-by: Tony Luck <[email protected]>

Thanks, picked up this variant!

Ingo

Subject: [tip:sched/urgent] sched: Avoid printing sched_group::__cpu_power for default case

Commit-ID: 728fc565bdfc98dc7086fdcccb3f1e1edf195b20
Gitweb: http://git.kernel.org/tip/728fc565bdfc98dc7086fdcccb3f1e1edf195b20
Author: Gautham R Shenoy <[email protected]>
AuthorDate: Tue, 14 Apr 2009 09:09:36 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 14 Apr 2009 20:39:40 +0200

sched: Avoid printing sched_group::__cpu_power for default case

Impact: reduce syslog clutter

Commit 46e0bb9c12f4 ("sched: Print sched_group::__cpu_power
in sched_domain_debug") produces a messy dmesg output while
attempting to print the sched_group::__cpu_power for each
group in the sched_domain hierarchy.

Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).

Reported-by: Tony Luck <[email protected]>
Signed-off-by: Gautham R Shenoy <[email protected]>
Fixed-by: Tony Luck <[email protected]>
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/sched.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index e90e70e..b902e58 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7367,8 +7367,12 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s (__cpu_power = %d)", str,
- group->__cpu_power);
+
+ printk(KERN_CONT " %s", str);
+ if (group->__cpu_power != SCHED_LOAD_SCALE) {
+ printk(KERN_CONT " (__cpu_power = %d)",
+ group->__cpu_power);
+ }

group = group->next;
} while (group != sd->groups);

Subject: [tip:sched/urgent] sched: Avoid printing sched_group::__cpu_power for default case

Commit-ID: 381512cf3d27f63f7a45b1bbe7d2d609c2ea3b74
Gitweb: http://git.kernel.org/tip/381512cf3d27f63f7a45b1bbe7d2d609c2ea3b74
Author: Gautham R Shenoy <[email protected]>
AuthorDate: Tue, 14 Apr 2009 09:09:36 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 17 Apr 2009 00:46:05 +0200

sched: Avoid printing sched_group::__cpu_power for default case

Commit 46e0bb9c12f4 ("sched: Print sched_group::__cpu_power
in sched_domain_debug") produces a messy dmesg output while
attempting to print the sched_group::__cpu_power for each
group in the sched_domain hierarchy.

Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).

[ Impact: reduce syslog clutter ]

Reported-by: Tony Luck <[email protected]>
Signed-off-by: Gautham R Shenoy <[email protected]>
Fixed-by: Tony Luck <[email protected]>
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
kernel/sched.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index e90e70e..b902e58 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7367,8 +7367,12 @@ static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level,
cpumask_or(groupmask, groupmask, sched_group_cpus(group));

cpulist_scnprintf(str, sizeof(str), sched_group_cpus(group));
- printk(KERN_CONT " %s (__cpu_power = %d)", str,
- group->__cpu_power);
+
+ printk(KERN_CONT " %s", str);
+ if (group->__cpu_power != SCHED_LOAD_SCALE) {
+ printk(KERN_CONT " (__cpu_power = %d)",
+ group->__cpu_power);
+ }

group = group->next;
} while (group != sd->groups);