2011-02-18 12:43:30

by Ciju Rajan K

[permalink] [raw]
Subject: [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat

Hi All,

Here is the v3.0 of the patch set, which updates
/proc/schedstat statistics. Please review the patches
and consider for inclusion.

Changes from v2.0:
* Re-based to linux-2.6-tip
* Added Tested-by tag

Changes from v1.0:
* Fixed couple of typos
* Re-written the documentation for sched-domain statistics
* Re-based to 2.6.38-rc2

Previous versions of the patches were posted here:
(v1.0) https://lkml.org/lkml/2011/1/17/87
(v2.0) https://lkml.org/lkml/2011/1/25/456

-Ciju


Documentation/scheduler/sched-stats.txt | 144 ++++++++++++--------------------
include/linux/sched.h | 11 --
kernel/sched_debug.c | 1
kernel/sched_stats.h | 13 +-
4 files changed, 60 insertions(+), 109 deletions(-)


2011-02-18 12:46:47

by Ciju Rajan K

[permalink] [raw]
Subject: [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat


From: Ciju Rajan K <[email protected]>
Date: Fri, 18 Feb 2011 16:31:12 +0530
Subject: [PATCH 1/2 v3.0] sched: Updating the fields of /proc/schedstat

This patch removes the unused statistics from /proc/schedstat.
Also updates the request queue structure fields.

Signed-off-by: Ciju Rajan K <[email protected]>
---
include/linux/sched.h | 11 -----------
kernel/sched_debug.c | 1 -
kernel/sched_stats.h | 13 +++++--------
3 files changed, 5 insertions(+), 20 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 23e9c27..a1691c7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -954,20 +954,9 @@ struct sched_domain {
unsigned int alb_failed;
unsigned int alb_pushed;

- /* SD_BALANCE_EXEC stats */
- unsigned int sbe_count;
- unsigned int sbe_balanced;
- unsigned int sbe_pushed;
-
- /* SD_BALANCE_FORK stats */
- unsigned int sbf_count;
- unsigned int sbf_balanced;
- unsigned int sbf_pushed;
-
/* try_to_wake_up() stats */
unsigned int ttwu_wake_remote;
unsigned int ttwu_move_affine;
- unsigned int ttwu_move_balance;
#endif
#ifdef CONFIG_SCHED_DEBUG
char *name;
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index 7bacd83..726b306 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -286,7 +286,6 @@ static void print_cpu(struct seq_file *m, int cpu)

P(yld_count);

- P(sched_switch);
P(sched_count);
P(sched_goidle);
#ifdef CONFIG_SMP
diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
index 48ddf43..8869ed9 100644
--- a/kernel/sched_stats.h
+++ b/kernel/sched_stats.h
@@ -4,7 +4,7 @@
* bump this up when changing the output format or the meaning of an existing
* format, so that tools can adapt (or abort)
*/
-#define SCHEDSTAT_VERSION 15
+#define SCHEDSTAT_VERSION 16

static int show_schedstat(struct seq_file *seq, void *v)
{
@@ -26,9 +26,9 @@ static int show_schedstat(struct seq_file *seq, void *v)

/* runqueue-specific stats */
seq_printf(seq,
- "cpu%d %u %u %u %u %u %u %llu %llu %lu",
+ "cpu%d %u %u %u %u %u %llu %llu %lu",
cpu, rq->yld_count,
- rq->sched_switch, rq->sched_count, rq->sched_goidle,
+ rq->sched_count, rq->sched_goidle,
rq->ttwu_count, rq->ttwu_local,
rq->rq_cpu_time,
rq->rq_sched_info.run_delay, rq->rq_sched_info.pcount);
@@ -57,12 +57,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
sd->lb_nobusyg[itype]);
}
seq_printf(seq,
- " %u %u %u %u %u %u %u %u %u %u %u %u\n",
+ " %u %u %u %u %u\n",
sd->alb_count, sd->alb_failed, sd->alb_pushed,
- sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
- sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
- sd->ttwu_wake_remote, sd->ttwu_move_affine,
- sd->ttwu_move_balance);
+ sd->ttwu_wake_remote, sd->ttwu_move_affine);
}
preempt_enable();
#endif

2011-02-18 12:47:17

by Ciju Rajan K

[permalink] [raw]
Subject: [PATCH 2/2 v3.0]sched: Updating the sched-stat documentation

From: Ciju Rajan K <[email protected]>
Date: Fri, 18 Feb 2011 16:29:14 +0530
Subject: [PATCH 2/2 v3.0] sched: Updating the sched-stat documentation

Some of the unused fields are removed from /proc/schedstat.
This is the documentation changes reflecting the same.

Signed-off-by: Ciju Rajan K <[email protected]>
---
Documentation/scheduler/sched-stats.txt | 144 ++++++++++++-------------------
1 files changed, 55 insertions(+), 89 deletions(-)

diff --git a/Documentation/scheduler/sched-stats.txt b/Documentation/scheduler/sched-stats.txt
index 1cd5d51..de47562 100644
--- a/Documentation/scheduler/sched-stats.txt
+++ b/Documentation/scheduler/sched-stats.txt
@@ -1,3 +1,4 @@
+Version 16 of schedstats removed some of the unused fields.
Version 15 of schedstats dropped counters for some sched_yield:
yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
identical to version 14.
@@ -30,112 +31,77 @@ Note that any such script will necessarily be version-specific, as the main
reason to change versions is changes in the output format. For those wishing
to write their own scripts, the fields are described here.

+The first two fields of /proc/schedstat indicates the version (current
+version is 16) and jiffies values. The following values are from
+cpu & domain statistics.
+
CPU statistics
--------------
-cpu<N> 1 2 3 4 5 6 7 8 9
-
-First field is a sched_yield() statistic:
- 1) # of times sched_yield() was called
-
-Next three are schedule() statistics:
- 2) # of times we switched to the expired queue and reused it
- 3) # of times schedule() was called
- 4) # of times schedule() left the processor idle
-
-Next two are try_to_wake_up() statistics:
- 5) # of times try_to_wake_up() was called
- 6) # of times try_to_wake_up() was called to wake up the local cpu
-
-Next three are statistics describing scheduling latency:
- 7) sum of all time spent running by tasks on this processor (in jiffies)
- 8) sum of all time spent waiting to run by tasks on this processor (in
- jiffies)
- 9) # of timeslices run on this cpu
-
+The format is like this:
+
+cpu<N> 1 2 3 4 5 6 7 8
+
+ 1) # of times sched_yield() was called on this CPU
+ 2) # of times scheduler runs on this CPU
+ 3) # of times scheduler picks idle task as next task on this CPU
+ 4) # of times try_to_wake_up() is run on this CPU
+ (Number of times task wakeup is attempted from this CPU)
+ 5) # of times try_to_wake_up() wakes up a task on the same CPU
+ (local wakeup)
+ 6) Time(ns) for which tasks have run on this CPU
+ 7) Time(ns) for which tasks on this CPU's runqueue have waited
+ before getting to run on the CPU
+ 8) # of tasks that have run on this CPU

Domain statistics
-----------------
-One of these is produced per domain for each cpu described. (Note that if
-CONFIG_SMP is not defined, *no* domains are utilized and these lines
-will not appear in the output.)
+One of these is produced per domain for each cpu described.
+(Note that if CONFIG_SMP is not defined, *no* domains are utilized
+and these lines will not appear in the output.)

-domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
+domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

The first field is a bit mask indicating what cpus this domain operates over.

The next 24 are a variety of load_balance() statistics in grouped into types
of idleness (idle, busy, and newly idle):

- 1) # of times in this domain load_balance() was called when the
- cpu was idle
- 2) # of times in this domain load_balance() checked but found
- the load did not require balancing when the cpu was idle
- 3) # of times in this domain load_balance() tried to move one or
- more tasks and failed, when the cpu was idle
- 4) sum of imbalances discovered (if any) with each call to
- load_balance() in this domain when the cpu was idle
- 5) # of times in this domain pull_task() was called when the cpu
- was idle
- 6) # of times in this domain pull_task() was called even though
- the target task was cache-hot when idle
- 7) # of times in this domain load_balance() was called but did
- not find a busier queue while the cpu was idle
- 8) # of times in this domain a busier queue was found while the
- cpu was idle but no busier group was found
-
- 9) # of times in this domain load_balance() was called when the
- cpu was busy
- 10) # of times in this domain load_balance() checked but found the
- load did not require balancing when busy
- 11) # of times in this domain load_balance() tried to move one or
- more tasks and failed, when the cpu was busy
- 12) sum of imbalances discovered (if any) with each call to
- load_balance() in this domain when the cpu was busy
- 13) # of times in this domain pull_task() was called when busy
- 14) # of times in this domain pull_task() was called even though the
- target task was cache-hot when busy
- 15) # of times in this domain load_balance() was called but did not
- find a busier queue while the cpu was busy
- 16) # of times in this domain a busier queue was found while the cpu
- was busy but no busier group was found
-
- 17) # of times in this domain load_balance() was called when the
- cpu was just becoming idle
- 18) # of times in this domain load_balance() checked but found the
- load did not require balancing when the cpu was just becoming idle
- 19) # of times in this domain load_balance() tried to move one or more
- tasks and failed, when the cpu was just becoming idle
- 20) sum of imbalances discovered (if any) with each call to
- load_balance() in this domain when the cpu was just becoming idle
- 21) # of times in this domain pull_task() was called when newly idle
- 22) # of times in this domain pull_task() was called even though the
- target task was cache-hot when just becoming idle
- 23) # of times in this domain load_balance() was called but did not
- find a busier queue while the cpu was just becoming idle
- 24) # of times in this domain a busier queue was found while the cpu
- was just becoming idle but no busier group was found
-
+CPU_NOT_IDLE: Load balancer is being run on a CPU when it is
+ not in IDLE state (busy times)
+CPU_NEWLY_IDLE: Load balancer is being run on a CPU which is
+ about to enter IDLE state
+
+There are eight stats available for each of the above three states:
+ - # of times in this domain load_balance() was called
+ - # of times in this domain load_balance() checked but found
+ the load did not require balancing
+ - # of times in this domain load_balance() tried to move one or
+ more tasks and failed
+ - sum of imbalances discovered (if any) with each call to
+ load_balance() in this domain
+ - # of times in this domain pull_task() was called
+ - # of times in this domain pull_task() was called even though
+ the target task was cache-hot
+ - # of times in this domain load_balance() was called but did
+ not find a busier queue
+ - # of times in this domain a busier queue was found but no
+ busier group was found
+
+ The first 1-8) fields are the stats when cpu was idle (CPU_IDLE),
+ the next 9-15) fields are the stats when cpu was busy (CPU_NOT_IDLE),
+ and the next 16-24) fields are the stats when cpu was just
+ becoming idle (CPU_NEWLY_IDLE)
+
Next three are active_load_balance() statistics:
25) # of times active_load_balance() was called
26) # of times active_load_balance() tried to move a task and failed
27) # of times active_load_balance() successfully moved a task

- Next three are sched_balance_exec() statistics:
- 28) sbe_cnt is not used
- 29) sbe_balanced is not used
- 30) sbe_pushed is not used
-
- Next three are sched_balance_fork() statistics:
- 31) sbf_cnt is not used
- 32) sbf_balanced is not used
- 33) sbf_pushed is not used
-
- Next three are try_to_wake_up() statistics:
- 34) # of times in this domain try_to_wake_up() awoke a task that
- last ran on a different cpu in this domain
- 35) # of times in this domain try_to_wake_up() moved a task to the
- waking cpu because it was cache-cold on its own cpu anyway
- 36) # of times in this domain try_to_wake_up() started passive balancing
+ Next two are try_to_wake_up() statistics:
+ 28) # of times in this domain try_to_wake_up() awoke a task that
+ last ran on a different cpu in this domain
+ 29) # of times in this domain try_to_wake_up() moved a task to the
+ waking cpu because it was cache-cold on its own cpu anyway

/proc/<pid>/schedstat
----------------

2011-02-22 08:37:48

by Bharata B Rao

[permalink] [raw]
Subject: Re: [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat

On Fri, Feb 18, 2011 at 06:13:28PM +0530, Ciju Rajan K wrote:
> Hi All,
>
> Here is the v3.0 of the patch set, which updates
> /proc/schedstat statistics. Please review the patches
> and consider for inclusion.

I believe this documentation cleanup is good to go in. Hope you
have ensured that userspace can work smoothly with the bumped up
version.

Regards,
Bharata.