2004-09-04 05:07:40

by Nick Piggin

[permalink] [raw]
Subject: [PATCH] schedstats additions

Hi,
I have a patch here to provide more useful statistics for me. Basically
it moves a lot more of the balancing information into the domains instead
of the runqueue, where it is nearly useless on multi-domain setups (eg.
SMT+SMP, SMP+NUMA).

It requires a version number bump, but that isn't much of an issue because
I think we're about the only two using it at the moment. But your tools
will need a little bit of work.

What do you think?


Attachments:
sched-stat.patch (10.28 kB)

2004-09-04 08:26:59

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] schedstats additions

Dnia Saturday 04 of September 2004 07:07, Nick Piggin napisa?:
> Hi,
> I have a patch here to provide more useful statistics for me. Basically
> it moves a lot more of the balancing information into the domains instead
> of the runqueue, where it is nearly useless on multi-domain setups (eg.
> SMT+SMP, SMP+NUMA).

Which kernel version it is against?

RJW

--
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman

2004-09-04 09:04:48

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] schedstats additions

Rafael J. Wysocki wrote:
> Dnia Saturday 04 of September 2004 07:07, Nick Piggin napisa?:
>
>>Hi,
>>I have a patch here to provide more useful statistics for me. Basically
>>it moves a lot more of the balancing information into the domains instead
>>of the runqueue, where it is nearly useless on multi-domain setups (eg.
>>SMT+SMP, SMP+NUMA).
>
>
> Which kernel version it is against?
>

-mm3 ... oh yeah that has nicksched in it, sorry that would put a spanner
in the works.

I'll redo it to suit 2.6 if Rick acks it - the main info he needs is still
valid, that is the output format.

Thanks
Nick

2004-09-08 08:09:52

by Rick Lindsley

[permalink] [raw]
Subject: Re: [PATCH] schedstats additions

I have a patch here to provide more useful statistics for me. Basically
it moves a lot more of the balancing information into the domains instead
of the runqueue, where it is nearly useless on multi-domain setups (eg.
SMT+SMP, SMP+NUMA).

It requires a version number bump, but that isn't much of an issue because
I think we're about the only two using it at the moment. But your tools
will need a little bit of work.

What do you think?

The idea of moving some counters from runqueues to domains is fine in
general, but I've some questions about a couple of specific changes in
your patch.

It looks to me like there are some changes in try_to_wake_up() that
aren't schedstats related, although schedstats code is among some
that is moved around. Is there some code there that should be
broken out separately?

alb_cnt
by moving this, we won't get an accurate look at the number of
times we called active_load_balance and returned immediately
because nr_running had slipped to 0 or 1. how about we add
another counter to count that too, and/or change the name of
this one?

lb_balanced
are you sure lb_balanced[idle] can't be deduced from lb_cnt[idle]
and lb_failed[idle]?

ttwu_attempts
ttwu_moved
removing these makes it harder to determine how successful
try_to_wake_up() was at moving a process. What counters would
I use to get this information if these were removed?

ttwu_remote
ttwu_wake_remote
so what's the one line description of what these count now?

smt_cnt
sbe_cnt
how might I see how often sched_migrate_task() and sched_exec()
were called if these were deleted?

lb_pulled
Rather than add another counter here, would it be as effective
to make pt_gained a domain counter? Looks like you're collecting
the same information. pt_lost would have to remain a runqueue
counter, though, since losing a task has nothing to do with a
particular domain.

Rick

2004-09-09 00:03:07

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] schedstats additions

Rick Lindsley wrote:
> I have a patch here to provide more useful statistics for me. Basically
> it moves a lot more of the balancing information into the domains instead
> of the runqueue, where it is nearly useless on multi-domain setups (eg.
> SMT+SMP, SMP+NUMA).
>
> It requires a version number bump, but that isn't much of an issue because
> I think we're about the only two using it at the moment. But your tools
> will need a little bit of work.
>
> What do you think?
>
> The idea of moving some counters from runqueues to domains is fine in
> general, but I've some questions about a couple of specific changes in
> your patch.
>
> It looks to me like there are some changes in try_to_wake_up() that
> aren't schedstats related, although schedstats code is among some
> that is moved around. Is there some code there that should be
> broken out separately?
>

There is, yes. I'll be sure to seperate it.

> alb_cnt
> by moving this, we won't get an accurate look at the number of
> times we called active_load_balance and returned immediately
> because nr_running had slipped to 0 or 1. how about we add
> another counter to count that too, and/or change the name of
> this one?
>

OK.

> lb_balanced
> are you sure lb_balanced[idle] can't be deduced from lb_cnt[idle]
> and lb_failed[idle]?
>

I don't think so, because you also have the success case, which is
!balanced && !failed.

> ttwu_attempts
> ttwu_moved
> removing these makes it harder to determine how successful
> try_to_wake_up() was at moving a process. What counters would
> I use to get this information if these were removed?
>

ttwu_cnt in the rq stats, and ttwu_wake_affine / ttwu_wake_balance
in the domain stats.

> ttwu_remote
> ttwu_wake_remote
> so what's the one line description of what these count now?
>

ttwu_remote/ttwu_wake_remote are the number of times a runqueue has
woken a remote task / a remote task within that domain, respectively.
Regardless of whether or not it gets pulled onto the local CPU.

> smt_cnt
> sbe_cnt
> how might I see how often sched_migrate_task() and sched_exec()
> were called if these were deleted?
>

sbe_pushed should basically be the same as smt_cnt, barring rare
races with the cpus_allowed mask. I guess sbe_cnt doesn't have to
go.

> lb_pulled
> Rather than add another counter here, would it be as effective
> to make pt_gained a domain counter? Looks like you're collecting

Yeah removing the runqueue counters for these would be good.

> the same information. pt_lost would have to remain a runqueue
> counter, though, since losing a task has nothing to do with a
> particular domain.


Whatever domain that the pulling CPU was in, is also a fair candidate
for pt_lost. Remember, all the domains are per-CPU so any information
you can get from a per-runqueue counter you can also get from a domain
counter.

I'll make a few changes and give you another look. Thanks for the comments.