2004-03-30 16:45:37

by Richard B. Johnson

[permalink] [raw]
Subject: sched_yield() version 2.4.24


Anybody know why a task that does:

for(;;)
sched_yield();

Shows 100% CPU utiliization when there are other tasks that
are actually getting the CPU? It seems that a caller to
sched_yield() does not show that it is sleeping for any
portion of the time it gives up the CPU. On the other hand,
if usleep(0) is substituted, the task is shown to be sleeping.

This shows that the accounting for sched_yield() is mucked
up. It works fine, it gives up the CPU to other tasks. However,
`top` shows it as a CPU hog, which it isn't.

Simple code to check it out:

extern void sched_yield(void);
extern int usleep(int);
int main()
{
#if BAD
for(;;)
sched_yield();
#endif
for(;;)
usleep(0);
}

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.



2004-03-30 16:58:41

by Chris Friesen

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

Richard B. Johnson wrote:
> Anybody know why a task that does:
>
> for(;;)
> sched_yield();
>
> Shows 100% CPU utiliization when there are other tasks that
> are actually getting the CPU?

What do the other tasks show for cpu in top?

Maybe it's an artifact of the timer-based process sampling for cpu
utilization, and it just happens to be running when the timer interrupt
fires, so it keeps getting billed?

Chris

2004-03-30 17:08:51

by Richard B. Johnson

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

On Tue, 30 Mar 2004, Chris Friesen wrote:

> Richard B. Johnson wrote:
> > Anybody know why a task that does:
> >
> > for(;;)
> > sched_yield();
> >
> > Shows 100% CPU utiliization when there are other tasks that
> > are actually getting the CPU?
>
> What do the other tasks show for cpu in top?
>

Well in excess of 100% on a single-CPU system.

12:02pm up 1 day, 53 min, 4 users, load average: 2.54, 1.25, 0.90
34 processes: 31 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 65.8% user, 134.6% system, 0.0% nice, 0.0% idle
Mem: 322352K av, 101772K used, 220580K free, 0K shrd, 9836K buff
Swap: 1044208K av, 1044208K used, 0K free 20240K cached

PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
7144 root 19 0 5564 5564 1444 R 0 82.5 1.7 2:27 client
7143 root 15 0 980 976 428 S 0 59.9 0.3 1:57 server
7142 root 18 0 1464 1464 1444 R 0 56.0 0.4 1:39 client
7163 root 11 0 568 564 432 R 0 1.9 0.1 0:00 top
[SNIPPED...sleeping tasks]

Here, one of the 'client' tasks is yielding its CPU time when it
is waiting on a semaphore from the first one.

> Maybe it's an artifact of the timer-based process sampling for cpu
> utilization, and it just happens to be running when the timer interrupt
> fires, so it keeps getting billed?
>
> Chris
>

I think somebody forgot to put something into the 'current' structure
when sys_sched_yield() gets called.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.


2004-03-30 17:31:40

by Chris Friesen

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

Richard B. Johnson wrote:

> Well in excess of 100% on a single-CPU system.

Very odd.

> 12:02pm up 1 day, 53 min, 4 users, load average: 2.54, 1.25, 0.90
> 34 processes: 31 sleeping, 3 running, 0 zombie, 0 stopped
> CPU states: 65.8% user, 134.6% system, 0.0% nice, 0.0% idle
> Mem: 322352K av, 101772K used, 220580K free, 0K shrd, 9836K buff
> Swap: 1044208K av, 1044208K used, 0K free 20240K cached
>
> PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
> 7144 root 19 0 5564 5564 1444 R 0 82.5 1.7 2:27 client
> 7143 root 15 0 980 976 428 S 0 59.9 0.3 1:57 server
> 7142 root 18 0 1464 1464 1444 R 0 56.0 0.4 1:39 client
> 7163 root 11 0 568 564 432 R 0 1.9 0.1 0:00 top
> [SNIPPED...sleeping tasks]


The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
since 2002. Must be somewhere else.

Anyone else have any ideas?


Chris

2004-03-30 17:54:53

by Ben Greear

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

Chris Friesen wrote:

> The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
> since 2002. Must be somewhere else.
>
> Anyone else have any ideas?

As another sample point, I have fired up about 100 processes with
each process having 10+ threads. On my dual-xeon, I see maybe 15
processes shown as 99% CPU in 'top'. System load was near 25
when I was looking, but the machine was still quite responsive.

I'm guessing this is just an artifact of having lots of processes running
very often and top is just not able to calculate with fine enough
granularity?

This is on 2.4.25 kernel.

Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2004-03-30 19:40:33

by Denis Vlasenko

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

On Tuesday 30 March 2004 19:52, Ben Greear wrote:
> Chris Friesen wrote:
> > The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
> > since 2002. Must be somewhere else.
> >
> > Anyone else have any ideas?
>
> As another sample point, I have fired up about 100 processes with
> each process having 10+ threads. On my dual-xeon, I see maybe 15
> processes shown as 99% CPU in 'top'. System load was near 25
> when I was looking, but the machine was still quite responsive.

There was a top bug with exactly this symptom. Fixed.
I use procps-2.0.18.

> I'm guessing this is just an artifact of having lots of processes running
> very often and top is just not able to calculate with fine enough
> granularity?
>
> This is on 2.4.25 kernel.
>
> Ben
--
vda

2004-03-30 20:27:25

by Richard B. Johnson

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

On Tue, 30 Mar 2004, Denis Vlasenko wrote:

> On Tuesday 30 March 2004 19:52, Ben Greear wrote:
> > Chris Friesen wrote:
> > > The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
> > > since 2002. Must be somewhere else.
> > >
> > > Anyone else have any ideas?
> >
> > As another sample point, I have fired up about 100 processes with
> > each process having 10+ threads. On my dual-xeon, I see maybe 15
> > processes shown as 99% CPU in 'top'. System load was near 25
> > when I was looking, but the machine was still quite responsive.
>
> There was a top bug with exactly this symptom. Fixed.
> I use procps-2.0.18.
>
Wonderful! Now, where do I find the sources now that RedHat has
gone "commercial" and is keeping everything secret?

I followed the http://sources.redhat.com/procps/ instructions
__exactly__ and get this:

Script started on Tue Mar 30 15:27:02 2004
quark:/home/johnson/foo[1] cvs -d :pserver:[email protected]:/procps login anoncvs
Logging in to :pserver:[email protected]:2401/procps
CVS password:
/procps: no such repository
quark:/home/johnson/foo[2] exit
Script done on Tue Mar 30 15:28:32 2004


Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.


2004-03-30 23:17:48

by Diego Calleja

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

El Tue, 30 Mar 2004 15:29:29 -0500 (EST) "Richard B. Johnson" <[email protected]> escribi?:

> Wonderful! Now, where do I find the sources now that RedHat has
> gone "commercial" and is keeping everything secret?

Exactly *why* are you trying to spread FUD? Use other distro if you don't like
instead of usign stupid arguments.


> I followed the http://sources.redhat.com/procps/ instructions
> __exactly__ and get this:

Me too.

diego@estel:/tmp$ cvs -d :pserver:[email protected]:/cvs/procps login anoncvs
Logging in to :pserver:[email protected]:2401/cvs/procps
CVS password:
diego@estel:/tmp$ cvs -d :pserver:[email protected]:/cvs/procps co procps
cvs server: Updating procps
U procps/.cvsignore
U procps/BUGS
U procps/COPYING
U procps/COPYING.LIB
U procps/INSTALL
U procps/Makefile
U procps/NEWS
U procps/TODO
U procps/free.1
U procps/free.c
U procps/pgrep.1
U procps/pgrep.c
U procps/pkill.1
U procps/pmap.1
U procps/pmap.c
U procps/procps.spec
U procps/skill.1
U procps/skill.c
U procps/slabtop.1
U procps/slabtop.c
U procps/snice.1
U procps/sysctl.8
U procps/sysctl.c
U procps/sysctl.conf.5
U procps/tload.1
U procps/tload.c
U procps/top.1
U procps/top.c
U procps/uptime.1
U procps/uptime.c
U procps/vmstat.8
U procps/vmstat.c
U procps/w.1
U procps/w.c
U procps/watch.1
U procps/watch.c
cvs server: Updating procps/proc
U procps/proc/.cvsignore
U procps/proc/Makefile
U procps/proc/compare.c
U procps/proc/devname.c
U procps/proc/ksym.c
U procps/proc/procps.h
U procps/proc/pwcache.c
U procps/proc/readproc.c
U procps/proc/readproc.h
U procps/proc/signals.c
U procps/proc/slab.c
U procps/proc/slab.h
U procps/proc/status.c
U procps/proc/sysinfo.c
U procps/proc/sysinfo.h
U procps/proc/version.c
U procps/proc/version.h
U procps/proc/vmstat.c
U procps/proc/vmstat.h
U procps/proc/whattime.c
cvs server: Updating procps/ps
U procps/ps/.cvsignore
U procps/ps/HACKING
U procps/ps/Makefile
U procps/ps/common.h
U procps/ps/display.c
U procps/ps/escape.c
U procps/ps/global.c
U procps/ps/help.c
U procps/ps/output.c
U procps/ps/parser.c
U procps/ps/ps.1
U procps/ps/regression
U procps/ps/select.c
U procps/ps/sortformat.c
U procps/ps/stacktrace.c
cvs server: Updating procps/xproc
diego@estel:/tmp$

2004-03-31 13:54:33

by Richard B. Johnson

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

On Tue, 30 Mar 2004, Richard B. Johnson wrote:

> On Tue, 30 Mar 2004, Denis Vlasenko wrote:
>
> > On Tuesday 30 March 2004 19:52, Ben Greear wrote:
> > > Chris Friesen wrote:
> > > > The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
> > > > since 2002. Must be somewhere else.
> > > >
> > > > Anyone else have any ideas?
> > >
> > > As another sample point, I have fired up about 100 processes with
> > > each process having 10+ threads. On my dual-xeon, I see maybe 15
> > > processes shown as 99% CPU in 'top'. System load was near 25
> > > when I was looking, but the machine was still quite responsive.
> >
> > There was a top bug with exactly this symptom. Fixed.
> > I use procps-2.0.18.
> >
> Wonderful! Now, where do I find the sources now that RedHat has
> gone "commercial" and is keeping everything secret?
>
> I followed the http://sources.redhat.com/procps/ instructions
> __exactly__ and get this:
>
> Script started on Tue Mar 30 15:27:02 2004
> quark:/home/johnson/foo[1] cvs -d :pserver:[email protected]:/procps login anoncvs
> Logging in to :pserver:[email protected]:2401/procps
> CVS password:
> /procps: no such repository
> quark:/home/johnson/foo[2] exit
> Script done on Tue Mar 30 15:28:32 2004
>

The RedHat server was apparently broken yesterday. There were many
persons who tried to get the source. Eventually Burton Windle
sent me a copy of the source, that he had previously acquired,
after he tried to access it also.

I compiled the source and the problem persists. Any task that
executes sched_yield() will get "charged" for the time that it
has given away. This is not correct. Maybe it is not correctable,
but it is still not correct. In addition to it being "unfair",
it messes up the totals because tasks that are using the CPU time
given up, also get charged.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.24 on an i686 machine (797.90 BogoMips).
Note 96.31% of all statistics are fiction.


2004-04-02 19:43:42

by Eric W. Biederman

[permalink] [raw]
Subject: Re: sched_yield() version 2.4.24

"Richard B. Johnson" <[email protected]> writes:

> On Tue, 30 Mar 2004, Richard B. Johnson wrote:
>
> > On Tue, 30 Mar 2004, Denis Vlasenko wrote:
> >
> > > On Tuesday 30 March 2004 19:52, Ben Greear wrote:
> > > > Chris Friesen wrote:
> > > > > The cpu util accounting code in kernel/timer.c hasn't changed in 2.4
> > > > > since 2002. Must be somewhere else.
> > > > >
> > > > > Anyone else have any ideas?
> > > >
> > > > As another sample point, I have fired up about 100 processes with
> > > > each process having 10+ threads. On my dual-xeon, I see maybe 15
> > > > processes shown as 99% CPU in 'top'. System load was near 25
> > > > when I was looking, but the machine was still quite responsive.
> > >
> > > There was a top bug with exactly this symptom. Fixed.
> > > I use procps-2.0.18.
> > >
> > Wonderful! Now, where do I find the sources now that RedHat has
> > gone "commercial" and is keeping everything secret?
> >
> > I followed the http://sources.redhat.com/procps/ instructions
> > __exactly__ and get this:
> >
> > Script started on Tue Mar 30 15:27:02 2004
> > quark:/home/johnson/foo[1] cvs -d :pserver:[email protected]:/procps
> login anoncvs
>
> > Logging in to :pserver:[email protected]:2401/procps
> > CVS password:
> > /procps: no such repository
> > quark:/home/johnson/foo[2] exit
> > Script done on Tue Mar 30 15:28:32 2004
> >
>
> The RedHat server was apparently broken yesterday. There were many
> persons who tried to get the source. Eventually Burton Windle
> sent me a copy of the source, that he had previously acquired,
> after he tried to access it also.
>
> I compiled the source and the problem persists. Any task that
> executes sched_yield() will get "charged" for the time that it
> has given away. This is not correct. Maybe it is not correctable,
> but it is still not correct. In addition to it being "unfair",
> it messes up the totals because tasks that are using the CPU time
> given up, also get charged.

Could it be that there are no other process with equal or greater
priority so that the process calling sched_yield gets called again?

Eric