2006-08-24 12:18:30

by Martin Schwidefsky

[permalink] [raw]
Subject: [patch] dubious process system time.

From: Martin Schwidefsky <[email protected]>

[patch] dubious process system time.

The system time that is accounted to a process includes the time spent
in three different contexts: normal system time, hardirq time and
softirq time. To account hardirq time and sortirq time to a process
seems wrong, because the process could just happen to run when the
interrupt arrives that was caused by an i/o for a completly different
process. And the sum over stime and cstime of all processes won't
match cputstat->system either.
The following patch changes the accounting of system time so that
hardirq and softirq time are not accounted to a process anymore.

Signed-off-by: Martin Schwidefsky <[email protected]>
---

kernel/sched.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)

diff -urpN linux-2.6/kernel/sched.c linux-2.6-patched/kernel/sched.c
--- linux-2.6/kernel/sched.c 2006-08-01 10:09:55.000000000 +0200
+++ linux-2.6-patched/kernel/sched.c 2006-08-24 13:42:40.000000000 +0200
@@ -2939,17 +2939,16 @@ void account_system_time(struct task_str
struct rq *rq = this_rq();
cputime64_t tmp;

- p->stime = cputime_add(p->stime, cputime);
-
/* Add system time to cpustat. */
tmp = cputime_to_cputime64(cputime);
if (hardirq_count() - hardirq_offset)
cpustat->irq = cputime64_add(cpustat->irq, tmp);
else if (softirq_count())
cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
- else if (p != rq->idle)
+ else if (p != rq->idle) {
+ p->stime = cputime_add(p->stime, cputime);
cpustat->system = cputime64_add(cpustat->system, tmp);
- else if (atomic_read(&rq->nr_iowait) > 0)
+ } else if (atomic_read(&rq->nr_iowait) > 0)
cpustat->iowait = cputime64_add(cpustat->iowait, tmp);
else
cpustat->idle = cputime64_add(cpustat->idle, tmp);


2006-08-24 12:32:58

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

Martin Schwidefsky <[email protected]> writes:

> From: Martin Schwidefsky <[email protected]>
>
> [patch] dubious process system time.
>
> The system time that is accounted to a process includes the time spent
> in three different contexts: normal system time, hardirq time and
> softirq time. To account hardirq time and sortirq time to a process
> seems wrong, because the process could just happen to run when the
> interrupt arrives that was caused by an i/o for a completly different
> process. And the sum over stime and cstime of all processes won't
> match cputstat->system either.
> The following patch changes the accounting of system time so that
> hardirq and softirq time are not accounted to a process anymore.

So where does it get accounted then? It has to be accounted somewhere.
Sounds like a quite radical change to me, might break a lot of
existing assumptions.

-Andi

2006-08-24 13:28:27

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

On Thu, 2006-08-24 at 14:32 +0200, Andi Kleen wrote:
> > The system time that is accounted to a process includes the time spent
> > in three different contexts: normal system time, hardirq time and
> > softirq time. To account hardirq time and sortirq time to a process
> > seems wrong, because the process could just happen to run when the
> > interrupt arrives that was caused by an i/o for a completly different
> > process. And the sum over stime and cstime of all processes won't
> > match cputstat->system either.
> > The following patch changes the accounting of system time so that
> > hardirq and softirq time are not accounted to a process anymore.
>
> So where does it get accounted then? It has to be accounted somewhere.
> Sounds like a quite radical change to me, might break a lot of
> existing assumptions.

At the moment hardirq+softirq is just added to a random process, in
general this is completely wrong. You just need a system with a cpu hog
and an i/o bound process and you get queer results.
To add hardirq+softirq to a single process is wrong to begin with, for
that you would need to be able to identify the process that caused the
i/o. And if two processes require a single file page then what? Split
the time required to load the page to two processes? Not really. The
conclusion is that hardirq+softirq time should not be accouted to any
process. It is accounted globally in cpustat->softirq and
cpustat->hardirq.

There is one assumption that would break by the change: that the sum of
the hardirq and softirq time is contained in the sum of the stime and
cstime fields of all processes. I don't think that this is relevant.

--
blue skies,
Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.


2006-08-24 15:18:53

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] dubious process system time.


> At the moment hardirq+softirq is just added to a random process, in
> general this is completely wrong.

It's better than not accounting it at all.

> You just need a system with a cpu hog
> and an i/o bound process and you get queer results.

Yes, but system load that is invisible to standard monitoring
tools is even worse.

If you stop accounting it to random processes you have to
account it somewhere else. Preferably somewhere that standard tools
automatically pick up.

-Andi

2006-08-24 16:02:47

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

On Thu, 2006-08-24 at 17:18 +0200, Andi Kleen wrote:
> > At the moment hardirq+softirq is just added to a random process, in
> > general this is completely wrong.
>
> It's better than not accounting it at all.

I think it is worse than not accounting it. You are "charging" a process
of some user for something that the user has nothing to do with.

> > You just need a system with a cpu hog
> > and an i/o bound process and you get queer results.
>
> Yes, but system load that is invisible to standard monitoring
> tools is even worse.

But it isn't invisible. cpustat->hardirq and cpustate->softirq will be
increased. /proc/stat will show the system time spent in these two
contexts.

> If you stop accounting it to random processes you have to
> account it somewhere else. Preferably somewhere that standard tools
> automatically pick up.

Again, why do I have to account non-process related time to a process?
Ihmo that is completly wrong.

--
blue skies,
Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.


2006-08-24 23:42:57

by Paul Mackerras

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

Martin Schwidefsky writes:

> The system time that is accounted to a process includes the time spent
> in three different contexts: normal system time, hardirq time and
> softirq time.

Is that true (at the moment) with CONFIG_VIRT_CPU_ACCOUNTING=y? I
thought it wasn't.

Paul.

2006-08-25 08:18:51

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

On Fri, 2006-08-25 at 09:40 +1000, Paul Mackerras wrote:
> > The system time that is accounted to a process includes the time spent
> > in three different contexts: normal system time, hardirq time and
> > softirq time.
>
> Is that true (at the moment) with CONFIG_VIRT_CPU_ACCOUNTING=y? I
> thought it wasn't.

CONFIG_VIRT_CPU_ACCOUNTING improves the precision of the numbers that
get accounted with account_[user,system,steal]_time. Which bucket the
time goes into is decided in the three functions.

--
blue skies,
Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.


2006-08-25 10:15:25

by Helge Hafting

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

Martin Schwidefsky wrote:
> On Thu, 2006-08-24 at 17:18 +0200, Andi Kleen wrote:
>
>>> At the moment hardirq+softirq is just added to a random process, in
>>> general this is completely wrong.
>>>
>> It's better than not accounting it at all.
>>
>
> I think it is worse than not accounting it. You are "charging" a process
> of some user for something that the user has nothing to do with.
>
>
>>> You just need a system with a cpu hog
>>> and an i/o bound process and you get queer results.
>>>
>> Yes, but system load that is invisible to standard monitoring
>> tools is even worse.
>>
>
> But it isn't invisible. cpustat->hardirq and cpustate->softirq will be
> increased. /proc/stat will show the system time spent in these two
> contexts.
>
>
>> If you stop accounting it to random processes you have to
>> account it somewhere else. Preferably somewhere that standard tools
>> automatically pick up.
>>
>
> Again, why do I have to account non-process related time to a process?
> Ihmo that is completly wrong.
>
If softirq time have to be accounted to a process (so as to not
get lost), how about accounting it to the softirqd process? Much
more reasonable than random processes.

Helge Hafting

2006-08-25 10:29:32

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

On Fri, 2006-08-25 at 12:12 +0200, Helge Hafting wrote:
> > Again, why do I have to account non-process related time to a process?
> > Ihmo that is completly wrong.
> >
> If softirq time have to be accounted to a process (so as to not
> get lost), how about accounting it to the softirqd process? Much
> more reasonable than random processes.

The main question still is if it is correct to add softirq/hardirq time
to the system time of a process. If the answer turns out to be yes, then
it might be a clever idea to account softirq time to the softirqd. That
still leaves the question what to do with hardirq time ..
My take still is that softirq/hardirq time does not belong to the system
time of any process.

--
blue skies,
Martin.

Martin Schwidefsky
Linux for zSeries Development & Services
IBM Deutschland Entwicklung GmbH

"Reality continues to ruin my life." - Calvin.


2006-08-25 12:58:21

by Paul Mackerras

[permalink] [raw]
Subject: Re: [patch] dubious process system time.

Martin Schwidefsky writes:

> The main question still is if it is correct to add softirq/hardirq time
> to the system time of a process. If the answer turns out to be yes, then
> it might be a clever idea to account softirq time to the softirqd. That
> still leaves the question what to do with hardirq time ..
> My take still is that softirq/hardirq time does not belong to the system
> time of any process.

I agree.

Paul.