2012-05-23 21:54:19

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

Hi Anders,

Anders Boström wrote[1]:

> Starting with 3.2.17-1, the CPU load accounting is broken when the
> computer is idle. The CPU load is reported as >0.50 when
> idle. 3.2.16-1 don't suffer from this problem.
>
> Suspected patch is the upstream patch
> "sched: Fix nohz load accounting -- again!"
> commit 5e2d50da11f0e6ec3ce8fe658d7c83b0b4346c68 to 3.2 and
> originating from c308b56b5398779cd3da0f62ab26b0453494c3d4 .
>
> See also:
>
> https://bugs.launchpad.net/unity/+bug/991370
> https://lkml.org/lkml/2012/5/22/310
> https://bugzilla.redhat.com/show_bug.cgi?id=822877
> https://bbs.archlinux.org/viewtopic.php?id=141289

Thanks for writing.

If I understand correctly, the load average calculation both before
and after that commit is broken, in different ways.

I'm cc-ing Lesław Kopeć, Aman Gupta, and Doug Smythies who worked
on the above change[2]. I recommend pulling Thomas Gleixner
<[email protected]> into the conversation once you have a better
idea of what's going on or a new change to recommend. If you'd like
to also track this on a bugtracker, http://bugzilla.kernel.org/,
product Process Management, component Scheduler might be a good place.

Aside from that, I can't really offer much to help you, but others
on linux-kernel might.

Hope that helps,
Jonathan

[1] http://bugs.debian.org/674153
[2] http://thread.gmane.org/gmane.linux.kernel/1249223/focus=1262319
[3] http://thread.gmane.org/gmane.linux.kernel/1291870/focus=1292058


2012-05-24 21:45:25

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

(cc-ing Peter and Thomas because there is a nice graph)
> Anders Boström wrote[1]:

>> Starting with 3.2.17-1, the CPU load accounting is broken when the
>> computer is idle. The CPU load is reported as >0.50 when
>> idle. 3.2.16-1 don't suffer from this problem.
>>
>> Suspected patch is the upstream patch
>> "sched: Fix nohz load accounting -- again!"
>> commit 5e2d50da11f0e6ec3ce8fe658d7c83b0b4346c68 to 3.2 and
>> originating from c308b56b5398779cd3da0f62ab26b0453494c3d4 .
>>
>> See also:
>>
>> https://bugs.launchpad.net/unity/+bug/991370
>> https://lkml.org/lkml/2012/5/22/310
>> https://bugzilla.redhat.com/show_bug.cgi?id=822877
>> https://bbs.archlinux.org/viewtopic.php?id=141289

I just found [1] from [2] which seems to describe the symptoms pretty
well. Peter, Thomas, advice?

Anders et al: does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[]
calculations", 2012-05-11) change anything?

Thanks,
Jonathan

[1] https://launchpadlibrarian.net/105809696/commit_low_load_rev2.png
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/838811

2012-05-30 14:32:23

by Doug Smythies

[permalink] [raw]
Subject: RE: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

Hi,

The referenced PNG file was sent to everyone on the address list on 2012.05.22 and the previous version was sent 2012.05.09.
The only reason the PNG file was made was for the e-mail and because I was instructed not to refer to external sources.
The web page version of the PNG file, which is kept up to date, is at [3].

"does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations", 2012-05-11) change anything?"
I back edited those changes into my test environment yesterday. It made no difference with respect to this issue. (minimally tested.)

This statement: "Starting with 3.2.17-1, the CPU load accounting is broken when the computer is idle. The CPU load is reported as >0.50 when idle. 3.2.16-1 don't suffer from this problem."
In my opinion has the following mistakes:
. The computer is not actually idle. If it was actually idle the reported load average would be 0.
. Yes, the new kernel reported load average is high, as detailed in the PNG file or the web notes.
. The older kernel suffers from a different problem, under all other conditions being the same, the reported load average would have been too low.

[3] http://www.smythies.com/~doug/network/load_average/new.html

Doug Smythies

-----Original Message-----
From: Jonathan Nieder [mailto:[email protected]]
Sent: May-24-2012 14:45
To: Anders Boström
Cc: [email protected]; Lesław Kopeć; Aman Gupta; Doug Smythies; Peter Zijlstra; Thomas Gleixner
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

(cc-ing Peter and Thomas because there is a nice graph)
> Anders Boström wrote[1]:

>> Starting with 3.2.17-1, the CPU load accounting is broken when the
>> computer is idle. The CPU load is reported as >0.50 when idle.
>> 3.2.16-1 don't suffer from this problem.
>>
>> Suspected patch is the upstream patch
>> "sched: Fix nohz load accounting -- again!"
>> commit 5e2d50da11f0e6ec3ce8fe658d7c83b0b4346c68 to 3.2 and
>> originating from c308b56b5398779cd3da0f62ab26b0453494c3d4 .
>>
>> See also:
>>
>> https://bugs.launchpad.net/unity/+bug/991370
>> https://lkml.org/lkml/2012/5/22/310
>> https://bugzilla.redhat.com/show_bug.cgi?id=822877
>> https://bbs.archlinux.org/viewtopic.php?id=141289

I just found [1] from [2] which seems to describe the symptoms pretty well. Peter, Thomas, advice?

Anders et al: does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations", 2012-05-11) change anything?

Thanks,
Jonathan

[1] https://launchpadlibrarian.net/105809696/commit_low_load_rev2.png
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/838811

2012-05-30 15:04:35

by Anders Boström

[permalink] [raw]
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

>>>>> "DS" == Doug Smythies <[email protected]> writes:

DS> This statement: "Starting with 3.2.17-1, the CPU load accounting is broken when the computer is idle. The CPU load is reported as >0.50 when idle. 3.2.16-1 don't suffer from this problem."
DS> In my opinion has the following mistakes:
DS> . The computer is not actually idle. If it was actually idle the reported load average would be 0.

Well, I tested in single user mode, with very few processes running,
mostly init, getty, bash and top (+ a lot of kernel threads). And
3.2.17 reported a load of >0.5 . Under the same conditions 3.2.16
typically reports 0.01 or 0.00 .

DS> . Yes, the new kernel reported load average is high, as detailed in the PNG file or the web notes.
DS> . The older kernel suffers from a different problem, under all other conditions being the same, the reported load average would have been too low.

I don't know if 0.01 is *too* low, but it should be much closer to the
truth than >0.5.

/ Anders

2012-06-05 15:35:48

by Lesław Kopeć

[permalink] [raw]
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

On 05/30/2012 04:54 PM, Anders Boström wrote:

> DS> This statement: "Starting with 3.2.17-1, the CPU load accounting is broken when the computer is idle. The CPU load is reported as >0.50 when idle. 3.2.16-1 don't suffer from this problem."
> DS> In my opinion has the following mistakes:
> DS> . The computer is not actually idle. If it was actually idle the reported load average would be 0.
>
> Well, I tested in single user mode, with very few processes running,
> mostly init, getty, bash and top (+ a lot of kernel threads). And
> 3.2.17 reported a load of >0.5 . Under the same conditions 3.2.16
> typically reports 0.01 or 0.00 .

I've tried to reproduce the problem, but haven't had much luck. I've
tested vanilla and Debian kernels versions 3.2.16 and 3.2.17. Load on an
idle or slightly busy system is the same across all versions.

vanilla 3.2.16 0.15 0.07 0.06
vanilla 3.2.17 0.17 0.11 0.13
Debian 3.2.16-1 0.13 0.07 0.05
Debian 3.2.17-1 0.10 0.09 0.11

When the system is completely idle load drops to 0. I've also tried
3.2.17 with 556061b00c9f, but it makes no difference and in comparison
to plain 3.2.17 load is the same even on a busy system.

I can't explain why we're getting different results on the same kernels.
If you'd like more details just ask.

--
Lesław Kopeć


Attachments:
signature.asc (262.00 B)
OpenPGP digital signature

2012-06-08 17:02:22

by Doug Smythies

[permalink] [raw]
Subject: RE: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

>> On 2012.05.30 07:54, Anders Boström wrote:
> On 2012.06.05 08:35, Lesław Kopeć wrote:

>> Well, I tested in single user mode, with very few processes running,
>> mostly init, getty, bash and top (+ a lot of kernel threads). And
>> 3.2.17 reported a load of >0.5 . Under the same conditions 3.2.16
>> typically reports 0.01 or 0.00 .

>> I don't know if 0.01 is *too* low, but it should be much closer to the
>> truth than >0.5.

I agree. However the not "idle" case needs to also be considered. For a
real load of 5.70 a reported load average of 0 is much further from the
truth than the 5.6 being reported now, for example.

> When the system is completely idle load drops to 0. I've also tried
> 3.2.17 with 556061b00c9f, but it makes no difference and in comparison
> to plain 3.2.17 load is the same even on a busy system.

> I can't explain why we're getting different results on the same kernels.

The different results are due to differences in the processes that are
running on those same kernels, and in particular the frequency at which
those processes do stuff and sleep. Where enough detail has been
available on various problem reports, I have always found much more CPU
activity than on my server system with no GUI. These have typically been
GUI based "desktop" linux systems. Where I have been able to figure
it out, the real "idle" load has been between 0.1 and 0.2 and reported
as about 0.8 to 1.2.

All of my analysis work for this reported load averages work has been
based on the assumption that the background load is close enough to 0 to
ignore. Obviously that assumption needed to be checked, [1]. Also see
the attached PNG file (also posted at [2]). (Summary: The same as Lesław)

By the way, I found and tested 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
It is similar (minimally tested).

I am certainly not an expert, and I find the load average area of the
code extremely difficult to follow and understand. That being said, I
think the root issue here is the 10 tick grace period. I think that
cpu idle enter exit transitions can not be ignored during this period,
and somehow needs to be accumulated towards the next sample time. So far,
I have been unsuccessful trying to help with a suggested solution. I will
continue to try.

Disclaimers:

My web pages and notes often refer to reported load averages to two
decimal places. I agree that is ridiculous. One should only expect
+- 0.1 to 0.15 at best, and for the 15 minute average, after settle
time. Worse for the shorter time constants.

It is hoped that readers understand that the 15 minute reported load
average never goes below 0.05 (after it has gone above that value once).
That is a simple finite number of bits integer math issue.

[1] http://www.smythies.com/~doug/network/load_average/background.html
[2] http://www.smythies.com/~doug/network/load_average/background_histograms.png

See also general related web notes at: http://www.smythies.com/~doug/network/load_average/index.html

Doug Smythies


Attachments:
background_histograms.png (42.46 kB)

2012-06-10 17:49:49

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

Hi Doug et al,

Doug Smythies wrote:

> "does 556061b00c9f ("sched/nohz: Fix rq->cpu_load[] calculations",
> 2012-05-11) change anything?"
>
> I back edited those changes into my test environment yesterday. It
> made no difference with respect to this issue. (minimally tested.)
[...]
> By the way, I found and tested 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
> It is similar (minimally tested).
>
> I am certainly not an expert, and I find the load average area of the
> code extremely difficult to follow and understand. That being said, I
> think the root issue here is the 10 tick grace period. I think that
> cpu idle enter exit transitions can not be ignored during this period,
> and somehow needs to be accumulated towards the next sample time. So far,
> I have been unsuccessful trying to help with a suggested solution. I will
> continue to try.

Another load average related patch is being discussed (not meant
particularly to address the too-low load case, just mentioning it
FYI):

sched: Folding nohz load accounting more accurate

After patch 453494c3d4 (sched: Fix nohz load accounting -- again!), we can fold
the idle into calc_load_tasks_idle between the last cpu load calculating and
calc_global_load calling. However problem still exits between the first cpu
load calculating and the last cpu load calculating. Every time when we do load
calculating, calc_load_tasks_idle will be added into calc_load_tasks, even if
the idle load is caused by calculated cpus. This problem is also described in
the following link:

https://lkml.org/lkml/2012/5/24/419

This bug can be found in our work load. The average running processes number
is about 15, but the load only shows about 4.

>From [*].

Hope that helps,
Jonathan

[*] http://thread.gmane.org/gmane.linux.kernel/1310462

2012-06-12 06:12:38

by Doug Smythies

[permalink] [raw]
Subject: RE: [3.2.16 -> 3.2.17 regression] High reported CPU load when idle

>> On 2012.06.08 10:01 Doug Smythies wrote:
> On 2012.06.10 10:50 Jonathan Nieder wrote:

>> By the way, I found and tested
>> 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146
>> It is similar (minimally tested).

Which a day later was included in kernel 3.5 RC2, which I also tested for
low load conditions only (i.e. in case I made some mistake with my manual
back edit.)
Herein, the abbreviation "5aaa" means Kernel 3.5 RC2 with
5aaa0b7a2ed5b12692c9ffb5222182bd558d3146 and its predecessors.

> Another load average related patch is being discussed (not meant
> particularly to address the too-low load case, just mentioning it FYI):

> sched: Folding nohz load accounting more accurate
> [...]
> From [*].
> [*] http://thread.gmane.org/gmane.linux.kernel/1310462

Jonathan: Thanks for the reference.

I also back edited that patch (by Charles Wang) into my working Kernel.
Herein, the abbreviation "Wang" means my working Kernel (3.2.0-24.39
(Ubuntu reference)) with these back edits: The above referenced patch by
Charles Wang; 5aaa0b7a2ed5b12692c9ffb5222182bd558d3146;
556061b00c9f2fd6a5524b6bde823ef12f299ecf;
and c308b56b5398779cd3da0f62ab26b0453494c3d4.

The abbreviation "c308" means my working kernel with only
c308b56b5398779cd3da0f62ab26b0453494c3d4.

The abbreviation "Control" means a tick based kernel compiled with
CONFIG_NO_HZ=no.

See the attached PNG file (and or [1]) for relatively low load test
results.

Summary:

"c308" and "5aaa" are the same, with reported load averages higher than
actual.
"Wang" is worse, with reported load averages in error even higher.
"Control" tends to track, but sometimes reported load averages are
somewhat low.

[1] http://www.smythies.com/~doug/network/load_average/wang_compare.png

Doug Smythies


Attachments:
wang_compare.png (76.30 kB)