2003-07-31 13:07:25

by Christian Vogel

[permalink] [raw]
Subject: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

Hi,

on a Thinkpad 600X I noticed the CPU getting very hot. It turned
out that pm_idle was never called (which invokes the ACPI pm_idle
call in this case) and default_idle was used instead.

/* arch/i386/kernel/process.c, line 723 */
void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
void (*idle)(void) = pm_idle;
if (!idle)
idle = default_idle; /* once on bootup */
irq_stat[smp_processor_id()].idle_timestamp = jiffies;
while (!need_resched())
idle();
schedule(); /* never reached */
}
}

The schedule() is never reached (need_resched() is never 0) and
so the idle-variable is not updated. pm_idle is NULL on the
first call to cpu_idle on this thinkpad, and so I stay idling
in the default_idle()-function.

By moving the "void *idle = pm_idle; if(!idle)..." in the inner
while()-loop the notebook calls pm_idle (as it get's updated by ACPI)
and stays cool.

Chris

--
Programming today is a race between software engineers striving to build
bigger and better idiot-proof programs, and the Universe trying to
produce bigger and better idiots. So far, the Universe is winning.
-- Rich Cook


2003-07-31 14:12:52

by Charles Lepple

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

Christian Vogel said:
> on a Thinkpad 600X I noticed the CPU getting very hot. It turned
> out that pm_idle was never called (which invokes the ACPI pm_idle
> call in this case) and default_idle was used instead.

The amd76x_pm patch[1] also hooks pm_idle, and while I'm not 100% sure
about this (haven't instrumented it all the way), it also appears that
pm_idle is not being called (for up to an hour, in some cases). Sometimes
it takes only a few minutes, and other times, it appears to kick in after
heavy CPU usage (kernel compiles, cpuburn, etc.). After pm_idle is called
once, things seem normal-- after the system quiesces, pm_idle is always
called when nothing is ready to run.

Since the amd76x_pm patch appears to work pretty well under 2.4.x, I get
the feeling that the problem lies in the cpu_idle() function. (But I
realize there could be SMP interactions in this case-- in the amd76x_pm
patch, C2 isn't entered unless both CPUs are idle; hence the need for
further instrumentation.)

[1] http://marc.theaimsgroup.com/?l=linux-kernel&m=105665646000758&w=3

--
Charles Lepple <[email protected]>
http://www.ghz.cc/charles/

2003-07-31 14:45:02

by Christian Vogel

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

Hi,

On Thu, Jul 31, 2003 at 10:12:51AM -0400, Charles Lepple wrote:
> about this (haven't instrumented it all the way), it also appears that
> pm_idle is not being called (for up to an hour, in some cases). Sometimes
> it takes only a few minutes, and other times, it appears to kick in after
> heavy CPU usage (kernel compiles, cpuburn, etc.).

Yes, exactly. Up to the first time need_resched() returns true and
the outer while(1){ } loop loops to update the idle-variable. Unfortunately
this never was the case on my system for a long time...

Probably the idle loop uses this local variable to be more cache-friendly,
but then any module updating pm_idle should probably set need_resched
to force an update of the idle function pointer?

Chris

--
Read the OSI protocol specifications? I can't even *lift* them!

2003-07-31 22:44:37

by Roger Larsson

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

On Thursday 31 July 2003 15.07, Christian Vogel wrote:
> Hi,
>
> on a Thinkpad 600X I noticed the CPU getting very hot. It turned
> out that pm_idle was never called (which invokes the ACPI pm_idle
> call in this case) and default_idle was used instead.
>
> /* arch/i386/kernel/process.c, line 723 */
> void cpu_idle (void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> void (*idle)(void) = pm_idle;
> if (!idle)
> idle = default_idle; /* once on bootup */
> irq_stat[smp_processor_id()].idle_timestamp = jiffies;
> while (!need_resched())
> idle();
> schedule(); /* never reached */
> }
> }
>
> The schedule() is never reached (need_resched() is never 0) and
> so the idle-variable is not updated. pm_idle is NULL on the
> first call to cpu_idle on this thinkpad, and so I stay idling
> in the default_idle()-function.
>

This smells preemptive kernel, correct?

> By moving the "void *idle = pm_idle; if(!idle)..." in the inner
> while()-loop the notebook calls pm_idle (as it get's updated by ACPI)
> and stays cool.
>

/RogerL

--
Roger Larsson
Skellefte?
Sweden

2003-07-31 22:52:20

by Robert Love

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

On Thu, 2003-07-31 at 15:45, Roger Larsson wrote:

> This smells preemptive kernel, correct?

Doesn't look like anything specific to kernel preemption to me.

But if need_resched() is never zero, the while loop fails, and
schedule() _should_ be reached. So something is fishy here. Looking at
it the other way, however, if schedule() is never called, then
need_resched() will remain non-zero forever. Maybe you mean
need_resched() is never non-zero, i.e., it is always zero?

I think the problem is that there is a race between whatever you are
saying ACPI is doing with pm_idle and the setting of idle. Actually,
that is exactly what you are saying :)

Your fix smells exactly of that. Maybe have APCI set need_resched, so
the loop flips over again?

Robert Love


2003-07-31 23:12:08

by Andrew Morton

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

Christian Vogel <[email protected]> wrote:
>
> Hi,
>
> on a Thinkpad 600X I noticed the CPU getting very hot. It turned
> out that pm_idle was never called (which invokes the ACPI pm_idle
> call in this case) and default_idle was used instead.
>
> /* arch/i386/kernel/process.c, line 723 */
> void cpu_idle (void)
> {
> /* endless idle loop with no priority at all */
> while (1) {
> void (*idle)(void) = pm_idle;
> if (!idle)
> idle = default_idle; /* once on bootup */
> irq_stat[smp_processor_id()].idle_timestamp = jiffies;
> while (!need_resched())
> idle();
> schedule(); /* never reached */
> }
> }
>
> The schedule() is never reached (need_resched() is never 0) and
> so the idle-variable is not updated. pm_idle is NULL on the
> first call to cpu_idle on this thinkpad, and so I stay idling
> in the default_idle()-function.
>
> By moving the "void *idle = pm_idle; if(!idle)..." in the inner
> while()-loop the notebook calls pm_idle (as it get's updated by ACPI)
> and stays cool.

Yes, I assume that need_resched() is always false because kernel preemption
cuts in first. Can you please confirm that you're using CONFIG_PREEMPT,
and that the problem goes away if CONFIG_PREEMPT is disabled?

The problem which you identify will also invalidate the idle_timestamp
instrumentation so I think we should move that inside as well.


diff -puN arch/i386/kernel/process.c~acpi-idle-fix arch/i386/kernel/process.c
--- 25/arch/i386/kernel/process.c~acpi-idle-fix Thu Jul 31 15:51:16 2003
+++ 25-akpm/arch/i386/kernel/process.c Thu Jul 31 15:57:27 2003
@@ -139,12 +139,15 @@ void cpu_idle (void)
{
/* endless idle loop with no priority at all */
while (1) {
- void (*idle)(void) = pm_idle;
- if (!idle)
- idle = default_idle;
- irq_stat[smp_processor_id()].idle_timestamp = jiffies;
- while (!need_resched())
+ while (!need_resched()) {
+ void (*idle)(void) = pm_idle;
+
+ if (!idle)
+ idle = default_idle;
+
+ irq_stat[smp_processor_id()].idle_timestamp = jiffies;
idle();
+ }
schedule();
}
}

_


2003-07-31 23:21:35

by Robert Love

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

On Thu, 2003-07-31 at 15:58, Robert Love wrote:
> On Thu, 2003-07-31 at 15:45, Roger Larsson wrote:
>
> > This smells preemptive kernel, correct?
>
> Doesn't look like anything specific to kernel preemption to me.

Oh I really misgrok'ed this.

Yah, kernel preemption catches the reschedule off of the interrupt and
thus this is never true (always zero). The "never zero" thing confused
me, sorry.

Moving the stuff into the while loop is one option.

Robert Love


2003-08-01 09:18:21

by Christian Vogel

[permalink] [raw]
Subject: Re: linux-2.6.0-test2: Never using pm_idle (CPU wasting power)

Hi Andrew,

On Thu, Jul 31, 2003 at 03:59:48PM -0700, Andrew Morton wrote:
> Yes, I assume that need_resched() is always false because kernel preemption
> cuts in first. Can you please confirm that you're using CONFIG_PREEMPT,
> and that the problem goes away if CONFIG_PREEMPT is disabled?

Yes I was using PREEMPT, unfortunately the machine is not here right
now so I can't test without it. Your diff is also exactly what I did and
it helped. (I already wrote that).

Chris

--
Smith & Wesson: The original point-and-click interface