I first noticed this while (cross-)compiling several 2.6.30 kernels on my
Core Duo HP 2510p notebook. I run the kernel builds with 'nice -n 10' and
noticed that both cores stayed at 800MHz instead of going up to 1333MHz.
It does not seem to be a cpufreq problem as the frequency does go up if I
run the process without nice.
I can simply reproduce it by running an empty loop:
$ sh -c "while :; do :; done" => one core immediately goes to 1333MHz
$ nice -n 10 sh -c "while :; do :; done" => both cores stay at 800MHz
In both cases top shows 99/100% CPU usage for one core.
The problem does not occur immediately after a new boot: the cpu frequency
does get raised to 1333MHz even for niced processes. I've also checked
that a single suspend to RAM + resume cycle does not trigger it.
It is possible that it is triggered by undocking the notebook (I have not
verified that yet), but I do know that the problem remains after the
notebook is docked again.
I'm certain that the problem did not occur with earlier kernels (even when
undocked), but am not sure when it first started happening. As I'm not
yet certain how to trigger it, I cannot currently check that.
System is running x86_64 kernel with Debian stable ("Lenny") userland.
Any suggestions?
Cheers,
FJP
# grep . /sys/devices/system/cpu/*/cpufreq/*
.../cpu0/cpufreq/affected_cpus:0
.../cpu0/cpufreq/cpuinfo_cur_freq:800000
.../cpu0/cpufreq/cpuinfo_max_freq:1333000
.../cpu0/cpufreq/cpuinfo_min_freq:800000
.../cpu0/cpufreq/cpuinfo_transition_latency:10000
.../cpu0/cpufreq/related_cpus:0 1
.../cpu0/cpufreq/scaling_available_frequencies:1333000 1200000 1067000
933000 800000
.../cpu0/cpufreq/scaling_available_governors:ondemand performance
.../cpu0/cpufreq/scaling_cur_freq:800000
.../cpu0/cpufreq/scaling_driver:acpi-cpufreq
.../cpu0/cpufreq/scaling_governor:ondemand
.../cpu0/cpufreq/scaling_max_freq:1333000
.../cpu0/cpufreq/scaling_min_freq:800000
.../cpu0/cpufreq/scaling_setspeed:<unsupported>
.../cpu1/cpufreq/affected_cpus:1
.../cpu1/cpufreq/cpuinfo_cur_freq:800000
.../cpu1/cpufreq/cpuinfo_max_freq:1333000
.../cpu1/cpufreq/cpuinfo_min_freq:800000
.../cpu1/cpufreq/cpuinfo_transition_latency:10000
.../cpu1/cpufreq/related_cpus:0 1
.../cpu1/cpufreq/scaling_available_frequencies:1333000 1200000 1067000
933000 800000
.../cpu1/cpufreq/scaling_available_governors:ondemand performance
.../cpu1/cpufreq/scaling_cur_freq:800000
.../cpu1/cpufreq/scaling_driver:acpi-cpufreq
.../cpu1/cpufreq/scaling_governor:ondemand
.../cpu1/cpufreq/scaling_max_freq:1333000
.../cpu1/cpufreq/scaling_min_freq:800000
.../cpu1/cpufreq/scaling_setspeed:<unsupported>
On Fri, 2009-06-12 at 09:44 -0700, Frans Pop wrote:
> I first noticed this while (cross-)compiling several 2.6.30 kernels on my
> Core Duo HP 2510p notebook. I run the kernel builds with 'nice -n 10' and
> noticed that both cores stayed at 800MHz instead of going up to 1333MHz.
>
> It does not seem to be a cpufreq problem as the frequency does go up if I
> run the process without nice.
>
> I can simply reproduce it by running an empty loop:
> $ sh -c "while :; do :; done" => one core immediately goes to 1333MHz
> $ nice -n 10 sh -c "while :; do :; done" => both cores stay at 800MHz
>
> In both cases top shows 99/100% CPU usage for one core.
>
> The problem does not occur immediately after a new boot: the cpu frequency
> does get raised to 1333MHz even for niced processes. I've also checked
> that a single suspend to RAM + resume cycle does not trigger it.
>
> It is possible that it is triggered by undocking the notebook (I have not
> verified that yet), but I do know that the problem remains after the
> notebook is docked again.
>
> I'm certain that the problem did not occur with earlier kernels (even when
> undocked), but am not sure when it first started happening. As I'm not
> yet certain how to trigger it, I cannot currently check that.
>
> System is running x86_64 kernel with Debian stable ("Lenny") userland.
>
> Any suggestions?
>
> Cheers,
> FJP
>
> # grep . /sys/devices/system/cpu/*/cpufreq/*
> .../cpu0/cpufreq/affected_cpus:0
> .../cpu0/cpufreq/cpuinfo_cur_freq:800000
> .../cpu0/cpufreq/cpuinfo_max_freq:1333000
> .../cpu0/cpufreq/cpuinfo_min_freq:800000
> .../cpu0/cpufreq/cpuinfo_transition_latency:10000
> .../cpu0/cpufreq/related_cpus:0 1
> .../cpu0/cpufreq/scaling_available_frequencies:1333000 1200000 1067000
> 933000 800000
> .../cpu0/cpufreq/scaling_available_governors:ondemand performance
> .../cpu0/cpufreq/scaling_cur_freq:800000
> .../cpu0/cpufreq/scaling_driver:acpi-cpufreq
> .../cpu0/cpufreq/scaling_governor:ondemand
> .../cpu0/cpufreq/scaling_max_freq:1333000
> .../cpu0/cpufreq/scaling_min_freq:800000
> .../cpu0/cpufreq/scaling_setspeed:<unsupported>
> .../cpu1/cpufreq/affected_cpus:1
> .../cpu1/cpufreq/cpuinfo_cur_freq:800000
> .../cpu1/cpufreq/cpuinfo_max_freq:1333000
> .../cpu1/cpufreq/cpuinfo_min_freq:800000
> .../cpu1/cpufreq/cpuinfo_transition_latency:10000
> .../cpu1/cpufreq/related_cpus:0 1
> .../cpu1/cpufreq/scaling_available_frequencies:1333000 1200000 1067000
> 933000 800000
> .../cpu1/cpufreq/scaling_available_governors:ondemand performance
> .../cpu1/cpufreq/scaling_cur_freq:800000
> .../cpu1/cpufreq/scaling_driver:acpi-cpufreq
> .../cpu1/cpufreq/scaling_governor:ondemand
> .../cpu1/cpufreq/scaling_max_freq:1333000
> .../cpu1/cpufreq/scaling_min_freq:800000
> .../cpu1/cpufreq/scaling_setspeed:<unsupported>
What does ignore_nice under cpufreq/ondemand say?
# grep . /sys/devices/system/cpu/*/cpufreq/ondemand/*
Thanks,
Venki
Thanks for the quick reply Venki.
On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> What does ignore_nice under cpufreq/ondemand say?
Right, that's 1 (was not aware that existed :-P)
And changing it to 0 solves the problem.
Next question is: how and why does it get set?
As userland has not changed (AFAIK), my first suspect remains the kernel.
# grep . /sys/devices/system/cpu/cpu*/cpufreq/ondemand/*
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/ignore_nice_load:1
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate:80000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_max:40000000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/sampling_rate_min:40000
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold:90
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/ignore_nice_load:1
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/sampling_rate:80000
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/sampling_rate_max:40000000
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/sampling_rate_min:40000
/sys/devices/system/cpu/cpu1/cpufreq/ondemand/up_threshold:90
On Fri, 2009-06-12 at 10:25 -0700, Frans Pop wrote:
> Thanks for the quick reply Venki.
>
> On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> > What does ignore_nice under cpufreq/ondemand say?
>
> Right, that's 1 (was not aware that existed :-P)
> And changing it to 0 solves the problem.
OK. Good to know that there are no kernel bugs with honoring
ignore_nice_load setting. :)
> Next question is: how and why does it get set?
> As userland has not changed (AFAIK), my first suspect remains the kernel.
>
Kernel never sets this. It is initialized to 0 and provides a /sys
interface to user. I think it is set by some user app
(gnome-power-manager or some other app like that). That explains why it
is 0 initially after boot and gets changed later.
The support for ignore_nice_load=1 was broken in kernel for a short
while (arounf 2.6.28, IIRC). That may be the reason why this behavior
was not noticed earlier.
Thanks,
Venki
On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> On Fri, 2009-06-12 at 10:25 -0700, Frans Pop wrote:
> > On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> > > What does ignore_nice under cpufreq/ondemand say?
> >
> > Right, that's 1 (was not aware that existed :-P)
> > And changing it to 0 solves the problem.
>
> OK. Good to know that there are no kernel bugs with honoring
> ignore_nice_load setting. :)
>
> > Next question is: how and why does it get set?
> > As userland has not changed (AFAIK), my first suspect remains the
> > kernel.
>
> Kernel never sets this. It is initialized to 0 and provides a /sys
> interface to user. I think it is set by some user app
> (gnome-power-manager or some other app like that). That explains why it
> is 0 initially after boot and gets changed later.
>
> The support for ignore_nice_load=1 was broken in kernel for a short
> while (arounf 2.6.28, IIRC). That may be the reason why this behavior
> was not noticed earlier.
Thanks for the info. I'll see if I can figure out what's responsible.
At least I know where to look now.
Cheers,
FJP
On Friday 12 June 2009, Frans Pop wrote:
> On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> > On Fri, 2009-06-12 at 10:25 -0700, Frans Pop wrote:
> > > On Friday 12 June 2009, Pallipadi, Venkatesh wrote:
> > > > What does ignore_nice under cpufreq/ondemand say?
> > >
> > > Right, that's 1 (was not aware that existed :-P)
> > > And changing it to 0 solves the problem.
> >
> > > Next question is: how and why does it get set?
> > > As userland has not changed (AFAIK), my first suspect remains the
> > > kernel.
> >
> > Kernel never sets this. It is initialized to 0 and provides a /sys
> > interface to user. I think it is set by some user app
> > (gnome-power-manager or some other app like that). That explains why
> > it is 0 initially after boot and gets changed later.
I think I have it figured out.
HAL has a method 'SetCPUFreqConsiderNice' which writes the file.
I use KDE's kpowersave, which has some code that calls that method through
dbus and sets the value to the value of a function getAcAdapter().
I.e, the intention seems to be to ignore niced processes when not on AC
(if I understand Matthew Garrett's blog posts correctly that is probably
not even correct policy, but let's ignore that for now).
But it also looks like the whole implementation, either in kpowersave or
in hal/dbus (or quite likely all three), is so crap that ignore_nice only
actually does get set if the moon is in phase with Saturn or something
like that. At least, I've tried undocking my notebook (removed AC) a few
times without seeing any change in ignore_nice.
I've got an inotify on the file now, so I should get some info next time
the setting does get changed. Possibly that will confirm this theory.
After that I'll probably change the kpowersave source and remove the code
that changes the setting.
Thanks again for the pointers!
Cheers,
FJP