2015-07-15 16:27:28

by Ken Moffat

[permalink] [raw]
Subject: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

New title, I originally posted this last night but I've now made
a little progress in identifying what changed. Previous thread was
labelled for AMD Phenom, but it is more general. CC'ing Jeff
because he replied to the original, I guess he probably won't be
interested after this.

Yesterday was the first day I had done any real compiling with
4.2-rc. I started out using -rc1, and make -j4 on a recent LFS/BLFS
system (gcc-5.1.0, make-4.1, etc). I wanted to start trying to
build kde5, and I expected a lot of issues. What I did not expect
was that qt5 would build as if I was using -j1.

Examination eventually identified that with 4.2-rc1 and 4.2-rc2,
make ran the number of jobs I had specified, but the total of the
CPU percentages ('top' from procps-ng-3.3.10) maxed out at 100%. On
4.1 kernels the percentage with -j4 maxes out at 400% (my machine
has 4 cores). I suspected either an unfortunate choice in 'make
oldconfig', or something specific to an AMD Phenom / gcc-5.1.

Today I have tried make -j4 on two other machines with 4.2-rc
kernels [ building the current git stable release ]. On my i3
SandyBridge everything was fine, CPU usage approached 400%. On my
AMD A10-7850K I have the same problem as on the phenom. Not
surprising, I began by using the Phenom config when I got the A10,
then adapted it to suit, whereas the i3 has much less memory so I
haven't made many changes since I got it.

Comparing the configs for the i3 and the A10, the first thing which
looked as if it might be relevant was the _CPU_ACCOUNTING choices.
Those on the A10 seem to be driven from CONFIG_NO_HZ_FULL so I began
by changing that to CONFIG_NO_HZ_IDLE. Payday ;-) make -j4 now
approaches 400% CPU usage.

The config differences follow. Perhaps it is actually one of the
subsequent choices that is the problem. And I guess it could still
be a gcc-5.1 issue.

--- config-4.2-initial 2015-07-15 16:25:12.548005751 +0100
+++ config-4.2-speed-ok 2015-07-15 17:00:50.919998703 +0100
@@ -104,11 +104,8 @@
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
-# CONFIG_NO_HZ_IDLE is not set
-CONFIG_NO_HZ_FULL=y
-CONFIG_NO_HZ_FULL_ALL=y
-CONFIG_NO_HZ_FULL_SYSIDLE=y
-CONFIG_NO_HZ_FULL_SYSIDLE_SMALL=4
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

@@ -116,7 +113,9 @@
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
+# CONFIG_TICK_CPU_ACCOUNTING is not set
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
@@ -131,7 +130,6 @@
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_CONTEXT_TRACKING=y
-CONFIG_RCU_USER_QS=y
CONFIG_CONTEXT_TRACKING_FORCE=y
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_RCU_NOCB_CPU=y

Anyway, I'll start a bisection. But it might take me a few days,
this is not a convenient time (somehow, kernel issues which need
bisection always come at a bad time for me).

ĸen
--
This one goes up to eleven!


2015-07-15 19:05:12

by Andy Lutomirski

[permalink] [raw]
Subject: Re: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

On 07/15/2015 09:27 AM, Ken Moffat wrote:
> New title, I originally posted this last night but I've now made
> a little progress in identifying what changed. Previous thread was
> labelled for AMD Phenom, but it is more general. CC'ing Jeff
> because he replied to the original, I guess he probably won't be
> interested after this.
>
> Yesterday was the first day I had done any real compiling with
> 4.2-rc. I started out using -rc1, and make -j4 on a recent LFS/BLFS
> system (gcc-5.1.0, make-4.1, etc). I wanted to start trying to
> build kde5, and I expected a lot of issues. What I did not expect
> was that qt5 would build as if I was using -j1.
>
> Examination eventually identified that with 4.2-rc1 and 4.2-rc2,
> make ran the number of jobs I had specified, but the total of the
> CPU percentages ('top' from procps-ng-3.3.10) maxed out at 100%. On
> 4.1 kernels the percentage with -j4 maxes out at 400% (my machine
> has 4 cores). I suspected either an unfortunate choice in 'make
> oldconfig', or something specific to an AMD Phenom / gcc-5.1.
>
> Today I have tried make -j4 on two other machines with 4.2-rc
> kernels [ building the current git stable release ]. On my i3
> SandyBridge everything was fine, CPU usage approached 400%. On my
> AMD A10-7850K I have the same problem as on the phenom. Not
> surprising, I began by using the Phenom config when I got the A10,
> then adapted it to suit, whereas the i3 has much less memory so I
> haven't made many changes since I got it.
>
> Comparing the configs for the i3 and the A10, the first thing which
> looked as if it might be relevant was the _CPU_ACCOUNTING choices.
> Those on the A10 seem to be driven from CONFIG_NO_HZ_FULL so I began
> by changing that to CONFIG_NO_HZ_IDLE. Payday ;-) make -j4 now
> approaches 400% CPU usage.
>
> The config differences follow. Perhaps it is actually one of the
> subsequent choices that is the problem. And I guess it could still
> be a gcc-5.1 issue.
>

Before going nuts bisecting, it could be worth running perf record -a -g
-e cycles (or perhaps -e task-clock instead of -e cycles). It could
also be worth manually sampling /proc/PID/stack a few times for a
process that isn't making as much progress as it should.

--Andy

2015-07-15 19:11:48

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

2015-07-15 18:27 GMT+02:00 Ken Moffat <[email protected]>:
> New title, I originally posted this last night but I've now made
> a little progress in identifying what changed. Previous thread was
> labelled for AMD Phenom, but it is more general. CC'ing Jeff
> because he replied to the original, I guess he probably won't be
> interested after this.
>
> Yesterday was the first day I had done any real compiling with
> 4.2-rc. I started out using -rc1, and make -j4 on a recent LFS/BLFS
> system (gcc-5.1.0, make-4.1, etc). I wanted to start trying to
> build kde5, and I expected a lot of issues. What I did not expect
> was that qt5 would build as if I was using -j1.
>
> Examination eventually identified that with 4.2-rc1 and 4.2-rc2,
> make ran the number of jobs I had specified, but the total of the
> CPU percentages ('top' from procps-ng-3.3.10) maxed out at 100%. On
> 4.1 kernels the percentage with -j4 maxes out at 400% (my machine
> has 4 cores). I suspected either an unfortunate choice in 'make
> oldconfig', or something specific to an AMD Phenom / gcc-5.1.
>
> Today I have tried make -j4 on two other machines with 4.2-rc
> kernels [ building the current git stable release ]. On my i3
> SandyBridge everything was fine, CPU usage approached 400%. On my
> AMD A10-7850K I have the same problem as on the phenom. Not
> surprising, I began by using the Phenom config when I got the A10,
> then adapted it to suit, whereas the i3 has much less memory so I
> haven't made many changes since I got it.
>
> Comparing the configs for the i3 and the A10, the first thing which
> looked as if it might be relevant was the _CPU_ACCOUNTING choices.
> Those on the A10 seem to be driven from CONFIG_NO_HZ_FULL so I began
> by changing that to CONFIG_NO_HZ_IDLE. Payday ;-) make -j4 now
> approaches 400% CPU usage.
>
> The config differences follow. Perhaps it is actually one of the
> subsequent choices that is the problem. And I guess it could still
> be a gcc-5.1 issue.
>
> --- config-4.2-initial 2015-07-15 16:25:12.548005751 +0100
> +++ config-4.2-speed-ok 2015-07-15 17:00:50.919998703 +0100
> @@ -104,11 +104,8 @@
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> -# CONFIG_NO_HZ_IDLE is not set
> -CONFIG_NO_HZ_FULL=y
> -CONFIG_NO_HZ_FULL_ALL=y

You had CONFIG_NO_HZ_FULL_ALL enabled? Because that would indeed
produce that effect since it isolates all CPUs but 0 off sched
domains.

Which means that basically only CPU 0 runs user tasks unless you
forces these otherwise.

2015-07-16 00:32:34

by Ken Moffat

[permalink] [raw]
Subject: Re: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

On Wed, Jul 15, 2015 at 09:11:46PM +0200, Frederic Weisbecker wrote:
> 2015-07-15 18:27 GMT+02:00 Ken Moffat <[email protected]>:
> >
> > The config differences follow. Perhaps it is actually one of the
> > subsequent choices that is the problem. And I guess it could still
> > be a gcc-5.1 issue.
> >
> > --- config-4.2-initial 2015-07-15 16:25:12.548005751 +0100
> > +++ config-4.2-speed-ok 2015-07-15 17:00:50.919998703 +0100
> > @@ -104,11 +104,8 @@
> > CONFIG_TICK_ONESHOT=y
> > CONFIG_NO_HZ_COMMON=y
> > # CONFIG_HZ_PERIODIC is not set
> > -# CONFIG_NO_HZ_IDLE is not set
> > -CONFIG_NO_HZ_FULL=y
> > -CONFIG_NO_HZ_FULL_ALL=y
>
> You had CONFIG_NO_HZ_FULL_ALL enabled? Because that would indeed
> produce that effect since it isolates all CPUs but 0 off sched
> domains.
>
> Which means that basically only CPU 0 runs user tasks unless you
> forces these otherwise.

Thanks. I'll put it down to a bad .config choice, although it was
fine on early 4.1. While I was starting to bisect, I noticed that
on the A10 everything was happening on CPU 0 - not sure if that was
happening on the original box, but for the moment it sounds likely.

ĸen
--
This one goes up to eleven!

2015-07-16 00:34:32

by Ken Moffat

[permalink] [raw]
Subject: Re: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

On Wed, Jul 15, 2015 at 12:05:05PM -0700, Andy Lutomirski wrote:
>
> Before going nuts bisecting, it could be worth running perf record -a -g -e
> cycles (or perhaps -e task-clock instead of -e cycles). It could also be
> worth manually sampling /proc/PID/stack a few times for a process that isn't
> making as much progress as it should.
>
> --Andy

I think Frederic has already spotted the problem - in the early
bisects, everything was building on CPU 0. But thanks for the
suggestion.

ĸen
--
This one goes up to eleven!

2015-07-18 13:26:19

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: CONFIG_NO_HZ_FULL restricts cpu usage to the equivalent of one in 4.2

On Thu, Jul 16, 2015 at 01:32:18AM +0100, Ken Moffat wrote:
> On Wed, Jul 15, 2015 at 09:11:46PM +0200, Frederic Weisbecker wrote:
> > 2015-07-15 18:27 GMT+02:00 Ken Moffat <[email protected]>:
> > >
> > > The config differences follow. Perhaps it is actually one of the
> > > subsequent choices that is the problem. And I guess it could still
> > > be a gcc-5.1 issue.
> > >
> > > --- config-4.2-initial 2015-07-15 16:25:12.548005751 +0100
> > > +++ config-4.2-speed-ok 2015-07-15 17:00:50.919998703 +0100
> > > @@ -104,11 +104,8 @@
> > > CONFIG_TICK_ONESHOT=y
> > > CONFIG_NO_HZ_COMMON=y
> > > # CONFIG_HZ_PERIODIC is not set
> > > -# CONFIG_NO_HZ_IDLE is not set
> > > -CONFIG_NO_HZ_FULL=y
> > > -CONFIG_NO_HZ_FULL_ALL=y
> >
> > You had CONFIG_NO_HZ_FULL_ALL enabled? Because that would indeed
> > produce that effect since it isolates all CPUs but 0 off sched
> > domains.
> >
> > Which means that basically only CPU 0 runs user tasks unless you
> > forces these otherwise.
>
> Thanks. I'll put it down to a bad .config choice, although it was
> fine on early 4.1.

Yeah we decided to include the nohz_full on cpu_isolated_map recently,
except CPU 0.

> While I was starting to bisect, I noticed that
> on the A10 everything was happening on CPU 0 - not sure if that was
> happening on the original box, but for the moment it sounds likely.
>
> ĸen
> --
> This one goes up to eleven!