Subject: Re: [.32-rc3] scheduler: iwlagn consistently high in "waiting for
 CPU"
From: Mike Galbraith <efault@gmx.de>
To: Frans Pop <elendil@planet.nl>
Cc: Arjan van de Ven <arjan@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-wireless@vger.kernel.org
In-Reply-To: <200910072034.57511.elendil@planet.nl>
References: <200910051500.55875.elendil@planet.nl>
	 <200910061749.02805.elendil@planet.nl>
	 <200910071910.53907.elendil@planet.nl>
	 <200910072034.57511.elendil@planet.nl>
Content-Type: text/plain
Date: Thu, 08 Oct 2009 06:05:43 +0200
Message-Id: <1254974743.7797.21.camel@marge.simson.net>
Mime-Version: 1.0
Sender: linux-wireless-owner@vger.kernel.org

On Wed, 2009-10-07 at 20:34 +0200, Frans Pop wrote:
> On Wednesday 07 October 2009, Frans Pop wrote:
> > On Tuesday 06 October 2009, Frans Pop wrote:
> > > I've checked for 2.6.31.1 now and iwlagn is listed high there too when
> > > the system is idle, but with normal values of 60-100 ms. And phy0 has
> > > normal values of below 10 ms.
> > > I've now rebooted with today's mainline git; phy0 now frequently shows
> > > with values of around 100 ms too (i.e. higher than last time).
> >
> > Mike privately sent me a script to try to capture the latencies with
> > perf, but the perf output does not show any high latencies at all. It
> > looks as if we may have found a bug in latencytop here instead.
> 
> Not sure if it's relevant nor what it means, but I frequently see two lines 
> for iwlagn, e.g:
> 
>     Scheduler: waiting for cpu              102.4 msec         99.7 %
>     .                                         3.3 msec          0.3 %
> 
> I get the same results with both latencytop 0.4 and 0.5.

OK, I see latencytop spikes here on an idle box too, to the tune of up
to a _second_.  Booting with nohz=off seems to have cured it.

I wanted to see if that's also the perf sched record -C N trouble I
warned you not to try when recording with script, but unfortunately,
after pulling this morning...

marge:/root/tmp # perf sched lat --sort=max
Segmentation fault

...perf sched got busted.  Seems likely to be same thing for both
though, as magnitude/frequency of bogons is very similar.

	-Mike