Message-ID: <50736456.1010407@us.ibm.com>
Date: Mon, 08 Oct 2012 16:40:06 -0700
From: John Stultz <johnstul@us.ibm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120912 Thunderbird/15.0.1
MIME-Version: 1.0
To: Fengguang Wu <fengguang.wu@intel.com>
CC: Avi Kivity <avi@redhat.com>, paulmck@linux.vnet.ibm.com,
        Josh Boyer <jwboyer@redhat.com>,
        Christian Hoffmann <email@christianhoffmann.info>,
        LKML <linux-kernel@vger.kernel.org>,
        "kvm@vger.kernel.org" <kvm@vger.kernel.org>, tglx@linutronix.de
Subject: Re: INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected
 by 0, t=10002 jiffies)
References: <20120926043445.GB2524@linux.vnet.ibm.com> <20120926081501.GA28485@localhost> <20120926164543.GD2467@linux.vnet.ibm.com> <20120927025400.GA11434@localhost> <20120927042849.GA6556@linux.vnet.ibm.com> <20120927044044.GA22753@localhost> <20120928033511.GB2480@linux.vnet.ibm.com> <506828BF.9050409@redhat.com> <20120930112308.GA18839@localhost> <50682DDE.7010002@redhat.com> <20120930115937.GA21989@localhost>
In-Reply-To: <20120930115937.GA21989@localhost>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4075
Lines: 84

On 09/30/2012 04:59 AM, Fengguang Wu wrote:
> On Sun, Sep 30, 2012 at 01:32:46PM +0200, Avi Kivity wrote:
>> On 09/30/2012 01:23 PM, Fengguang Wu wrote:
>>> On Sun, Sep 30, 2012 at 01:10:55PM +0200, Avi Kivity wrote:
>>>> On 09/28/2012 05:35 AM, Paul E. McKenney wrote:
>>>>> On Thu, Sep 27, 2012 at 12:40:44PM +0800, Fengguang Wu wrote:
>>>>>> On Wed, Sep 26, 2012 at 09:28:50PM -0700, Paul E. McKenney wrote:
>>>>>>> On Thu, Sep 27, 2012 at 10:54:00AM +0800, Fengguang Wu wrote:
>>>>>>>> On Wed, Sep 26, 2012 at 09:45:43AM -0700, Paul E. McKenney wrote:
>>>>>>>>> On Wed, Sep 26, 2012 at 04:15:01PM +0800, Fengguang Wu wrote:
>>>>> [ . . . ]
>>>>>
>>>>>>>>> But could you also please send your .config file and a description of
>>>>>>>> .config attached.
>>>>>>>>
>>>>>>>>> the workload you are running?
>>>>>>>> It's basically the below commands. The exact initrd is not relevant in
>>>>>>>> this case because it's a boot time warning before user space is
>>>>>>>> started. The stalls roughly happen 1 time on every 10 boots.
>>>>>>> Yow!!!
>>>>>>>
>>>>>>> You have severe cross-CPU time-synchronization problems.  See for
>>>>>>> example the first dmesg, with the relevant part extracted right here.
>>>>>>> One CPU believes that it is about 37 seconds past boot, and the other
>>>>>>> CPU beleives that it is about 137 seconds past boot.  Given that large
>>>>>>> of a time difference, an RCU CPU stall warning is expected behavior.
>>>>>> Good spot! Yeah I noticed that huge timestamp gap, however didn't take
>>>>>> it seriously enough..
>>>>>>
>>>>>>> Get your two CPUs in agreement about what time it is, and I bet that
>>>>>>> the CPU stall warnings will go away.
>>>>>> Possibly KVM related? Because the warnings show up in many test boxes
>>>>>> running KVM and so is not likely some hardware specific issue.
>>>>> I vaguely recall seeing something recently.  But let's ask the KVM and
>>>>> timekeeping guys.
>>>> >From the logs it looks like hpet (why not kvmclock?) is used for the
>>>> clock, it should not generate such drifts since it is a global clock.
>>>> Can you verify current_clocksource on a boot that actually failed (in
>>>> case the clocksource is switched during runtime)?
>>> I've checked out the dmesg that's cited by Paul, attached. Yes it
>>> contains lines
>>>
>>> [    4.970051] Switching to clocksource hpet
>>>
>>> and then
>>>
>>> [    7.250353] Switching to clocksource tsc
>>>
>>> And there is no kvm-clock lines. Oh well for this particular kernel:
>>>
>> Ah, tsc will certainly break on kvm if the hardware doesn't provide a
>> constant tsc source.  I'm surprised the guest kernel didn't detect it
>> and switch back to hpet though.
> Thanks, it's good to know the root cause. All the dmesgs show the same hpet+tsc
> switching pattern (and never switch back):
>
> $ grep Switching dmesg-kvm_bisect2-inn-*21
> dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[    4.111415] Switching to clocksource hpet
> dmesg-kvm_bisect2-inn-41931-2012-09-27-10-37-51-3.6.0-rc7-bisect2-00078-g593d100-21:[    6.550098] Switching to clocksource tsc

Is this still an open issue? Fengguang's mail sounds like its resolved, 
but I'm not sure it is.

The switching from HPET -> TSC  I believe is expected, as the refined 
calibration will delay the TSC from being registered for a few seconds.  
However, its unclear why the TSC, if it is faulty, isn't being caught 
and demoted by the clocksource watchdog.

I'm also curious why this originally bisected down to 
06ae115a1d551cd952d8  (when using the kvm clock) if it was more of a 
hardware issue. And in those logs, I don't see the printk time-stamp 
inconsistencies that were alluded to in this thread.

Fengguang: Is this still reproducible? Do you have any details (dmesg) 
about host system as well?

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/