Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757223AbbLAXFq (ORCPT ); Tue, 1 Dec 2015 18:05:46 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:44318 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756417AbbLAXFp (ORCPT ); Tue, 1 Dec 2015 18:05:45 -0500 Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu. To: Sander Eikelenboom References: <80dcf47eb772c65a62652d7a56c8ed26@eikelenboom.it> <20151130214513.GF14317@char.us.oracle.com> <16a638d3c5e24963599e621f265181f8@eikelenboom.it> <565CD3A7.1060006@oracle.com> <1fbcd101856fde55adafb200b82ee396@eikelenboom.it> Cc: Konrad Rzeszutek Wilk , david.vrabel@citrix.com, linux-kernel@vger.kernel.org, xen-devel@lists.xen.org From: Boris Ostrovsky Message-ID: <565E27C5.9050703@oracle.com> Date: Tue, 1 Dec 2015 18:05:41 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <1fbcd101856fde55adafb200b82ee396@eikelenboom.it> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4337 Lines: 149 On 12/01/2015 05:51 PM, Sander Eikelenboom wrote: > On 2015-11-30 23:54, Boris Ostrovsky wrote: >> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote: >>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote: >>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom wrote: >>>>> Hi all, >>>>> >>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the tip >>>>> tree >>>>> pulled on top. >>>>> >>>>> Running this kernel under Xen on PV-guests with multiple vcpus >>>>> goes well (on >>>>> idle < 10% cpu usage), >>>>> but a guest with only a single vcpu doesn't idle at all, it seems >>>>> a kworker >>>>> thread is stuck: >>>>> root 569 98.0 0.0 0 0 ? R 16:02 12:47 >>>>> [kworker/0:1] >>>>> >>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting >>>>> would probably >>>>> quite painful since there were some breakages this merge window >>>>> with respect >>>>> to Xen pv-guests. >>>>> >>>>> There are some differences in the diff's from booting a 4.3, >>>>> 4.4-single, >>>>> 4.4-multi cpu boot: >>>> >>>> Boris has been tracking a bunch of them. I am attaching the latest >>>> set of >>>> patches I've to carry on top of v4.4-rc3. >>> >>> Hi Konrad, >>> >>> i will test those, see if it fixes all my issues and report back >> >> They shouldn't help you ;-( (and I just saw a message from you >> confirming this) >> >> The first one fixes a 32-bit bug (on bare metal too). The second fixes >> a fatal bug for 32-bit PV guests. The other two are code >> improvements/cleanup. >> >> >>> >>> Thanks :) >>> >>> -- Sander >>> >>>>> Between 4.3 and 4.4-single: >>>>> >>>>> -NR_IRQS:4352 nr_irqs:32 16 >>>>> +Using NULL legacy PIC >>>>> +NR_IRQS:4352 nr_irqs:32 0 >> >> This is fine, as long as you have >> b4ff8389ed14b849354b59ce9b360bdefcdbf99c. >> >>>>> >>>>> -cpu 0 spinlock event irq 17 >>>>> +cpu 0 spinlock event irq 1 >> >> This is strange. I wouldn't expect spinlocks to use legacy irqs. >> > > Could it be .. that with your fixup: > xen/events: Always allocate legacy interrupts on PV guests > (b4ff8389ed14b849354b59ce9b360bdefcdbf99c) > for commit: > x86/irq: Probe for PIC presence before allocating descs for legacy > IRQs > (8c058b0b9c34d8c8d7912880956543769323e2d8) > > that we now have the situation described in the commit message of > 8c058b0b9c, but now for Xen PV instead of > Hyper-V ? > (seems both Xen and Hyper-V want to achieve the same but have > different competing implementations ?) > > (BTW 8c058b0b9c has a CC for stable ... so could be destined to cause > more trouble). You mean my statement that irq 1 looks bad? That was a red herring, it should be fine. -boris > > -- > Sander > > >>>>> >>>>> and later on: >>>>> >>>>> -hctosys: unable to open rtc device (rtc0) >>>>> +rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock >>>>> >>>>> +genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 >>>>> (rtc0) >>>>> +hvc_open: request_irq failed with rc -16. >>>>> +Warning: unable to open an initial console. >>>>> >>>>> >>>>> between 4.4-single and 4.4-multi: >>>>> >>>>> Using NULL legacy PIC >>>>> -NR_IRQS:4352 nr_irqs:32 0 >>>>> +NR_IRQS:4352 nr_irqs:48 0 >> >> This is probably OK too since nr_irqs depend on number of CPUs. >> >> I think something is messed up with IRQ. I saw last week something >> from setup_irq() generating a stack dump (warninig) for rtc_cmos but >> it appeared harmless at that time and now I don't see it anymore. >> >> -boris >> >> >>>>> >>>>> and later on: >>>>> >>>>> -rtc_cmos rtc_cmos: hctosys: unable to read the hardware clock >>>>> +hctosys: unable to open rtc device (rtc0) >>>>> >>>>> -genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 >>>>> (rtc0) >>>>> -hvc_open: request_irq failed with rc -16. >>>>> -Warning: unable to open an initial console. >>>>> >>>>> attached: >>>>> - dmesg with 4.3 kernel with 1 vcpu >>>>> - dmesg with 4.4 kernel with 1 vpcu >>>>> - dmesg with 4.4 kernel with 2 vpcus >>>>> - .config of the 4.4 kernel is attached. >>>>> >>>>> -- Sander >>>>> >>>>> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/