Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932178AbbLAXlY (ORCPT ); Tue, 1 Dec 2015 18:41:24 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:51245 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756196AbbLAXlV (ORCPT ); Tue, 1 Dec 2015 18:41:21 -0500 Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu. To: Sander Eikelenboom References: <80dcf47eb772c65a62652d7a56c8ed26@eikelenboom.it> <20151130214513.GF14317@char.us.oracle.com> <16a638d3c5e24963599e621f265181f8@eikelenboom.it> <565CD3A7.1060006@oracle.com> <1454b921884da94b7c1e9f434c13eb4a@eikelenboom.it> <565E239E.8070507@oracle.com> <565E2AF1.7090902@oracle.com> <02865a792f7e7f183b4419feea9e1009@eikelenboom.it> Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xen.org, david.vrabel@citrix.com From: Boris Ostrovsky Message-ID: <565E301D.7020501@oracle.com> Date: Tue, 1 Dec 2015 18:41:17 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <02865a792f7e7f183b4419feea9e1009@eikelenboom.it> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4447 Lines: 138 On 12/01/2015 06:30 PM, Sander Eikelenboom wrote: > On 2015-12-02 00:19, Boris Ostrovsky wrote: >> On 12/01/2015 06:00 PM, Sander Eikelenboom wrote: >>> On 2015-12-01 23:47, Boris Ostrovsky wrote: >>>> On 11/30/2015 05:55 PM, Sander Eikelenboom wrote: >>>>> On 2015-11-30 23:54, Boris Ostrovsky wrote: >>>>>> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote: >>>>>>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote: >>>>>>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom >>>>>>>> wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the >>>>>>>>> tip tree >>>>>>>>> pulled on top. >>>>>>>>> >>>>>>>>> Running this kernel under Xen on PV-guests with multiple vcpus >>>>>>>>> goes well (on >>>>>>>>> idle < 10% cpu usage), >>>>>>>>> but a guest with only a single vcpu doesn't idle at all, it >>>>>>>>> seems a kworker >>>>>>>>> thread is stuck: >>>>>>>>> root 569 98.0 0.0 0 0 ? R 16:02 12:47 >>>>>>>>> [kworker/0:1] >>>>>>>>> >>>>>>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting >>>>>>>>> would probably >>>>>>>>> quite painful since there were some breakages this merge >>>>>>>>> window with respect >>>>>>>>> to Xen pv-guests. >>>>>>>>> >>>>>>>>> There are some differences in the diff's from booting a 4.3, >>>>>>>>> 4.4-single, >>>>>>>>> 4.4-multi cpu boot: >>>>>>>> >>>>>>>> Boris has been tracking a bunch of them. I am attaching the >>>>>>>> latest set of >>>>>>>> patches I've to carry on top of v4.4-rc3. >>>>>>> >>>>>>> Hi Konrad, >>>>>>> >>>>>>> i will test those, see if it fixes all my issues and report back >>>>>> >>>>>> They shouldn't help you ;-( (and I just saw a message from you >>>>>> confirming this) >>>>>> >>>>>> The first one fixes a 32-bit bug (on bare metal too). The second >>>>>> fixes >>>>>> a fatal bug for 32-bit PV guests. The other two are code >>>>>> improvements/cleanup. >>>>> >>>>> One of these patches also fixes a bug i was having with a >>>>> pci-passthrough device in >>>>> a HVM that wasn't working (depending on which dom0-kernel i was >>>>> using (4.3 or 4.4)), >>>>> but didn't report yet. >>>>> >>>>> Fingers crossed but i think this pv-guest single vcpu issue is the >>>>> last i'm troubled by for now ;) >>>> >>>> I could not reproduce this, including with your kernel config file. >>> >>> Hmm that's unpleasant :-\ >>> >>> Hmm other strange thing is it doesn't seem to affect dom0 (which is >>> also a PV guest), but only unprivileged ones >>> All unprivileged pv-guests seem to have the irq issue, but only with >>> a single vcpu i see to get the stuck kworker thread that got my >>> attention, with a 2 vcpu that doesn't seem to happen, but you still >>> get the dmesg output and warnings about hvc) >>> >>> Could it be that: >>> >>> arch/x86/include/asm/i8259.h >>> static inline int nr_legacy_irqs(void) >>> { >>> return legacy_pic->nr_legacy_irqs; >>> } >>> >>> returns something different in some circumstances ? >> >> It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and 0 >> after that commit. >> >> This is the last number that you see in >> NR_IRQS:4352 nr_irqs:48 0 >> line. >> >> I think you should be able to safely revert both >> b4ff8389ed14b849354b59ce9b360bdefcdbf99c and >> 8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any >> difference. >> >> >> -boris >> > > That was already underway compiling :) > > And it does reveal that reverting both fixes the issue, no stuck > kworker thread .. and no: > genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 > (rtc0) > hvc_open: request_irq failed with rc -16. Let me try it again tomorrow. Can you post your guest config file, Xen version and host HW (Intel or AMD)? 'xl info' maybe? -boris > > What i did get was an conflict reverting > b4ff8389ed14b849354b59ce9b360bdefcdbf99c: > arch/arm64/include/asm/irq.h, although that shouldn't matter because > we are on x86 and not on arm. > > -- > Sander > > >>> >>> -- Sander >>> >>>> >>>> -boris >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/