Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932187AbbLAXs6 (ORCPT ); Tue, 1 Dec 2015 18:48:58 -0500 Received: from vserver.eikelenboom.it ([84.200.39.61]:38956 "EHLO smtp.eikelenboom.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755605AbbLAXs4 (ORCPT ); Tue, 1 Dec 2015 18:48:56 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 02 Dec 2015 00:44:54 +0100 From: Sander Eikelenboom To: Boris Ostrovsky Cc: linux-kernel@vger.kernel.org, xen-devel@lists.xen.org, david.vrabel@citrix.com Subject: Re: [Xen-devel] linux 4.4 Regression: 100% cpu usage on idle pv guest under Xen with single vcpu. In-Reply-To: <565E301D.7020501@oracle.com> References: <80dcf47eb772c65a62652d7a56c8ed26@eikelenboom.it> <20151130214513.GF14317@char.us.oracle.com> <16a638d3c5e24963599e621f265181f8@eikelenboom.it> <565CD3A7.1060006@oracle.com> <1454b921884da94b7c1e9f434c13eb4a@eikelenboom.it> <565E239E.8070507@oracle.com> <565E2AF1.7090902@oracle.com> <02865a792f7e7f183b4419feea9e1009@eikelenboom.it> <565E301D.7020501@oracle.com> Message-ID: <9f451583066900c7c7db345c103f6530@eikelenboom.it> User-Agent: Roundcube Webmail/0.9.5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6505 Lines: 191 On 2015-12-02 00:41, Boris Ostrovsky wrote: > On 12/01/2015 06:30 PM, Sander Eikelenboom wrote: >> On 2015-12-02 00:19, Boris Ostrovsky wrote: >>> On 12/01/2015 06:00 PM, Sander Eikelenboom wrote: >>>> On 2015-12-01 23:47, Boris Ostrovsky wrote: >>>>> On 11/30/2015 05:55 PM, Sander Eikelenboom wrote: >>>>>> On 2015-11-30 23:54, Boris Ostrovsky wrote: >>>>>>> On 11/30/2015 04:46 PM, Sander Eikelenboom wrote: >>>>>>>> On 2015-11-30 22:45, Konrad Rzeszutek Wilk wrote: >>>>>>>>> On Sat, Nov 28, 2015 at 04:47:43PM +0100, Sander Eikelenboom >>>>>>>>> wrote: >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> I have just tested a 4.4-rc2 kernel (current linus tree) + the >>>>>>>>>> tip tree >>>>>>>>>> pulled on top. >>>>>>>>>> >>>>>>>>>> Running this kernel under Xen on PV-guests with multiple vcpus >>>>>>>>>> goes well (on >>>>>>>>>> idle < 10% cpu usage), >>>>>>>>>> but a guest with only a single vcpu doesn't idle at all, it >>>>>>>>>> seems a kworker >>>>>>>>>> thread is stuck: >>>>>>>>>> root 569 98.0 0.0 0 0 ? R 16:02 12:47 >>>>>>>>>> [kworker/0:1] >>>>>>>>>> >>>>>>>>>> Running a 4.3 kernel works fine with a single vpcu, bisecting >>>>>>>>>> would probably >>>>>>>>>> quite painful since there were some breakages this merge >>>>>>>>>> window with respect >>>>>>>>>> to Xen pv-guests. >>>>>>>>>> >>>>>>>>>> There are some differences in the diff's from booting a 4.3, >>>>>>>>>> 4.4-single, >>>>>>>>>> 4.4-multi cpu boot: >>>>>>>>> >>>>>>>>> Boris has been tracking a bunch of them. I am attaching the >>>>>>>>> latest set of >>>>>>>>> patches I've to carry on top of v4.4-rc3. >>>>>>>> >>>>>>>> Hi Konrad, >>>>>>>> >>>>>>>> i will test those, see if it fixes all my issues and report back >>>>>>> >>>>>>> They shouldn't help you ;-( (and I just saw a message from you >>>>>>> confirming this) >>>>>>> >>>>>>> The first one fixes a 32-bit bug (on bare metal too). The second >>>>>>> fixes >>>>>>> a fatal bug for 32-bit PV guests. The other two are code >>>>>>> improvements/cleanup. >>>>>> >>>>>> One of these patches also fixes a bug i was having with a >>>>>> pci-passthrough device in >>>>>> a HVM that wasn't working (depending on which dom0-kernel i was >>>>>> using (4.3 or 4.4)), >>>>>> but didn't report yet. >>>>>> >>>>>> Fingers crossed but i think this pv-guest single vcpu issue is the >>>>>> last i'm troubled by for now ;) >>>>> >>>>> I could not reproduce this, including with your kernel config file. >>>> >>>> Hmm that's unpleasant :-\ >>>> >>>> Hmm other strange thing is it doesn't seem to affect dom0 (which is >>>> also a PV guest), but only unprivileged ones >>>> All unprivileged pv-guests seem to have the irq issue, but only with >>>> a single vcpu i see to get the stuck kworker thread that got my >>>> attention, with a 2 vcpu that doesn't seem to happen, but you still >>>> get the dmesg output and warnings about hvc) >>>> >>>> Could it be that: >>>> >>>> arch/x86/include/asm/i8259.h >>>> static inline int nr_legacy_irqs(void) >>>> { >>>> return legacy_pic->nr_legacy_irqs; >>>> } >>>> >>>> returns something different in some circumstances ? >>> >>> It should return 16 pre-8c058b0b9c34d8c8d7912880956543769323e2d8 and >>> 0 >>> after that commit. >>> >>> This is the last number that you see in >>> NR_IRQS:4352 nr_irqs:48 0 >>> line. >>> >>> I think you should be able to safely revert both >>> b4ff8389ed14b849354b59ce9b360bdefcdbf99c and >>> 8c058b0b9c34d8c8d7912880956543769323e2d8 and see if it makes any >>> difference. >>> >>> >>> -boris >>> >> >> That was already underway compiling :) >> >> And it does reveal that reverting both fixes the issue, no stuck >> kworker thread .. and no: >> genirq: Flags mismatch irq 8. 00000000 (hvc_console) vs. 00000000 >> (rtc0) >> hvc_open: request_irq failed with rc -16. > > > Let me try it again tomorrow. Can you post your guest config file, Xen > version and host HW (Intel or AMD)? 'xl info' maybe? > > -boris Guest config file == dom0 config file == the one i send you earlier. Host is an AMD Phenom X6. # xl info host : serveerstertje release : 4.4.0-rc3-20151201-linus-doflr-boris+ version : #1 SMP Tue Dec 1 19:02:58 CET 2015 machine : x86_64 nr_cpus : 6 max_cpu_id : 5 nr_nodes : 1 cores_per_socket : 6 threads_per_core : 1 cpu_mhz : 3200 hw_caps : 178bf3ff:efd3fbff:00000000:00011300:00802001:00000000:000037ff:00000000 virt_caps : hvm hvm_directio total_memory : 20479 free_memory : 7745 sharing_freed_memory : 0 sharing_used_memory : 0 outstanding_claims : 0 free_cpus : 0 xen_major : 4 xen_minor : 7 xen_extra : -unstable xen_version : 4.7-unstable xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xffff800000000000 xen_changeset : Thu Nov 26 20:58:13 2015 +0100 git:5252636-dirty xen_commandline : dom0_mem=1536M,max:1536M loglvl=all loglvl_guest=all console_timestamps=datems vga=gfx-1280x1024x32 cpuidle cpufreq=xen com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 iommu=on,verbose,debug,amd-iommu-debug conring_size=128k ucode=-1 cc_compiler : gcc-4.9.real (Debian 4.9.2-10) 4.9.2 cc_compile_by : root cc_compile_domain : dyndns.org cc_compile_date : Thu Nov 26 21:18:41 CET 2015 xend_config_format : 4 If you need and can get more info by letting me run a debug patch for you (because you can't reproduce) don't hesitate to send one :) Thanks so far ! -- Sander > > >> >> What i did get was an conflict reverting >> b4ff8389ed14b849354b59ce9b360bdefcdbf99c: >> arch/arm64/include/asm/irq.h, although that shouldn't matter because >> we are on x86 and not on arm. >> >> -- Sander >> >> >>>> >>>> -- Sander >>>> >>>>> >>>>> -boris >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xen.org >>>> http://lists.xen.org/xen-devel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/