Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030880AbbDWV1M (ORCPT ); Thu, 23 Apr 2015 17:27:12 -0400 Received: from mail-bn1on0145.outbound.protection.outlook.com ([157.56.110.145]:18432 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751542AbbDWV1I (ORCPT ); Thu, 23 Apr 2015 17:27:08 -0400 Authentication-Results: freescale.com; dkim=none (message not signed) header.d=none; Message-ID: <1429824418.16357.26.camel@freescale.com> Subject: Re: [PATCH 0/2] powerpc/kvm: Enable running guests on RT Linux From: Scott Wood To: Purcareata Bogdan CC: Sebastian Andrzej Siewior , Paolo Bonzini , Alexander Graf , Bogdan Purcareata , , , , , Thomas Gleixner , Laurentiu Tudor Date: Thu, 23 Apr 2015 16:26:58 -0500 In-Reply-To: <5538E624.8080904@freescale.com> References: <1424251955-308-1-git-send-email-bogdan.purcareata@freescale.com> <54E73A6C.9080500@suse.de> <54E740E7.5090806@redhat.com> <54E74A8C.30802@linutronix.de> <1424734051.4698.17.camel@freescale.com> <54EF196E.4090805@redhat.com> <54EF2025.80404@linutronix.de> <1424999159.4698.78.camel@freescale.com> <55158E6D.40304@freescale.com> <1428016310.22867.289.camel@freescale.com> <551E4A41.1080705@freescale.com> <1428096375.22867.369.camel@freescale.com> <55262DD3.2050707@freescale.com> <1428623611.22867.561.camel@freescale.com> <5534DAA4.3050809@freescale.com> <1429577566.4352.68.camel@freescale.com> <55378EC4.2080302@freescale.com> <1429749001.16357.7.camel@freescale.com> <5538E624.8080904@freescale.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.10-0ubuntu1~14.10.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Originating-IP: [2601:2:5800:3f7:12bf:48ff:fe84:c9a0] X-ClientProxiedBy: BN3PR09CA0010.namprd09.prod.outlook.com (25.160.111.148) To CY1PR03MB1487.namprd03.prod.outlook.com (25.163.17.17) X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR03MB1487; X-Microsoft-Antispam-PRVS: X-Forefront-Antispam-Report: BMV:1;SFV:NSPM;SFS:(10019020)(6009001)(377424004)(24454002)(51704005)(87976001)(62966003)(47776003)(92566002)(86362001)(23676002)(50986999)(76176999)(77156002)(33646002)(50466002)(2950100001)(110136001)(122386002)(40100003)(19580405001)(46102003)(42186005)(36756003)(50226001)(103116003)(5820100001)(3826002)(4001450100001)(4001430100001);DIR:OUT;SFP:1102;SCL:1;SRVR:CY1PR03MB1487;H:[IPv6:2601:2:5800:3f7:12bf:48ff:fe84:c9a0];FPR:;SPF:None;MLV:sfv;LANG:en; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(5002010);SRVR:CY1PR03MB1487;BCL:0;PCL:0;RULEID:;SRVR:CY1PR03MB1487; X-Forefront-PRVS: 0555EC8317 X-OriginatorOrg: freescale.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Apr 2015 21:27:04.4211 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR03MB1487 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5238 Lines: 106 On Thu, 2015-04-23 at 15:31 +0300, Purcareata Bogdan wrote: > On 23.04.2015 03:30, Scott Wood wrote: > > On Wed, 2015-04-22 at 15:06 +0300, Purcareata Bogdan wrote: > >> On 21.04.2015 03:52, Scott Wood wrote: > >>> On Mon, 2015-04-20 at 13:53 +0300, Purcareata Bogdan wrote: > >>>> There was a weird situation for .kvmppc_mpic_set_epr - its corresponding inner > >>>> function is kvmppc_set_epr, which is a static inline. Removing the static inline > >>>> yields a compiler crash (Segmentation fault (core dumped) - > >>>> scripts/Makefile.build:441: recipe for target 'arch/powerpc/kvm/kvm.o' failed), > >>>> but that's a different story, so I just let it be for now. Point is the time may > >>>> include other work after the lock has been released, but before the function > >>>> actually returned. I noticed this was the case for .kvm_set_msi, which could > >>>> work up to 90 ms, not actually under the lock. This made me change what I'm > >>>> looking at. > >>> > >>> kvm_set_msi does pretty much nothing outside the lock -- I suspect > >>> you're measuring an interrupt that happened as soon as the lock was > >>> released. > >> > >> That's exactly right. I've seen things like a timer interrupt occuring right > >> after the spinlock_irqrestore, but before kvm_set_msi actually returned. > >> > >> [...] > >> > >>>> Or perhaps a different stress scenario involving a lot of VCPUs > >>>> and external interrupts? > >>> > >>> You could instrument the MPIC code to find out how many loop iterations > >>> you maxed out on, and compare that to the theoretical maximum. > >> > >> Numbers are pretty low, and I'll try to explain based on my observations. > >> > >> The problematic section in openpic_update_irq is this [1], since it loops > >> through all VCPUs, and IRQ_local_pipe further calls IRQ_check, which loops > >> through all pending interrupts for a VCPU [2]. > >> > >> The guest interfaces are virtio-vhostnet, which are based on MSI > >> (/proc/interrupts in guest shows they are MSI). For external interrupts to the > >> guest, the irq_source destmask is currently 0, and last_cpu is 0 (unitialized), > >> so [1] will go on and deliver the interrupt directly and unicast (no VCPUs loop). > >> > >> I activated the pr_debugs in arch/powerpc/kvm/mpic.c, to see how many interrupts > >> are actually pending for the destination VCPU. At most, there were 3 interrupts > >> - n_IRQ = {224,225,226} - even for 24 flows of ping flood. I understand that > >> guest virtio interrupts are cascaded over 1 or a couple of shared MSI interrupts. > >> > >> So worst case, in this scenario, was checking the priorities for 3 pending > >> interrupts for 1 VCPU. Something like this (some of my prints included): > >> > >> [61010.582033] openpic_update_irq: destmask 1 last_cpu 0 > >> [61010.582034] openpic_update_irq: Only one CPU is allowed to receive this IRQ > >> [61010.582036] IRQ_local_pipe: IRQ 224 active 0 was 1 > >> [61010.582037] IRQ_check: irq 226 set ivpr_pr=8 pr=-1 > >> [61010.582038] IRQ_check: irq 225 set ivpr_pr=8 pr=-1 > >> [61010.582039] IRQ_check: irq 224 set ivpr_pr=8 pr=-1 > >> > >> It would be really helpful to get your comments regarding whether these are > >> realistical number for everyday use, or they are relevant only to this > >> particular scenario. > > > > RT isn't about "realistic numbers for everyday use". It's about worst > > cases. > > > >> - Can these interrupts be used in directed delivery, so that the destination > >> mask can include multiple VCPUs? > > > > The Freescale MPIC does not support multiple destinations for most > > interrupts, but the (non-FSL-specific) emulation code appears to allow > > it. > > > >> The MPIC manual states that timer and IPI > >> interrupts are supported for directed delivery, altough I'm not sure how much of > >> this is used in the emulation. I know that kvmppc uses the decrementer outside > >> of the MPIC. > >> > >> - How are virtio interrupts cascaded over the shared MSI interrupts? > >> /proc/device-tree/soc@e0000000/msi@41600/interrupts in the guest shows 8 values > >> - 224 - 231 - so at most there might be 8 pending interrupts in IRQ_check, is > >> that correct? > > > > It looks like that's currently the case, but actual hardware supports > > more than that, so it's possible (albeit unlikely any time soon) that > > the emulation eventually does as well. > > > > But it's possible to have interrupts other than MSIs... > > Right. > > So given that the raw spinlock conversion is not suitable for all the scenarios > supported by the OpenPIC emulation, is it ok that my next step would be to send > a patch containing both the raw spinlock conversion and a mandatory disable of > the in-kernel MPIC? This is actually the last conclusion we came up with some > time ago, but I guess it was good to get some more insight on how things > actually work (at least for me). Fine with me. Have you given any thought to ways to restructure the code to eliminate the problem? -Scott -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/