Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753770AbbD2QdN (ORCPT ); Wed, 29 Apr 2015 12:33:13 -0400 Received: from smtp02.citrix.com ([66.165.176.63]:52298 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751647AbbD2QdK (ORCPT ); Wed, 29 Apr 2015 12:33:10 -0400 X-IronPort-AV: E=Sophos;i="5.11,671,1422921600"; d="scan'208";a="259965575" Message-ID: <554107A6.50604@citrix.com> Date: Wed, 29 Apr 2015 17:32:38 +0100 From: David Vrabel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.4.0 MIME-Version: 1.0 To: Boris Ostrovsky , David Vrabel , CC: , , Subject: Re: [Xen-devel] [PATCH 1/4] xen/events: Clear cpu_evtchn_mask before resuming References: <1430236333-11905-1-git-send-email-boris.ostrovsky@oracle.com> <1430236333-11905-2-git-send-email-boris.ostrovsky@oracle.com> <553FB540.4040707@citrix.com> <553FD1A5.60409@oracle.com> In-Reply-To: <553FD1A5.60409@oracle.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 8bit X-DLP: MIA2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4766 Lines: 109 On 28/04/15 19:29, Boris Ostrovsky wrote: > On 04/28/2015 12:28 PM, David Vrabel wrote: >> On 28/04/15 16:52, Boris Ostrovsky wrote: >>> When a guest is resumed, the hypervisor may change event channel >>> assignments. If this happens and the guest uses 2-level events it >>> is possible for the interrupt to be claimed by wrong VCPU since >>> cpu_evtchn_mask bits may be stale. This can happen even though >>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that >>> is passed in is not necessarily the original one (from pre-migration >>> times) but instead is freshly allocated during resume and so any >>> information about which CPU the channel was bound to is lost. >>> >>> Thus we should clear the mask during resume. >>> >>> We also need to make sure that bits for xenstore and console channels >>> are set when these two subsystems are resumed. While rebind_evtchn_irq() >>> (which is invoked for both of them on a resume) calls >>> irq_set_affinity(), >>> the latter will in fact postpone setting affinity until handling the >>> interrupt. But because cpu_evtchn_mask will have bits for these two >>> cleared >>> we won't be able to take the interrupt. >>> >>> Setting IRQ_MOVE_PCNTXT flag for the two irqs avoids this problem by >>> allowing to set affinity immediately, which is safe for >>> event-channel-based >>> interrupts. >> [...] >>> --- a/drivers/tty/hvc/hvc_xen.c >>> +++ b/drivers/tty/hvc/hvc_xen.c >>> @@ -533,6 +533,7 @@ static int __init xen_hvc_init(void) >>> info = vtermno_to_xencons(HVC_COOKIE); >>> info->irq = bind_evtchn_to_irq(info->evtchn); >>> + irq_set_status_flags(info->irq, IRQ_MOVE_PCNTXT); >>> } >>> if (info->irq < 0) >>> info->irq = 0; /* NO_IRQ */ >>> diff --git a/drivers/xen/events/events_2l.c >>> b/drivers/xen/events/events_2l.c >>> index 5db43fc..7dd4631 100644 >>> --- a/drivers/xen/events/events_2l.c >>> +++ b/drivers/xen/events/events_2l.c >>> @@ -345,6 +345,15 @@ irqreturn_t xen_debug_interrupt(int irq, void >>> *dev_id) >>> return IRQ_HANDLED; >>> } >>> +static void evtchn_2l_resume(void) >>> +{ >>> + int i; >>> + >>> + for_each_online_cpu(i) >>> + memset(per_cpu(cpu_evtchn_mask, i), 0, sizeof(xen_ulong_t) * >>> + EVTCHN_2L_NR_CHANNELS/BITS_PER_EVTCHN_WORD); >>> +} >>> + >>> static const struct evtchn_ops evtchn_ops_2l = { >>> .max_channels = evtchn_2l_max_channels, >>> .nr_channels = evtchn_2l_max_channels, >>> @@ -356,6 +365,7 @@ static const struct evtchn_ops evtchn_ops_2l = { >>> .mask = evtchn_2l_mask, >>> .unmask = evtchn_2l_unmask, >>> .handle_events = evtchn_2l_handle_events, >>> + .resume = evtchn_2l_resume, >>> }; >>> void __init xen_evtchn_2l_init(void) >>> diff --git a/drivers/xen/xenbus/xenbus_comms.c >>> b/drivers/xen/xenbus/xenbus_comms.c >>> index fdb0f33..30203d1 100644 >>> --- a/drivers/xen/xenbus/xenbus_comms.c >>> +++ b/drivers/xen/xenbus/xenbus_comms.c >>> @@ -231,6 +231,7 @@ int xb_init_comms(void) >>> } >>> xenbus_irq = err; >>> + irq_set_status_flags(xenbus_irq, IRQ_MOVE_PCNTXT); >> IRQ_MOVE_PCNTXT means "Interrupt can be migrated from process context" >> which doesn't really sound relevant to me here? >> >> Thomas Glexnier is really not happy with mis-use of IRQ APIs. > > Yes, this is somewhat expanding definition of IRQ_MOVE_PCNTXT but then > there are other devices that do the same (HPET, for one). Ok. But why set it on only these two event channels and not every one? >> From the commit log the evtchn_2l_resume() fucntion that's added sounds >> like it fixes the problem on its own? > > It in fact makes this problem worse since now that cpu_evtchn_mask is > cleared during resume we cannot process the interrupt anymore in > evtchn_2l_handle_events(): irqs have to be bound to a cpu in order for > an interrupt to be processed. Perhaps evtchn_2l_resume() should set the local cpu mask for any bound event channels? And then you wouldn't need IRQ_MOVE_PCNTX. > We already have this issue even without clearing the mask if, say, > xenstore channel changes or if it gets bound to cpu other than 0 before > suspend. It's just that we rarely (if ever) see this happen. > > We can explicitly call events_base.c:set_affinity_irq() from > rebind_evtchn_irq() after irq_set_affinity() if we want to avoid using > IRQ_MOVE_PCNTXT flag but that will also looks kind of strange. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/