Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1031065AbcCQRtk (ORCPT ); Thu, 17 Mar 2016 13:49:40 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:26005 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933182AbcCQRti (ORCPT ); Thu, 17 Mar 2016 13:49:38 -0400 Subject: Re: [Xen-devel] [PATCH] xen/events: Mask a moving irq To: David Vrabel , konrad.wilk@oracle.com References: <1458218750-5202-1-git-send-email-boris.ostrovsky@oracle.com> <56EAD55A.6020801@citrix.com> <56EAE0F7.4060703@oracle.com> <56EAE95D.6020006@citrix.com> Cc: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org From: Boris Ostrovsky Message-ID: <56EAEE36.6050804@oracle.com> Date: Thu, 17 Mar 2016 13:49:42 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <56EAE95D.6020006@citrix.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2058 Lines: 52 On 03/17/2016 01:29 PM, David Vrabel wrote: > On 17/03/16 16:53, Boris Ostrovsky wrote: >> On 03/17/2016 12:03 PM, David Vrabel wrote: >>> On 17/03/16 12:45, Boris Ostrovsky wrote: >>>> Moving an unmasked irq may result in irq handler being invoked on both >>>> source and target CPUs. >>>> >>>> With 2-level this can happen as follows: >>>> >>>> On source CPU: >>>> evtchn_2l_handle_events() -> >>>> generic_handle_irq() -> >>>> handle_edge_irq() -> >>>> eoi_pirq(): >>>> irq_move_irq(data); >>>> >>>> /***** WE ARE HERE *****/ >>>> >>>> if (VALID_EVTCHN(evtchn)) >>>> clear_evtchn(evtchn); >>>> >>>> If at this moment target processor is handling an unrelated event in >>>> evtchn_2l_handle_events()'s loop it may pick up our event since target's >>>> cpu_evtchn_mask claims that this event belongs to it *and* the event is >>>> unmasked and still pending. At the same time, source CPU will continue >>>> executing its own handle_edge_irq(). >>>> >>>> With FIFO interrupt the scenario is similar: irq_move_irq() may result >>>> in a EVTCHNOP_unmask hypercall which, in turn, may make the event >>>> pending on the target CPU. >>>> >>>> We can avoid this situation by moving and clearing the event while >>>> keeping event masked. >>> Can you do: >>> >>> if (unlikely(irqd_is_setaffinity_pending(data))) { >>> masked = test_and_set_mask() >>> >>> clear_evtchn() >>> irq_move_masked_irq() >> I did think about this but then I wasn't sure whether this might open >> some other window for things to sneak in. It shouldn't but these things >> are rather subtle so I'd rather leave the order of how operations are >> done unchanged. > This is the order your patch has though. I'm confused. Ugh, sorry --- I misread what you wrote, I thought you wanted to clear before masking. Which wouldn't make any sense. So yes, what you are suggesting is better. -borsi