Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2888772pxb; Sat, 6 Feb 2021 11:08:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJzr0Kfcmz54oGrnwf7hEMILzdjJvFmbXGRKAl3uAblUpw3DJ9HuBYCC5J7JmfjUg6zX1drX X-Received: by 2002:a17:906:4442:: with SMTP id i2mr10115415ejp.41.1612638481050; Sat, 06 Feb 2021 11:08:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612638481; cv=none; d=google.com; s=arc-20160816; b=Re0I6S6HIlv509pxBe5oQSG/K65D9ENQqxW75xxNp8CMFl6inRqOqOjfAKzEloWRrp OQwBxj9/w6xATZ6nWVDmwL4T6JSIkqWeJwB6N8p4iMa9Da2HjwaahZCUSmfnNTs3ALAf nclupf67ZMYPdxLFa77TZ/jCun6G+y3zOWex1ibtqO9HvTU4hzU+MGGR5dtkwmDMYSLu YLh/7rwMCMp1lCQdcBxGAOFh7M5V5O8wkqHDxhovAtHiEMlM6TR4QWibWQwZ9csamTYl MEW2m/UWRWPg9frlfDTbWBIhatpxd7Q+GDQgE/lcLUjTSmKtPn13OHXtHDifM2vQ0wgQ NJqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=t6RLyrgrl+6icHjlUcT9RbYfFIeBzwHbjcRTiWQRm98=; b=YsjHL1saKytUdKOcp0MmnyAVv1axp7cMHcfam6Oq6s3iwAsObZPDOvcToiQe6sKmgB I9WFJVHK51amcrjEEVsZek0dMbENf3NM7yFoUhGqp0kEGOFMRbyaN1auKmEViaATeONA f930Ef09w5iaJ9fjDn2BF0TixOe7ujtnaFIO6hlUF2sqvMtDn7OJ6PZKDqILhPFQ4Fq0 keWrLkPtGCc98FLv1r0p14hTCJBQDdIq4dv4D2sXBJQAs8hYgiJZKhbBp70f8GyZhsCx FYdfpjX7lclO5vA5U3GAG9xlU5rEdXQfutIlwkEfDN0reGHF+PBOK1faMzyH1X/QC4nA D68g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b=43EaaP4+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s9si7980350edu.474.2021.02.06.11.07.36; Sat, 06 Feb 2021 11:08:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b=43EaaP4+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231142AbhBFSrY (ORCPT + 99 others); Sat, 6 Feb 2021 13:47:24 -0500 Received: from mail.xenproject.org ([104.130.215.37]:55352 "EHLO mail.xenproject.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230124AbhBFSrX (ORCPT ); Sat, 6 Feb 2021 13:47:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=xen.org; s=20200302mail; h=Content-Transfer-Encoding:Content-Type:In-Reply-To: MIME-Version:Date:Message-ID:From:References:Cc:To:Subject; bh=t6RLyrgrl+6icHjlUcT9RbYfFIeBzwHbjcRTiWQRm98=; b=43EaaP4+j2FIHADQ+iNmhrZtp0 GLLA9A9bhzBLCUTEmZpoYo6oq6x56VWfSamlcEe/O4+HBPeu8pCDytu4q9Bzkgv7ilF9il4FX1NOD K7jLFGXa7skQOkAsb39MmrKdRzJI8/tOs0ZLz1rkmDtI72Y9XJ/WSlFNsht6iZoWXYVs=; Received: from xenbits.xenproject.org ([104.239.192.120]) by mail.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1l8SbJ-0006wh-UU; Sat, 06 Feb 2021 18:46:33 +0000 Received: from [54.239.6.185] (helo=a483e7b01a66.ant.amazon.com) by xenbits.xenproject.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1l8SbJ-0001KW-JY; Sat, 06 Feb 2021 18:46:33 +0000 Subject: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids To: Juergen Gross , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, netdev@vger.kernel.org, linux-scsi@vger.kernel.org Cc: Boris Ostrovsky , Stefano Stabellini , stable@vger.kernel.org, Konrad Rzeszutek Wilk , =?UTF-8?Q?Roger_Pau_Monn=c3=a9?= , Jens Axboe , Wei Liu , Paul Durrant , "David S. Miller" , Jakub Kicinski References: <20210206104932.29064-1-jgross@suse.com> From: Julien Grall Message-ID: Date: Sat, 6 Feb 2021 18:46:30 +0000 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: <20210206104932.29064-1-jgross@suse.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Juergen, On 06/02/2021 10:49, Juergen Gross wrote: > The first three patches are fixes for XSA-332. The avoid WARN splats > and a performance issue with interdomain events. Thanks for helping to figure out the problem. Unfortunately, I still see reliably the WARN splat with the latest Linux master (1e0d27fce010) + your first 3 patches. I am using Xen 4.11 (1c7d984645f9) and dom0 is forced to use the 2L events ABI. After some debugging, I think I have an idea what's went wrong. The problem happens when the event is initially bound from vCPU0 to a different vCPU. From the comment in xen_rebind_evtchn_to_cpu(), we are masking the event to prevent it being delivered on an unexpected vCPU. However, I believe the following can happen: vCPU0 | vCPU1 | | Call xen_rebind_evtchn_to_cpu() receive event X | | mask event X | bind to vCPU1 | unmask event X | | receive event X | | handle_edge_irq(X) handle_edge_irq(X) | -> handle_irq_event() | -> set IRQD_IN_PROGRESS -> set IRQS_PENDING | | -> evtchn_interrupt() | -> clear IRQD_IN_PROGRESS | -> IRQS_PENDING is set | -> handle_irq_event() | -> evtchn_interrupt() | -> WARN() | All the lateeoi handlers expect a ONESHOT semantic and evtchn_interrupt() is doesn't tolerate any deviation. I think the problem was introduced by 7f874a0447a9 ("xen/events: fix lateeoi irq acknowledgment") because the interrupt was disabled previously. Therefore we wouldn't do another iteration in handle_edge_irq(). Aside the handlers, I think it may impact the defer EOI mitigation because in theory if a 3rd vCPU is joining the party (let say vCPU A migrate the event from vCPU B to vCPU C). So info->{eoi_cpu, irq_epoch, eoi_time} could possibly get mangled? For a fix, we may want to consider to hold evtchn_rwlock with the write permission. Although, I am not 100% sure this is going to prevent everything. Does my write-up make sense to you? Cheers, -- Julien Grall