Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754162AbbLTHds (ORCPT ); Sun, 20 Dec 2015 02:33:48 -0500 Received: from mail-pa0-f48.google.com ([209.85.220.48]:34389 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753471AbbLTHdr (ORCPT ); Sun, 20 Dec 2015 02:33:47 -0500 Date: Sat, 19 Dec 2015 23:33:44 -0800 From: Jeremiah Mahler To: Jiang Liu Cc: Joe Lawrence , stable@vger.kernel.org, Thomas Gleixner , linux-kernel@vger.kernel.org Subject: Re: [BUG, bisect, linux-next] do_IRQ: No irq handler for vector Message-ID: <20151220073344.GA1487@hudson.localdomain> Mail-Followup-To: Jeremiah Mahler , Jiang Liu , Joe Lawrence , stable@vger.kernel.org, Thomas Gleixner , linux-kernel@vger.kernel.org References: <20151218034033.GA5351@hudson.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151218034033.GA5351@hudson.localdomain> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4201 Lines: 101 Jiang Liu, On Thu, Dec 17, 2015 at 07:40:33PM -0800, Jeremiah Mahler wrote: > all, > > I just started getting these "No irq handler for vector" messages > after upgrading to linux-next 20151217+. > > > (from the first boot) > ... > [ 2.282652] [drm] Initialized drm 1.1.0 20060810 > [ 2.318806] AVX version of gcm_enc/dec engaged. > [ 2.318810] AES CTR mode by8 optimization enabled > [ 2.324446] do_IRQ: 0.35 No irq handler for vector > [ 2.366146] iTCO_vendor_support: vendor-support=0 > [ 2.372762] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 > ... > [ 9.249887] wlan0: associate with 2c:5d:93:09:50:48 (try 1/3) > [ 9.265206] wlan0: RX AssocResp from 2c:5d:93:09:50:48 (capab=0x421 status=0 aid=8) > [ 9.284088] wlan0: associated > [ 10.453048] do_IRQ: 0.35 No irq handler for vector > [ 10.457923] do_IRQ: 0.35 No irq handler for vector > [ 10.457932] do_IRQ: 0.35 No irq handler for vector > [ 10.501026] do_IRQ: 0.35 No irq handler for vector > [ 10.501033] do_IRQ: 0.35 No irq handler for vector > [ 10.513951] do_IRQ: 0.35 No irq handler for vector > ... > > > (second boot, and after a resume) > ... > [10527.998694] PM: noirq resume of devices complete after 21.488 msecs > [10527.999578] PM: early resume of devices complete after 0.850 msecs > [10528.000525] rtc_cmos 00:02: System wakeup disabled by ACPI > [10528.005265] do_IRQ: 0.84 No irq handler for vector > [10528.005450] sd 0:0:0:0: [sda] Starting disk > [10528.021257] tpm_tis 00:05: TPM is disabled/deactivated (0x6) > ... > [10530.005541] PM: resume of devices complete after 2005.925 msecs > [10530.005690] usb 3-1.4:1.0: rebind failed: -517 > [10530.005696] usb 3-1.4:1.1: rebind failed: -517 > [10530.006575] Restarting tasks ... > [10530.008347] do_IRQ: 0.84 No irq handler for vector > [10530.021258] done. > [10530.042883] Bluetooth: hci0: BCM: chip id 63 > ... > [10559.005603] mei_me 0000:00:16.0: timer: init clients timeout hbm_state = 1. > [10559.005612] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000245 60000106 > [10559.009508] do_IRQ: 0.84 No irq handler for vector > [10561.005639] mei_me 0000:00:16.0: wait hw ready failed > [10561.005644] mei_me 0000:00:16.0: hw_start failed ret = -62 > ... > > > I can test patches if anyone has any ideas :-) > > -- > - Jeremiah Mahler I performed a bisect and found that the following patch introduced the bug, which is still present in the latest linux-next 20151218+. From 41c7518a5d14543fa4aa1b5b9994ac26b38c0406 Mon Sep 17 00:00:00 2001 From: Jiang Liu Date: Mon, 30 Nov 2015 16:09:29 +0800 Subject: [PATCH] x86/irq: Fix a race condition between vector assigning and cleanup Joe Lawrence reported an use after release issue related to x86 IRQ management code. Please refer to the following link for more information: http://lkml.kernel.org/r/5653B688.4050809@stratus.com Thomas pointed out that it's caused by a race condition between __assign_irq_vector() and __send_cleanup_vector(). Based on Thomas' draft patch, we solve this race condition by: 1) Use move_in_progress to signal that an IRQ cleanup IPI is needed 2) Use old_domain to save old CPU mask for IRQ cleanup 3) Use vector to protect move_in_progress and old_domain This bugfix patch also helps to get rid of that atomic allocation in __send_cleanup_vector(). Fixes: a782a7e46bb5 "x86/irq: Store irq descriptor in vector array" Reported-and-tested-by: Joe Lawrence Signed-off-by: Jiang Liu Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1448870970-1461-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/apic/vector.c | 77 +++++++++++++++++++------------------------ 1 file changed, 34 insertions(+), 43 deletions(-) ... -- - Jeremiah Mahler -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/