2015-12-18 03:40:34

by Jeremiah Mahler

[permalink] [raw]
Subject: [BUG, linux-next] do_IRQ: No irq handler for vector

all,

I just started getting these "No irq handler for vector" messages
after upgrading to linux-next 20151217+.


(from the first boot)
...
[ 2.282652] [drm] Initialized drm 1.1.0 20060810
[ 2.318806] AVX version of gcm_enc/dec engaged.
[ 2.318810] AES CTR mode by8 optimization enabled
[ 2.324446] do_IRQ: 0.35 No irq handler for vector
[ 2.366146] iTCO_vendor_support: vendor-support=0
[ 2.372762] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
...
[ 9.249887] wlan0: associate with 2c:5d:93:09:50:48 (try 1/3)
[ 9.265206] wlan0: RX AssocResp from 2c:5d:93:09:50:48 (capab=0x421 status=0 aid=8)
[ 9.284088] wlan0: associated
[ 10.453048] do_IRQ: 0.35 No irq handler for vector
[ 10.457923] do_IRQ: 0.35 No irq handler for vector
[ 10.457932] do_IRQ: 0.35 No irq handler for vector
[ 10.501026] do_IRQ: 0.35 No irq handler for vector
[ 10.501033] do_IRQ: 0.35 No irq handler for vector
[ 10.513951] do_IRQ: 0.35 No irq handler for vector
...


(second boot, and after a resume)
...
[10527.998694] PM: noirq resume of devices complete after 21.488 msecs
[10527.999578] PM: early resume of devices complete after 0.850 msecs
[10528.000525] rtc_cmos 00:02: System wakeup disabled by ACPI
[10528.005265] do_IRQ: 0.84 No irq handler for vector
[10528.005450] sd 0:0:0:0: [sda] Starting disk
[10528.021257] tpm_tis 00:05: TPM is disabled/deactivated (0x6)
...
[10530.005541] PM: resume of devices complete after 2005.925 msecs
[10530.005690] usb 3-1.4:1.0: rebind failed: -517
[10530.005696] usb 3-1.4:1.1: rebind failed: -517
[10530.006575] Restarting tasks ...
[10530.008347] do_IRQ: 0.84 No irq handler for vector
[10530.021258] done.
[10530.042883] Bluetooth: hci0: BCM: chip id 63
...
[10559.005603] mei_me 0000:00:16.0: timer: init clients timeout hbm_state = 1.
[10559.005612] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000245 60000106
[10559.009508] do_IRQ: 0.84 No irq handler for vector
[10561.005639] mei_me 0000:00:16.0: wait hw ready failed
[10561.005644] mei_me 0000:00:16.0: hw_start failed ret = -62
...


I can test patches if anyone has any ideas :-)

--
- Jeremiah Mahler


2015-12-20 07:33:48

by Jeremiah Mahler

[permalink] [raw]
Subject: Re: [BUG, bisect, linux-next] do_IRQ: No irq handler for vector

Jiang Liu,

On Thu, Dec 17, 2015 at 07:40:33PM -0800, Jeremiah Mahler wrote:
> all,
>
> I just started getting these "No irq handler for vector" messages
> after upgrading to linux-next 20151217+.
>
>
> (from the first boot)
> ...
> [ 2.282652] [drm] Initialized drm 1.1.0 20060810
> [ 2.318806] AVX version of gcm_enc/dec engaged.
> [ 2.318810] AES CTR mode by8 optimization enabled
> [ 2.324446] do_IRQ: 0.35 No irq handler for vector
> [ 2.366146] iTCO_vendor_support: vendor-support=0
> [ 2.372762] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
> ...
> [ 9.249887] wlan0: associate with 2c:5d:93:09:50:48 (try 1/3)
> [ 9.265206] wlan0: RX AssocResp from 2c:5d:93:09:50:48 (capab=0x421 status=0 aid=8)
> [ 9.284088] wlan0: associated
> [ 10.453048] do_IRQ: 0.35 No irq handler for vector
> [ 10.457923] do_IRQ: 0.35 No irq handler for vector
> [ 10.457932] do_IRQ: 0.35 No irq handler for vector
> [ 10.501026] do_IRQ: 0.35 No irq handler for vector
> [ 10.501033] do_IRQ: 0.35 No irq handler for vector
> [ 10.513951] do_IRQ: 0.35 No irq handler for vector
> ...
>
>
> (second boot, and after a resume)
> ...
> [10527.998694] PM: noirq resume of devices complete after 21.488 msecs
> [10527.999578] PM: early resume of devices complete after 0.850 msecs
> [10528.000525] rtc_cmos 00:02: System wakeup disabled by ACPI
> [10528.005265] do_IRQ: 0.84 No irq handler for vector
> [10528.005450] sd 0:0:0:0: [sda] Starting disk
> [10528.021257] tpm_tis 00:05: TPM is disabled/deactivated (0x6)
> ...
> [10530.005541] PM: resume of devices complete after 2005.925 msecs
> [10530.005690] usb 3-1.4:1.0: rebind failed: -517
> [10530.005696] usb 3-1.4:1.1: rebind failed: -517
> [10530.006575] Restarting tasks ...
> [10530.008347] do_IRQ: 0.84 No irq handler for vector
> [10530.021258] done.
> [10530.042883] Bluetooth: hci0: BCM: chip id 63
> ...
> [10559.005603] mei_me 0000:00:16.0: timer: init clients timeout hbm_state = 1.
> [10559.005612] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000245 60000106
> [10559.009508] do_IRQ: 0.84 No irq handler for vector
> [10561.005639] mei_me 0000:00:16.0: wait hw ready failed
> [10561.005644] mei_me 0000:00:16.0: hw_start failed ret = -62
> ...
>
>
> I can test patches if anyone has any ideas :-)
>
> --
> - Jeremiah Mahler

I performed a bisect and found that the following patch introduced the bug,
which is still present in the latest linux-next 20151218+.

From 41c7518a5d14543fa4aa1b5b9994ac26b38c0406 Mon Sep 17 00:00:00 2001
From: Jiang Liu <[email protected]>
Date: Mon, 30 Nov 2015 16:09:29 +0800
Subject: [PATCH] x86/irq: Fix a race condition between vector assigning and
cleanup

Joe Lawrence reported an use after release issue related to x86 IRQ
management code. Please refer to the following link for more
information: http://lkml.kernel.org/r/[email protected]

Thomas pointed out that it's caused by a race condition between
__assign_irq_vector() and __send_cleanup_vector(). Based on Thomas'
draft patch, we solve this race condition by:
1) Use move_in_progress to signal that an IRQ cleanup IPI is needed
2) Use old_domain to save old CPU mask for IRQ cleanup
3) Use vector to protect move_in_progress and old_domain

This bugfix patch also helps to get rid of that atomic allocation in
__send_cleanup_vector().

Fixes: a782a7e46bb5 "x86/irq: Store irq descriptor in vector array"
Reported-and-tested-by: Joe Lawrence <[email protected]>
Signed-off-by: Jiang Liu <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
arch/x86/kernel/apic/vector.c | 77 +++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 43 deletions(-)
...

--
- Jeremiah Mahler

2015-12-22 01:34:56

by Jeremiah Mahler

[permalink] [raw]
Subject: Re: [BUG, bisect, linux-next] do_IRQ: No irq handler for vector

somebody,

On Sat, Dec 19, 2015 at 11:33:44PM -0800, Jeremiah Mahler wrote:
[...]
> I performed a bisect and found that the following patch introduced the bug,
> which is still present in the latest linux-next 20151218+.
>
> From 41c7518a5d14543fa4aa1b5b9994ac26b38c0406 Mon Sep 17 00:00:00 2001
> From: Jiang Liu <[email protected]>
> Date: Mon, 30 Nov 2015 16:09:29 +0800
> Subject: [PATCH] x86/irq: Fix a race condition between vector assigning and
> cleanup
>
[...]

Thank you to whoever yanked this patch. It is gone from -next 20151221+
and my machine is working again.

--
- Jeremiah Mahler