Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755142AbdLFLNf (ORCPT ); Wed, 6 Dec 2017 06:13:35 -0500 Received: from mga04.intel.com ([192.55.52.120]:45185 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755032AbdLFLNd (ORCPT ); Wed, 6 Dec 2017 06:13:33 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,367,1508828400"; d="scan'208";a="15908602" Subject: Re: [GIT pull] x86 APIC updates for 4.15 To: Thomas Gleixner Cc: Linus Torvalds , LKML , Ingo Molnar , "H. Peter Anvin" , Borislav Petkov , Peter Zijlstra , Jani Nikula , Intel Graphics Development References: <7c1dd011-0b7b-7c57-71ed-bfab8a52c888@mblankhorst.nl> <373cf597-c068-4c90-354c-496c88692f9f@linux.intel.com> From: Maarten Lankhorst Message-ID: Date: Wed, 6 Dec 2017 12:13:27 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9604 Lines: 166 Hey, Op 30-11-17 om 23:47 schreef Thomas Gleixner: > On Thu, 30 Nov 2017, Maarten Lankhorst wrote: >> Op 30-11-17 om 10:18 schreef Thomas Gleixner: >> # cat /sys/kernel/debug/irq/irqs/28 >> handler: handle_edge_irq >> device: 0000:00:02.0 >> status: 0x00000000 >> istate: 0x00000000 >> ddepth: 0 >> wdepth: 0 >> dstate: 0x01401200 >> IRQD_ACTIVATED >> IRQD_IRQ_STARTED >> IRQD_SINGLE_TARGET >> IRQD_AFFINITY_SET >> node: 0 >> affinity: 4 >> effectiv: 4 >> pending: >> domain: PCI-MSI-2 >> hwirq: 0x8000 >> chip: PCI-MSI >> flags: 0x10 >> IRQCHIP_SKIP_SET_WAKE >> parent: >> domain: VECTOR >> hwirq: 0x1c >> chip: APIC >> flags: 0x0 > Ok. Fine. At that point the interrupt is assigned to CPU4 > >> # cat /sys/kernel/debug/irq/irqs/28 >> handler: handle_edge_irq >> device: 0000:00:02.0 >> status: 0x00000400 >> _IRQ_NOPROBE >> istate: 0x00000000 >> ddepth: 0 >> wdepth: 0 >> dstate: 0x01401300 >> IRQD_ACTIVATED >> IRQD_IRQ_STARTED >> IRQD_SINGLE_TARGET >> IRQD_AFFINITY_SET >> IRQD_SETAFFINITY_PENDING >> node: 0 >> affinity: 0,5-7 >> effectiv: 0 >> pending: 4 >> domain: PCI-MSI-2 >> hwirq: 0x8000 >> chip: PCI-MSI >> flags: 0x10 >> IRQCHIP_SKIP_SET_WAKE >> parent: >> domain: VECTOR >> hwirq: 0x1c >> chip: APIC >> flags: 0x0 > Now after suspend/resume it's affine to CPU 0 and a request to move it back > to CPU4 is pending. > > That looks halfways sane, but: > >> affinity: 0,5-7 >> effectiv: 0 >> pending: 4 > does not make sense because 4 is not in 0,5-7 > > Can you please enable the tracepoints: > > /sys/kernel/debug/tracing/events/irq_vectors/vector* > > and > > /sys/kernel/debug/tracing/events/irq_matrix/* > > and collect the trace from > > /sys/kernel/debug/trace > > right after resume or maybe after the timeout hit. > > That might not give us all required info, but at least it will allow me to > come up with the necessary extras required. Right now I'm tapping in the > dark The problem happens during hibernate, but seems might happen during s/r too, just not necessarily dying.. Maybe a irq at the right moment is causing the unexpected "No irq handler for vector" ? echo core > /sys/power/pm_test echo mem > /sys/power/state cat /sys/kernel/debug/irq/irqs/28: affinity: 0,5-7 effectiv: 0 pending: 4 Same issue, just didn't cause the death yet.. interrupts still working as expected.. cat /sys/kernel/debug -0 [000] d.h3 862.035000: irq_matrix_alloc: bit=33 cpu=1 online=1 avl=202 alloc=1 managed=0 online_maps=8 global_avl=1611, global_rsvd=15, total_alloc=13 -0 [000] d.h3 862.035003: vector_update: irq=30 vector=33 cpu=1 prev_vector=43 prev_cpu=0 -0 [000] d.h3 862.035004: vector_alloc: irq=30 vector=33 reserved=1 ret=0 -0 [000] d.h3 862.035005: vector_config: irq=30 vector=33 cpu=1 apicdest=0x00000002 -0 [000] d.h2 862.035065: vector_free_moved: irq=30 cpu=0 vector=43 is_managed=0 -0 [000] d.h2 862.035066: irq_matrix_free: bit=43 cpu=0 online=1 avl=196 alloc=7 managed=0 online_maps=8 global_avl=1612, global_rsvd=15, total_alloc=12 kworker/u16:2-149 [000] d..1 862.043955: vector_deactivate: irq=27 is_managed=0 can_reserve=1 early=0 kworker/u16:2-149 [000] d..2 862.043962: vector_clear: irq=27 vector=40 cpu=0 prev_vector=0 prev_cpu=0 kworker/u16:2-149 [000] d..2 862.043963: irq_matrix_free: bit=40 cpu=0 online=1 avl=197 alloc=6 managed=0 online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11 kworker/u16:2-149 [000] d..2 862.043964: irq_matrix_reserve: online_maps=8 global_avl=1613, global_rsvd=16, total_alloc=11 kworker/u16:2-149 [000] d..2 862.043964: vector_reserve: irq=27 ret=0 kworker/u16:2-149 [000] d..2 862.043965: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:2-149 [000] d..1 862.044428: vector_teardown: irq=27 is_managed=0 has_reserved=1 kworker/u16:2-149 [000] d..1 862.044429: irq_matrix_remove_reserved: online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11 kworker/u16:5-1922 [001] d..1 862.049683: vector_deactivate: irq=30 is_managed=0 can_reserve=1 early=0 kworker/u16:5-1922 [001] d..2 862.049684: vector_clear: irq=30 vector=33 cpu=1 prev_vector=0 prev_cpu=0 kworker/u16:5-1922 [001] d..2 862.049684: irq_matrix_free: bit=33 cpu=1 online=1 avl=203 alloc=0 managed=0 online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10 kworker/u16:5-1922 [001] d..2 862.049685: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=16, total_alloc=10 kworker/u16:5-1922 [001] d..2 862.049685: vector_reserve: irq=30 ret=0 kworker/u16:5-1922 [001] d..2 862.049686: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:5-1922 [001] d..1 862.049711: vector_teardown: irq=30 is_managed=0 has_reserved=1 kworker/u16:5-1922 [001] d..1 862.049711: irq_matrix_remove_reserved: online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10 -0 [000] d.h3 862.094308: irq_matrix_alloc: bit=33 cpu=1 online=1 avl=202 alloc=1 managed=0 online_maps=8 global_avl=1613, global_rsvd=15, total_alloc=11 -0 [000] d.h3 862.094310: vector_update: irq=16 vector=33 cpu=1 prev_vector=37 prev_cpu=0 -0 [000] d.h3 862.094310: vector_alloc: irq=16 vector=33 reserved=1 ret=0 -0 [000] d.h3 862.094311: vector_config: irq=16 vector=33 cpu=1 apicdest=0x00000002 -0 [000] d.h2 862.094647: vector_free_moved: irq=16 cpu=0 vector=37 is_managed=0 -0 [000] d.h2 862.094648: irq_matrix_free: bit=37 cpu=0 online=1 avl=198 alloc=5 managed=0 online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=10 -0 [001] d..2 867.272092: irq_matrix_online: online_maps=2 global_avl=396, global_rsvd=15, total_alloc=17 -0 [002] d..2 867.277213: irq_matrix_online: online_maps=3 global_avl=599, global_rsvd=15, total_alloc=17 -0 [003] d..2 867.281533: irq_matrix_online: online_maps=4 global_avl=802, global_rsvd=15, total_alloc=17 -0 [004] d..2 867.286371: irq_matrix_online: online_maps=5 global_avl=1005, global_rsvd=15, total_alloc=17 -0 [005] d..2 867.291078: irq_matrix_online: online_maps=6 global_avl=1208, global_rsvd=15, total_alloc=17 -0 [006] d..2 867.295910: irq_matrix_online: online_maps=7 global_avl=1411, global_rsvd=15, total_alloc=17 -0 [007] d..2 867.301120: irq_matrix_online: online_maps=8 global_avl=1614, global_rsvd=15, total_alloc=17 kworker/u16:24-1941 [006] d..1 867.335259: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=16, total_alloc=17 kworker/u16:24-1941 [006] d..1 867.335262: vector_reserve: irq=27 ret=0 kworker/u16:24-1941 [006] d..1 867.335263: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:24-1941 [006] .... 867.335265: vector_setup: irq=27 is_legacy=0 ret=0 kworker/u16:24-1941 [006] .... 867.335310: vector_activate: irq=27 is_managed=0 can_reserve=1 early=1 kworker/u16:24-1941 [006] d..1 867.335330: vector_config: irq=27 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:76-1994 [004] d..1 867.335372: irq_matrix_reserve: online_maps=8 global_avl=1614, global_rsvd=17, total_alloc=17 kworker/u16:76-1994 [004] d..1 867.335373: vector_reserve: irq=30 ret=0 kworker/u16:76-1994 [004] d..1 867.335374: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:76-1994 [004] .... 867.335375: vector_setup: irq=30 is_legacy=0 ret=0 kworker/u16:76-1994 [004] .... 867.335398: vector_activate: irq=30 is_managed=0 can_reserve=1 early=1 kworker/u16:76-1994 [004] d..1 867.335399: vector_config: irq=30 vector=239 cpu=0 apicdest=0x00000001 kworker/u16:76-1994 [004] d..1 867.335499: vector_activate: irq=30 is_managed=0 can_reserve=1 early=0 kworker/u16:76-1994 [004] d..2 867.335501: irq_matrix_alloc: bit=42 cpu=0 online=1 avl=192 alloc=11 managed=0 online_maps=8 global_avl=1613, global_rsvd=16, total_alloc=18 kworker/u16:76-1994 [004] d..2 867.335502: vector_update: irq=30 vector=42 cpu=0 prev_vector=0 prev_cpu=0 kworker/u16:76-1994 [004] d..2 867.335502: vector_alloc: irq=30 vector=42 reserved=1 ret=0 kworker/u16:76-1994 [004] d..2 867.335503: vector_config: irq=30 vector=42 cpu=0 apicdest=0x00000001 kworker/u16:76-1994 [004] d..3 867.335670: vector_config: irq=30 vector=42 cpu=0 apicdest=0x00000001 kworker/u16:24-1941 [006] d..1 867.335845: vector_activate: irq=27 is_managed=0 can_reserve=1 early=0 kworker/u16:24-1941 [006] d..2 867.335846: irq_matrix_alloc: bit=43 cpu=0 online=1 avl=191 alloc=12 managed=0 online_maps=8 global_avl=1612, global_rsvd=15, total_alloc=19 kworker/u16:24-1941 [006] d..2 867.335847: vector_update: irq=27 vector=43 cpu=0 prev_vector=0 prev_cpu=0 kworker/u16:24-1941 [006] d..2 867.335847: vector_alloc: irq=27 vector=43 reserved=1 ret=0 kworker/u16:24-1941 [006] d..2 867.335848: vector_config: irq=27 vector=43 cpu=0 apicdest=0x00000001 kworker/u16:24-1941 [006] d..3 867.335861: vector_config: irq=27 vector=43 cpu=0 apicdest=0x00000001 Hopefully this is enough? ~Maarten