Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932235AbbLOHzV (ORCPT ); Tue, 15 Dec 2015 02:55:21 -0500 Received: from mga11.intel.com ([192.55.52.93]:20513 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751487AbbLOHzS (ORCPT ); Tue, 15 Dec 2015 02:55:18 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,431,1444719600"; d="scan'208,223";a="707719800" Subject: Re: [LKP] [lkp] [x86/irq] 4c24cee6b2: IP-Config: Auto-configuration of network failed To: Borislav Petkov , "Huang, Ying" References: <87si39mnl5.fsf@yhuang-dev.intel.com> <566E63C8.3050000@linux.intel.com> <87d1u9ikqd.fsf@yhuang-dev.intel.com> <20151214095427.GA11638@pd.tnic> Cc: Joe Lawrence , Thomas Gleixner , lkp@01.org, LKML , x86-ml From: Jiang Liu Organization: Intel Message-ID: <566FC762.1040107@linux.intel.com> Date: Tue, 15 Dec 2015 15:55:14 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151214095427.GA11638@pd.tnic> Content-Type: multipart/mixed; boundary="------------030202030103070607000903" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6925 Lines: 166 This is a multi-part message in MIME format. --------------030202030103070607000903 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit On 2015/12/14 17:54, Borislav Petkov wrote: > On Mon, Dec 14, 2015 at 02:54:02PM +0800, Huang, Ying wrote: >> No, there are no other systems reporting the same issue. I will queue >> more tests for make sure this is not a false positive. > > I can trigger this too with my guest here. > > I have these two ontop of rc5: > > cc22b9b83f6a x86/irq: Enhance __assign_irq_vector() to rollback in case of failure > 45dd79e03e1e x86/irq: Do not reuse struct apic_chip_data.old_domain as temporary buffer > 9f9499ae8e64 Linux 4.4-rc5 > > and my guest stalls while booting. > > The new thing I see in dmesg is this: > > ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > +..MP-BIOS bug: 8254 timer not connected to IO-APIC > +...trying to set up timer (IRQ0) through the 8259A ... > +..... (found apic 0 pin 2) ... > +....... failed. > +...trying to set up timer as Virtual Wire IRQ... > +..... failed. > +...trying to set up timer as ExtINT IRQ... > +..... works. > +APIC calibration not consistent with PM-Timer: 111ms instead of 100ms > +APIC delta adjusted to PM-Timer: 6248393 (6997337) > > which leads to boot stalling and timeoutting when loading the hdd > driver: Hi Boris and Ying, Aha, found a possible regression. Could you please help to apply the attached bugfix patch ontop of "cc22b9b83f6a x86/irq: Enhance __assign_irq_vector() to rollback in case of failure"? Hi Ying, I have push this patch to github so it should reach 0day test farm soon:) Thanks, Gerry > > ... > [ 3.973447] console [netcon0] enabled > [ 3.976099] netconsole: network logging started > [ 3.979604] rtc_cmos 00:00: setting system clock to 2015-12-14 10:45:35 UTC (1450089935) > [ 3.985348] PM: Checking hibernation image partition /dev/sdb1 > [ 6.600706] usb 1-1: New USB device found, idVendor=0627, idProduct=0001 > [ 6.613651] usb 1-1: New USB device strings: Mfr=1, Product=3, SerialNumber=5 > [ 6.636905] usb 1-1: Product: QEMU USB Tablet > [ 6.642248] usb 1-1: Manufacturer: QEMU > [ 6.647109] usb 1-1: SerialNumber: 42 > [ 7.580995] ata2.00: qc timeout (cmd 0xa0) > [ 7.589300] ata2.00: TEST_UNIT_READY failed (err_mask=0x5) > [ 7.750715] ata2.01: NODEV after polling detection > [ 7.759605] ata2.00: configured for MWDMA2 > [ 8.585691] input: QEMU QEMU USB Tablet as /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/0003:0627:0001.0001/input/input1 > [ 8.602467] hid-generic 0003:0627:0001.0001: input,hidraw0: USB HID v0.01 Pointer [QEMU QEMU USB Tablet] on usb-0000:00:01.2-1/input0 > [ 12.760846] ata2.00: qc timeout (cmd 0xa0) > [ 12.786543] ata2.00: TEST_UNIT_READY failed (err_mask=0x5) > [ 12.796576] ata2.00: limiting speed to MWDMA2:PIO3 > [ 12.958455] ata2.01: NODEV after polling detection > [ 12.969693] ata2.00: configured for MWDMA2 > [ 17.972782] ata2.00: qc timeout (cmd 0xa0) > [ 17.978967] ata2.00: TEST_UNIT_READY failed (err_mask=0x5) > [ 17.983495] ata2.00: disabled > [ 17.986352] ata2: soft resetting link > [ 18.146586] ata2.01: NODEV after polling detection > [ 18.151413] ata2: EH complete > [ 32.745227] ata1: lost interrupt (Status 0x50) > [ 32.748470] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen > [ 32.756586] ata1.00: failed command: READ DMA > [ 32.761251] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in > [ 32.761251] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > [ 32.773928] ata1.00: status: { DRDY } > [ 32.777028] ata1: soft resetting link > [ 32.934437] ata1.01: NODEV after polling detection > [ 32.946663] ata1.00: configured for MWDMA2 > [ 32.949964] ata1.00: device reported invalid CHS sector 0 > [ 32.953793] ata1: EH complete > [ 63.849089] ata1: lost interrupt (Status 0x50) > [ 63.857470] ata1.00: limiting speed to MWDMA1:PIO4 > [ 63.860982] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen > [ 63.865862] ata1.00: failed command: READ DMA > [ 63.883697] ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in > [ 63.883697] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > [ 63.899573] ata1.00: status: { DRDY } > [ 63.902649] ata1: soft resetting link > [ 64.062580] ata1.01: NODEV after polling detection > [ 64.073800] ata1.00: configured for MWDMA1 > [ 64.076813] ata1.00: device reported invalid CHS sector 0 > [ 64.096188] ata1: EH complete > --------------030202030103070607000903 Content-Type: text/x-patch; name="0001-.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-.patch" >From c7c3cc3a048576fd1e196e67b11ae0193e7fba1e Mon Sep 17 00:00:00 2001 From: Jiang Liu Date: Tue, 15 Dec 2015 15:40:43 +0800 Subject: [PATCH] Signed-off-by: Jiang Liu --- arch/x86/kernel/apic/vector.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index f03957e7c50d..fce2853f70d9 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -116,14 +116,13 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d, */ static int current_vector = FIRST_EXTERNAL_VECTOR + VECTOR_OFFSET_START; static int current_offset = VECTOR_OFFSET_START % 16; - int cpu, err; - unsigned int dest = d->cfg.dest_apicid; + int cpu, err = -ENOSPC; + unsigned int dest; if (d->move_in_progress) return -EBUSY; /* Only try and allocate irqs on cpus that are present */ - err = -ENOSPC; cpumask_clear(d->old_domain); cpumask_clear(used_cpumask); cpu = cpumask_first_and(mask, cpu_online_mask); @@ -133,9 +132,6 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d, apic->vector_allocation_domain(cpu, vector_cpumask, mask); if (cpumask_subset(vector_cpumask, d->domain)) { - err = 0; - if (cpumask_equal(vector_cpumask, d->domain)) - break; /* * New cpumask using the vector is a proper subset of * the current in use mask. So cleanup the vector @@ -144,7 +140,7 @@ static int __assign_irq_vector(int irq, struct apic_chip_data *d, cpumask_and(used_cpumask, d->domain, vector_cpumask); err = apic->cpu_mask_to_apicid_and(mask, used_cpumask, &dest); - if (err) + if (err || cpumask_equal(vector_cpumask, d->domain)) break; cpumask_andnot(d->old_domain, d->domain, vector_cpumask); -- 1.7.10.4 --------------030202030103070607000903-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/