Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933771AbZIDQ6v (ORCPT ); Fri, 4 Sep 2009 12:58:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S933753AbZIDQ6t (ORCPT ); Fri, 4 Sep 2009 12:58:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:8747 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933727AbZIDQ6s (ORCPT ); Fri, 4 Sep 2009 12:58:48 -0400 Date: Fri, 4 Sep 2009 12:55:25 -0400 From: Stefan Assmann To: linux-kernel@vger.kernel.org Cc: jcm@redhat.com, sdietrich@novell.com, linux-acpi@vger.kernel.org, andi@firstfloor.org, hpa@zytor.com, Stefan Assmann , mingo@elte.hu, Olaf.Dabrunz@gmx.net, ktokunag@redhat.com, tglx@linutronix.de, lenb@kernel.org Message-Id: <20090904165525.26294.31112.sendpatchset@t500> Subject: [RFC][PATCH 0/2] boot interrupts on Intel X58 and 55x0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2938 Lines: 53 This patchset is meant to disable boot interrupts on Intel X58 and 55x0 chipsets (Tylersburg). A lot of effort from Kei Tokunaga has gone into these patches. Thanks a lot Kei! The reason why this consists of 2 patches is that the PCI config space of the configuration device to disable boot interrupts on these chipsets is not always accessible by default. The first patch is to ensure that the device is visible while the second patch applies the necessary changes to stop the generation of boot interrupts. We're not really sure whether the final X58 and 55x0 chipsets have the configuration device visible or not, so patch #1 might be superfluous but we've seen at least 2 machines where this is not the case. That's one of the reasons why this patchset is marked as RFC. The other reason is more serious namely the onboard NIC (8086:10c9) is malfunctioning on some of our test system if the second patch is applied. It fails to acquire an IP from DHCP and we're pretty clueless on this issue right now. Help is greatly appreciated! A quick summary of why boot interrupts are better off than on. Boot interrupts will be generated by the chipset if the interrupt line of a non-primary IO-APIC is masked and an IRQ arrives there. In that case a boot interrupt will be forwarded to the PIC _and_ primary IO-APIC. We're not quite sure why it arrives at the primary IO-APIC as well but it has been observed on various chipsets. As there will be no interrupt handler installed (for the boot interrupt) on the primary IO-APIC the interrupt will be counted as spurious, which can result in disabling the entire interrupt line by the kernel in case of too many spurious interrupts. The problem only shows up if the primary IO-APIC already has an interrupt handler installed on that line, otherwise that line would be masked anyway and the boot interrupt silently ignored (which makes it tricky to observe). When does this become a problem? Any device connected to a non-primary IO-APIC (that doesn't use MSIs) will trigger the generation of boot interrupts if it's IO-APIC pin is masked. There can be many reasons for that for example: - The interrupt is shared and a buggy device driver (from another device) causes the interrupt to get disabled by the kernel. - The RT kernel masks interrupt lines during handling (threaded IRQ-handling). - Kei reported from issues in the case of kdump when the first kernel disables the IO-APICs before the second kernel starts booting. It becomes a problem when too many interrupts are counted as spurious (the boot interrupts cannot be handled because the kernel doesn't expect them) and the kernel decides to better bring down the interrupt line. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/