Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753537AbbBSU3h (ORCPT ); Thu, 19 Feb 2015 15:29:37 -0500 Received: from mail-ie0-f182.google.com ([209.85.223.182]:34897 "EHLO mail-ie0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753381AbbBSU3f (ORCPT ); Thu, 19 Feb 2015 15:29:35 -0500 MIME-Version: 1.0 In-Reply-To: References: <20150218222544.GA17717@twins.programming.kicks-ass.net> Date: Thu, 19 Feb 2015 12:29:35 -0800 X-Google-Sender-Auth: 68yAgaoEjD_8uFEKDyye7KvZTzc Message-ID: Subject: Re: smp_call_function_single lockups From: Linus Torvalds To: Rafael David Tinoco , Ingo Molnar , Peter Anvin , Jiang Liu Cc: Peter Zijlstra , LKML , Jens Axboe , Frederic Weisbecker , Gema Gomez , Christopher Arges , "the arch/x86 maintainers" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2292 Lines: 54 On Thu, Feb 19, 2015 at 9:39 AM, Linus Torvalds wrote: > On Thu, Feb 19, 2015 at 8:59 AM, Linus Torvalds > wrote: >> >> Are there known errata for the x2apic? > > .. and in particular, do we still have to worry about the traditional > local apic "if there are more than two pending interrupts per priority > level, things get lost" problem? > > I forget the exact details. Hopefully somebody remembers. I can't find it in the docs. I find the "two-entries per vector", but not anything that is per priority level (group of 16 vectors). Maybe that was the IO-APIC, in which case it's immaterial for IPI's. However, having now mostly re-acquainted myself with the APIC details, it strikes me that we do have some oddities here. In particular, a few interrupt types are very special: NMI, SMI, INIT, ExtINT, or SIPI are handled early in the interrupt acceptance logic, and are sent directly to the CPU core, without going through the usual intermediate IRR/ISR dance. And why might this matter? It's important because it means that those kinds of interrupts must *not* do the apic EOI that ack_APIC_irq() does. And we correctly don't do ack_APIC_irq() for NMI etc, but it strikes me that ExtINT is odd and special. I think we still use ExtINT for some odd cases. We used to have some magic with the legacy timer interrupt, for example. And I think they all go through the normal "do_IRQ()" logic regardless of whether they are ExtINT or not. Now, what happens if we send an EOI for an ExtINT interrupt? It basically ends up being a spurious IPI. And I *think* that what normally happens is absolutely nothing at all. But if in addition to the ExtINT, there was a pending IPI (or other pending ISR bit set), maybe we lose interrupts.. .. and it's entirely possible that I'm just completely full of shit. Who is the poor bastard who has worked most with things like ExtINT, and can educate me? I'm adding Ingo, hpa and Jiang Liu as primary contacts.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/