Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422839AbWJFSve (ORCPT ); Fri, 6 Oct 2006 14:51:34 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1422834AbWJFSve (ORCPT ); Fri, 6 Oct 2006 14:51:34 -0400 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:44248 "EHLO ebiederm.dsl.xmission.com") by vger.kernel.org with ESMTP id S1422839AbWJFSvd (ORCPT ); Fri, 6 Oct 2006 14:51:33 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Muli Ben-Yehuda , Ingo Molnar , Thomas Gleixner , Benjamin Herrenschmidt , Rajesh Shah , Andi Kleen , "Protasevich, Natalie" , "Luck, Tony" , Andrew Morton , Linux-Kernel , Badari Pulavarty Subject: Re: 2.6.19-rc1 genirq causes either boot hang or "do_IRQ: cannot handle IRQ -1" References: <20061005212216.GA10912@rhun.haifa.ibm.com> Date: Fri, 06 Oct 2006 12:48:55 -0600 In-Reply-To: (Linus Torvalds's message of "Fri, 6 Oct 2006 11:08:08 -0700 (PDT)") Message-ID: User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3324 Lines: 85 Linus Torvalds writes: > On Fri, 6 Oct 2006, Eric W. Biederman wrote: >> >> Forcing irqs to specific cpus is not something this patch adds. That >> is the way the ioapic routes irqs. > > What that patch adds is to make it an ERROR if some irq goes to an > unexpected cpu. > > And that very much is wrong. Agreed. Not recovering from an irq that hits the wrong cpu if we can recover from it is a problem. That part must be fixed. >> Yes. A single problem over several months of testing has been found. > > Umm. It got found the moment it became part of the standard tree. > > The fact is, "months of testing" is not actually very much, if it's the > -mm tree. That's at best a "good vetting", but it really doesn't prove > anything. I'm not trying to prove anything just saying that I tried. All it shows is that there are an interesting subset of systems that work. The fact that the system that failed has a comparatively low volume chipset from IBM let's me entertain my an atypical hardware hypothesis. >> So this is fairly fundamentally an irq migration problem. If you >> never change which cpu an irq is pointed at you don't have problems, >> as there are no races. > > So? Does that change the issue that this new model seems inherently racy? If it is inherently racy, (i.e. it cannot be fixed) I don't have a problem removing the code. >> The current irq migration logic does everything in the irq handler >> after an irq has been received so we can avoid various kinds of races. > > No. You don't understand, or you refuse to face the issue. > > The races are in _hardware_, outside the CPU. The fact that we do things > in an irq handler doesn't seem to change a lot. (as an aside the problem does not appear on the irq migration path because the kernel has not made it far enough for that to be possible) I think I don't understand the race you see. I believe the premise the irq migration code works under is that while an irq is pending a second irq will not be sent from the ioapic. If that premise is true, and we disable that irq on the ioapic, while the irq is still pending that should successfully prevent the hardware from sending any further instances of that irq while we manipulate it's routing. There are a few more details but that is why I think that path is safe. > And what do you intend to do if it turns out that the reason it doesn't > work on x366 is that the _hardware_ just is incompatible with your > model? If the code is fundamentally unfixable the code must go. > I'm not saying that's the case, and maybe there's some stupid bug that has > been overlooked, and maybe it can all work fine. But the new model _does_ > seem to be at least _potentially_ fundamentally broken. The BUG_ON certainly is, I will work up a patch to get rid of that. I'm hoping to understand how it could possibly happen before I fix that now that I have a reproducer of that condition, because it may influence the fix. But dropping an irq on the floor is certainly better then crashing the entire system. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/