Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1946217AbXBCBk0 (ORCPT ); Fri, 2 Feb 2007 20:40:26 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1946216AbXBCBk0 (ORCPT ); Fri, 2 Feb 2007 20:40:26 -0500 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:47459 "EHLO ebiederm.dsl.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1946217AbXBCBkZ (ORCPT ); Fri, 2 Feb 2007 20:40:25 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Andrew Morton Cc: , "Lu, Yinghai" , "Luigi Genoni" , Ingo Molnar , Natalie Protasevich , Andi Kleen Subject: Re: [PATCH 2/2] x86_64 irq: Handle irqs pending in IRR during irq migration. References: <200701221116.13154.luigi.genoni@pirelli.com> <200702021848.55921.luigi.genoni@pirelli.com> <200702021905.39922.luigi.genoni@pirelli.com> <20070202170500.57b6c3a3.akpm@linux-foundation.org> Date: Fri, 02 Feb 2007 18:39:15 -0700 In-Reply-To: <20070202170500.57b6c3a3.akpm@linux-foundation.org> (Andrew Morton's message of "Fri, 2 Feb 2007 17:05:00 -0800") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1514 Lines: 36 Andrew Morton writes: > So is this a for-2.6.20 thing? The bug was present in 2.6.19, so > I assume it doesn't affect many people? If it's not to late, and this patch isn't too scary. It's a really rare set of circumstances that trigger it, but the possibility of being hit is pretty widespread, anything with more than one cpu, and more then one irq could see this. The easiest way to trigger this is to have two level triggered irqs on two different cpus using the same vector. In that case if one acks it's irq while the other irq is migrating to a different cpu 2.6.19 get completely confused and stop handling interrupts properly. With my previous bug fix (not to drop the ack when we are confused) the machine will stay up, and that is obviously correct and can't affect anything else so is probably a candidate for the stable tree. With this fix everything just works. I don't know how often a legitimate case of the exact same irq going off twice in a row is, but that is a possibility as well especially with edge triggered interrupts. Setting up the test scenario was a pain, but by extremely limiting my choice of vectors I was able to confirm I survived several hundred of these events with in a couple of minutes no problem. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/