Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Fri, 14 Feb 2003 15:54:00 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Fri, 14 Feb 2003 15:52:33 -0500 Received: from nat-pool-rdu.redhat.com ([66.187.233.200]:6803 "EHLO devserv.devel.redhat.com") by vger.kernel.org with ESMTP id ; Fri, 14 Feb 2003 15:52:25 -0500 Date: Fri, 14 Feb 2003 16:02:16 -0500 From: Pete Zaitcev Message-Id: <200302142102.h1EL2GE28325@devserv.devel.redhat.com> To: James Bourne cc: zaitcev@redhat.com, linux-kernel@vger.kernel.org Subject: Re: lockups with 2.4.20 (tg3? net/core/dev.c|deliver_to_old_ones) In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1196 Lines: 28 > Since sometime in December two systems we have on site using P4 HT (one > Dell 2650 and one Dell 4600, both dual CPU, both ht/mce capable) have been > locking up without any kernel output and without sysrq keys working (the > keyboard is locked solid). >[...] > Using nmi_watchdog I've managed to get a stack track and ran ksymoops over > it (attached). Good report. To tell the truth, I know that this lockup exists, there's an RH issue-tracker item against me on this. It is different from the old "porkchop" lockup, which DaveM and Jeff Garzik fixed. The stumbling block is that NMI oopser catches a thread which gets stuck because of the lock, but this does not explain how the lock was taken. I think the best resolution would be an instrumentation patch which records lock takers, and prints them when the thing is forcefuly oopsed. I should come with it eventually, if someone does not beat me to it (I wish they did, actually :-) -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/