Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755345AbZJPGg3 (ORCPT ); Fri, 16 Oct 2009 02:36:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754912AbZJPGg1 (ORCPT ); Fri, 16 Oct 2009 02:36:27 -0400 Received: from dwdmx5.dwd.de ([141.38.3.242]:53364 "EHLO dwdmx5.dwd.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754850AbZJPGg0 (ORCPT ); Fri, 16 Oct 2009 02:36:26 -0400 Date: Fri, 16 Oct 2009 06:24:32 +0000 (GMT) From: Holger Kiehl X-X-Sender: kiehl@praktifix.dwd.de To: linux-kernel cc: netdev@vger.kernel.org Subject: e1000_clean_tx_irq: Detected Tx Unit Hang Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8505 Lines: 127 Hello I have received the following error on a busy network: Oct 15 22:01:13 hermes kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Oct 15 22:01:13 hermes kernel: Tx Queue <0> Oct 15 22:01:13 hermes kernel: TDH Oct 15 22:01:13 hermes kernel: TDT Oct 15 22:01:13 hermes kernel: next_to_use Oct 15 22:01:13 hermes kernel: next_to_clean Oct 15 22:01:13 hermes kernel: buffer_info[next_to_clean] Oct 15 22:01:13 hermes kernel: time_stamp <1031cfe6d> Oct 15 22:01:13 hermes kernel: next_to_watch <2> Oct 15 22:01:13 hermes kernel: jiffies <1031d0000> Oct 15 22:01:13 hermes kernel: next_to_watch.status <0> Oct 15 22:01:15 hermes kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Oct 15 22:01:15 hermes kernel: Tx Queue <0> Oct 15 22:01:15 hermes kernel: TDH Oct 15 22:01:15 hermes kernel: TDT Oct 15 22:01:15 hermes kernel: next_to_use Oct 15 22:01:15 hermes kernel: next_to_clean Oct 15 22:01:15 hermes kernel: buffer_info[next_to_clean] Oct 15 22:01:15 hermes kernel: time_stamp <1031cfe6d> Oct 15 22:01:15 hermes kernel: next_to_watch <2> Oct 15 22:01:15 hermes kernel: jiffies <1031d01f4> Oct 15 22:01:15 hermes kernel: next_to_watch.status <0> Oct 15 22:01:17 hermes kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Oct 15 22:01:17 hermes kernel: Tx Queue <0> Oct 15 22:01:17 hermes kernel: TDH Oct 15 22:01:17 hermes kernel: TDT Oct 15 22:01:17 hermes kernel: next_to_use Oct 15 22:01:17 hermes kernel: next_to_clean Oct 15 22:01:17 hermes kernel: buffer_info[next_to_clean] Oct 15 22:01:17 hermes kernel: time_stamp <1031cfe6d> Oct 15 22:01:17 hermes kernel: next_to_watch <2> Oct 15 22:01:17 hermes kernel: jiffies <1031d03e8> Oct 15 22:01:17 hermes kernel: next_to_watch.status <0> Oct 15 22:01:18 hermes kernel: ------------[ cut here ]------------ Oct 15 22:01:18 hermes kernel: WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0x143/0x1eb() Oct 15 22:01:18 hermes kernel: Hardware name: PRIMERGY RX300 S4 Oct 15 22:01:18 hermes kernel: NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed out Oct 15 22:01:18 hermes kernel: Modules linked in: coretemp ipmi_devintf ipmi_si ipmi_msghandler bonding nf_conntrack_ftp binfmt_misc usbhid i2c_i801 i5000_edac i2c_core i5k_amb uhci_hcd ehci_hcd sg usbcore [last unloaded: microcode] Oct 15 22:01:18 hermes kernel: Pid: 0, comm: swapper Not tainted 2.6.31.4 #4 Oct 15 22:01:18 hermes kernel: Call Trace: Oct 15 22:01:18 hermes kernel: [] warn_slowpath_common+0x88/0xb6 Oct 15 22:01:18 hermes kernel: [] warn_slowpath_fmt+0x4b/0x61 Oct 15 22:01:18 hermes kernel: [] ? netdev_drivername+0x52/0x70 Oct 15 22:01:18 hermes kernel: [] dev_watchdog+0x143/0x1eb Oct 15 22:01:18 hermes kernel: [] ? __queue_work+0x44/0x61 Oct 15 22:01:18 hermes kernel: [] run_timer_softirq+0x1a8/0x238 Oct 15 22:01:18 hermes kernel: [] ? clockevents_program_event+0x88/0xa5 Oct 15 22:01:18 hermes kernel: [] __do_softirq+0xab/0x160 Oct 15 22:01:18 hermes kernel: [] call_softirq+0x1c/0x28 Oct 15 22:01:18 hermes kernel: [] do_softirq+0x51/0xae Oct 15 22:01:18 hermes kernel: [] irq_exit+0x52/0xa3 Oct 15 22:01:18 hermes kernel: [] smp_apic_timer_interrupt+0x9c/0xc1 Oct 15 22:01:18 hermes kernel: [] apic_timer_interrupt+0x13/0x20 Oct 15 22:01:18 hermes kernel: [] ? acpi_idle_enter_simple+0x17e/0x1c6 Oct 15 22:01:18 hermes kernel: [] ? acpi_idle_enter_simple+0x177/0x1c6 Oct 15 22:01:18 hermes kernel: [] ? cpuidle_idle_call+0x9b/0xe7 Oct 15 22:01:18 hermes kernel: [] ? cpu_idle+0xb0/0xf3 Oct 15 22:01:18 hermes kernel: [] ? start_secondary+0x1b8/0x1d3 Oct 15 22:01:18 hermes kernel: ---[ end trace 5d760977cd95430f ]--- Oct 15 22:01:18 hermes kernel: bonding: bond0: link status definitely down for interface eth0, disabling it Oct 15 22:01:18 hermes kernel: bonding: bond0: making interface eth2 the new active one. Oct 15 22:01:21 hermes kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX Oct 15 22:01:21 hermes kernel: bonding: bond0: link status definitely up for interface eth0. This happened with a plain kernel.org kernel 2.6.31.4. The ethernet card is a PCI-X card (ie. using the e1000 driver), here the output of lspci: 05:04.0 Ethernet controller: Intel Corporation 82546GB Gigabit Ethernet Controller (rev 03) Subsystem: Intel Corporation Device 118a Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR-