Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751982AbbBVMBV (ORCPT ); Sun, 22 Feb 2015 07:01:21 -0500 Received: from mail-qa0-f53.google.com ([209.85.216.53]:55857 "EHLO mail-qa0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751801AbbBVMBT (ORCPT ); Sun, 22 Feb 2015 07:01:19 -0500 From: "Justin Piszcz" To: Subject: 3.19: ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout Date: Sun, 22 Feb 2015 07:01:17 -0500 Message-ID: <000001d04e97$43b4b950$cb1e2bf0$@lucidpixels.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-Index: AdBOlvMlmz0RWXcqThyaZ0dg57bRog== Content-Language: en-us Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9228 Lines: 186 Hello, Kernel: 3.19.0 Issue: When using robocopy to copy files (from Windows 8/8.1) to Linux/samba, the 10GbE NIC resets - dmesg [1] below. To get it back working again, I have to down/up the interface. Jumbo frames are being used (mtu of 9014) on each side. The lspci output is listed below. Are there any other recommended workarounds for this issue as LRO is already off for me as shown below. When using Linux<->Linux with rsync or NFS, there are no errors with 10GbE. When using Samba<->Windows 8 over 10GbE, this issue occurs persistently as shown below when a copy is running. # ethtool -k eth4|grep large large-receive-offload: off [fixed] There is/was a similar issue as reported here: https://communities.intel.com/message/207408 [1] dmesg [538576.098186] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [541013.223961] ------------[ cut here ]------------ [541013.223970] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303 dev_watchdog+0x227/0x230() [541013.223971] NETDEV WATCHDOG: eth4 (ixgbe): transmit queue 0 timed out [541013.223972] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.19.0 #2 [541013.223973] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.0a 12/05/2013 [541013.223974] ffffffff81d3a6ae ffff88107fc03da8 ffffffff819d07d7 ffffffff81e34d98 [541013.223976] ffff88107fc03df8 ffff88107fc03de8 ffffffff810dbdab 0000000000000000 [541013.223977] 0000000000000000 ffff881036304000 0000000000000000 0000000000000010 [541013.223979] Call Trace: [541013.223979] [] dump_stack+0x45/0x57 [541013.223985] [] warn_slowpath_common+0x7b/0xc0 [541013.223987] [] warn_slowpath_fmt+0x41/0x50 [541013.223990] [] ? __queue_work+0xfc/0x290 [541013.223996] [] dev_watchdog+0x227/0x230 [541013.223997] [] ? qdisc_rcu_free+0x40/0x40 [541013.223998] [] ? qdisc_rcu_free+0x40/0x40 [541013.224001] [] call_timer_fn.isra.29+0x17/0x80 [541013.224002] [] run_timer_softirq+0x1c9/0x280 [541013.224004] [] __do_softirq+0xff/0x200 [541013.224005] [] irq_exit+0x76/0xa0 [541013.224007] [] smp_apic_timer_interrupt+0x41/0x50 [541013.224009] [] apic_timer_interrupt+0x6a/0x70 [541013.224009] [] ? cpuidle_enter_state+0x48/0xc0 [541013.224013] [] ? cpuidle_enter_state+0x3d/0xc0 [541013.224014] [] cpuidle_enter+0x12/0x20 [541013.224017] [] cpu_startup_entry+0x272/0x2f0 [541013.224018] [] rest_init+0x6d/0x70 [541013.224021] [] start_kernel+0x353/0x360 [541013.224022] [] x86_64_start_reservations+0x2a/0x2c [541013.224023] [] x86_64_start_kernel+0xc8/0xcc [541013.224024] ---[ end trace 59877113cf8b7358 ]--- [541013.224026] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [541013.224036] ixgbe 0000:01:00.0 eth4: Reset adapter [541020.099402] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX ( .. it continue but without the trace later .. ) [567457.771728] ixgbe 0000:01:00.0 eth4: NIC Link is Down [567458.140112] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [567561.611941] ixgbe 0000:01:00.0 eth4: NIC Link is Down [567568.188422] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [570130.483823] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [570130.483924] ixgbe 0000:01:00.0 eth4: Reset adapter [570137.252167] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [572094.256452] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [572094.256538] ixgbe 0000:01:00.0 eth4: Reset adapter [572101.130915] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [573967.946084] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [573967.946097] ixgbe 0000:01:00.0 eth4: Reset adapter [573974.676387] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [575766.574731] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [575766.574753] ixgbe 0000:01:00.0 eth4: Reset adapter [575773.315067] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [585476.513732] perf interrupt took too long (5003 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [597267.959412] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout [597267.959452] ixgbe 0000:01:00.0 eth4: Reset adapter [597274.709728] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: RX/TX [2] lspci 01:00.0 Ethernet controller: Intel Corporation 82598EB 10-Gigabit AT2 Server Adapter (rev 01) Subsystem: Intel Corporation 82598EB 10-Gigabit AT2 Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-