Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753782Ab3CNXTk (ORCPT ); Thu, 14 Mar 2013 19:19:40 -0400 Received: from mail-ee0-f53.google.com ([74.125.83.53]:48403 "EHLO mail-ee0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753612Ab3CNXTj (ORCPT ); Thu, 14 Mar 2013 19:19:39 -0400 Message-ID: <1363303174.29475.46.camel@edumazet-glaptop> Subject: Re: BUG: IPv4: Attempt to release TCP socket in state 1 From: Eric Dumazet To: dormando Cc: Cong Wang , linux-kernel@vger.kernel.org, netdev@vger.kernel.org Date: Fri, 15 Mar 2013 00:19:34 +0100 In-Reply-To: References: <51356AC1.4090302@gmail.com> <1362460046.15793.111.camel@edumazet-glaptop> <1362494795.15793.113.camel@edumazet-glaptop> <1362663990.15793.208.camel@edumazet-glaptop> <1363301786.29475.40.camel@edumazet-glaptop> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5577 Lines: 102 On Thu, 2013-03-14 at 16:15 -0700, dormando wrote: > *sigh*. it's been a long month, sorry: > > [58377.436522] IPv4: Attempt to release TCP socket family 2 in state 1 > ffff8813fbad9500 > [58377.436539] ------------[ cut here ]------------ > [58377.436545] WARNING: at net/ipv4/af_inet.c:146 > inet_sock_destruct+0x176/0x200() > [58377.436546] Hardware name: X9DR3-F > [58377.436547] Modules linked in: bridge coretemp ghash_clmulni_intel > ipmi_watchdog ipmi_devintf gpio_ich microcode ixgbe sb_edac edac_core mei > lpc_ich mfd_core mdio ipmi_si ipmi_msghandler iptable_nat nf_nat_ipv4 > nf_nat isci libsas igb ptp pps_core > [58377.436563] Pid: 0, comm: swapper/0 Not tainted 3.8.2 #3 > [58377.436564] Call Trace: > [58377.436566] [] warn_slowpath_common+0x7f/0xc0 > [58377.436572] [] warn_slowpath_null+0x1a/0x20 > [58377.436574] [] inet_sock_destruct+0x176/0x200 > [58377.436578] [] ? tcp_write_timer_handler+0x1b0/0x1b0 > [58377.436581] [] __sk_free+0x1d/0x140 > [58377.436583] [] ? tcp_write_timer_handler+0x1b0/0x1b0 > [58377.436585] [] sk_free+0x25/0x30 > [58377.436586] [] tcp_write_timer+0x49/0x70 > [58377.436590] [] call_timer_fn+0x49/0x130 > [58377.436593] [] ? scheduler_tick+0x15f/0x190 > [58377.436596] [] run_timer_softirq+0x224/0x290 > [58377.436598] [] ? update_process_times+0x76/0x90 > [58377.436600] [] ? tcp_write_timer_handler+0x1b0/0x1b0 > [58377.436602] [] ? ktime_get+0x54/0xe0 > [58377.436604] [] __do_softirq+0xc7/0x230 > [58377.436608] [] call_softirq+0x1c/0x30 > [58377.436611] [] do_softirq+0x55/0x90 > [58377.436613] [] irq_exit+0x85/0xa0 > [58377.436616] [] smp_apic_timer_interrupt+0x6e/0x99 > [58377.436618] [] apic_timer_interrupt+0x6a/0x70 > [58377.436619] [] ? __schedule+0x3ac/0x750 > [58377.436625] [] ? mwait_idle+0xad/0x1f0 > [58377.436627] [] cpu_idle+0xb3/0x100 > [58377.436629] [] rest_init+0x72/0x80 > [58377.436633] [] start_kernel+0x3ac/0x3b9 > [58377.436635] [] ? repair_env_string+0x5b/0x5b > [58377.436636] [] x86_64_start_reservations+0x131/0x136 > [58377.436638] [] x86_64_start_kernel+0xed/0xf4 > [58377.436639] ---[ end trace 9e57364162374433 ]--- > > ^ pretty sure that's the WARN_ON_ONCE(1) > > Then a short while later the usual: > > [58394.689801] ------------[ cut here ]------------ > [58394.689817] WARNING: at net/sched/sch_generic.c:254 > dev_watchdog+0x258/0x270() > [58394.689820] Hardware name: X9DR3-F > [58394.689836] NETDEV WATCHDOG: eth2 (ixgbe): transmit queue 14 timed out > [58394.689837] Modules linked in: bridge coretemp ghash_clmulni_intel > ipmi_watchdog ipmi_devintf gpio_ich microcode ixgbe sb_edac edac_core mei > lpc_ich mfd_core mdio ipmi_si ipmi_msghandler iptable_nat nf_nat_ipv4 > nf_nat isci libsas igb ptp pps_core > [58394.689853] Pid: 0, comm: swapper/0 Tainted: G W > 3.8.2 #3 > [58394.689854] Call Trace: > [58394.689856] [] warn_slowpath_common+0x7f/0xc0 > [58394.689863] [] warn_slowpath_fmt+0x46/0x50 > [58394.689865] [] dev_watchdog+0x258/0x270 > [58394.689868] [] ? __netdev_watchdog_up+0x80/0x80 > [58394.689872] [] call_timer_fn+0x49/0x130 > [58394.689875] [] ? scheduler_tick+0x15f/0x190 > [58394.689877] [] run_timer_softirq+0x224/0x290 > [58394.689880] [] ? update_process_times+0x76/0x90 > [58394.689882] [] ? __netdev_watchdog_up+0x80/0x80 > [58394.689884] [] ? ktime_get+0x54/0xe0 > [58394.689886] [] __do_softirq+0xc7/0x230 > [58394.689890] [] call_softirq+0x1c/0x30 > [58394.689894] [] do_softirq+0x55/0x90 > [58394.689895] [] irq_exit+0x85/0xa0 > [58394.689898] [] smp_apic_timer_interrupt+0x6e/0x99 > [58394.689900] [] apic_timer_interrupt+0x6a/0x70 > [58394.689901] [] ? __schedule+0x3ac/0x750 > [58394.689907] [] ? mwait_idle+0xad/0x1f0 > [58394.689909] [] cpu_idle+0xb3/0x100 > [58394.689911] [] rest_init+0x72/0x80 > [58394.689915] [] start_kernel+0x3ac/0x3b9 > [58394.689917] [] ? repair_env_string+0x5b/0x5b > [58394.689918] [] x86_64_start_reservations+0x131/0x136 > [58394.689920] [] x86_64_start_kernel+0xed/0xf4 > [58394.689922] ---[ end trace 9e57364162374434 ]--- > [58394.689965] ixgbe 0000:83:00.0 eth2: Reset adapter > [58447.665326] INFO: rcu_sched self-detected stall on CPU { 8} (t=15001 > jiffies g=3607787 c=3607786 q=332913) > > (then tons of stuck processes getting timed out) Thanks thats really useful, we might miss to increment socket refcount in a timer setup. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/