Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752996AbZLZR61 (ORCPT ); Sat, 26 Dec 2009 12:58:27 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752707AbZLZR61 (ORCPT ); Sat, 26 Dec 2009 12:58:27 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:58912 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752593AbZLZR60 (ORCPT ); Sat, 26 Dec 2009 12:58:26 -0500 Date: Sat, 26 Dec 2009 09:57:23 -0800 From: Stephen Hemminger To: Michael Breuer Cc: Andrew Morton , "Berck E. Nash" , "linux-kernel@vger.kernel.org" , netdev@vger.kernel.org Subject: Re: sky2 panic in 2.6.32.1 under load (new oops) Message-ID: <20091226095723.7ac82b18@nehalam> In-Reply-To: <4B3581C7.8000702@majjas.com> References: <4B300A2A.8040305@gmail.com> <4B300E30.9090707@majjas.com> <4B3114E3.1070602@majjas.com> <4B329FA3.9090904@majjas.com> <20091223230102.4bb0100e.akpm@linux-foundation.org> <4B34E847.8010809@majjas.com> <20091225152200.1cf11dfe@nehalam> <4B3581C7.8000702@majjas.com> Organization: Linux Foundation X-Mailer: Claws Mail 3.7.2 (GTK+ 2.18.3; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3211 Lines: 94 On Fri, 25 Dec 2009 22:23:51 -0500 Michael Breuer wrote: > On 12/25/2009 6:22 PM, Stephen Hemminger wrote: > > On Fri, 25 Dec 2009 11:28:55 -0500 > > Michael Breuer wrote: > > > > > >> More data points - I'm able to reliably recreate this now. > >> While I thought it was coincidence, each and every time I hit this issue > >> there is a DHCP renew event immediately before the first error. > >> The crash occurs while under load - in my case seems that the traffic is > >> actually IPV6 (hadn't noticed that before). > >> I ran nethogs on a remote display - the reported rx rate on the IPV6 smb > >> connection at the time of the lockup was 33889.688 KB/sec on a 1gbit > >> nic. I've got two events like this - don't recall if the earlier one was > >> the exact same # - but it was in the ballpark. > >> > >> On 12/24/2009 2:01 AM, Andrew Morton wrote: > >> > >>> cc's added again. > >>> > >>> On Wed, 23 Dec 2009 17:54:27 -0500 Michael Breuer wrote: > >>> > >>> > >>> > >>>> Ok - not the firmware. Ran another Windows backup and sky2 went down. > >>>> > >>>> Nothing in dmesg.old - have oops in syslog. System became unresponsive > >>>> and watchdog kicked in after a minute. > >>>> > >>>> Also note that I have a similar oops with VT-D disabled (posted here on > >>>> 12/5). I'm attaching the oops from that below this oops for comparison. > >>>> That also happened under similar load. > >>>> > >>>> On the assumption that I can recreate this (although it takes a while) > >>>> please let me know how I can help. > >>>> > >>>> What's in my log (starting with an smbd error about 2 min before the > >>>> oops (note: the dchpd is not the system doing the backup). > >>>> > >>>> > >>> This (nastily wordwrapped) oops appers to be quite different from > >>> Berck's one. > >>> > >>> > >>> > > What is the MTU? > > > 1500 > >> It looks like the problem only shows up for packets generated by DHCP, and these come through AF_PACKET. The problem maybe related to how this packets are fragmented into header and page, in a different way than other packets confusing the driver or DMA engine. Does this help? ----- --- a/drivers/net/sky2.c 2009-12-26 09:50:20.869565022 -0800 +++ b/drivers/net/sky2.c 2009-12-26 09:55:54.620645355 -0800 @@ -1616,6 +1616,13 @@ static netdev_tx_t sky2_xmit_frame(struc if (unlikely(tx_avail(sky2) < tx_le_req(skb))) return NETDEV_TX_BUSY; + if (!pskb_may_pull(skb, ETH_HLEN)) { + if (net_ratelimit()) + pr_info(PFX "%s: packet missing ether header (%d)?", + dev->name, skb->len); + goto drop; + } + len = skb_headlen(skb); mapping = pci_map_single(hw->pdev, skb->data, len, PCI_DMA_TODEVICE); @@ -1761,6 +1768,7 @@ mapping_unwind: mapping_error: if (net_ratelimit()) dev_warn(&hw->pdev->dev, "%s: tx mapping error\n", dev->name); +drop: dev_kfree_skb(skb); return NETDEV_TX_OK; } -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/