Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422966AbXBATWm (ORCPT ); Thu, 1 Feb 2007 14:22:42 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1422964AbXBATWm (ORCPT ); Thu, 1 Feb 2007 14:22:42 -0500 Received: from faui03.informatik.uni-erlangen.de ([131.188.30.103]:57405 "EHLO faui03.informatik.uni-erlangen.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422963AbXBATWl (ORCPT ); Thu, 1 Feb 2007 14:22:41 -0500 Date: Thu, 1 Feb 2007 19:55:32 +0100 From: Thomas Glanzmann To: LKML Cc: shemminger@linux-foundation.org, netdev@vger.kernel.org Subject: sky2 hangs Message-ID: <20070201185532.GL13130@cip.informatik.uni-erlangen.de> Mail-Followup-To: Thomas Glanzmann , LKML , shemminger@linux-foundation.org, netdev@vger.kernel.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.11-2006-07-11 Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3358 Lines: 66 Hello, I have a sky2 network card in my intel mac mini. It stops working when I do havy network load like watching a divx over http/sshfs. However if I remove the driver module and load it again it works and even the tcp connection doesn't get shutdown. I automated the above procedure using a userland watchdog which basically does the same thing and is written entirely by me, because the traditional watchdog wasn't that reliable and did a lot of false positives: * Look every ten seconds if my default router is pingable (3 pings, one has to get back). If it isn't the case I call network_fix script (it calls the script only once after a ping gets lost. To run the script again at least one ping has to arrive again) (mini) [~] cat /usr/local/sbin/fix_network #!/bin/bash export PATH=/bin:/usr/bin:/usr/sbin:/sbin rmmod sky2 modprobe sky2 ifdown eth0 ifup eth0 If after that no ping is received from the default router for another 90 seconds I tell init to reboot and stop feeding the kernel software watchdog. * My watchdog also checks if sshd process is running. If it is down for more than 100 seconds it reboots the machine, too. Jan 27 22:35:35 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 27 22:35:35 mini watchdog-tg[4146]: Running fix_network script. Jan 27 22:38:46 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 27 22:38:46 mini watchdog-tg[4146]: Running fix_network script. Jan 27 22:44:17 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 27 22:44:17 mini watchdog-tg[4146]: Running fix_network script. Jan 29 12:00:13 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 29 12:00:13 mini watchdog-tg[4146]: Running fix_network script. Jan 29 19:18:59 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 29 19:18:59 mini watchdog-tg[4146]: Running fix_network script. Jan 31 15:56:29 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Jan 31 15:56:29 mini watchdog-tg[4146]: Running fix_network script. Feb 1 08:56:57 mini watchdog-tg[4146]: No PONG received from 192.168.0.3 (failure 1 of 10) Feb 1 08:56:57 mini watchdog-tg[4146]: Running fix_network script. I have a question to this: I wonder why the Linux Kernel (no longer?) increments the use counter of an ethernet driver (I saw it on sky2 and e1000) when the interface is up, running and configured? I can unload the sky2 driver without doing a 'ifconfig eth0 down' beforehand. Could somone provide me with background on this fact? With that everything works. If somone is interested in my userland watchdog, just send me an E-Mail. @Sam: I can provide you access to my hardware including root access via the wifi driver so that you can debug this network driver lockup, if you want to. Thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/