Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757600AbZA2BKC (ORCPT ); Wed, 28 Jan 2009 20:10:02 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754083AbZA2BJt (ORCPT ); Wed, 28 Jan 2009 20:09:49 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:50311 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753351AbZA2BJs (ORCPT ); Wed, 28 Jan 2009 20:09:48 -0500 Date: Wed, 28 Jan 2009 17:09:14 -0800 (PST) From: Linus Torvalds X-X-Sender: torvalds@localhost.localdomain To: Parag Warudkar cc: netdev@vger.kernel.org, Linux Kernel Mailing List , "David S. Miller" , Andrew Morton Subject: Re: 2.6.29-rc3: tg3 dead after resume In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1811 Lines: 46 On Wed, 28 Jan 2009, Parag Warudkar wrote: > > This is similar to the issue reported back in Jul 2007 - > http://kerneltrap.org/mailarchive/linux-kernel/2007/8/1/154073/thread > which was fixed with a patch to unconditionally save/restore pci config > space - that one is still in tg3.c. In fact, the new PCI suspend/restore code should have made that unnecessary, since the PCI layer now makes sure that a save/restore is done even if the driver hadn't done it. But at the same time, still having the driver do it certainly shouldn't have _hurt_ anything either. But it's quite possible that the tg3 thing is very sensitive to the exact order things happen in - there's a lot of comments about bugs in there ;) > After resume tg3 complains that no firmware is running and eth0 is > non-existent. Rmmoding and modprobing tg3 again causes some timeouts and > errors from tg3 and the link still doesn't work. That seems to imply that even the reset failed, which is interesting. But it also possibly means that the problem is not necessarily the driver itself, but some cached state that we keep around in "struct pci_dev" even across a module load/unload. For example, if we get the "dev->current_state" cache wrong, then we may not actually end up changing it when we should, because we think we already match the target state. I don't _think_ that is it, but that's the kind of thing that could happen. Can you do a lspci -vvxxx -s [tg3-device] before-and-after suspend? Is there some state that looks like it got corrupted? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/