Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753382AbZFCDOT (ORCPT ); Tue, 2 Jun 2009 23:14:19 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752705AbZFCDOM (ORCPT ); Tue, 2 Jun 2009 23:14:12 -0400 Received: from hera.kernel.org ([140.211.167.34]:34850 "EHLO hera.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752315AbZFCDOM (ORCPT ); Tue, 2 Jun 2009 23:14:12 -0400 Message-ID: <4A25EA78.7070705@kernel.org> Date: Wed, 03 Jun 2009 12:14:00 +0900 From: Tejun Heo User-Agent: Thunderbird 2.0.0.19 (X11/20081227) MIME-Version: 1.0 To: Niel Lambrechts CC: Alan Cox , "linux.kernel" , Theodore Tso Subject: Re: 2.6.29 regression: ATA bus errors on resume References: <4A17C39E.2030302@gmail.com> <4A19F006.3000303@kernel.org> <20090525091534.13ae103c@lxorguk.ukuu.org.uk> <4A1B164B.1010108@gmail.com> <4A1B76EB.9040500@kernel.org> <4A1B8193.1010703@gmail.com> <4A1B8328.80801@kernel.org> <4A1B8873.1040101@gmail.com> <4A1BEFB6.80205@kernel.org> <4A1C316C.9040201@gmail.com> <4A1C8444.9040605@kernel.org> <4A1D47C6.1070504@gmail.com> <4A2424A2.5020704@gmail.com> In-Reply-To: <4A2424A2.5020704@gmail.com> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0 (hera.kernel.org [127.0.0.1]); Wed, 03 Jun 2009 03:14:03 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1816 Lines: 43 Hello, Niel. Niel Lambrechts wrote: > Did you perhaps have any time to look into my feedback around the > readahead patch? Yeah, I've been thinking about it and am a bit out of ideas, so no immediate follow-up. >>From my side, I tried on Saturday to bisect this problem again, doing > 5-8 hibernates per each bisect from 2.6.28. I stopped at 2.6.30-rc2 due > time (or fatigue), and did not manage to replicate the problem at all > which is strange since I was playing audio, doing finds and even doing > an entire dd of the root partition. I saved the bisect logs so perhaps I > can continue to see if the problem becomes more prevalent in later > versions - the first time it ever happened to me was somewhere in 2.6.29 > originally. > > The other interesting thing was to see that "hard resetting link" > messages seem to first start appearing at v2.6.29-rc7 or perhaps rc8. Is > it worth trying to track down the commit that lead to this? > > Do you have any other debug patches to try, or should I try to delve > deeper into finding commits that can be reverted? I'm running out of > ideas, I even tried to find later firmware for my drive, but I seem to > be on the latest level. Given the non-deterministic nature of the failure, I think bisection would be quite difficult. I think the best way to diagnose the problem is to track down the owner of the failed request. It has FAILFAST set and yet the issuer fails to deal with an error condition which is expected when FAILFAST is set. I'll prep a patch to track request / bio failure. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/