Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1763487AbYFFRLh (ORCPT ); Fri, 6 Jun 2008 13:11:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753699AbYFFRL1 (ORCPT ); Fri, 6 Jun 2008 13:11:27 -0400 Received: from linode.ducksong.com ([64.22.125.164]:56950 "EHLO linode.ducksong.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753066AbYFFRL0 (ORCPT ); Fri, 6 Jun 2008 13:11:26 -0400 Subject: Re: [fixed] [patch] Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+ From: Patrick McManus To: Ilpo =?ISO-8859-1?Q?J=E4rvinen?= Cc: Ingo Molnar , David Miller , peterz@infradead.org, LKML , Netdev , rjw@sisk.pl, Andrew Morton , johnpol@2ka.mipt.ru In-Reply-To: References: <20080603094057.GA29480@elte.hu> <20080603.150344.145518113.davem@davemloft.net> <20080605142244.GA19216@elte.hu> <1212708571.19522.10.camel@tng> Content-Type: text/plain Date: Fri, 06 Jun 2008 13:11:33 -0400 Message-Id: <1212772293.23706.22.camel@tng> Mime-Version: 1.0 X-Mailer: Evolution 2.22.2 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1931 Lines: 42 > This Ingo's testcase should anyway be quite "simple", I mean that distcc > shouldn't do anything unexpected in a sense it shouldn't abort the flows > by not sending data, close the listening socket or other things like that. > maybe - I've noted that I can get the distcc server to crash with just a little fuzz (telnet to it and close the telnet) - but it is true I haven't seen anything odd using the distcc client. Anyhow, my news is that using rc5 I have managed to reproduce it on localhost - so it isn't just ingo anymore ! ;) I didn't have a 16 cpu machine at my disposal, so I was concerned I wouldn't be able to make it happen - but I setup a 64 bit kvm image with -smp 4, running on my core2-duo and created a makefile with 3 src files that I just compiled as "while true; do make -j3 all; done" - the makefile uses distcc to localhost and has intentionally broken dependencies so it just keeps recompiling stuff. The input files are approximately 135k, 98k, and 16k after running gcc -E on them (which I what I assume distcc does before putting them down the socket). On rc5 I could get the lockup in under 20 minutes.. usually 10. I think I did it 4 times. My compile test is probably a better trigger than the kernel compile because the distcc connects are never staggered like they would be in a large directory of files. (3 files, -j4). When I apply the locking patch you (Ilpo) wrote, I cannot reproduce the error at all in the first 90 minutes of testing. I'll let the test run and update the list. I'm holding out hope that Ingo's report did not have the locking patch on the distcc server end - because it certainly makes a difference for me. -Patrick -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/