Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751773AbWIGN7K (ORCPT ); Thu, 7 Sep 2006 09:59:10 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751775AbWIGN7K (ORCPT ); Thu, 7 Sep 2006 09:59:10 -0400 Received: from ebiederm.dsl.xmission.com ([166.70.28.69]:49345 "EHLO ebiederm.dsl.xmission.com") by vger.kernel.org with ESMTP id S1751773AbWIGN7I (ORCPT ); Thu, 7 Sep 2006 09:59:08 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Jean Delvare Cc: Andrew Morton , Oleg Nesterov , KAMEZAWA Hiroyuki , linux-kernel@vger.kernel.org, ak@suse.de Subject: Re: [PATCH] proc: readdir race fix (take 3) References: <20060825182943.697d9d81.kamezawa.hiroyu@jp.fujitsu.com> <200609062312.57774.jdelvare@suse.de> <200609071031.33855.jdelvare@suse.de> Date: Thu, 07 Sep 2006 07:57:52 -0600 In-Reply-To: <200609071031.33855.jdelvare@suse.de> (Jean Delvare's message of "Thu, 7 Sep 2006 10:31:33 +0200") Message-ID: User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1476 Lines: 34 Jean Delvare writes: > On Thursday 7 September 2006 00:43, Eric W. Biederman wrote: >> Have you tested 2.6.18-rc6 without my patch? > > Yes I did, it didn't crash after a couple hours. Of course it doesn't > prove anything as the crash appears to be the result of a race. > > I'll now apply Oleg's fix and see if things get better. > >> I guess the practical question is what was your test methodology to >> reproduce this problem? A couple of more people running the same >> test on a few more machines might at least give us confidence in what >> is going on. > > "My" test program forks 1000 children who sleep for 1 second then look for > themselves in /proc, warn if they can't find themselves, and exit. So > basically the idea is that the process list will shrink very rapidly at > the same moment every child does readdir(/proc). > > I attached the test program, I take no credit (nor shame) for it, it was > provided to me by IBM (possibly on behalf of one of their own customers) > as a way to demonstrate and reproduce the original readdir(/proc) race > bug. Ok. So whatever is creating lots of child threads that tripped you up is probably peculiar to the environment on your laptop. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/