Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753385AbYFCJCo (ORCPT ); Tue, 3 Jun 2008 05:02:44 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751996AbYFCJCe (ORCPT ); Tue, 3 Jun 2008 05:02:34 -0400 Received: from styx.suse.cz ([82.119.242.94]:56111 "EHLO mail.suse.cz" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750913AbYFCJCd (ORCPT ); Tue, 3 Jun 2008 05:02:33 -0400 Message-ID: <48450864.6080707@suse.cz> Date: Tue, 03 Jun 2008 11:01:24 +0200 From: Petr Tesarik Organization: SUSE CR, s.r.o. User-Agent: Icedove 1.5.0.14pre (X11/20071018) MIME-Version: 1.0 To: Luming Yu CC: Roland McGrath , LKML , linux-ia64@vger.kernel.org Subject: Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race References: <3877989d0805211947i54bacc7cv619541e9b40824fb@mail.gmail.com> <20080523041940.39E8726FA24@magilla.localdomain> <3877989d0805222224n77ce36b6wdf15c4bab330a0f8@mail.gmail.com> <20080526001527.81E1126FA9E@magilla.localdomain> <3877989d0805251830w70f19e4cu46fbc32148217749@mail.gmail.com> <3877989d0805262031i29db16bcjfa31652afc746b49@mail.gmail.com> <20080527040454.053C526FA9E@magilla.localdomain> <3877989d0805262249yab130cbyfc5f5e54065cec5c@mail.gmail.com> <20080527061209.9A24426FAA6@magilla.localdomain> <1211869515.29836.2.camel@elijah.suse.cz> <3877989d0806022304w35764b17p9d4c3c95eceae0f5@mail.gmail.com> In-Reply-To: <3877989d0806022304w35764b17p9d4c3c95eceae0f5@mail.gmail.com> X-Enigmail-Version: 0.94.2.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1564 Lines: 35 Luming Yu wrote: > On Tue, May 27, 2008 at 2:25 PM, Petr Tesarik wrote: >> On Mon, 2008-05-26 at 23:12 -0700, Roland McGrath wrote: >>>> [] skip_rbs_switch+0xe0/0x110 >>>> sp=e000000141c9fe30 bsp=e000000141c90cf8 >>>> [] __kernel_syscall_via_break+0x0/0x20 >>>> sp=e000000141ca0000 bsp=e000000141c90cf8 >> Indeed, there seems to be a large hole here. So, this is either a bug in >> the unwinder, or a bug in the RBS synchronization, which causes >> corruption. My test machine currently needs some work to run 2.6.25 >> again, but I'll try your test case as soon as I re-install it later this >> week. > > Just want to check if the test case works for you? Yes, the test case hangs here too. But the problem seems to be elsewhere. Did you look into the strace output? This line is pretty suspicious: 3258 clone2(child_stack=0, stack_size=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x200000000004e290) = 1 Obviously, strace cannot attach PID 1, and since it is not designed to handle this situation, it hangs. I'm going to investigate why the return value of the clone2 syscall is seen as 1 by the tracer. Might even turn out to be a bug in strace... Petr Tesarik -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/