Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758911AbYFJIkY (ORCPT ); Tue, 10 Jun 2008 04:40:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753358AbYFJIkK (ORCPT ); Tue, 10 Jun 2008 04:40:10 -0400 Received: from mtagate4.de.ibm.com ([195.212.29.153]:24424 "EHLO mtagate4.de.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752249AbYFJIkI (ORCPT ); Tue, 10 Jun 2008 04:40:08 -0400 In-Reply-To: <20080609220149.d930d141.akpm@linux-foundation.org> To: linux-mm@kvack.org Cc: balbir@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, Mariusz Kozlowski , Andrew Morton MIME-Version: 1.0 Subject: Re: 2.6.26-rc5-mm1 X-Mailer: Lotus Notes Release 7.0 HF277 June 21, 2006 Message-ID: From: Peter 1 Oberparleiter Date: Tue, 10 Jun 2008 10:39:42 +0200 X-MIMETrack: Serialize by Router on D12ML066/12/M/IBM(Release 7.0.2FP2HF322 | September 26, 2007) at 10/06/2008 10:39:43, Serialize complete at 10/06/2008 10:39:43 Content-Type: text/plain; charset="US-ASCII" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5615 Lines: 142 Andrew Morton wrote on 10.06.2008 07:01:49: > On Tue, 10 Jun 2008 06:57:02 +0200 Mariusz Kozlowski kozlowski@tuxland.pl> wrote: > > > Witam, > > > > > On Mon, 9 Jun 2008 21:14:54 +0200 > > > Mariusz Kozlowski wrote: > > > > > > > Hello Balbir, > > > > > > > > > Andrew Morton wrote: > > > > > > Temporarily at > > > > > > > > > > > > http://userweb.kernel.org/~akpm/2.6.26-rc5-mm1/ > > > > > > > > > > > > > > > > I've hit a segfault, the last few lines on my console are > > > > > > > > > > > > > > > Testing -fstack-protector-all feature > > > > > registered taskstats version 1 > > > > > debug: unmapping init memory ffffffff80c03000..ffffffff80dd8000 > > > > > init[1]: segfault at 7fff701fe880 ip 7fff701fee5e sp > 7fff7006e6d0 error 7 > > > > > > > > > > With absolutely no stack trace. I'll dig deeper. > > > > > > > > Hey, I see something similar and I actually have a stack > trace. Here it goes: > > > > > > > > bash[498] segfault at ffffffff80868b58 ip ffffffffff600412 sp > 7fffa3d010f0 error 7 > > > > init[1] segfault at ffffffff80868b58 ip ffffffffff600412 sp > 7fff9e97f640 error 7 > > > > init[1] segfault at ffffffff80868b58 ip ffffffffff600412 sp > 7fff9e97eed0 error 7 > > > > Kernel panic - not syncing: Attemted to kill init! > > > > Pid 1, comm: init Not tainted 2.6.26-rc5-mm1 #1 > > > > > > > > Call Trace: > > > > [] panic+0xe2/0x260 > > > > [] ? __slab_free+0x10a/0x630 > > > > [] ? __sigqueue_free+0x5e/0x70 > > > > [] ? trace_hardirqs_off+0x1b/0x30 > > > > [] ? trace_hardirqs_off+0x1b/0x30 > > > > [] do_exit+0xb84/0xc30 > > > > [] do_group_exit+0x5a/0x110 > > > > [] get_signal_to_deliver+0x2c5/0x620 > > > > [] do_notify_resume+0x11b/0xd10 > > > > [] ? trace_hardirqs_on+0x1b/0x30 > > > > [] ? _spin_unlock_irqrestore+0x93/0x130 > > > > [] ? force_sig_info+0x10c/0x130 > > > > [] ? force_sig_info_fault+0x2c/0x40 > > > > [] ? print_vma_addr+0x10d/0x1d0 > > > > [] ? trace_hardirqs_on_thunk+0x3a/0x3f > > > > [] ? trace_hardirqs_on_caller+0x15a/0x2c0 > > > > [] retint_signal+0x46/0x8d > > > > > > > > This was copied manually so typos are possible. > > > > > > > > > > Thanks. Could someone send a config please? Or a bisection result ;) > > > > In my case it turns out to be gcov patches - in which I'm interested > > in to see (and play with) the tests coverage. > > > > # > > # gcov > > # > > kernel-call-constructors.patch > > kernel-introduce-gcc_version_lower-macro.patch > > seq_file-add-function-to-write-binary-data.patch > > GOOD > > gcov-add-gcov-profiling-infrastructure.patch > > GOOD > > gcov-create-links-to-gcda-files-in-build-directory.patch > > gcov-architecture-specific-compile-flag-adjustments.patch > > BAD > > > > I can not bisect between the last two due to build error. Config > is attached. > > > > (cc Peter) Thanks for the report. These look like the "known architecture problems" that I've hinted at in the gcov announcement post (I'm assuming this is x86_64 as I've seem similar reports in the past). Possible reasons: 1) initrd overwrites kernel: When kernel and initrd are loaded to static addresses, the oversized gcov kernel may overlap with the initrd load address. Solution: move initrd loading address. 2) out-of-memory: Kernel plus profiling code may just not fit into a minimal memory configuration any more. Solution: add memory. 3) write-protection of kernel code: gcc keeps profiling code and data close together in the .text section. Solution: any mechanism that protects .text against writes should be disabled when running a profiled kernel. 4) as of yet undiscovered incompatibilities between arch-dependent code and gcc's -fprofile-arcs option. Examples would be: * code which is run before memory access preparations were made * hard coded section sizes * relative address displacements which are out of range Unfortunately I neither have access to a machine nor the skill to debug 4) myself, so if 1)-3) can be ruled out, I'd like to ask for more help on this one: First off, someone needs to track down the offending file(s). This is done by putting a line containing "GCOV := n" in all Makefiles below arch/x86_64 (or go one step further back and set CONFIG_GCOV_PROFILE_ALL=n). If my assumption is correct, then the kernel should boot fine afterwards. In that case, remove the lines again one-by-one, while compiling and booting after each change. If the problem can be narrowed down to a single Makefile, replace the single "GCOV := n" line with multiple "GCOV_file.o := n" lines, one for each generated object file. Then again, same approach as before: remove those lines, compile and boot until it breaks. Finally post your results. At this point we would need someone with x86_64 arch skills to look at the file and find out why this code is broken with "-fprofile-arcs" enabled (on s390 we discovered at least one actual code bug this way, so the analysis might just be of general use). Alternatively we can just keep these files from being profiled. Regards, Peter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/