Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760292AbYHZTDt (ORCPT ); Tue, 26 Aug 2008 15:03:49 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758790AbYHZTDi (ORCPT ); Tue, 26 Aug 2008 15:03:38 -0400 Received: from netops-testserver-3-out.sgi.com ([192.48.171.28]:42726 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757769AbYHZTDh (ORCPT ); Tue, 26 Aug 2008 15:03:37 -0400 Message-ID: <48B45387.8090205@sgi.com> Date: Tue, 26 Aug 2008 12:03:35 -0700 From: Mike Travis User-Agent: Thunderbird 2.0.0.6 (X11/20070801) MIME-Version: 1.0 To: Ingo Molnar CC: Linus Torvalds , "Alan D. Brunelle" , Thomas Gleixner , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Andrew Morton , Arjan van de Ven , Rusty Russell Subject: Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected References: <48B29F7B.6080405@hp.com> <48B2A421.7080705@hp.com> <48B313E0.1000501@hp.com> <20080826072220.GB31876@elte.hu> In-Reply-To: <20080826072220.GB31876@elte.hu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1931 Lines: 44 Ingo Molnar wrote: > * Linus Torvalds wrote: > >> On Mon, 25 Aug 2008, Linus Torvalds wrote: >>> checkstack.pl shows these things as the top problems: >>> >>> 0xffffffff80266234 smp_call_function_mask [vmlinux]: 2736 >>> 0xffffffff80234747 __build_sched_domains [vmlinux]: 2232 >>> 0xffffffff8023523f __build_sched_domains [vmlinux]: 2232 >>> >>> Anyway, the reason smp_call_function_mask and friends have such _huge_ >>> stack usages for you is that they contain a 'cpumask_t' on the stack. >> In fact, they contain multiple CPU-masks, each 4k-bits - 512 bytes - in >> size. And they tend to call each other. >> >> Quite frankly, I don't think we were really ready for 4k CPU's. I'm >> going to commit this patch to make sure others don't do that many >> CPU's by mistake. It marks MAXCPU's as being 'broken' so you cannot >> select it, and also limits the number of CPU's that you _can_ select >> to "just" 512. > > yeah, that's OK i guess - distros can still enable 4K support if they > wish to. Someone interested in improving the stack footprint situation > should dust off the max-stack-footprint tracer so that we can catch > these things in a more structured way. > > And i guess the next generation of 4K CPUs support should just get away > from cpumask_t-on-kernel-stack model altogether, as the current model is > not maintainable. We tried the on-kernel-stack variant, and it really > does not work reliably. We can fix this in v2.6.28. > > Ingo I would be most interested in any tools to analyze call-trees and accumulated stack usages. My current method of using kdb is really time consuming. Thanks! Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/