Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758863AbYHZRfh (ORCPT ); Tue, 26 Aug 2008 13:35:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757245AbYHZRf3 (ORCPT ); Tue, 26 Aug 2008 13:35:29 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:56081 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757226AbYHZRf2 (ORCPT ); Tue, 26 Aug 2008 13:35:28 -0400 Date: Tue, 26 Aug 2008 10:35:05 -0700 (PDT) From: Linus Torvalds To: Rusty Russell cc: "Alan D. Brunelle" , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Andrew Morton , Arjan van de Ven , Ingo Molnar Subject: Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected In-Reply-To: <200808261111.19205.rusty@rustcorp.com.au> Message-ID: References: <48B313E0.1000501@hp.com> <200808261111.19205.rusty@rustcorp.com.au> User-Agent: Alpine 1.10 (LFD 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2013 Lines: 49 On Tue, 26 Aug 2008, Rusty Russell wrote: > > Your workaround is very random, and that scares me. I think a huge number of > CPUs needs a real solution (an actual cpumask allocator, then do something > clever if we come across an actual fastpath). The thing is, the inlining thing is a separate issue. Yes, the cpumasks were what made stack pressure so critical to begin with, but no, a 400-byte stack frame in a deep callchain isn't acceptable _regardless_ of any cpumask_t issues. Gcc inlining is a total and utter pile of shit. And _that_ is the problem. I seriously think we shouldn't allow gcc to inline anything at all unless we tell it to. That's how it used to work, and quite frankly, that's how it _should_ work. The downsides of inlining are big enough from both a debugging and a real code generation angle (eg stack usage like this), that the upsides (_somesimes_ smaller kernel, possibly slightly faster code) simply aren't relevant. So the "noinline" was random, yes, but this is a real issue. Looking at checkstack output for a saner config (NR_CPUS=16), the top entries for me are things like ide_generic_init [vmlinux]: 1384 idefloppy_ioctl [vmlinux]: 1208 e1000_check_options [vmlinux]: 1152 ... which are "leaf" functions. They are broken as hell (the e1000 is apparently because it builds structs on the stack that should all be "static const", for example), but they are different from something like the module init sequence in that they are not going to affect anything else. It would be interesting to see what "-fno-default-inline" does to the kernel. It really would get rid of a _lot_ of gcc version issues too. Inlining behavior of gcc has long been a problem for us. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/