Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755204AbXIOPPm (ORCPT ); Sat, 15 Sep 2007 11:15:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751445AbXIOPPg (ORCPT ); Sat, 15 Sep 2007 11:15:36 -0400 Received: from ug-out-1314.google.com ([66.249.92.171]:59372 "EHLO ug-out-1314.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751429AbXIOPPf (ORCPT ); Sat, 15 Sep 2007 11:15:35 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:subject:from:reply-to:to:cc:in-reply-to:references:content-type:date:message-id:mime-version:x-mailer:content-transfer-encoding; b=jaY5TEhFOaD8CsYztrGZlJ3ERoI498ifcAaHIbUdPK52miBHjZhSmCOEYKGCD9mKFlY/411AFsKcm3BcBlf6JmIv92IliIq5wd+ubaG2HI71qNXLgoUg5rYURDo1kqYM2TC/90yLiKLW+N9v+tRAknCeagIGWKWTQUcVJ4gCAfk= Subject: Re: [Minor patch] Reduce __print_symbol/sprint_symbol stack usage. From: Gilboa Davara Reply-To: gilboad@gmail.com To: Satyam Sharma Cc: linux-kernel@vger.kernel.org, Eric Sandeen In-Reply-To: References: <1189856129.18191.11.camel@gilboa-home-dev.localdomain> Content-Type: text/plain Date: Sat, 15 Sep 2007 18:15:29 +0300 Message-Id: <1189869329.18191.77.camel@gilboa-home-dev.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.10.3 (2.10.3-4.fc7) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3642 Lines: 91 On Sat, 2007-09-15 at 18:32 +0530, Satyam Sharma wrote: > Hi, > > > On 9/15/07, Gilboa Davara wrote: > > Hello all, > > > > In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed > > out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is > > enabled. (Though not limited to it) > > Yeah, I have experienced this phenomenon/problem myself. > > > > Code path is simple: do_IRQ detects a a near stack overflow condition > > and calls show_trace_log_lvl which, down the line uses __print_symbol > > and sprint_symbol to print the call stack. > > However, both __print_symbol + sprint_symbol are eating no-less then > > 128+223 bytes on static char arrays, which, given the fact that this > > code path is actually generated by low stack warning (< 512 bytes), > > might turn a minor (?) problem (low stack) into a full blown crash. > > __print_symbol() and sprint_symbol() are called multiple times during > oopsen / panics. I think those buffers were static char arrays for a good > reason ... OK. Point taken. I pull this patch pending some additional thinking. > > The patch itself is fairly simple and non-intrusive. [2] > > Both functions allocate memory for their buffers - falling back to > > minimal address display if memory allocation fails. > > > > P.S. Can anyone please point me to the maintainer of kernel/syms? (I > > rather not spam world + dog for such a minor patch) > > Anything that touches the panic codepath is important, not minor at all. Bad wording on my part. By minor I meant, changes a single source file, doesn't change interfaces, doesn't change code behavior beyond it's local scope. > > [2]. In theory, there's a second option: pre-allocating memory on a > > per_cpu basis, however: > > A. dump_trace/stack are usually called when something bad has happened - > > reducing the need for performance optimizations. > > That's not a performance optimization -- avoiding repeated kmalloc()'s in the > panic codepath sounds like a *requirement* to me. ACK. Though in my defense, solution [2] requires a massive surgery that would have made this patch far more intrusive. > > > > B. per_cpu allocation will also require local_irq_disable/enable as both > > functions are being called from multiple contexts. Too much hassle. > > I think not bothering about any locking in these codepaths may not be an > entirely unreasonable thing to do (sorry about the triple negation in the > sentence). What I mean is that there are places in these codepaths where > we already don't bother with locking ... > > Overall I don't much like introducing kmalloc(GFP_ATOMIC) in these codepaths > and would ask you guys to consider some other pre-allocation (i.e. static > allocation not on stack but in .data) alternative instead ... > > Satyam No locking what-so-ever is a bad idea. dump_stack/trace are being called by non-fatal sources (sleep while atomic; stack-check; debugging) that may produce problematic results if a static/shared buffer is being used with no locks. We can agree that using in-stack char buffer is very problematic - especially given the fact that 4K is becoming the default build option. I'll try and create an option 2 (static allocation, minimal locking) patch and post ASAP. Hopefully it'll fare better. (While keeping the current interface intact and reducing the damage/noise) - Gilboa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/