Hello all,
In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
enabled. (Though not limited to it)
Code path is simple: do_IRQ detects a a near stack overflow condition
and calls show_trace_log_lvl which, down the line uses __print_symbol
and sprint_symbol to print the call stack.
However, both __print_symbol + sprint_symbol are eating no-less then
128+223 bytes on static char arrays, which, given the fact that this
code path is actually generated by low stack warning (< 512 bytes),
might turn a minor (?) problem (low stack) into a full blown crash.
The patch itself is fairly simple and non-intrusive. [2]
Both functions allocate memory for their buffers - falling back to
minimal address display if memory allocation fails.
P.S. Can anyone please point me to the maintainer of kernel/syms? (I
rather not spam world + dog for such a minor patch)
--
Gilboa Davara <[email protected]>
[1]
http://www.mail-archive.com/[email protected]/msg00640.html
[2]. In theory, there's a second option: pre-allocating memory on a
per_cpu basis, however:
A. dump_trace/stack are usually called when something bad has happened -
reducing the need for performance optimizations.
B. per_cpu allocation will also require local_irq_disable/enable as both
functions are being called from multiple contexts. Too much hassle.
--- linux-2.6/kernel/kallsyms.orig 2007-09-15 11:46:54.000000000 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 14:25:21.000000000 +0300
@@ -309,30 +309,62 @@ int lookup_symbol_attrs(unsigned long ad
/* Look up a kernel symbol and return it in a text buffer. */
int sprint_symbol(char *buffer, unsigned long address)
{
- char *modname;
- const char *name;
unsigned long offset, size;
- char namebuf[KSYM_NAME_LEN];
+ const char *name = NULL;
+ char *namebuf = NULL;
+ char *modname;
+ int ret;
+
+
+ /* Static buffer allocation.
+ Required in-order to reduce stack footprint on
+ do_IRQ/4KSTACK/i386 */
+ namebuf = kmalloc(KSYM_NAME_LEN, GFP_ATOMIC);
+ if (namebuf)
+ name = kallsyms_lookup(address, &size, &offset,
+ &modname, namebuf);
- name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
if (!name)
- return sprintf(buffer, "0x%lx", address);
+ ret = sprintf(buffer, "0x%lx", address);
+ else {
+ if (modname)
+ ret = sprintf(buffer, "%s+%#lx/%#lx [%s]",
+ name, offset, size, modname);
+ else
+ ret = sprintf(buffer, "%s+%#lx/%#lx",
+ name, offset, size);
+ }
- if (modname)
- return sprintf(buffer, "%s+%#lx/%#lx [%s]", name, offset,
- size, modname);
- else
- return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
+ if (namebuf)
+ kfree(namebuf);
+
+ return ret;
}
/* Look up a kernel symbol and print it to the kernel messages. */
void __print_symbol(const char *fmt, unsigned long address)
{
- char buffer[KSYM_SYMBOL_LEN];
+ char *buffer = NULL;
- sprint_symbol(buffer, address);
- printk(fmt, buffer);
+ /* Static buffer allocation.
+ Required in-order to reduce stack footprint on
+ do_IRQ/4KSTACK/i386 */
+ buffer = kmalloc(KSYM_SYMBOL_LEN, GFP_ATOMIC);
+ if (buffer) {
+ sprint_symbol(buffer, address);
+ printk(fmt, buffer);
+ kfree(buffer);
+ } else {
+ /* Address + '0x' + NULL. */
+ char sbuffer[(BITS_PER_LONG / 4) + 3];
+
+ /* Fall-back mode.
+ Memory allocation failed.
+ Convert the address to string and display it. */
+ sprintf(sbuffer, "0x%lx", address);
+ printk(fmt, sbuffer);
+ }
}
/* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */
Hi,
On 9/15/07, Gilboa Davara <[email protected]> wrote:
> Hello all,
>
> In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
> out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
> enabled. (Though not limited to it)
Yeah, I have experienced this phenomenon/problem myself.
> Code path is simple: do_IRQ detects a a near stack overflow condition
> and calls show_trace_log_lvl which, down the line uses __print_symbol
> and sprint_symbol to print the call stack.
> However, both __print_symbol + sprint_symbol are eating no-less then
> 128+223 bytes on static char arrays, which, given the fact that this
> code path is actually generated by low stack warning (< 512 bytes),
> might turn a minor (?) problem (low stack) into a full blown crash.
__print_symbol() and sprint_symbol() are called multiple times during
oopsen / panics. I think those buffers were static char arrays for a good
reason ...
> The patch itself is fairly simple and non-intrusive. [2]
> Both functions allocate memory for their buffers - falling back to
> minimal address display if memory allocation fails.
>
> P.S. Can anyone please point me to the maintainer of kernel/syms? (I
> rather not spam world + dog for such a minor patch)
Anything that touches the panic codepath is important, not minor at all.
> Gilboa Davara <[email protected]>
>
> [1]
> http://www.mail-archive.com/[email protected]/msg00640.html
>
> [2]. In theory, there's a second option: pre-allocating memory on a
> per_cpu basis, however:
> A. dump_trace/stack are usually called when something bad has happened -
> reducing the need for performance optimizations.
That's not a performance optimization -- avoiding repeated kmalloc()'s in the
panic codepath sounds like a *requirement* to me.
> B. per_cpu allocation will also require local_irq_disable/enable as both
> functions are being called from multiple contexts. Too much hassle.
I think not bothering about any locking in these codepaths may not be an
entirely unreasonable thing to do (sorry about the triple negation in the
sentence). What I mean is that there are places in these codepaths where
we already don't bother with locking ...
Overall I don't much like introducing kmalloc(GFP_ATOMIC) in these codepaths
and would ask you guys to consider some other pre-allocation (i.e. static
allocation not on stack but in .data) alternative instead ...
Satyam
> --- linux-2.6/kernel/kallsyms.orig 2007-09-15 11:46:54.000000000 +0300
> +++ linux-2.6/kernel/kallsyms.c 2007-09-15 14:25:21.000000000 +0300
> @@ -309,30 +309,62 @@ int lookup_symbol_attrs(unsigned long ad
> /* Look up a kernel symbol and return it in a text buffer. */
> int sprint_symbol(char *buffer, unsigned long address)
> {
> - char *modname;
> - const char *name;
> unsigned long offset, size;
> - char namebuf[KSYM_NAME_LEN];
> + const char *name = NULL;
> + char *namebuf = NULL;
> + char *modname;
> + int ret;
> +
> +
> + /* Static buffer allocation.
> + Required in-order to reduce stack footprint on
> + do_IRQ/4KSTACK/i386 */
> + namebuf = kmalloc(KSYM_NAME_LEN, GFP_ATOMIC);
> + if (namebuf)
> + name = kallsyms_lookup(address, &size, &offset,
> + &modname, namebuf);
>
> - name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
> if (!name)
> - return sprintf(buffer, "0x%lx", address);
> + ret = sprintf(buffer, "0x%lx", address);
> + else {
> + if (modname)
> + ret = sprintf(buffer, "%s+%#lx/%#lx [%s]",
> + name, offset, size, modname);
> + else
> + ret = sprintf(buffer, "%s+%#lx/%#lx",
> + name, offset, size);
> + }
>
> - if (modname)
> - return sprintf(buffer, "%s+%#lx/%#lx [%s]", name, offset,
> - size, modname);
> - else
> - return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> + if (namebuf)
> + kfree(namebuf);
> +
> + return ret;
> }
>
> /* Look up a kernel symbol and print it to the kernel messages. */
> void __print_symbol(const char *fmt, unsigned long address)
> {
> - char buffer[KSYM_SYMBOL_LEN];
> + char *buffer = NULL;
>
> - sprint_symbol(buffer, address);
>
> - printk(fmt, buffer);
> + /* Static buffer allocation.
> + Required in-order to reduce stack footprint on
> + do_IRQ/4KSTACK/i386 */
> + buffer = kmalloc(KSYM_SYMBOL_LEN, GFP_ATOMIC);
> + if (buffer) {
> + sprint_symbol(buffer, address);
> + printk(fmt, buffer);
> + kfree(buffer);
> + } else {
> + /* Address + '0x' + NULL. */
> + char sbuffer[(BITS_PER_LONG / 4) + 3];
> +
> + /* Fall-back mode.
> + Memory allocation failed.
> + Convert the address to string and display it. */
> + sprintf(sbuffer, "0x%lx", address);
> + printk(fmt, sbuffer);
> + }
> }
>
> /* To avoid using get_symbol_offset for every symbol, we carry prefix
> along. */
On Sat, 2007-09-15 at 18:32 +0530, Satyam Sharma wrote:
> Hi,
>
>
> On 9/15/07, Gilboa Davara <[email protected]> wrote:
> > Hello all,
> >
> > In a small exchange in fedora-kernel-list [1] Eric Sandeen has pointed
> > out a possible stack overflow... when CONFIG_DEBUG_STACKOVERFLOW is
> > enabled. (Though not limited to it)
>
> Yeah, I have experienced this phenomenon/problem myself.
>
>
> > Code path is simple: do_IRQ detects a a near stack overflow condition
> > and calls show_trace_log_lvl which, down the line uses __print_symbol
> > and sprint_symbol to print the call stack.
> > However, both __print_symbol + sprint_symbol are eating no-less then
> > 128+223 bytes on static char arrays, which, given the fact that this
> > code path is actually generated by low stack warning (< 512 bytes),
> > might turn a minor (?) problem (low stack) into a full blown crash.
>
> __print_symbol() and sprint_symbol() are called multiple times during
> oopsen / panics. I think those buffers were static char arrays for a good
> reason ...
OK. Point taken.
I pull this patch pending some additional thinking.
> > The patch itself is fairly simple and non-intrusive. [2]
> > Both functions allocate memory for their buffers - falling back to
> > minimal address display if memory allocation fails.
> >
> > P.S. Can anyone please point me to the maintainer of kernel/syms? (I
> > rather not spam world + dog for such a minor patch)
>
> Anything that touches the panic codepath is important, not minor at all.
Bad wording on my part.
By minor I meant, changes a single source file, doesn't change
interfaces, doesn't change code behavior beyond it's local scope.
> > [2]. In theory, there's a second option: pre-allocating memory on a
> > per_cpu basis, however:
> > A. dump_trace/stack are usually called when something bad has happened -
> > reducing the need for performance optimizations.
>
> That's not a performance optimization -- avoiding repeated kmalloc()'s in the
> panic codepath sounds like a *requirement* to me.
ACK.
Though in my defense, solution [2] requires a massive surgery that would
have made this patch far more intrusive.
>
>
> > B. per_cpu allocation will also require local_irq_disable/enable as both
> > functions are being called from multiple contexts. Too much hassle.
>
> I think not bothering about any locking in these codepaths may not be an
> entirely unreasonable thing to do (sorry about the triple negation in the
> sentence). What I mean is that there are places in these codepaths where
> we already don't bother with locking ...
>
> Overall I don't much like introducing kmalloc(GFP_ATOMIC) in these codepaths
> and would ask you guys to consider some other pre-allocation (i.e. static
> allocation not on stack but in .data) alternative instead ...
>
> Satyam
No locking what-so-ever is a bad idea. dump_stack/trace are being called
by non-fatal sources (sleep while atomic; stack-check; debugging) that
may produce problematic results if a static/shared buffer is being used
with no locks.
We can agree that using in-stack char buffer is very problematic -
especially given the fact that 4K is becoming the default build option.
I'll try and create an option 2 (static allocation, minimal locking)
patch and post ASAP.
Hopefully it'll fare better. (While keeping the current interface intact
and reducing the damage/noise)
- Gilboa
Hello all, Satyam,
This is my second stab at solving the "stack over flow due to
dump_strace when close to stack-overflow is detected by do_IRQ" problem.
(Hopefully) this patch is creates less noise then the previous one.
[snip]
> I'll try and create an option 2 (static allocation, minimal locking)
> patch and post ASAP.
> Hopefully it'll fare better. (While keeping the current interface intact
> and reducing the damage/noise)
>
> - Gilboa
- Gilboa
--- linux-2.6/kernel/kallsyms.orig 2007-09-15 11:46:54.000000000 +0300
+++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.000000000 +0300
@@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
}
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/* Internal version:
+ Look up a kernel symbol and module name and return them to the
+ caller's buffer/namebuf buffers. */
+int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
{
- char *modname;
- const char *name;
unsigned long offset, size;
- char namebuf[KSYM_NAME_LEN];
+ const char *name;
+ char *modname;
name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
if (!name)
@@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
}
+/* Exported version:
+ Look up a kernel symbol and return it in a text buffer. */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+ char namebuf[KSYM_NAME_LEN];
+
+ return __sprint_symbol(buffer, namebuf, address);
+}
+
+static DEFINE_SPINLOCK(symbol_lock);
+
/* Look up a kernel symbol and print it to the kernel messages. */
void __print_symbol(const char *fmt, unsigned long address)
{
- char buffer[KSYM_SYMBOL_LEN];
+ /* Use static buffers instead of char array to reduce
+ stack footprint in i386/4KSTACKS.
+ Buffers must be protected against re-entry. */
+ static char namebuf[KSYM_NAME_LEN];
+ static char buffer[KSYM_SYMBOL_LEN];
+ unsigned long flags;
+
- sprint_symbol(buffer, address);
+ spin_lock_irqsave(&symbol_lock, flags);
+
+ __sprint_symbol(buffer, namebuf, address);
printk(fmt, buffer);
+
+ spin_unlock_irqrestore(&symbol_lock, flags);
}
/* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */
Hi Gilboa,
On Sat, 15 Sep 2007, Gilboa Davara wrote:
>
> This is my second stab at solving the "stack over flow due to
> dump_strace when close to stack-overflow is detected by do_IRQ" problem.
> (Hopefully) this patch is creates less noise then the previous one.
>
> [snip]
> > I'll try and create an option 2 (static allocation, minimal locking)
> > patch and post ASAP.
> > Hopefully it'll fare better. (While keeping the current interface intact
> > and reducing the damage/noise)
>
> - Gilboa
>
> --- linux-2.6/kernel/kallsyms.orig 2007-09-15 11:46:54.000000000 +0300
> +++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.000000000 +0300
> @@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
> return lookup_module_symbol_attrs(addr, size, offset, modname, name);
> }
>
> -/* Look up a kernel symbol and return it in a text buffer. */
> -int sprint_symbol(char *buffer, unsigned long address)
> +/* Internal version:
> + Look up a kernel symbol and module name and return them to the
> + caller's buffer/namebuf buffers. */
/*
* ...
* ...
*/
is the general coding style here ...
> +int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
> {
> - char *modname;
> - const char *name;
> unsigned long offset, size;
> - char namebuf[KSYM_NAME_LEN];
> + const char *name;
> + char *modname;
>
> name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
> if (!name)
> @@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
> return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> }
>
> +/* Exported version:
> + Look up a kernel symbol and return it in a text buffer. */
ditto.
> +int sprint_symbol(char *buffer, unsigned long address)
> +{
> + char namebuf[KSYM_NAME_LEN];
Hmm, don't we intend to push this array out of the stack too?
+ static char namebuf[KSYM_NAME_LEN];
+ static DEFINE_SPINLOCK(namebuf_lock);
here ?
> +
> + return __sprint_symbol(buffer, namebuf, address);
And you'd need to wrap spin_lock_irqsave()/spin_unlock_irqrestore()
around this call.
> +}
> +static DEFINE_SPINLOCK(symbol_lock);
Try to keep the declarations of a lock, and the data that it protects,
close together. Since this lock is being used to protect "buffer", it
makes sense to ...
> /* Look up a kernel symbol and print it to the kernel messages. */
> void __print_symbol(const char *fmt, unsigned long address)
> {
> - char buffer[KSYM_SYMBOL_LEN];
> + /* Use static buffers instead of char array to reduce
> + stack footprint in i386/4KSTACKS.
> + Buffers must be protected against re-entry. */
> + static char namebuf[KSYM_NAME_LEN];
> + static char buffer[KSYM_SYMBOL_LEN];
... have it:
+ static DEFINE_SPINLOCK(buffer_lock);
here (note the name that exactly describes what the lock protects).
And the namebuf array isn't required here, it's already there in
sprint_symbol(), which you can call from ...
> + unsigned long flags;
> +
>
> - sprint_symbol(buffer, address);
> + spin_lock_irqsave(&symbol_lock, flags);
> +
> + __sprint_symbol(buffer, namebuf, address);
here ... sprint_symbol() ?
> printk(fmt, buffer);
> +
> + spin_unlock_irqrestore(&symbol_lock, flags);
But I still don't much like this :-(
More importantly, if a panic occurs *below* this callchain (and let's
say we ended up in this callchain because somebody put in a dump_stack()
somewhere for debugging purposes), then we'd have a deadlock on our hands,
and nothing gets printed for that panic.
I don't know who maintains this part of kernel code, but you can try
resubmitting (with the changes suggested above) to someone appropriate ...
Satyam
Satyam Sharma wrote:
> On Sat, 15 Sep 2007, Gilboa Davara wrote:
>> printk(fmt, buffer);
>> +
>> + spin_unlock_irqrestore(&symbol_lock, flags);
>
> But I still don't much like this :-(
I must say I agree with Satyam here.
Locking in the panic path might leave us without some critical debug
information, which is much more important than all this.
Maybe it would be better to change the print_symbol interface to avoid
having a "char buffer[KSYM_SYMBOL_LEN];" at all.
Most print_symbol callers use something like "yada yada %s" as the
format string, with an optional "\n" in the end.
if we change the interface from "print_symbol(fmt, addr)" to
"print_symbol(prefix, addr, int newline)" we can simply do:
printk(prefix);
printk_symbol(addr);
if (newline)
printk("\n");
where "printk_symbol" is a new function that does the same as
sprint_symbol, but does "printk" instead of "sprintf".
This should reduce immensely the stack usage of print_symbol without the
need for locking.
Of course this requires changing _all_ callers of print_symbol to use
the new interface, but these are less than 100 ;)
Comments?
--
Paulo Marques - http://www.grupopie.com
"You're just jealous because the voices only talk to me."
Hello Satyam,
On Wed, 2007-09-19 at 06:30 +0530, Satyam Sharma wrote:
> Hi Gilboa,
>
>
> On Sat, 15 Sep 2007, Gilboa Davara wrote:
> >
> > This is my second stab at solving the "stack over flow due to
> > dump_strace when close to stack-overflow is detected by do_IRQ" problem.
> > (Hopefully) this patch is creates less noise then the previous one.
> >
> > [snip]
> > > I'll try and create an option 2 (static allocation, minimal locking)
> > > patch and post ASAP.
> > > Hopefully it'll fare better. (While keeping the current interface intact
> > > and reducing the damage/noise)
> >
> > - Gilboa
> >
> > --- linux-2.6/kernel/kallsyms.orig 2007-09-15 11:46:54.000000000 +0300
> > +++ linux-2.6/kernel/kallsyms.c 2007-09-15 21:06:55.000000000 +0300
> > @@ -306,13 +306,14 @@ int lookup_symbol_attrs(unsigned long ad
> > return lookup_module_symbol_attrs(addr, size, offset, modname, name);
> > }
> >
> > -/* Look up a kernel symbol and return it in a text buffer. */
> > -int sprint_symbol(char *buffer, unsigned long address)
> > +/* Internal version:
> > + Look up a kernel symbol and module name and return them to the
> > + caller's buffer/namebuf buffers. */
>
> /*
> * ...
> * ...
> */
>
> is the general coding style here ...
ACK.
Will be fixed ASAP.
>
> > +int __sprint_symbol(char *buffer, char *namebuf, unsigned long address)
> > {
> > - char *modname;
> > - const char *name;
> > unsigned long offset, size;
> > - char namebuf[KSYM_NAME_LEN];
> > + const char *name;
> > + char *modname;
> >
> > name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
> > if (!name)
> > @@ -325,14 +326,35 @@ int sprint_symbol(char *buffer, unsigned
> > return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> > }
> >
> > +/* Exported version:
> > + Look up a kernel symbol and return it in a text buffer. */
>
> ditto.
>
> > +int sprint_symbol(char *buffer, unsigned long address)
> > +{
> > + char namebuf[KSYM_NAME_LEN];
>
> Hmm, don't we intend to push this array out of the stack too?
>
> + static char namebuf[KSYM_NAME_LEN];
> + static DEFINE_SPINLOCK(namebuf_lock);
>
> here ?
I did my best to keep the old interface (sprint_symbol), char arrays and
all, fixing only the critical dump_stack path. (Hence, print_symbol no
longer calls sprint_symbol - it calls the __sprint_symbol helper
function instead.
>
> > +
> > + return __sprint_symbol(buffer, namebuf, address);
>
> And you'd need to wrap spin_lock_irqsave()/spin_unlock_irqrestore()
> around this call.
As I'm keeping the old interface (with it's static char arrays) for
non-dump_stack paths, there's no need to use locks around sprint_symbol.
>
> > +}
>
>
> > +static DEFINE_SPINLOCK(symbol_lock);
>
> Try to keep the declarations of a lock, and the data that it protects,
> close together. Since this lock is being used to protect "buffer", it
> makes sense to ...
ACK.
>
>
> > /* Look up a kernel symbol and print it to the kernel messages. */
> > void __print_symbol(const char *fmt, unsigned long address)
> > {
> > - char buffer[KSYM_SYMBOL_LEN];
> > + /* Use static buffers instead of char array to reduce
> > + stack footprint in i386/4KSTACKS.
> > + Buffers must be protected against re-entry. */
> > + static char namebuf[KSYM_NAME_LEN];
> > + static char buffer[KSYM_SYMBOL_LEN];
>
> ... have it:
>
> + static DEFINE_SPINLOCK(buffer_lock);
ACK.
>
> here (note the name that exactly describes what the lock protects).
>
> And the namebuf array isn't required here, it's already there in
> sprint_symbol(), which you can call from ...
>
> > + unsigned long flags;
> > +
> >
> > - sprint_symbol(buffer, address);
> > + spin_lock_irqsave(&symbol_lock, flags);
> > +
> > + __sprint_symbol(buffer, namebuf, address);
>
> here ... sprint_symbol() ?
>
> > printk(fmt, buffer);
> > +
> > + spin_unlock_irqrestore(&symbol_lock, flags);
>
> But I still don't much like this :-(
Yep. Still needs work.
... I'll get there. Dead or alive. ;)
>
> More importantly, if a panic occurs *below* this callchain (and let's
> say we ended up in this callchain because somebody put in a dump_stack()
> somewhere for debugging purposes), then we'd have a deadlock on our hands,
> and nothing gets printed for that panic.
The -best- solution is to use locking in normal circumstances (E.g.
debug code, sleep while atomic ,etc) -but- drop all locks when BUG/Oops
is being hit.
I believe the next patch will take care of this problem.
>
> I don't know who maintains this part of kernel code, but you can try
> resubmitting (with the changes suggested above) to someone appropriate ...
Any suggestions?
- Gilboa
Hello Paulo,
[snip]
> I must say I agree with Satyam here.
>
> Locking in the panic path might leave us without some critical debug
> information, which is much more important than all this.
>
> Maybe it would be better to change the print_symbol interface to avoid
> having a "char buffer[KSYM_SYMBOL_LEN];" at all.
>
> Most print_symbol callers use something like "yada yada %s" as the
> format string, with an optional "\n" in the end.
>
> if we change the interface from "print_symbol(fmt, addr)" to
> "print_symbol(prefix, addr, int newline)" we can simply do:
>
> printk(prefix);
> printk_symbol(addr);
> if (newline)
> printk("\n");
>
> where "printk_symbol" is a new function that does the same as
> sprint_symbol, but does "printk" instead of "sprintf".
>
> This should reduce immensely the stack usage of print_symbol without the
> need for locking.
I fully agree.
... Further more, multiple printk_symbols should be combined into a
single, multi-line printk transaction. (To prevent debug printk's from
trashing a BUG() dump_stack).
>
> Of course this requires changing _all_ callers of print_symbol to use
> the new interface, but these are less than 100 ;)
This is my first contribution to the Linux kernel. As such I rather
start small, and work my way up slowly. (Read: solve the immediate stack
over-run now, think about changing the symbol_display interface later)
>
> Comments?
I do agree that the current interface needs work.
... But as I said, I rather start slowly and on small scale. (Though I
did find a rather problematic place to start at... ;))
- Gilboa
Gilboa Davara wrote:
> Hello Paulo,
Hi, Gilboa
> [snip]
>> [...]
>> if we change the interface from "print_symbol(fmt, addr)" to
>> "print_symbol(prefix, addr, int newline)" we can simply do:
>>
>> printk(prefix);
>> printk_symbol(addr);
>> if (newline)
>> printk("\n");
>>
>> where "printk_symbol" is a new function that does the same as
>> sprint_symbol, but does "printk" instead of "sprintf".
>>
>> This should reduce immensely the stack usage of print_symbol without the
>> need for locking.
>
> I fully agree.
> ... Further more, multiple printk_symbols should be combined into a
> single, multi-line printk transaction. (To prevent debug printk's from
> trashing a BUG() dump_stack).
Usually the developer can separate the output by hand in the unlikely
case of simultaneous concurrent users of printk, so I don't think this
is really a big problem.
>> Of course this requires changing _all_ callers of print_symbol to use
>> the new interface, but these are less than 100 ;)
>
> This is my first contribution to the Linux kernel. As such I rather
> start small, and work my way up slowly. (Read: solve the immediate stack
> over-run now, think about changing the symbol_display interface later)
Yes, but this is a sensitive area, so you can not just implement
something now that can cause regressions, and just fix it later.
Kernel panics are quite rare and the information provided by the stack
dump is _extremely_ precious.
Even more, risking deadlocks caused by something that should only be
used to produce debug information is even worse.
>> Comments?
>
> I do agree that the current interface needs work.
>
> ... But as I said, I rather start slowly and on small scale. (Though I
> did find a rather problematic place to start at... ;))
Well, if we agree that this is the way to go, then the way to start
slowly would be to submit a patch that makes both interfaces available
for a while and changes the most stack-critical callers to the new
interface.
Then slowly keep submitting patches to change other callers
progressively until there are no more callers of the old interface. At
that point we can just drop it entirely.
--
Paulo Marques - http://www.grupopie.com
"God is love. Love is blind. Ray Charles is blind. Ray Charles is God."
Hello all,
(1) Problem:
I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
i386 kernels, do_IRQ calls dump_stack which, down the path, uses
print_symbol (display) and sprint_symbol (format) to format and display
the function name/address/module.
Both function use stack based char array (~350 bytes) that, given the
initial state (<512 bytes of stack space) may overrun the stack.
II. (Comments - previous patches) Using spinlock protected static
storage within these functions might block or even deadlock dump_stack
(E.g. Crash within dump_stack itself)
(2) Solution:
I. Break sprint_symbol into sprint_symbol (API functions; keeps the
current interface) and sprint_symbol_helper (helper function with
minimal local storage).
II. Replace the char array in __print_symbol with two spinlock protected
static char arrays; call the __sprint_symbol helper function instead of
sprint_symbol.
III. Ignore the spinlock if oops_in_progress is set.
(3) Comments:
I. Currently, if oops_in_progress is set, multiple callers can trash
each other.
II. In order to solve it, print_symbol (and/or dump_stack itself) must
be aware of it's own context. (Read: if it's being called by oops, or by
a "debug/warning" code).
III. Possible solutions:
A. Add oops_in_progress_cpuid to bust_spinlocks (Hack, but generates far
less noise).
B. Add priority tag to all the functions that handle the calls stuck.
(Nicer, requires a massive tree-changing patch.)
Signed-off-by: Gilboa Davara <[email protected]>
Satyam, Paulo,
A. I hope this patch better complies with your comments.
- Gilboa
--- linux-2.6/kernel/kallsyms.orig 2007-09-20 19:17:49.000000000 +0200
+++ linux-2.6/kernel/kallsyms.c 2007-09-21 16:24:03.000000000 +0200
@@ -306,13 +306,17 @@ int lookup_symbol_attrs(unsigned long ad
return lookup_module_symbol_attrs(addr, size, offset, modname, name);
}
-/* Look up a kernel symbol and return it in a text buffer. */
-int sprint_symbol(char *buffer, unsigned long address)
+/*
+ * Helper function:
+ *
+ * Look up a kernel symbol and module name and return them to the
+ * caller's buffer/namebuf buffers.
+*/
+int sprint_symbol_helper(char *buffer, char *namebuf, unsigned long
address)
{
- char *modname;
- const char *name;
unsigned long offset, size;
- char namebuf[KSYM_NAME_LEN];
+ const char *name;
+ char *modname;
name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
if (!name)
@@ -325,14 +329,52 @@ int sprint_symbol(char *buffer, unsigned
return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
}
+/*
+ * API function:
+ *
+ * Look up a kernel symbol and return it in a text buffer.
+ */
+int sprint_symbol(char *buffer, unsigned long address)
+{
+ char namebuf[KSYM_NAME_LEN];
+
+ return sprint_symbol_helper(buffer, namebuf, address);
+}
+
+
/* Look up a kernel symbol and print it to the kernel messages. */
void __print_symbol(const char *fmt, unsigned long address)
{
- char buffer[KSYM_SYMBOL_LEN];
-
- sprint_symbol(buffer, address);
+ /*
+ * Use static buffers instead of char array to reduce the
+ * stack footprint in i386/4KSTACKS.
+ * Buffers must be protected against re-entry.
+ * ( As long as Oops is not in progress. )
+ */
+ static DEFINE_SPINLOCK(symbol_lock);
+ static char namebuf[KSYM_NAME_LEN];
+ static char buffer[KSYM_SYMBOL_LEN];
+ unsigned long flags;
+ int locked = 0;
+
+
+ /*
+ * Bust all spinlocks if oops is in progress.
+ * FIXME: Might generate broken output if BUG()/oops
+ * generates a callstack while another one uses
+ * dump_stack for debug purposes.
+ */
+ if (!oops_in_progress) {
+ /* Normal lock mode. */
+ spin_lock_irqsave(&symbol_lock, flags);
+ locked = 1;
+ }
+ sprint_symbol_helper(buffer, namebuf, address);
printk(fmt, buffer);
+
+ if (locked)
+ spin_unlock_irqrestore(&symbol_lock, flags);
}
/* To avoid using get_symbol_offset for every symbol, we carry prefix
along. */
On Sat, Sep 15, 2007 at 02:35:29PM +0300, Gilboa Davara wrote:
> - return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> + if (namebuf)
> + kfree(namebuf);
Note, the if condition is not needed, kfree handles null pointers fine.
-- Steve
On Fri, 2007-09-21 at 10:47 -0400, Steven Rostedt wrote:
> On Sat, Sep 15, 2007 at 02:35:29PM +0300, Gilboa Davara wrote:
> > - return sprintf(buffer, "%s+%#lx/%#lx", name, offset, size);
> > + if (namebuf)
> > + kfree(namebuf);
>
> Note, the if condition is not needed, kfree handles null pointers fine.
>
> -- Steve
Steve,
This patch has been scrapped and by another, (hopefully) less
problematic one [1].
Thanks for the comments,
Gilboa
[1] http://lkml.org/lkml/2007/9/21/180
On Wed, Sep 19, 2007 at 03:25:15PM +0100, Paulo Marques wrote:
>
> if we change the interface from "print_symbol(fmt, addr)" to
> "print_symbol(prefix, addr, int newline)" we can simply do:
>
> printk(prefix);
> printk_symbol(addr);
> if (newline)
> printk("\n");
NACK
I just wrote something that does "print_symbol(" %s)\n", addr);"
Notice the ")" in the output.
-- Steve
>
> where "printk_symbol" is a new function that does the same as
> sprint_symbol, but does "printk" instead of "sprintf".
>
> This should reduce immensely the stack usage of print_symbol without the
> need for locking.
>
> Of course this requires changing _all_ callers of print_symbol to use the
> new interface, but these are less than 100 ;)
>
> Comments?
>
> --
> Paulo Marques - http://www.grupopie.com
>
> "You're just jealous because the voices only talk to me."
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Fri, 2007-09-21 at 15:21 +0100, Paulo Marques wrote:
> Gilboa Davara wrote:
> > Hello Paulo,
>
> Hi, Gilboa
>
> Usually the developer can separate the output by hand in the unlikely
> case of simultaneous concurrent users of printk, so I don't think this
> is really a big problem.
In general I agree, but as the number of cores increases dramatically,
hitting a warning/BUG/oops simultaneously on multiple cores (especially
during development) becomes more of a problem.
At least to me, It's not rare that when working on 8C or above machine,
an oops message becomes completely unreadable due to multiple debug
printks that trash the serial port output.
Never the less, I do agree that this problem is more developer related
and should not hit an end user running a debug-free distribution kernel.
>
> >> Of course this requires changing _all_ callers of print_symbol to use
> >> the new interface, but these are less than 100 ;)
> >
> > This is my first contribution to the Linux kernel. As such I rather
> > start small, and work my way up slowly. (Read: solve the immediate stack
> > over-run now, think about changing the symbol_display interface later)
>
> Yes, but this is a sensitive area, so you can not just implement
> something now that can cause regressions, and just fix it later.
True.
My aim is to create a patch that
A. Doesn't break the current interface/APIs while keeping the
patch/noise to a minimum.
B. Does not cause any type of regression.
>
> Kernel panics are quite rare and the information provided by the stack
> dump is _extremely_ precious.
I fully agree.
>
> Even more, risking deadlocks caused by something that should only be
> used to produce debug information is even worse.
Hopefully the latest patch moves one step further in removing your
concerns.
>
> >> Comments?
> >
> > I do agree that the current interface needs work.
> >
> > ... But as I said, I rather start slowly and on small scale. (Though I
> > did find a rather problematic place to start at... ;))
>
> Well, if we agree that this is the way to go, then the way to start
> slowly would be to submit a patch that makes both interfaces available
> for a while and changes the most stack-critical callers to the new
> interface.
>
> Then slowly keep submitting patches to change other callers
> progressively until there are no more callers of the old interface. At
> that point we can just drop it entirely.
If I cannot find a working solution that does meet my initial goals
(Minimal patch/no regression) I'll start from scratch - creating a new
trace call path.
- Gilboa
Steven Rostedt wrote:
> On Wed, Sep 19, 2007 at 03:25:15PM +0100, Paulo Marques wrote:
>> if we change the interface from "print_symbol(fmt, addr)" to
>> "print_symbol(prefix, addr, int newline)" we can simply do:
>>
>> printk(prefix);
>> printk_symbol(addr);
>> if (newline)
>> printk("\n");
>
> NACK
>
> I just wrote something that does "print_symbol(" %s)\n", addr);"
> Notice the ")" in the output.
We can just change that to "print_symbol(prefix, addr, suffix)" instead.
The concept is basically the same (and I wasn't very fond of that
newline argument either).
As an added bonus we stop needing an extra layer to check that the
string passed is in the right format with a single "%s" in it.
--
Paulo Marques - http://www.grupopie.com
"Very funny Scotty. Now beam up my clothes."
Gilboa Davara wrote:
> Hello all,
Hi, Gilboa
> (1) Problem:
> I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
> i386 kernels, do_IRQ calls dump_stack which, down the path, uses
> print_symbol (display) and sprint_symbol (format) to format and display
> the function name/address/module.
> Both function use stack based char array (~350 bytes) that, given the
> initial state (<512 bytes of stack space) may overrun the stack.
> II. (Comments - previous patches) Using spinlock protected static
> storage within these functions might block or even deadlock dump_stack
> (E.g. Crash within dump_stack itself)
>
> (2) Solution:
> I. Break sprint_symbol into sprint_symbol (API functions; keeps the
> current interface) and sprint_symbol_helper (helper function with
> minimal local storage).
> II. Replace the char array in __print_symbol with two spinlock protected
> static char arrays; call the __sprint_symbol helper function instead of
> sprint_symbol.
> III. Ignore the spinlock if oops_in_progress is set.
This is getting more and more convoluted :(
The problem with the spinlock isn't just that during a panic, we can not
trust the kernel structures enough to use spinlocks. It might well
happen that lockdep code might want to use print_symbol (and I think it
does, so this is not just theoretical) to dump the stack when someone
calls spin_lock_irqsave.
But now, because print_symbol itself uses spin_lock_irqsave, we might
get into a recursive situation and a produce a deadlock.
On the other hand, if you take the other approach of reducing the stack
usage by creating a printk_symbol interface, the stack usage would drop
from 350 bytes to 128 bytes and your problem would go away entirely.
--
Paulo Marques - http://www.grupopie.com
"All I ask is a chance to prove that money can't make me happy."
On Fri, 2007-09-21 at 17:02 +0100, Paulo Marques wrote:
> Gilboa Davara wrote:
> > Hello all,
>
> Hi, Gilboa
>
> > (1) Problem:
> > I. When CONFIG_4KSTACKS and CONFIG_DEBUG_STACKOVERFLOW are enabled on
> > i386 kernels, do_IRQ calls dump_stack which, down the path, uses
> > print_symbol (display) and sprint_symbol (format) to format and display
> > the function name/address/module.
> > Both function use stack based char array (~350 bytes) that, given the
> > initial state (<512 bytes of stack space) may overrun the stack.
> > II. (Comments - previous patches) Using spinlock protected static
> > storage within these functions might block or even deadlock dump_stack
> > (E.g. Crash within dump_stack itself)
> >
> > (2) Solution:
> > I. Break sprint_symbol into sprint_symbol (API functions; keeps the
> > current interface) and sprint_symbol_helper (helper function with
> > minimal local storage).
> > II. Replace the char array in __print_symbol with two spinlock protected
> > static char arrays; call the __sprint_symbol helper function instead of
> > sprint_symbol.
> > III. Ignore the spinlock if oops_in_progress is set.
>
> This is getting more and more convoluted :(
>
> The problem with the spinlock isn't just that during a panic, we can not
> trust the kernel structures enough to use spinlocks. It might well
> happen that lockdep code might want to use print_symbol (and I think it
> does, so this is not just theoretical) to dump the stack when someone
> calls spin_lock_irqsave.
> But now, because print_symbol itself uses spin_lock_irqsave, we might
> get into a recursive situation and a produce a deadlock.
>
> On the other hand, if you take the other approach of reducing the stack
> usage by creating a printk_symbol interface, the stack usage would drop
> from 350 bytes to 128 bytes and your problem would go away entirely.
>
OK. I'll wait for additional comments while thinking about a third
route. (Following your suggestion)
- Gilboa