Add a FRAME_POINTER option and when it is enabled, use frame pointers
to walk the stack during a backtrace dump. This eliminates printout of
confusing "function calls" corresponding to stack values that look like they
might be return addresses, but aren't.
This patch is dependent upon
[PATCH] microblaze: Begin stack dump with caller of dump_stack()
I'm not certain whether the MMU compiler generates frame pointers the same
way as the noMMU compiler I am using. I'm also not sure what all the
ramifications of providing FRAME_POINTER are. It looks like tracing
functionality makes use of it. Need someone familiar with these areas
to comment on the patch.
Signed-off-by: Steven J. Magnani <[email protected]>
---
diff -uprN a/arch/microblaze/Kconfig.debug b/arch/microblaze/Kconfig.debug
--- a/arch/microblaze/Kconfig.debug 2010-02-25 13:52:30.000000000 -0600
+++ b/arch/microblaze/Kconfig.debug 2010-02-25 13:52:49.000000000 -0600
@@ -26,4 +26,11 @@ config DEBUG_BOOTMEM
depends on DEBUG_KERNEL
bool "Debug BOOTMEM initialization"
+config FRAME_POINTER
+ bool "Use frame pointers"
+ default n
+ help
+ If you say N here, the resulting kernel will be slightly smaller and
+ faster. However, stack dumps will be much harder to interpret.
+
endmenu
diff -uprN a/arch/microblaze/kernel/traps.c b/arch/microblaze/kernel/traps.c
--- a/arch/microblaze/kernel/traps.c 2010-02-25 13:50:00.000000000 -0600
+++ b/arch/microblaze/kernel/traps.c 2010-02-25 13:51:11.000000000 -0600
@@ -8,6 +8,7 @@
* for more details.
*/
+#include <generated/autoconf.h>
#include <linux/kernel.h>
#include <linux/kallsyms.h>
#include <linux/module.h>
@@ -44,7 +45,7 @@ void show_trace(struct task_struct *task
printk(KERN_NOTICE "\n");
#endif
while (!kstack_end(stack)) {
- addr = *stack++;
+ addr = *stack;
/*
* If the address is either in the text segment of the
* kernel, or in the region which contains vmalloc'ed
@@ -55,6 +56,13 @@ void show_trace(struct task_struct *task
*/
if (kernel_text_address(addr))
print_ip_sym(addr);
+
+#if defined(CONFIG_FRAME_POINTER)
+ /* Fetch the caller's frame pointer */
+ stack = (unsigned long *) stack[7];
+#else
+ stack++;
+#endif
}
printk(KERN_NOTICE "\n");
Steven J. Magnani wrote:
> Add a FRAME_POINTER option and when it is enabled, use frame pointers
> to walk the stack during a backtrace dump. This eliminates printout of
> confusing "function calls" corresponding to stack values that look like they
> might be return addresses, but aren't.
>
> This patch is dependent upon
> [PATCH] microblaze: Begin stack dump with caller of dump_stack()
yes.
>
> I'm not certain whether the MMU compiler generates frame pointers the same
> way as the noMMU compiler I am using. I'm also not sure what all the
> ramifications of providing FRAME_POINTER are. It looks like tracing
> functionality makes use of it. Need someone familiar with these areas
> to comment on the patch.
Firstly I was surprise that you create any frame pointer solution but
1. It is not frame pointer because Microblaze not use it
2. it is just one optimization which could help but IMHO not. Your patch
expects that every stack frame size has 7/8 (doesn't matter right now)
items but that's not correct expectation. (Do objdump from vmlinux and
look at cpu_idle, prom_add_property and others) - that's why I think
that your patch won't work.
3. The next question is, if we can expect that every function record has
at least 7/8 items. If yes than look at my function below.
4. One more thing is that function still use kernel_text_address() which
is silly because we are still not sure if the address there is correct
or not. It is just checking and if we are using, it is just mean that
there is any expectation which is not correct.
I did some testing several months/weeks ago and I tried to solve it in
the same way as you. But found that it is based on any expectation which
is not correct.
I know that stack tracing is pain but I am not convinced that this is
way how to solve it. :-(
Michal
>
> Signed-off-by: Steven J. Magnani <[email protected]>
> ---
> diff -uprN a/arch/microblaze/Kconfig.debug b/arch/microblaze/Kconfig.debug
> --- a/arch/microblaze/Kconfig.debug 2010-02-25 13:52:30.000000000 -0600
> +++ b/arch/microblaze/Kconfig.debug 2010-02-25 13:52:49.000000000 -0600
> @@ -26,4 +26,11 @@ config DEBUG_BOOTMEM
> depends on DEBUG_KERNEL
> bool "Debug BOOTMEM initialization"
>
> +config FRAME_POINTER
> + bool "Use frame pointers"
> + default n
> + help
> + If you say N here, the resulting kernel will be slightly smaller and
> + faster. However, stack dumps will be much harder to interpret.
> +
depends on !MMU
> endmenu
> diff -uprN a/arch/microblaze/kernel/traps.c b/arch/microblaze/kernel/traps.c
> --- a/arch/microblaze/kernel/traps.c 2010-02-25 13:50:00.000000000 -0600
> +++ b/arch/microblaze/kernel/traps.c 2010-02-25 13:51:11.000000000 -0600
> @@ -8,6 +8,7 @@
> * for more details.
> */
>
> +#include <generated/autoconf.h>
why? I don't think that this is necessary.
> #include <linux/kernel.h>
> #include <linux/kallsyms.h>
> #include <linux/module.h>
> @@ -44,7 +45,7 @@ void show_trace(struct task_struct *task
> printk(KERN_NOTICE "\n");
> #endif
> while (!kstack_end(stack)) {
> - addr = *stack++;
> + addr = *stack;
> /*
> * If the address is either in the text segment of the
> * kernel, or in the region which contains vmalloc'ed
> @@ -55,6 +56,13 @@ void show_trace(struct task_struct *task
> */
> if (kernel_text_address(addr))
> print_ip_sym(addr);
> +
> +#if defined(CONFIG_FRAME_POINTER)
> + /* Fetch the caller's frame pointer */
> + stack = (unsigned long *) stack[7];
If is calculation correct then some comments, why you use number 7, will
be necessary.
> +#else
> + stack++;
> +#endif
> }
> printk(KERN_NOTICE "\n");
>
>
Look at this code which should be better than yours.
if (kernel_text_address(addr)) {
print_ip_sym(addr);
/* Fetch the caller's frame pointer */
stack = (unsigned long *) stack[7];
}
stack++;
--
Michal Simek, Ing. (M.Eng)
w: http://www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
Michal,
On Fri, 2010-02-26 at 09:06 +0100, Michal Simek wrote:
> Firstly I was surprise that you create any frame pointer solution but
> 1. It is not frame pointer because Microblaze not use it
Can you explain this in different words? I'm not following you.
The code compiles differently in these two cases:
ifdef CONFIG_FRAME_POINTER
KBUILD_CFLAGS += -fno-omit-frame-pointer -fno-optimize-sibling-calls
else
KBUILD_CFLAGS += -fomit-frame-pointer
endif
Empirically speaking, the former causes the compiler to use r19 to hold
the frame pointer. The latter allows the compiler to use r19 for other
purposes.
> 2. it is just one optimization which could help but IMHO not. Your patch
> expects that every stack frame size has 7/8 (doesn't matter right now)
> items but that's not correct expectation. (Do objdump from vmlinux and
> look at cpu_idle, prom_add_property and others) - that's why I think
> that your patch won't work.
The patch expects only that frames involved in a backtrace are _at
least_ 8 words deep and that the frame pointer is always the 8th word of
the frame (index 7).
In my build, cpu_idle() begins like this:
4b8: 3021ffd8 addik r1, r1, -40
4bc: fa61001c swi r19, r1, 28
4c0: f9e10000 swi r15, r1, 0
...which is a frame of 40 bytes, and a frame pointer stored 7 words into
the frame. prom_add_property() has a frame of 48 bytes and a frame
pointer stored 7 words in.
Now, disable_hlt() has a runt frame of only 8 bytes when compiled with
-fno-omit-frame-pointer. But it is a leaf function and should never show
up in a backtrace, at least in a noMMU kernel. For MMU I suppose it's
possible for a leaf function to oops. I don't know the implications of
that.
Although the examples you cite don't prove your point, in looking more
closely, I see that there _are_ non-leaf functions where the frame
pointer is being placed elsewhere, for example do_one_initcall():
20000064: 3021ffc0 addik r1, r1, -64
20000068: fa61002c swi r19, r1, 44
This of course is a killer. I wonder if this is something that could be
changed in the Microblaze gcc someday.
> 3. The next question is, if we can expect that every function record has
> at least 7/8 items. If yes than look at my function below.
> 4. One more thing is that function still use kernel_text_address() which
> is silly because we are still not sure if the address there is correct
> or not. It is just checking and if we are using, it is just mean that
> there is any expectation which is not correct.
It may be that the function can be improved further; that's beyond the
scope of what I was trying to accomplish.
> > ---
> > diff -uprN a/arch/microblaze/Kconfig.debug b/arch/microblaze/Kconfig.debug
> > --- a/arch/microblaze/Kconfig.debug 2010-02-25 13:52:30.000000000 -0600
> > +++ b/arch/microblaze/Kconfig.debug 2010-02-25 13:52:49.000000000 -0600
> > @@ -26,4 +26,11 @@ config DEBUG_BOOTMEM
> > depends on DEBUG_KERNEL
> > bool "Debug BOOTMEM initialization"
> >
> > +config FRAME_POINTER
> > + bool "Use frame pointers"
> > + default n
> > + help
> > + If you say N here, the resulting kernel will be slightly smaller and
> > + faster. However, stack dumps will be much harder to interpret.
> > +
>
> depends on !MMU
>
> > endmenu
> > diff -uprN a/arch/microblaze/kernel/traps.c b/arch/microblaze/kernel/traps.c
> > --- a/arch/microblaze/kernel/traps.c 2010-02-25 13:50:00.000000000 -0600
> > +++ b/arch/microblaze/kernel/traps.c 2010-02-25 13:51:11.000000000 -0600
> > @@ -8,6 +8,7 @@
> > * for more details.
> > */
> >
> > +#include <generated/autoconf.h>
>
> why? I don't think that this is necessary.
Otherwise CONFIG_FRAME_POINTER is always undefined.
>
> > #include <linux/kernel.h>
> > #include <linux/kallsyms.h>
> > #include <linux/module.h>
> > @@ -44,7 +45,7 @@ void show_trace(struct task_struct *task
> > printk(KERN_NOTICE "\n");
> > #endif
> > while (!kstack_end(stack)) {
> > - addr = *stack++;
> > + addr = *stack;
> > /*
> > * If the address is either in the text segment of the
> > * kernel, or in the region which contains vmalloc'ed
> > @@ -55,6 +56,13 @@ void show_trace(struct task_struct *task
> > */
> > if (kernel_text_address(addr))
> > print_ip_sym(addr);
> > +
> > +#if defined(CONFIG_FRAME_POINTER)
> > + /* Fetch the caller's frame pointer */
> > + stack = (unsigned long *) stack[7];
>
> If is calculation correct then some comments, why you use number 7, will
> be necessary.
>
> > +#else
> > + stack++;
> > +#endif
> > }
> > printk(KERN_NOTICE "\n");
> >
> >
>
> Look at this code which should be better than yours.
>
> if (kernel_text_address(addr)) {
> print_ip_sym(addr);
> /* Fetch the caller's frame pointer */
> stack = (unsigned long *) stack[7];
> }
There would need to be an 'else' here. Also, once the code goes off the
frame pointer rail, it should stay off.
> stack++;
>
Regards,
------------------------------------------------------------------------
Steven J. Magnani "I claim this network for MARS!
http://www.digidescorp.com Earthling, return my space modulator!"
#include <standard.disclaimer>
On Fri, Feb 26, 2010 at 10:49:58AM -0600, Steven J. Magnani wrote:
> On Fri, 2010-02-26 at 09:06 +0100, Michal Simek wrote:
> > 2. it is just one optimization which could help but IMHO not. Your patch
> > expects that every stack frame size has 7/8 (doesn't matter right now)
> > items but that's not correct expectation. (Do objdump from vmlinux and
> > look at cpu_idle, prom_add_property and others) - that's why I think
> > that your patch won't work.
>
> The patch expects only that frames involved in a backtrace are _at
> least_ 8 words deep and that the frame pointer is always the 8th word of
> the frame (index 7).
>
> In my build, cpu_idle() begins like this:
>
> 4b8: 3021ffd8 addik r1, r1, -40
> 4bc: fa61001c swi r19, r1, 28
> 4c0: f9e10000 swi r15, r1, 0
>
> ...which is a frame of 40 bytes, and a frame pointer stored 7 words into
> the frame. prom_add_property() has a frame of 48 bytes and a frame
> pointer stored 7 words in.
>
> Now, disable_hlt() has a runt frame of only 8 bytes when compiled with
> -fno-omit-frame-pointer. But it is a leaf function and should never show
> up in a backtrace, at least in a noMMU kernel. For MMU I suppose it's
> possible for a leaf function to oops. I don't know the implications of
> that.
>
> Although the examples you cite don't prove your point, in looking more
> closely, I see that there _are_ non-leaf functions where the frame
> pointer is being placed elsewhere, for example do_one_initcall():
>
> 20000064: 3021ffc0 addik r1, r1, -64
> 20000068: fa61002c swi r19, r1, 44
>
> This of course is a killer. I wonder if this is something that could be
> changed in the Microblaze gcc someday.
>
This doesn't look that bad compared to what some of the other
architectures have to deal with. If the frame pointer is always setup
using these addik/swi pairs then you can trivially scan an arbitrary
number of instructions attempting to match before giving up. We do
similar things for sh64 where we also have to figure out how stack frames
were created in order to roll them back.
In any event, take a look at arch/sh/kernel/cpu/sh5/unwind.c, you should
be able to use a similar scheme without the need for undue complexity.