2004-04-20 15:44:23

by Eli Cohen

[permalink] [raw]
Subject: sysrq shows impossible call stack

Hi,
I am exsperiencing a hang, probably caused by a deadlock, so I can do
sysrq commands. However I can see that in some cases the exspected call
stack has some functions in between as if two call stack are
interleaved. For example, if A and B denote two possible paths, I may see:
A0
A1
B0
A2
B1

Does anyone can explain that?
thanks
Eli


2004-04-20 16:28:39

by Roland Dreier

[permalink] [raw]
Subject: Re: sysrq shows impossible call stack

Eli> Hi, I am exsperiencing a hang, probably caused by a deadlock,
Eli> so I can do sysrq commands. However I can see that in some
Eli> cases the exspected call stack has some functions in between
Eli> as if two call stack are interleaved.

Hi Eli,

I think what you are seeing is old values left on the stack. If a
function allocates space for local variables but bumping the stack
pointer, but then never actually uses those variables, then old values
including old return addresses may be left in the stack.

- Roland

2004-04-20 17:34:58

by Eli Cohen

[permalink] [raw]
Subject: Re: sysrq shows impossible call stack

Roland Dreier wrote:

> Eli> Hi, I am exsperiencing a hang, probably caused by a deadlock,
> Eli> so I can do sysrq commands. However I can see that in some
> Eli> cases the exspected call stack has some functions in between
> Eli> as if two call stack are interleaved.
>
>Hi Eli,
>
>I think what you are seeing is old values left on the stack. If a
>function allocates space for local variables but bumping the stack
>pointer, but then never actually uses those variables, then old values
>including old return addresses may be left in the stack.
>
> - Roland
>
>
Thanks Roland

2004-04-20 17:51:11

by Roland Dreier

[permalink] [raw]
Subject: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

Your question prompted me to look at show_trace() in
arch/i386/kernel/traps.c. I see that even in kernels as new as 2.6.5,
there is no attempt to use frame pointers for stack dumps even when
CONFIG_FRAME_POINTER is set. I seem to remember some patches to do
this floating around a while ago. How did that discussion end up?

(Red Hat seems to have something to do this in the source for the
version of 2.4.21 that they ship in RHEL 3, but none of their shipped
kernels turn on CONFIG_FRAME_POINTER).

Thanks,
Roland

2004-04-20 22:01:25

by Adam Litke

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

On Tue, 2004-04-20 at 10:51, Roland Dreier wrote:
> Your question prompted me to look at show_trace() in
> arch/i386/kernel/traps.c. I see that even in kernels as new as 2.6.5,
> there is no attempt to use frame pointers for stack dumps even when
> CONFIG_FRAME_POINTER is set. I seem to remember some patches to do
> this floating around a while ago. How did that discussion end up?

This problem was annoying me a few months ago so I coded up a stack
trace patch that actually uses the frame pointer. It is currently
maintained in -mjb but I have pasted below. Hope this helps.

diff -upN reference/arch/i386/kernel/traps.c current/arch/i386/kernel/traps.c
--- reference/arch/i386/kernel/traps.c 2004-04-09 11:53:00.000000000 -0700
+++ current/arch/i386/kernel/traps.c 2004-04-09 11:53:04.000000000 -0700
@@ -124,20 +124,62 @@ void breakpoint(void)
#define CHK_REMOTE_DEBUG(trapnr,signr,error_code,regs,after)
#endif

+#define STACK_PRINT_DEPTH 32

-static int kstack_depth_to_print = 24;
+#ifdef CONFIG_FRAME_POINTER
+#define valid_stack_ptr(task, p) \
+ ((p > (unsigned long)task->thread_info) && \
+ (p < (unsigned long)task->thread_info+THREAD_SIZE))

-void show_trace(struct task_struct *task, unsigned long * stack)
+void show_stack_frame(unsigned long start, unsigned long end)
+{
+ int i;
+
+ printk(" ");
+ for (i = start; i < end; i += 4) {
+ if ((i - start) && ((i - start)%24 == 0))
+ printk("\n ");
+ printk("%08lx ", *(unsigned long *) i);
+ }
+ printk("\n");
+}
+
+void show_trace_fp(struct task_struct *task, unsigned long * stack)
+{
+ unsigned long addr, ebp;
+
+ if (!task)
+ task = current;
+
+ if (task == current) {
+ /* Grab ebp right from our regs */
+ asm ("movl %%ebp, %0" : "=r" (ebp) : );
+ } else {
+ /* ebp is the last reg pushed by switch_to */
+ ebp = *(unsigned long *) task->thread.esp;
+ }
+
+ show_stack_frame((unsigned long) stack, ebp+4);
+ while (valid_stack_ptr(task, ebp)) {
+ addr = *(unsigned long *) (ebp + 4);
+ printk(" [<%08lx>] ", addr);
+ print_symbol("%s\n", addr);
+
+ /* Show the stack frame starting with args */
+ show_stack_frame(ebp + 8, (*(unsigned long *) ebp) + 4);
+ ebp = *(unsigned long *) ebp;
+ }
+}
+
+#else /* !CONFIG_FRAME_POINTER */
+
+void show_trace_guess(unsigned long * stack)
{
unsigned long addr;

if (!stack)
stack = (unsigned long*)&stack;

- printk("Call Trace:");
-#ifdef CONFIG_KALLSYMS
- printk("\n");
-#endif
while (1) {
struct thread_info *context;
context = (struct thread_info*) ((unsigned long)stack & (~(THREAD_SIZE - 1)));
@@ -153,6 +195,20 @@ void show_trace(struct task_struct *task
break;
printk(" =======================\n");
}
+}
+#endif
+
+void show_trace(struct task_struct *task, unsigned long * stack)
+{
+ printk("Call Trace:");
+#ifdef CONFIG_KALLSYMS
+ printk("\n");
+#endif
+#ifdef CONFIG_FRAME_POINTER
+ show_trace_fp(task, stack);
+#else
+ show_trace_guess(stack);
+#endif
printk("\n");
}

@@ -168,8 +224,10 @@ void show_trace_task(struct task_struct

void show_stack(struct task_struct *task, unsigned long *esp)
{
+#ifndef CONFIG_FRAME_POINTER
unsigned long *stack;
int i;
+#endif

if (esp == NULL) {
if (task)
@@ -178,8 +236,9 @@ void show_stack(struct task_struct *task
esp = (unsigned long *)&esp;
}

+#ifndef CONFIG_FRAME_POINTER
stack = esp;
- for(i = 0; i < kstack_depth_to_print; i++) {
+ for(i = 0; i < STACK_PRINT_DEPTH; i++) {
if (kstack_end(stack))
break;
if (i && ((i % 8) == 0))
@@ -187,6 +246,7 @@ void show_stack(struct task_struct *task
printk("%08lx ", *stack++);
}
printk("\n");
+#endif
show_trace(task, esp);
}


--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

2004-04-21 00:13:02

by Roland Dreier

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

Adam> This problem was annoying me a few months ago so I coded up
Adam> a stack trace patch that actually uses the frame pointer.
Adam> It is currently maintained in -mjb but I have pasted below.
Adam> Hope this helps.

Thanks, that looks really useful. What is the chance of this moving
from -mjb to mainline?

- R.

2004-04-21 01:41:44

by Andrew Morton

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

Andrew Morton <[email protected]> wrote:
>
> Roland Dreier <[email protected]> wrote:
> >
> > Adam> This problem was annoying me a few months ago so I coded up
> > Adam> a stack trace patch that actually uses the frame pointer.
> > Adam> It is currently maintained in -mjb but I have pasted below.
> > Adam> Hope this helps.
> >
> > Thanks, that looks really useful. What is the chance of this moving
> > from -mjb to mainline?
>
> Good, but it needs to be updated to do the right thing with 4k stacks when
> called from interrupt context.

Also, I'd like to be convinced that it does the right thing across assembly
code which doesn't set up a correct frame pointer, such as down().

I assume it will simply skip that frame altogether?

2004-04-21 01:40:23

by Andrew Morton

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

Roland Dreier <[email protected]> wrote:
>
> Adam> This problem was annoying me a few months ago so I coded up
> Adam> a stack trace patch that actually uses the frame pointer.
> Adam> It is currently maintained in -mjb but I have pasted below.
> Adam> Hope this helps.
>
> Thanks, that looks really useful. What is the chance of this moving
> from -mjb to mainline?

Good, but it needs to be updated to do the right thing with 4k stacks when
called from interrupt context.

See how the the current version of show_trace() does this.

2004-04-21 23:30:38

by Adam Litke

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

On Tue, 2004-04-20 at 18:41, Andrew Morton wrote:
> Andrew Morton <[email protected]> wrote:
> >
> > Roland Dreier <[email protected]> wrote:
> > >
> > > Adam> This problem was annoying me a few months ago so I coded up
> > > Adam> a stack trace patch that actually uses the frame pointer.
> > > Adam> It is currently maintained in -mjb but I have pasted below.
> > > Adam> Hope this helps.
> > >
> > > Thanks, that looks really useful. What is the chance of this moving
> > > from -mjb to mainline?
> >
> > Good, but it needs to be updated to do the right thing with 4k stacks when
> > called from interrupt context.

The show_trace() for the CONFIG_FRAME_POINTER case will now be called
the same way as the existing code. This brings up a question though.
It doesn't appear to me that anyone is actually calling
show_trace_task() yet. Am I missing something or should we change all
the callers of show_trace() to use show_trace_task()?

>
> Also, I'd like to be convinced that it does the right thing across assembly
> code which doesn't set up a correct frame pointer, such as down().
>
> I assume it will simply skip that frame altogether?

I hacked read_profile() a bit so I could catch it inside down(). This
looks good to me. In fact, even the inline asm functions were
recognized.

cat D 00000246 0 215 212 (NOTLB)
Call Trace:
f296feec 00000086 f296ff60 00000246 f296ff10 f681dd60
00000004 f2b59ce8 f2b59d6c f296fed0 c04b4c80 c04b6680
f296fee4 c01d7f2a bffffafc f296ff08 002e58e9 08848a79
00000018 c1a0cbe0 f29a2160 f29a2328 f296ff18
[<c01079b2>] __down+0x76/0xe0
00000400 00000000 0804c038 f29a2160 00000001 f29a2160
c0118c8c f296ff6c f296ff6c f296ff2c
[<c0107b7f>] __down_failed+0xb/0x14
f296ff60 f296ff68 f296ff6c f296ff74
[<c017887e>] .text.lock.proc_misc+0xf/0x21
00000000 f2a851e0 f2a85200 000011b4 00000000 40869be4
15b4a440 40869be4 15b4a440 40869be4 00000004 ffffffff
00000001 00000001 f296ff10 f296ff10 f296ff98
[<c014a62e>] vfs_read+0x9e/0xd0
f2a851e0 0804c038 00000400 f2a85200 f2a851e0 fffffff7
0804c038 f296ffbc
[<c014a810>] sys_read+0x30/0x50
f2a851e0 0804c038 00000400 f2a85200 00000003 00000400
00000000 f296e000
[<c0108aff>] syscall_call+0x7/0xb

The top stack frame is defined to be all values from esp to ebp.
Subsequent frames are defined as ebp to *ebp (if ebp is a valid stack
address). Therefore in the worst case, a stack frame may include data
from two "logical functions" if a new frame was not defined (but we will
not lose any data). GCC seems to be generating frame-pointer enabled
code even for inline asm calls so we should only see this "worst-case"
behavior around actual assembly functions.

Here is the updated patch (2.6.5)...

diff -purN linux-2.6.5/arch/i386/kernel/traps.c linux-2.6.5+stack/arch/i386/kernel/traps.c
--- linux-2.6.5/arch/i386/kernel/traps.c Mon Apr 5 16:26:55 2004
+++ linux-2.6.5+stack/arch/i386/kernel/traps.c Wed Apr 21 15:24:04 2004
@@ -92,6 +92,58 @@ asmlinkage void alignment_check(void);
asmlinkage void spurious_interrupt_bug(void);
asmlinkage void machine_check(void);

+#ifdef CONFIG_FRAME_POINTER
+#define valid_stack_ptr(task, p) \
+ ((p > (unsigned long)task->thread_info) && \
+ (p < (unsigned long)task->thread_info+THREAD_SIZE))
+
+void show_stack_frame(unsigned long start, unsigned long end)
+{
+ unsigned long i;
+
+ printk(" ");
+ for (i = start; i < end; i += 4) {
+ if ((i - start) && ((i - start)%24 == 0))
+ printk("\n ");
+ printk("%08lx ", *(unsigned long *) i);
+ }
+ printk("\n");
+}
+
+void show_trace(struct task_struct *task, unsigned long * stack)
+{
+ unsigned long addr, ebp;
+
+ if (!task)
+ task = current;
+
+ if (!valid_stack_ptr(task, (unsigned long)stack)) {
+ printk("Stack pointer is garbage, not printing trace\n");
+ return;
+ }
+
+ if (task == current) {
+ /* Grab ebp right from our regs */
+ asm ("movl %%ebp, %0" : "=r" (ebp) : );
+ } else {
+ /* ebp is the last reg pushed by switch_to */
+ ebp = *(unsigned long *) task->thread.esp;
+ }
+
+ printk("Call Trace:\n");
+ show_stack_frame((unsigned long) stack, ebp+4);
+ while (valid_stack_ptr(task, ebp)) {
+ addr = *(unsigned long *) (ebp + 4);
+ printk(" [<%08lx>] ", addr);
+ print_symbol("%s", addr);
+ printk("\n");
+
+ /* Show the stack frame starting with args */
+ show_stack_frame(ebp + 8, (*(unsigned long *) ebp) + 4);
+ ebp = *(unsigned long *) ebp;
+ }
+}
+#else
static int kstack_depth_to_print = 24;

void show_trace(struct task_struct *task, unsigned long * stack)
@@ -101,10 +153,7 @@ void show_trace(struct task_struct *task
if (!stack)
stack = (unsigned long*)&stack;

- printk("Call Trace:");
-#ifdef CONFIG_KALLSYMS
- printk("\n");
-#endif
+ printk("Call Trace:\n");
while (!kstack_end(stack)) {
addr = *stack++;
if (kernel_text_address(addr)) {
@@ -114,6 +163,7 @@ void show_trace(struct task_struct *task
}
printk("\n");
}
+#endif

void show_trace_task(struct task_struct *tsk)
{
@@ -127,9 +177,6 @@ void show_trace_task(struct task_struct

void show_stack(struct task_struct *task, unsigned long *esp)
{
- unsigned long *stack;
- int i;
-
if (esp == NULL) {
if (task)
esp = (unsigned long*)task->thread.esp;
@@ -137,6 +184,10 @@ void show_stack(struct task_struct *task
esp = (unsigned long *)&esp;
}

+#ifndef CONFIG_FRAME_POINTER
+ {
+ unsigned long *stack;
+ int i;
stack = esp;
for(i = 0; i < kstack_depth_to_print; i++) {
if (kstack_end(stack))
@@ -146,6 +197,8 @@ void show_stack(struct task_struct *task
printk("%08lx ", *stack++);
}
printk("\n");
+ }
+#endif
show_trace(task, esp);
}

--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

2004-04-21 23:49:17

by Andrew Morton

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

Adam Litke <[email protected]> wrote:
>
> On Tue, 2004-04-20 at 18:41, Andrew Morton wrote:
> > Andrew Morton <[email protected]> wrote:
> > >
> > > Roland Dreier <[email protected]> wrote:
> > > >
> > > > Adam> This problem was annoying me a few months ago so I coded up
> > > > Adam> a stack trace patch that actually uses the frame pointer.
> > > > Adam> It is currently maintained in -mjb but I have pasted below.
> > > > Adam> Hope this helps.
> > > >
> > > > Thanks, that looks really useful. What is the chance of this moving
> > > > from -mjb to mainline?
> > >
> > > Good, but it needs to be updated to do the right thing with 4k stacks when
> > > called from interrupt context.
>
> The show_trace() for the CONFIG_FRAME_POINTER case will now be called
> the same way as the existing code.

I still don't see any code in there to handle the transition from the
interrupt stack page to the non-interrupt stack page in the 4k-stacks case?

> This brings up a question though.
> It doesn't appear to me that anyone is actually calling
> show_trace_task() yet. Am I missing something or should we change all
> the callers of show_trace() to use show_trace_task()?

You're right - we've killed off all of its callers. Neat. I shall
administer the coup de grace.

2004-04-22 18:27:50

by Adam Litke

[permalink] [raw]
Subject: Re: stack dumps, CONFIG_FRAME_POINTER and i386 (was Re: sysrq shows impossible call stack)

On Wed, 2004-04-21 at 16:50, Andrew Morton wrote:
> Adam Litke <[email protected]> wrote:
> >
> > On Tue, 2004-04-20 at 18:41, Andrew Morton wrote:
> > > Andrew Morton <[email protected]> wrote:
> > > >
> > > > Roland Dreier <[email protected]> wrote:
> > > > >
> > > > > Adam> This problem was annoying me a few months ago so I coded up
> > > > > Adam> a stack trace patch that actually uses the frame pointer.
> > > > > Adam> It is currently maintained in -mjb but I have pasted below.
> > > > > Adam> Hope this helps.
> > > > >
> > > > > Thanks, that looks really useful. What is the chance of this moving
> > > > > from -mjb to mainline?
> > > >
> > > > Good, but it needs to be updated to do the right thing with 4k stacks when
> > > > called from interrupt context.
> >
> > The show_trace() for the CONFIG_FRAME_POINTER case will now be called
> > the same way as the existing code.
>
> I still don't see any code in there to handle the transition from the
> interrupt stack page to the non-interrupt stack page in the 4k-stacks case?

Ok here is the latest version of the patch. I added support for the
4k-stacks interrupt stack case. I also moved some things around to
minimize code duplication. Andrew, how does this look?

diff -purN linux-2.6.5-bk/arch/i386/kernel/traps.c linux-2.6.5-bk-stack/arch/i386/kernel/traps.c
--- linux-2.6.5-bk/arch/i386/kernel/traps.c Thu Apr 22 09:10:57 2004
+++ linux-2.6.5-bk-stack/arch/i386/kernel/traps.c Thu Apr 22 10:57:39 2004
@@ -92,29 +92,84 @@ asmlinkage void alignment_check(void);
asmlinkage void spurious_interrupt_bug(void);
asmlinkage void machine_check(void);

-static int kstack_depth_to_print = 24;
+#define valid_stack_ptr(task, p) \
+ ((struct thread_info*)p > task->thread_info) && \
+ !kstack_end((unsigned long*)p)

-void show_trace(struct task_struct *task, unsigned long * stack)
+#ifdef CONFIG_FRAME_POINTER
+void show_stack_frame(unsigned long start, unsigned long end)
+{
+ unsigned long i;
+
+ printk(" ");
+ for (i = start; i < end; i += 4) {
+ if ((i - start) && ((i - start)%24 == 0))
+ printk("\n ");
+ printk("%08lx ", *(unsigned long *) i);
+ }
+ printk("\n");
+}
+
+void print_context_stack(struct task_struct *task, unsigned long * stack,
+ unsigned long ebp)
{
unsigned long addr;

- if (!stack)
- stack = (unsigned long*)&stack;
+ show_stack_frame((unsigned long) stack, ebp + 4);
+ while (valid_stack_ptr(task, ebp)) {
+ addr = *(unsigned long *) (ebp + 4);
+ printk(" [<%08lx>] ", addr);
+ print_symbol("%s", addr);
+ printk("\n");
+
+ /* Show the stack frame (excluding the frame pointer) */
+ show_stack_frame(ebp + 8, (*(unsigned long *) ebp) + 4);
+ ebp = *(unsigned long *) ebp;
+ }
+}
+#else
+int kstack_depth_to_print = 24;
+
+void print_context_stack(struct task_struct *task, unsigned long * stack,
+ unsigned long ebp)
+{
+ unsigned long addr;

- printk("Call Trace:");
-#ifdef CONFIG_KALLSYMS
- printk("\n");
+ while (!kstack_end(stack)) {
+ addr = *stack++;
+ if (kernel_text_address(addr)) {
+ printk(" [<%08lx>] ", addr);
+ print_symbol("%s\n", addr);
+ }
+ }
+}
#endif
+
+void show_trace(struct task_struct *task, unsigned long * stack)
+{
+ unsigned long ebp;
+
+ if (!task)
+ task = current;
+
+ if (!valid_stack_ptr(task, stack)) {
+ printk("Stack pointer is garbage, not printing trace\n");
+ return;
+ }
+
+ if (task == current) {
+ /* Grab ebp right from our regs */
+ asm ("movl %%ebp, %0" : "=r" (ebp) : );
+ } else {
+ /* ebp is the last reg pushed by switch_to */
+ ebp = *(unsigned long *) task->thread.esp;
+ }
+
while (1) {
struct thread_info *context;
- context = (struct thread_info*) ((unsigned long)stack & (~(THREAD_SIZE - 1)));
- while (!kstack_end(stack)) {
- addr = *stack++;
- if (kernel_text_address(addr)) {
- printk(" [<%08lx>] ", addr);
- print_symbol("%s\n", addr);
- }
- }
+ context = (struct thread_info*)
+ ((unsigned long)stack & (~(THREAD_SIZE - 1)));
+ print_context_stack(task, stack, ebp);
stack = (unsigned long*)context->previous_esp;
if (!stack)
break;
@@ -135,9 +190,6 @@ void show_trace_task(struct task_struct

void show_stack(struct task_struct *task, unsigned long *esp)
{
- unsigned long *stack;
- int i;
-
if (esp == NULL) {
if (task)
esp = (unsigned long*)task->thread.esp;
@@ -145,6 +197,10 @@ void show_stack(struct task_struct *task
esp = (unsigned long *)&esp;
}

+#ifndef CONFIG_FRAME_POINTER
+ {
+ unsigned long *stack;
+ int i;
stack = esp;
for(i = 0; i < kstack_depth_to_print; i++) {
if (kstack_end(stack))
@@ -154,6 +210,8 @@ void show_stack(struct task_struct *task
printk("%08lx ", *stack++);
}
printk("\n");
+ }
+#endif
show_trace(task, esp);
}


--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center