2004-10-03 15:54:20

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] alternate stack dump fix.

Hello.

Kirill Korotaev wrote:
>
> This patch fixes incorrect check for stack ptr in
> show_trace()->valid_stack_ptr(). When called from hardirq/softirq
> show_trace() prints "Stack pointer is garbage, not printing trace"
> message instead of call traces.

There is another problem in show_trace(). With CONFIG_FRAME_POINTER
every call to print_context_stack() will now print entire call chain,
switching the stacks transparently, beacause valid_stack_ptr()
now accepts ebp in irq stack.

Then show trace switch the stack, and calls print_context_stack()
again with the same value in ebp, and we have the same dump
after printk(" =======================\n").

What do you think about the following patch?

Against 2.6.9-rc3.

Oleg.

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.9-rc3/arch/i386/kernel/traps.c~ Sun Oct 3 18:45:52 2004
+++ 2.6.9-rc3/arch/i386/kernel/traps.c Sun Oct 3 19:54:07 2004
@@ -107,36 +107,27 @@ int register_die_notifier(struct notifie
return err;
}

-static int valid_stack_ptr(struct task_struct *task, void *p)
+static int valid_stack_ptr(struct thread_info *tinfo, void *p)
{
- if (p <= (void *)task->thread_info)
- return 0;
- if (kstack_end(p))
- return 0;
- return 1;
+ return p > (void *)tinfo &&
+ p < (void *)tinfo + THREAD_SIZE - 3;
}

-#ifdef CONFIG_FRAME_POINTER
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
+static unsigned long print_context_stack(struct thread_info *tinfo, unsigned long *stack,
unsigned long ebp)
{
unsigned long addr;

- while (valid_stack_ptr(task, (void *)ebp)) {
+#ifdef CONFIG_FRAME_POINTER
+ while (valid_stack_ptr(tinfo, (void *)ebp)) {
addr = *(unsigned long *)(ebp + 4);
printk(" [<%08lx>] ", addr);
print_symbol("%s", addr);
printk("\n");
ebp = *(unsigned long *)ebp;
}
-}
#else
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
- unsigned long ebp)
-{
- unsigned long addr;
-
- while (!kstack_end(stack)) {
+ while (valid_stack_ptr(tinfo, stack)) {
addr = *stack++;
if (__kernel_text_address(addr)) {
printk(" [<%08lx>]", addr);
@@ -144,9 +135,11 @@ static void print_context_stack(struct t
printk("\n");
}
}
-}
#endif

+ return ebp;
+}
+
void show_trace(struct task_struct *task, unsigned long * stack)
{
unsigned long ebp;
@@ -154,11 +147,6 @@ void show_trace(struct task_struct *task
if (!task)
task = current;

- if (!valid_stack_ptr(task, stack)) {
- printk("Stack pointer is garbage, not printing trace\n");
- return;
- }
-
if (task == current) {
/* Grab ebp right from our regs */
asm ("movl %%ebp, %0" : "=r" (ebp) : );
@@ -171,7 +159,7 @@ void show_trace(struct task_struct *task
struct thread_info *context;
context = (struct thread_info *)
((unsigned long)stack & (~(THREAD_SIZE - 1)));
- print_context_stack(task, stack, ebp);
+ ebp = print_context_stack(context, stack, ebp);
stack = (unsigned long*)context->previous_esp;
if (!stack)
break;


2004-10-03 17:08:25

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] alternate stack dump fix.

Oleg Nesterov <[email protected]> wrote:
>
> There is another problem in show_trace(). With CONFIG_FRAME_POINTER
> every call to print_context_stack() will now print entire call chain,
> switching the stacks transparently, beacause valid_stack_ptr()
> now accepts ebp in irq stack.
>
> Then show trace switch the stack, and calls print_context_stack()
> again with the same value in ebp, and we have the same dump
> after printk(" =======================\n").
>
> What do you think about the following patch?
>
> Against 2.6.9-rc3.

But it conflicts in a big way with Kirill's patch. Could you redo it
against 2.6.9-rc3-mm1, or against just

fix-of-stack-dump-in-soft-hardirqs.patch
fix-of-stack-dump-in-soft-hardirqs-cleanup.patch
fix-of-stack-dump-in-soft-hardirqs-build-fix.patch

from ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm1/broken-out?

Thanks.

2004-10-04 08:16:03

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] alternate stack dump fix.

Andrew Morton wrote:
>
> But it conflicts in a big way with Kirill's patch. Could you redo it
> against 2.6.9-rc3-mm1, or against just
>
> fix-of-stack-dump-in-soft-hardirqs.patch
> fix-of-stack-dump-in-soft-hardirqs-cleanup.patch
> fix-of-stack-dump-in-soft-hardirqs-build-fix.patch
>

Rediffed against rc3-mm1.

For your convenience, i will post the same patch against mm tree with those
3 patches reverted in a separate message.

Oleg.

Signed-off-by: Oleg Nesterov <[email protected]>

diff -rup rc3-mm1-clean/arch/i386/kernel/irq.c rc3-mm1-trace/arch/i386/kernel/irq.c
--- rc3-mm1-clean/arch/i386/kernel/irq.c 2004-10-04 10:26:26.000000000 +0400
+++ rc3-mm1-trace/arch/i386/kernel/irq.c 2004-10-04 10:38:06.000000000 +0400
@@ -10,14 +10,12 @@
* io_apic.c.)
*/

+#include <asm/uaccess.h>
#include <linux/module.h>
#include <linux/seq_file.h>
#include <linux/interrupt.h>
#include <linux/kernel_stat.h>

-#include <asm/uaccess.h>
-#include <asm/hardirq.h>
-
#ifdef CONFIG_4KSTACKS
/*
* per-CPU IRQ handling contexts (thread information and stack)
@@ -178,23 +176,8 @@ asmlinkage void do_softirq(void)

local_irq_restore(flags);
}
-EXPORT_SYMBOL(do_softirq);
-
-int is_irq_stack_ptr(struct task_struct *task, void *p)
-{
- unsigned long off;
-
- off = task->thread_info->cpu * THREAD_SIZE;
- if (p >= (void *)hardirq_stack + off &&
- p < (void *)hardirq_stack + off + THREAD_SIZE)
- return 1;
- if (p >= (void *)softirq_stack + off &&
- p < (void *)softirq_stack + off + THREAD_SIZE)
- return 1;
-
- return 0;
-}

+EXPORT_SYMBOL(do_softirq);
#endif

/*
diff -rup rc3-mm1-clean/arch/i386/kernel/traps.c rc3-mm1-trace/arch/i386/kernel/traps.c
--- rc3-mm1-clean/arch/i386/kernel/traps.c 2004-10-04 10:26:26.000000000 +0400
+++ rc3-mm1-trace/arch/i386/kernel/traps.c 2004-10-04 10:55:03.000000000 +0400
@@ -50,7 +50,6 @@
#include <asm/smp.h>
#include <asm/arch_hooks.h>
#include <asm/kdebug.h>
-#include <asm/hardirq.h>

#include <linux/irq.h>
#include <linux/module.h>
@@ -106,18 +105,6 @@ int register_die_notifier(struct notifie
return err;
}

-static int valid_stack_ptr(struct task_struct *task, void *p)
-{
- if (is_irq_stack_ptr(task, p))
- return 1;
- if (p >= (void *)task->thread_info &&
- p < (void *)task->thread_info + THREAD_SIZE &&
- !kstack_end(p))
- return 1;
-
- return 0;
-}
-
#ifdef CONFIG_KGDB
extern void sysenter_past_esp(void);
#include <asm/kgdb.h>
@@ -151,28 +138,27 @@ void breakpoint(void)
#define CHK_REMOTE_DEBUG(trapnr,signr,error_code,regs,after)
#endif

+static int valid_stack_ptr(struct thread_info *tinfo, void *p)
+{
+ return p > (void *)tinfo &&
+ p < (void *)tinfo + THREAD_SIZE - 3;
+}

-#ifdef CONFIG_FRAME_POINTER
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
+static unsigned long print_context_stack(struct thread_info *tinfo, unsigned long *stack,
unsigned long ebp)
{
unsigned long addr;

- while (valid_stack_ptr(task, (void *)ebp)) {
+#ifdef CONFIG_FRAME_POINTER
+ while (valid_stack_ptr(tinfo, (void *)ebp)) {
addr = *(unsigned long *)(ebp + 4);
printk(" [<%08lx>] ", addr);
print_symbol("%s", addr);
printk("\n");
ebp = *(unsigned long *)ebp;
}
-}
#else
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
- unsigned long ebp)
-{
- unsigned long addr;
-
- while (!kstack_end(stack)) {
+ while (valid_stack_ptr(tinfo, stack)) {
addr = *stack++;
if (__kernel_text_address(addr)) {
printk(" [<%08lx>]", addr);
@@ -180,8 +166,9 @@ static void print_context_stack(struct t
printk("\n");
}
}
-}
#endif
+ return ebp;
+}

void show_trace(struct task_struct *task, unsigned long * stack)
{
@@ -190,11 +177,6 @@ void show_trace(struct task_struct *task
if (!task)
task = current;

- if (!valid_stack_ptr(task, stack)) {
- printk("Stack pointer is garbage, not printing trace\n");
- return;
- }
-
if (task == current) {
/* Grab ebp right from our regs */
asm ("movl %%ebp, %0" : "=r" (ebp) : );
@@ -207,7 +189,7 @@ void show_trace(struct task_struct *task
struct thread_info *context;
context = (struct thread_info *)
((unsigned long)stack & (~(THREAD_SIZE - 1)));
- print_context_stack(task, stack, ebp);
+ ebp = print_context_stack(context, stack, ebp);
stack = (unsigned long*)context->previous_esp;
if (!stack)
break;
diff -rup rc3-mm1-clean/include/asm-i386/hardirq.h rc3-mm1-trace/include/asm-i386/hardirq.h
--- rc3-mm1-clean/include/asm-i386/hardirq.h 2004-10-04 10:26:41.000000000 +0400
+++ rc3-mm1-trace/include/asm-i386/hardirq.h 2004-10-04 10:38:04.000000000 +0400
@@ -36,15 +36,4 @@ static inline void ack_bad_irq(unsigned
#endif
}

-struct task_struct;
-
-#ifdef CONFIG_4KSTACKS
-int is_irq_stack_ptr(struct task_struct *task, void *p);
-#else
-static inline int is_irq_stack_ptr(struct task_struct *task, void *p)
-{
- return 0;
-}
-#endif
-
#endif /* __ASM_HARDIRQ_H */

2004-10-04 08:17:19

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] alternate stack dump fix.

Oleg Nesterov wrote:
>
> > Andrew Morton wrote:
> >
> > But it conflicts in a big way with Kirill's patch. Could you redo it
> > against 2.6.9-rc3-mm1, or against just
>
> For your convenience, i will post the same patch against mm tree with those
> 3 patches reverted in a separate message.

Against 2.6.9-rc3-mm1 +
-R fix-of-stack-dump-in-soft-hardirqs-build-fix.patch +
-R fix-of-stack-dump-in-soft-hardirqs-cleanup.patch +
-R fix-of-stack-dump-in-soft-hardirqs.patch

Oleg.

Signed-off-by: Oleg Nesterov <[email protected]>

--- rc3-mm1/arch/i386/kernel/traps.c~ Mon Oct 4 10:46:10 2004
+++ rc3-mm1/arch/i386/kernel/traps.c Mon Oct 4 10:56:22 2004
@@ -105,15 +105,6 @@ int register_die_notifier(struct notifie
return err;
}

-static int valid_stack_ptr(struct task_struct *task, void *p)
-{
- if (p <= (void *)task->thread_info)
- return 0;
- if (kstack_end(p))
- return 0;
- return 1;
-}
-
#ifdef CONFIG_KGDB
extern void sysenter_past_esp(void);
#include <asm/kgdb.h>
@@ -147,28 +138,27 @@ void breakpoint(void)
#define CHK_REMOTE_DEBUG(trapnr,signr,error_code,regs,after)
#endif

+static int valid_stack_ptr(struct thread_info *tinfo, void *p)
+{
+ return p > (void *)tinfo &&
+ p < (void *)tinfo + THREAD_SIZE - 3;
+}

-#ifdef CONFIG_FRAME_POINTER
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
+static unsigned long print_context_stack(struct thread_info *tinfo, unsigned long *stack,
unsigned long ebp)
{
unsigned long addr;

- while (valid_stack_ptr(task, (void *)ebp)) {
+#ifdef CONFIG_FRAME_POINTER
+ while (valid_stack_ptr(tinfo, (void *)ebp)) {
addr = *(unsigned long *)(ebp + 4);
printk(" [<%08lx>] ", addr);
print_symbol("%s", addr);
printk("\n");
ebp = *(unsigned long *)ebp;
}
-}
#else
-static void print_context_stack(struct task_struct *task, unsigned long *stack,
- unsigned long ebp)
-{
- unsigned long addr;
-
- while (!kstack_end(stack)) {
+ while (valid_stack_ptr(tinfo, stack)) {
addr = *stack++;
if (__kernel_text_address(addr)) {
printk(" [<%08lx>]", addr);
@@ -176,8 +166,9 @@ static void print_context_stack(struct t
printk("\n");
}
}
-}
#endif
+ return ebp;
+}

void show_trace(struct task_struct *task, unsigned long * stack)
{
@@ -186,11 +177,6 @@ void show_trace(struct task_struct *task
if (!task)
task = current;

- if (!valid_stack_ptr(task, stack)) {
- printk("Stack pointer is garbage, not printing trace\n");
- return;
- }
-
if (task == current) {
/* Grab ebp right from our regs */
asm ("movl %%ebp, %0" : "=r" (ebp) : );
@@ -203,7 +189,7 @@ void show_trace(struct task_struct *task
struct thread_info *context;
context = (struct thread_info *)
((unsigned long)stack & (~(THREAD_SIZE - 1)));
- print_context_stack(task, stack, ebp);
+ ebp = print_context_stack(context, stack, ebp);
stack = (unsigned long*)context->previous_esp;
if (!stack)
break;

2004-10-04 08:29:59

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] alternate stack dump fix.

Oleg Nesterov <[email protected]> wrote:
>
> For your convenience, i will post the same patch against mm tree with those
> 3 patches reverted in a separate message.

In that case I'm not understanding you. Are you saying that your patch
fixes the same problem as that which Kirill's patch fixed? That wasn't at
all clear from your earlier comments.

If so, please describe precisely what problems your patch fixes, and how.

Thanks.

2004-10-04 09:21:18

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] alternate stack dump fix.

Andrew Morton wrote:
>
> In that case I'm not understanding you. Are you saying that your patch
> fixes the same problem as that which Kirill's patch fixed?

It seems to me, yes.

Here the pseudo code for CONFIG_FRAME_POINTER:

int valid_stack_ptr(struct thread_info *tinfo, void *p)
{
return p > (void *)tinfo &&
p < (void *)tinfo + THREAD_SIZE - 3;
}

long print_context_stack(thread_info *tinfo, long *stack, long ebp)
{
while (valid_stack_ptr(tinfo, ebp))
{
print_symbol("%s", *(ebp+4));
ebp = *(unsigned long *)ebp;
}

return ebp;
}

void show_trace(struct task_struct *task, unsigned long * stack)
{
while (1) {
struct thread_info *context = (struct thread_info *)
((unsigned long)stack & (~(THREAD_SIZE - 1)));

ebp = print_context_stack(context, stack, ebp);

stack = (unsigned long*)context->previous_esp;
if (!stack)
break;
printk(" =======================\n");
}
}

show_trace() now does not use task argument in the main
loop. Instead, it converts stack to thread_info* context,
and passes it to print_context_stack() and (implicitly)
to valid_stack_ptr().

valid_stack_ptr() does not care now whether or not it is
irq stack, it just does bounds checking.

Please note, i simply deleted this printk("Stack pointer is garbage"),
it can be restored, if neccessary.

Did i miss something?

> That wasn't at all clear from your earlier comments.

Yes, sorry.

Oleg.