2005-05-18 12:36:43

by Vivek Goyal

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

On Sat, May 14, 2005 at 10:49:58PM +0200, Alexander Nyberg wrote:
> Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
> get exact register state at the point of the crash. If we come from
> direct panic assertion NULL will be passed and the current registers
> saved before crashdump.
>
> This hooks into two places:
> die(): check the conditions under which we will panic when calling
> do_exit and go there directly with the pt_regs that caused the fatal
> fault.
>
> die_nmi(): If we receive an NMI lockup while in the kernel use the
> pt_regs and go directly to crash_kexec(). We're probably nested up badly
> at this point so this might be the only chance to escape with proper
> information.
>
>
> Signed-off-by: Alexander Nyberg <[email protected]>
>


Largely looks ok except few comments . Copying to LKML for more eyes.

I am tempted to think that if panic had been passed pt_regs, then
crash_kexec() call could have been placed at a single place that too in
architecture independent manner. But panic is called at too many a places
to change the function signature. Just a passing thought.


> Index: mm/include/linux/reboot.h
> ===================================================================
> --- mm.orig/include/linux/reboot.h 2005-05-14 22:29:16.000000000 +0200
> +++ mm/include/linux/reboot.h 2005-05-14 22:29:29.000000000 +0200
> @@ -50,9 +50,10 @@
> extern void machine_restart(char *cmd);
> extern void machine_halt(void);
> extern void machine_power_off(void);
> -
> extern void machine_shutdown(void);
> -extern void machine_crash_shutdown(void);
> +
> +struct pt_regs;
> +extern void machine_crash_shutdown(struct pt_regs *);
>
> #endif
>
> Index: mm/arch/i386/kernel/traps.c
> ===================================================================
> --- mm.orig/arch/i386/kernel/traps.c 2005-05-14 22:29:16.000000000 +0200
> +++ mm/arch/i386/kernel/traps.c 2005-05-14 22:48:14.000000000 +0200
> @@ -27,6 +27,7 @@
> #include <linux/ptrace.h>
> #include <linux/utsname.h>
> #include <linux/kprobes.h>
> +#include <linux/kexec.h>
>
> #ifdef CONFIG_EISA
> #include <linux/ioport.h>
> @@ -326,6 +327,9 @@
> printk("Kernel BUG\n");
> }
>
> +/* This is gone through when something in the kernel
> + * has done something bad and is about to be terminated.
> +*/
> void die(const char * str, struct pt_regs * regs, long err)
> {
> static struct {
> @@ -382,6 +386,10 @@
> bust_spinlocks(0);
> die.lock_owner = -1;
> spin_unlock_irq(&die.lock);
> +
> + if (in_interrupt() || !current->pid || current->pid == 1)
> + crash_kexec(regs);
> +


Would be nice if panic_on_oops case also can be covered. There also we want
to capture actual pt_regs register state.


> if (in_interrupt())
> panic("Fatal exception in interrupt");
>
> @@ -616,6 +624,15 @@
> console_silent();
> spin_unlock(&nmi_print_lock);
> bust_spinlocks(0);
> +
> + /* If we are in kernel we are probably nested up pretty bad
> + * and might aswell get out now while we still can.
> + */
> + if (!user_mode(regs)) {
> + current->thread.trap_no = 2;
> + crash_kexec(regs);
> + }
> +
> do_exit(SIGSEGV);
> }
>
> Index: mm/arch/i386/kernel/crash.c
> ===================================================================
> --- mm.orig/arch/i386/kernel/crash.c 2005-05-14 22:29:16.000000000 +0200
> +++ mm/arch/i386/kernel/crash.c 2005-05-14 22:48:53.000000000 +0200
> @@ -98,12 +98,19 @@
> regs->eip = (unsigned long)current_text_addr();
> }
>
> -static void crash_save_self(void)
> +/* We may have saved_regs from where the error came from
> + * or it is NULL if via a direct panic().
> + */
> +static void crash_save_self(struct pt_regs *saved_regs)
> {
> struct pt_regs regs;
> int cpu;
> cpu = smp_processor_id();
> - crash_get_current_regs(&regs);
> +
> + if (saved_regs)
> + memcpy(&regs, saved_regs, sizeof(regs));


saved_regs will not have proper value of ss and esp if exception occurred
which thread was running in kernel mode as cpu has not pushed these on stack
A fix something like following also might go here.

if (saved_regs) {
memcpy(&regs, saved_regs, sizeof(regs));
if (!user_mode(saved_regs)) {
regs.esp = (unsigned long)&(saved_regs->esp);
__asm__ __volatile__("xorl %eax, %eax;");
__asm__ __volatile__ ("movw %%ss, %%ax;"
:"=a"(regs.xss));
}
}


> + else
> + crash_get_current_regs(&regs);
> crash_save_this_cpu(&regs, cpu);
> }
>
> @@ -175,7 +182,7 @@
> }
> #endif
>
> -void machine_crash_shutdown(void)
> +void machine_crash_shutdown(struct pt_regs *regs)
> {
> /* This function is only called after the system
> * has paniced or is otherwise in a critical state.
> @@ -192,5 +199,5 @@
> #if defined(CONFIG_X86_IO_APIC)
> disable_IO_APIC();
> #endif
> - crash_save_self();
> + crash_save_self(regs);
> }
> Index: mm/kernel/panic.c
> ===================================================================
> --- mm.orig/kernel/panic.c 2005-05-14 22:29:16.000000000 +0200
> +++ mm/kernel/panic.c 2005-05-14 22:29:29.000000000 +0200
> @@ -83,7 +83,7 @@
> * everything else.
> * Do we want to call this before we try to display a message?
> */
> - crash_kexec();
> + crash_kexec(NULL);
>
> #ifdef CONFIG_SMP
> /*
> Index: mm/kernel/kexec.c
> ===================================================================
> --- mm.orig/kernel/kexec.c 2005-05-14 22:29:16.000000000 +0200
> +++ mm/kernel/kexec.c 2005-05-14 22:29:29.000000000 +0200
> @@ -1010,7 +1010,7 @@
> }
> #endif
>
> -void crash_kexec(void)
> +void crash_kexec(struct pt_regs *regs)
> {
> struct kimage *image;
> int locked;
> @@ -1028,7 +1028,7 @@
> if (!locked) {
> image = xchg(&kexec_crash_image, NULL);
> if (image) {
> - machine_crash_shutdown();
> + machine_crash_shutdown(regs);
> machine_kexec(image);
> }
> xchg(&kexec_lock, 0);
> Index: mm/include/linux/kexec.h
> ===================================================================
> --- mm.orig/include/linux/kexec.h 2005-05-14 22:29:16.000000000 +0200
> +++ mm/include/linux/kexec.h 2005-05-14 22:29:29.000000000 +0200
> @@ -99,7 +99,7 @@
> unsigned long flags);
> #endif
> extern struct page *kimage_alloc_control_pages(struct kimage *image, unsigned int order);
> -extern void crash_kexec(void);
> +extern void crash_kexec(struct pt_regs *);
> extern struct kimage *kexec_image;
>
> #define KEXEC_ON_CRASH 0x00000001
> @@ -123,6 +123,6 @@
> extern struct resource crashk_res;
>
> #else /* !CONFIG_KEXEC */
> -static inline void crash_kexec(void) { }
> +static inline void crash_kexec(struct pt_regs *regs) { }
> #endif /* CONFIG_KEXEC */
> #endif /* LINUX_KEXEC_H */
> Index: mm/drivers/char/sysrq.c
> ===================================================================
> --- mm.orig/drivers/char/sysrq.c 2005-05-14 22:29:16.000000000 +0200
> +++ mm/drivers/char/sysrq.c 2005-05-14 22:29:29.000000000 +0200
> @@ -119,7 +119,7 @@
> static void sysrq_handle_crashdump(int key, struct pt_regs *pt_regs,
> struct tty_struct *tty)
> {
> - crash_kexec();
> + crash_kexec(pt_regs);
> }
> static struct sysrq_key_op sysrq_crashdump_op = {
> .handler = sysrq_handle_crashdump,
>
>

> _______________________________________________
> fastboot mailing list
> [email protected]
> http://lists.osdl.org/mailman/listinfo/fastboot


2005-05-18 14:55:46

by Alexander Nyberg

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

Vivek, new version where comments are fixed and added
kexec_should_crash() instead of open-coding panic conditions.

Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
get exact register state at the point of the crash. If we come from
direct panic assertion NULL will be passed and the current registers
saved before crashdump.

This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.

die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.


Signed-off-by: Alexander Nyberg <[email protected]>

Index: mm/include/linux/reboot.h
===================================================================
--- mm.orig/include/linux/reboot.h 2005-05-16 18:28:52.000000000 +0200
+++ mm/include/linux/reboot.h 2005-05-18 16:25:03.000000000 +0200
@@ -52,7 +52,7 @@
extern void machine_power_off(void);

extern void machine_shutdown(void);
-extern void machine_crash_shutdown(void);
+extern void machine_crash_shutdown(struct pt_regs *);

#endif

Index: mm/arch/i386/kernel/traps.c
===================================================================
--- mm.orig/arch/i386/kernel/traps.c 2005-05-16 18:28:45.000000000 +0200
+++ mm/arch/i386/kernel/traps.c 2005-05-18 16:42:36.000000000 +0200
@@ -27,6 +27,7 @@
#include <linux/ptrace.h>
#include <linux/utsname.h>
#include <linux/kprobes.h>
+#include <linux/kexec.h>

#ifdef CONFIG_EISA
#include <linux/ioport.h>
@@ -326,6 +327,9 @@
printk("Kernel BUG\n");
}

+/* This is gone through when something in the kernel
+ * has done something bad and is about to be terminated.
+*/
void die(const char * str, struct pt_regs * regs, long err)
{
static struct {
@@ -382,6 +386,10 @@
bust_spinlocks(0);
die.lock_owner = -1;
spin_unlock_irq(&die.lock);
+
+ if (kexec_should_crash(current))
+ crash_kexec(regs);
+
if (in_interrupt())
panic("Fatal exception in interrupt");

@@ -616,6 +624,15 @@
console_silent();
spin_unlock(&nmi_print_lock);
bust_spinlocks(0);
+
+ /* If we are in kernel we are probably nested up pretty bad
+ * and might aswell get out now while we still can.
+ */
+ if (!user_mode(regs)) {
+ current->thread.trap_no = 2;
+ crash_kexec(regs);
+ }
+
do_exit(SIGSEGV);
}

Index: mm/arch/i386/kernel/crash.c
===================================================================
--- mm.orig/arch/i386/kernel/crash.c 2005-05-16 18:28:45.000000000 +0200
+++ mm/arch/i386/kernel/crash.c 2005-05-18 15:26:43.000000000 +0200
@@ -98,12 +98,31 @@
regs->eip = (unsigned long)current_text_addr();
}

-static void crash_save_self(void)
+/* CPU does not save ss and esp on stack if execution is already
+ * running in kernel mode at the time of NMI occurrence. This code
+ * fixes it.
+ */
+static void crash_setup_regs(struct pt_regs *newregs, struct pt_regs *oldregs)
+{
+ memcpy(newregs, oldregs, sizeof(*newregs));
+ newregs->esp = (unsigned long)&(oldregs->esp);
+ __asm__ __volatile__("xorl %eax, %eax;");
+ __asm__ __volatile__ ("movw %%ss, %%ax;" :"=a"(newregs->xss));
+}
+
+/* We may have saved_regs from where the error came from
+ * or it is NULL if via a direct panic().
+ */
+static void crash_save_self(struct pt_regs *saved_regs)
{
struct pt_regs regs;
int cpu;
cpu = smp_processor_id();
- crash_get_current_regs(&regs);
+
+ if (saved_regs)
+ crash_setup_regs(&regs, saved_regs);
+ else
+ crash_get_current_regs(&regs);
crash_save_this_cpu(&regs, cpu);
}

@@ -115,15 +134,8 @@
struct pt_regs fixed_regs;
local_irq_disable();

- /* CPU does not save ss and esp on stack if execution is already
- * running in kernel mode at the time of NMI occurrence. This code
- * fixes it.
- */
if (!user_mode(regs)) {
- memcpy(&fixed_regs, regs, sizeof(*regs));
- fixed_regs.esp = (unsigned long)&(regs->esp);
- __asm__ __volatile__("xorl %eax, %eax;");
- __asm__ __volatile__ ("movw %%ss, %%ax;" :"=a"(fixed_regs.xss));
+ crash_setup_regs(&fixed_regs, regs);
regs = &fixed_regs;
}
crash_save_this_cpu(regs, cpu);
@@ -175,7 +187,7 @@
}
#endif

-void machine_crash_shutdown(void)
+void machine_crash_shutdown(struct pt_regs *regs)
{
/* This function is only called after the system
* has paniced or is otherwise in a critical state.
@@ -192,5 +204,5 @@
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();
#endif
- crash_save_self();
+ crash_save_self(regs);
}
Index: mm/kernel/panic.c
===================================================================
--- mm.orig/kernel/panic.c 2005-05-16 18:28:52.000000000 +0200
+++ mm/kernel/panic.c 2005-05-18 15:06:21.000000000 +0200
@@ -83,7 +83,7 @@
* everything else.
* Do we want to call this before we try to display a message?
*/
- crash_kexec();
+ crash_kexec(NULL);

#ifdef CONFIG_SMP
/*
Index: mm/kernel/kexec.c
===================================================================
--- mm.orig/kernel/kexec.c 2005-05-16 18:28:52.000000000 +0200
+++ mm/kernel/kexec.c 2005-05-18 16:34:34.000000000 +0200
@@ -32,6 +32,13 @@
.flags = IORESOURCE_BUSY | IORESOURCE_MEM
};

+int kexec_should_crash(struct task_struct *p)
+{
+ if (in_interrupt() || !p->pid || p->pid == 1 || panic_on_oops)
+ return 1;
+ return 0;
+}
+
/*
* When kexec transitions to the new kernel there is a one-to-one
* mapping between physical and virtual addresses. On processors
@@ -1010,7 +1017,7 @@
}
#endif

-void crash_kexec(void)
+void crash_kexec(struct pt_regs *regs)
{
struct kimage *image;
int locked;
@@ -1028,7 +1035,7 @@
if (!locked) {
image = xchg(&kexec_crash_image, NULL);
if (image) {
- machine_crash_shutdown();
+ machine_crash_shutdown(regs);
machine_kexec(image);
}
xchg(&kexec_lock, 0);
Index: mm/include/linux/kexec.h
===================================================================
--- mm.orig/include/linux/kexec.h 2005-05-16 18:28:52.000000000 +0200
+++ mm/include/linux/kexec.h 2005-05-18 16:40:17.000000000 +0200
@@ -99,7 +99,8 @@
unsigned long flags);
#endif
extern struct page *kimage_alloc_control_pages(struct kimage *image, unsigned int order);
-extern void crash_kexec(void);
+extern void crash_kexec(struct pt_regs *);
+int kexec_should_crash(struct task_struct *);
extern struct kimage *kexec_image;

#define KEXEC_ON_CRASH 0x00000001
@@ -123,6 +124,7 @@
extern struct resource crashk_res;

#else /* !CONFIG_KEXEC */
-static inline void crash_kexec(void) { }
+static inline void crash_kexec(struct pt_regs *regs) { }
+static inline int kexec_should_crash(struct task_struct *p) { return 0; }
#endif /* CONFIG_KEXEC */
#endif /* LINUX_KEXEC_H */
Index: mm/drivers/char/sysrq.c
===================================================================
--- mm.orig/drivers/char/sysrq.c 2005-05-16 18:28:46.000000000 +0200
+++ mm/drivers/char/sysrq.c 2005-05-18 15:06:21.000000000 +0200
@@ -119,7 +119,7 @@
static void sysrq_handle_crashdump(int key, struct pt_regs *pt_regs,
struct tty_struct *tty)
{
- crash_kexec();
+ crash_kexec(pt_regs);
}
static struct sysrq_key_op sysrq_crashdump_op = {
.handler = sysrq_handle_crashdump,


2005-05-25 09:08:55

by Andrew Morton

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

Alexander Nyberg <[email protected]> wrote:
>
> -extern void machine_crash_shutdown(void);
> +extern void machine_crash_shutdown(struct pt_regs *);

That'll break x86_64, ppc, ppc64 and s/390.

2005-05-25 12:15:29

by Alexander Nyberg

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

ons 2005-05-25 klockan 02:07 -0700 skrev Andrew Morton:
> Alexander Nyberg <[email protected]> wrote:
> >
> > -extern void machine_crash_shutdown(void);
> > +extern void machine_crash_shutdown(struct pt_regs *);
>
> That'll break x86_64, ppc, ppc64 and s/390.

I'm such an idiot.

Make sure all arches take pt_regs * as argument to
machine_crash_shutdown(). (now cross-compiled on above arches except
s/390).


Signed-off-by: Alexander Nyberg <[email protected]>

Index: mm/arch/ppc/kernel/machine_kexec.c
===================================================================
--- mm.orig/arch/ppc/kernel/machine_kexec.c 2005-05-25 13:17:41.000000000 +0200
+++ mm/arch/ppc/kernel/machine_kexec.c 2005-05-25 13:18:06.000000000 +0200
@@ -34,7 +34,7 @@
}
}

-void machine_crash_shutdown(void)
+void machine_crash_shutdown(struct pt_regs *regs)
{
if (ppc_md.machine_crash_shutdown) {
ppc_md.machine_crash_shutdown();
Index: mm/arch/x86_64/kernel/crash.c
===================================================================
--- mm.orig/arch/x86_64/kernel/crash.c 2005-05-25 13:13:18.000000000 +0200
+++ mm/arch/x86_64/kernel/crash.c 2005-05-25 13:15:44.000000000 +0200
@@ -22,7 +22,7 @@

note_buf_t crash_notes[NR_CPUS];

-void machine_crash_shutdown(void)
+void machine_crash_shutdown(struct pt_regs *regs)
{
/* This function is only called after the system
* has paniced or is otherwise in a critical state.
Index: mm/arch/s390/kernel/crash.c
===================================================================
--- mm.orig/arch/s390/kernel/crash.c 2005-05-25 13:13:18.000000000 +0200
+++ mm/arch/s390/kernel/crash.c 2005-05-25 13:15:58.000000000 +0200
@@ -12,6 +12,6 @@

note_buf_t crash_notes[NR_CPUS];

-void machine_crash_shutdown(void)
+void machine_crash_shutdown(struct pt_regs *regs)
{
}
Index: mm/arch/ppc64/kernel/machine_kexec.c
===================================================================
--- mm.orig/arch/ppc64/kernel/machine_kexec.c 2005-05-25 13:13:18.000000000 +0200
+++ mm/arch/ppc64/kernel/machine_kexec.c 2005-05-25 13:15:07.000000000 +0200
@@ -34,7 +34,7 @@
* and if what it will achieve. Letting it be now to compile the code
* in generic kexec environment
*/
-void machine_crash_shutdown(void)
+void machine_crash_shutdown(struct pt_regs *regs)
{
/* do nothing right now */
/* smp_relase_cpus() if we want smp on panic kernel */
Index: mm/include/linux/reboot.h
===================================================================
--- mm.orig/include/linux/reboot.h 2005-05-25 13:13:39.000000000 +0200
+++ mm/include/linux/reboot.h 2005-05-25 13:51:49.000000000 +0200
@@ -52,6 +52,7 @@
extern void machine_power_off(void);

extern void machine_shutdown(void);
+struct pt_regs;
extern void machine_crash_shutdown(struct pt_regs *);

#endif
Index: mm/include/linux/kexec.h
===================================================================
--- mm.orig/include/linux/kexec.h 2005-05-25 13:13:39.000000000 +0200
+++ mm/include/linux/kexec.h 2005-05-25 13:47:47.000000000 +0200
@@ -124,6 +124,8 @@
extern struct resource crashk_res;

#else /* !CONFIG_KEXEC */
+struct pt_regs;
+struct task_struct;
static inline void crash_kexec(struct pt_regs *regs) { }
static inline int kexec_should_crash(struct task_struct *p) { return 0; }
#endif /* CONFIG_KEXEC */


2005-05-25 13:06:18

by Vivek Goyal

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

On Wed, May 25, 2005 at 02:14:56PM +0200, Alexander Nyberg wrote:
> ons 2005-05-25 klockan 02:07 -0700 skrev Andrew Morton:
> > Alexander Nyberg <[email protected]> wrote:
> > >
> > > -extern void machine_crash_shutdown(void);
> > > +extern void machine_crash_shutdown(struct pt_regs *);
> >
> > That'll break x86_64, ppc, ppc64 and s/390.
>
> I'm such an idiot.
>
> Make sure all arches take pt_regs * as argument to
> machine_crash_shutdown(). (now cross-compiled on above arches except
> s/390).
>

Alexander, I face following warning if I build my kernel without HIGHMEM
support. Fianally linker fails in the end.

CC kernel/kexec.o
kernel/kexec.c: In function `kexec_should_crash':
kernel/kexec.c:37: warning: implicit declaration of function `in_interrupt'

If I include HIGHMEM support, it compiles fine.

You might have to include include/linux/hardirq.h in kexec.c to
resolve the problem.

Thanks
Vivek

2005-05-25 13:17:12

by Alexander Nyberg

[permalink] [raw]
Subject: Re: [Fastboot] [1/2] kdump: Use real pt_regs from exception

ons 2005-05-25 klockan 18:36 +0530 skrev Vivek Goyal:
> On Wed, May 25, 2005 at 02:14:56PM +0200, Alexander Nyberg wrote:
> > ons 2005-05-25 klockan 02:07 -0700 skrev Andrew Morton:
> > > Alexander Nyberg <[email protected]> wrote:
> > > >
> > > > -extern void machine_crash_shutdown(void);
> > > > +extern void machine_crash_shutdown(struct pt_regs *);
> > >
> > > That'll break x86_64, ppc, ppc64 and s/390.
> >
> > I'm such an idiot.
> >
> > Make sure all arches take pt_regs * as argument to
> > machine_crash_shutdown(). (now cross-compiled on above arches except
> > s/390).
> >
>
> Alexander, I face following warning if I build my kernel without HIGHMEM
> support. Fianally linker fails in the end.
>
> CC kernel/kexec.o
> kernel/kexec.c: In function `kexec_should_crash':
> kernel/kexec.c:37: warning: implicit declaration of function `in_interrupt'
>
> If I include HIGHMEM support, it compiles fine.
>
> You might have to include include/linux/hardirq.h in kexec.c to
> resolve the problem.
>

Yeah this is fixed in -mm, thanks