The address provided by the SMBIOS/DMI CRU information is mapped via
ioremap() in the virtual address space. However, since the address
is executed (i.e. call'd), we need to set that pages as executable.
Without that, I get following oops on a HP ProLiant DL385 G2
machine with BIOS from 05/29/2008 when I trigger crashdump:
BUG: unable to handle kernel paging request at ffffc20011090c00
IP: [<ffffc20011090c00>] 0xffffc20011090c00
PGD 12f813067 PUD 7fe6a067 PMD 7effe067 PTE 80000000fffd3173
Oops: 0011 [1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1
Modules linked in: autofs4 ipv6 af_packet cpufreq_conservative cpufreq_userspace
cpufreq_powersave powernow_k8 fuse loop dm_mod rtc_cmos ipmi_si sg rtc_core i2c
_piix4 ipmi_msghandler bnx2 sr_mod container button i2c_core hpilo joydev pcspkr
rtc_lib shpchp hpwdt cdrom pci_hotplug usbhid hid ff_memless ohci_hcd ehci_hcd
uhci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic serverworks ide_core p
ata_serverworks pata_acpi cciss ata_generic libata scsi_mod dock thermal process
or thermal_sys hwmon
Supported: Yes
Pid: 0, comm: swapper Not tainted 2.6.27.5-HEAD_20081111100657-default #1
RIP: 0010:[<ffffc20011090c00>] [<ffffc20011090c00>] 0xffffc20011090c00
RSP: 0018:ffff88012f6f9e68 EFLAGS: 00010046
RAX: 0000000000000d02 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88012f6f9e98 R08: 666666666666660a R09: ffffffffa1006fc0
R10: 0000000000000000 R11: ffff88012f6f3ea8 R12: ffffc20011090c00
R13: ffff88012f6f9ee8 R14: 000000000000000e R15: 0000000000000000
FS: 00007ff70b29a6f0(0000) GS:ffff88012f6512c0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffffc20011090c00 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff88012f6f2000, task ffff88007fa8a1c0)
Stack: ffffffffa0f8502b 0000000000000002 ffffffff80738d50 0000000000000000
0000000000000046 0000000000000046 00000000fffffffe ffffffffa0f852ec
0000000000000000 ffffffff804ad9a6 0000000000000000 0000000000000000
Call Trace:
Inexact backtrace:
<NMI> [<ffffffffa0f8502b>] ? asminline_call+0x2b/0x55 [hpwdt]
[<ffffffffa0f852ec>] hpwdt_pretimeout+0x3c/0xa0 [hpwdt]
[<ffffffff804ad9a6>] ? notifier_call_chain+0x29/0x4c
[<ffffffff802587e4>] ? notify_die+0x2d/0x32
[<ffffffff804abbdc>] ? default_do_nmi+0x53/0x1d9
[<ffffffff804abd90>] ? do_nmi+0x2e/0x43
[<ffffffff804ab552>] ? nmi+0xa2/0xd0
[<ffffffff80221ef9>] ? native_safe_halt+0x2/0x3
<<EOE>> [<ffffffff8021345d>] ? default_idle+0x38/0x54
[<ffffffff8021359a>] ? c1e_idle+0x118/0x11c
[<ffffffff8020b3b5>] ? cpu_idle+0xa9/0xf1
Code: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff <55> 50 e8 00 00 00 00 58 48 2d 07 10 40 00 48 8b e8 58 e9 68 02
RIP [<ffffc20011090c00>] 0xffffc20011090c00
RSP <ffff88012f6f9e68>
CR2: ffffc20011090c00
Kernel panic - not syncing: Fatal exception
Signed-off-by: Bernhard Walle <[email protected]>
---
drivers/watchdog/hpwdt.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 9890dff..e83e1ac 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -40,6 +40,7 @@
#include <linux/bootmem.h>
#include <linux/slab.h>
#include <asm/desc.h>
+#include <asm/cacheflush.h>
#define PCI_BIOS32_SD_VALUE 0x5F32335F /* "_32_" */
#define CRU_BIOS_SIGNATURE_VALUE 0x55524324
@@ -394,6 +395,8 @@ static void __devinit dmi_find_cru(const struct dmi_header *dm)
smbios_cru64_ptr->double_offset;
cru_rom_addr = ioremap(cru_physical_address,
smbios_cru64_ptr->double_length);
+ set_memory_x((unsigned long)cru_rom_addr & PAGE_MASK,
+ smbios_cru64_ptr->double_length >> PAGE_SHIFT);
}
}
}
--
1.6.0.2
On Fri, 14 Nov 2008 15:47:03 +0100
Bernhard Walle <[email protected]> wrote:
> The address provided by the SMBIOS/DMI CRU information is mapped via
> ioremap() in the virtual address space. However, since the address
> is executed (i.e. call'd), we need to set that pages as executable.
>
> Without that, I get following oops on a HP ProLiant DL385 G2
> machine with BIOS from 05/29/2008 when I trigger crashdump:
>
> BUG: unable to handle kernel paging request at ffffc20011090c00
> IP: [<ffffc20011090c00>] 0xffffc20011090c00
> PGD 12f813067 PUD 7fe6a067 PMD 7effe067 PTE 80000000fffd3173
> Oops: 0011 [1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> CPU 1
> Modules linked in: autofs4 ipv6 af_packet cpufreq_conservative cpufreq_userspace
> cpufreq_powersave powernow_k8 fuse loop dm_mod rtc_cmos ipmi_si sg rtc_core i2c
> _piix4 ipmi_msghandler bnx2 sr_mod container button i2c_core hpilo joydev pcspkr
> rtc_lib shpchp hpwdt cdrom pci_hotplug usbhid hid ff_memless ohci_hcd ehci_hcd
> uhci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic serverworks ide_core p
> ata_serverworks pata_acpi cciss ata_generic libata scsi_mod dock thermal process
> or thermal_sys hwmon
> Supported: Yes
> Pid: 0, comm: swapper Not tainted 2.6.27.5-HEAD_20081111100657-default #1
> RIP: 0010:[<ffffc20011090c00>] [<ffffc20011090c00>] 0xffffc20011090c00
> RSP: 0018:ffff88012f6f9e68 EFLAGS: 00010046
> RAX: 0000000000000d02 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff88012f6f9e98 R08: 666666666666660a R09: ffffffffa1006fc0
> R10: 0000000000000000 R11: ffff88012f6f3ea8 R12: ffffc20011090c00
> R13: ffff88012f6f9ee8 R14: 000000000000000e R15: 0000000000000000
> FS: 00007ff70b29a6f0(0000) GS:ffff88012f6512c0(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: ffffc20011090c00 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffff88012f6f2000, task ffff88007fa8a1c0)
> Stack: ffffffffa0f8502b 0000000000000002 ffffffff80738d50 0000000000000000
> 0000000000000046 0000000000000046 00000000fffffffe ffffffffa0f852ec
> 0000000000000000 ffffffff804ad9a6 0000000000000000 0000000000000000
> Call Trace:
> Inexact backtrace:
>
> <NMI> [<ffffffffa0f8502b>] ? asminline_call+0x2b/0x55 [hpwdt]
> [<ffffffffa0f852ec>] hpwdt_pretimeout+0x3c/0xa0 [hpwdt]
> [<ffffffff804ad9a6>] ? notifier_call_chain+0x29/0x4c
> [<ffffffff802587e4>] ? notify_die+0x2d/0x32
> [<ffffffff804abbdc>] ? default_do_nmi+0x53/0x1d9
> [<ffffffff804abd90>] ? do_nmi+0x2e/0x43
> [<ffffffff804ab552>] ? nmi+0xa2/0xd0
> [<ffffffff80221ef9>] ? native_safe_halt+0x2/0x3
> <<EOE>> [<ffffffff8021345d>] ? default_idle+0x38/0x54
> [<ffffffff8021359a>] ? c1e_idle+0x118/0x11c
> [<ffffffff8020b3b5>] ? cpu_idle+0xa9/0xf1
>
>
> Code: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff <55> 50 e8 00 00 00 00 58 48 2d 07 10 40 00 48 8b e8 58 e9 68 02
> RIP [<ffffc20011090c00>] 0xffffc20011090c00
> RSP <ffff88012f6f9e68>
> CR2: ffffc20011090c00
> Kernel panic - not syncing: Fatal exception
>
>
> Signed-off-by: Bernhard Walle <[email protected]>
> ---
> drivers/watchdog/hpwdt.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index 9890dff..e83e1ac 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -40,6 +40,7 @@
> #include <linux/bootmem.h>
> #include <linux/slab.h>
> #include <asm/desc.h>
> +#include <asm/cacheflush.h>
>
> #define PCI_BIOS32_SD_VALUE 0x5F32335F /* "_32_" */
> #define CRU_BIOS_SIGNATURE_VALUE 0x55524324
> @@ -394,6 +395,8 @@ static void __devinit dmi_find_cru(const struct dmi_header *dm)
> smbios_cru64_ptr->double_offset;
> cru_rom_addr = ioremap(cru_physical_address,
> smbios_cru64_ptr->double_length);
> + set_memory_x((unsigned long)cru_rom_addr & PAGE_MASK,
> + smbios_cru64_ptr->double_length >> PAGE_SHIFT);
> }
> }
> }
This is also needed in 2.6.27.x, yes?
* Andrew Morton [2008-11-18 14:30]:
>
> This is also needed in 2.6.27.x, yes?
Right. But I was waiting for feedback from the author since I'm really
not familiar with low level memory management and that driver. :(
Regards,
Bernhard
Yes. I agree with this fix. The HP ProLiant systems have an RBSU setting in the BIOS for the NX bit but we need to make certain we can execute as the default setting for this RBSU option may switch between enable/disable.
Tom
-----Original Message-----
From: Andrew Morton [mailto:[email protected]]
Sent: Tuesday, November 18, 2008 4:30 PM
To: Bernhard Walle
Cc: Mingarelli, Thomas; [email protected]; [email protected]; [email protected]; [email protected]
Subject: Re: [PATCH] [WATCHDOG] [hpwdt] Set the mapped BIOS address space as executable
On Fri, 14 Nov 2008 15:47:03 +0100
Bernhard Walle <[email protected]> wrote:
> The address provided by the SMBIOS/DMI CRU information is mapped via
> ioremap() in the virtual address space. However, since the address
> is executed (i.e. call'd), we need to set that pages as executable.
>
> Without that, I get following oops on a HP ProLiant DL385 G2
> machine with BIOS from 05/29/2008 when I trigger crashdump:
>
> BUG: unable to handle kernel paging request at ffffc20011090c00
> IP: [<ffffc20011090c00>] 0xffffc20011090c00
> PGD 12f813067 PUD 7fe6a067 PMD 7effe067 PTE 80000000fffd3173
> Oops: 0011 [1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> CPU 1
> Modules linked in: autofs4 ipv6 af_packet cpufreq_conservative cpufreq_userspace
> cpufreq_powersave powernow_k8 fuse loop dm_mod rtc_cmos ipmi_si sg rtc_core i2c
> _piix4 ipmi_msghandler bnx2 sr_mod container button i2c_core hpilo joydev pcspkr
> rtc_lib shpchp hpwdt cdrom pci_hotplug usbhid hid ff_memless ohci_hcd ehci_hcd
> uhci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic serverworks ide_core p
> ata_serverworks pata_acpi cciss ata_generic libata scsi_mod dock thermal process
> or thermal_sys hwmon
> Supported: Yes
> Pid: 0, comm: swapper Not tainted 2.6.27.5-HEAD_20081111100657-default #1
> RIP: 0010:[<ffffc20011090c00>] [<ffffc20011090c00>] 0xffffc20011090c00
> RSP: 0018:ffff88012f6f9e68 EFLAGS: 00010046
> RAX: 0000000000000d02 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff88012f6f9e98 R08: 666666666666660a R09: ffffffffa1006fc0
> R10: 0000000000000000 R11: ffff88012f6f3ea8 R12: ffffc20011090c00
> R13: ffff88012f6f9ee8 R14: 000000000000000e R15: 0000000000000000
> FS: 00007ff70b29a6f0(0000) GS:ffff88012f6512c0(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: ffffc20011090c00 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffff88012f6f2000, task ffff88007fa8a1c0)
> Stack: ffffffffa0f8502b 0000000000000002 ffffffff80738d50 0000000000000000
> 0000000000000046 0000000000000046 00000000fffffffe ffffffffa0f852ec
> 0000000000000000 ffffffff804ad9a6 0000000000000000 0000000000000000
> Call Trace:
> Inexact backtrace:
>
> <NMI> [<ffffffffa0f8502b>] ? asminline_call+0x2b/0x55 [hpwdt]
> [<ffffffffa0f852ec>] hpwdt_pretimeout+0x3c/0xa0 [hpwdt]
> [<ffffffff804ad9a6>] ? notifier_call_chain+0x29/0x4c
> [<ffffffff802587e4>] ? notify_die+0x2d/0x32
> [<ffffffff804abbdc>] ? default_do_nmi+0x53/0x1d9
> [<ffffffff804abd90>] ? do_nmi+0x2e/0x43
> [<ffffffff804ab552>] ? nmi+0xa2/0xd0
> [<ffffffff80221ef9>] ? native_safe_halt+0x2/0x3
> <<EOE>> [<ffffffff8021345d>] ? default_idle+0x38/0x54
> [<ffffffff8021359a>] ? c1e_idle+0x118/0x11c
> [<ffffffff8020b3b5>] ? cpu_idle+0xa9/0xf1
>
>
> Code: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff <55> 50 e8 00 00 00 00 58 48 2d 07 10 40 00 48 8b e8 58 e9 68 02
> RIP [<ffffc20011090c00>] 0xffffc20011090c00
> RSP <ffff88012f6f9e68>
> CR2: ffffc20011090c00
> Kernel panic - not syncing: Fatal exception
>
>
> Signed-off-by: Bernhard Walle <[email protected]>
> ---
> drivers/watchdog/hpwdt.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index 9890dff..e83e1ac 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -40,6 +40,7 @@
> #include <linux/bootmem.h>
> #include <linux/slab.h>
> #include <asm/desc.h>
> +#include <asm/cacheflush.h>
>
> #define PCI_BIOS32_SD_VALUE 0x5F32335F /* "_32_" */
> #define CRU_BIOS_SIGNATURE_VALUE 0x55524324
> @@ -394,6 +395,8 @@ static void __devinit dmi_find_cru(const struct dmi_header *dm)
> smbios_cru64_ptr->double_offset;
> cru_rom_addr = ioremap(cru_physical_address,
> smbios_cru64_ptr->double_length);
> + set_memory_x((unsigned long)cru_rom_addr & PAGE_MASK,
> + smbios_cru64_ptr->double_length >> PAGE_SHIFT);
> }
> }
> }
This is also needed in 2.6.27.x, yes?
We have also verified that returning NOTIFY_OK from the hpwdt_pretimeout routine makes the KDUMP feature work correctly.
Bernhard pointed that out a week or so ago and we have since verified it.
Thanks,
Tom
-----Original Message-----
From: Bernhard Walle [mailto:[email protected]]
Sent: Tuesday, November 18, 2008 4:32 PM
To: Andrew Morton
Cc: Mingarelli, Thomas; [email protected]; [email protected]; [email protected]
Subject: Re: [PATCH] [WATCHDOG] [hpwdt] Set the mapped BIOS address space as executable
* Andrew Morton [2008-11-18 14:30]:
>
> This is also needed in 2.6.27.x, yes?
Right. But I was waiting for feedback from the author since I'm really
not familiar with low level memory management and that driver. :(
Regards,
Bernhard
On Wed, 19 Nov 2008 14:05:06 +0000 "Mingarelli, Thomas" <[email protected]> wrote:
>
> -----Original Message-----
> > From: Bernhard Walle [mailto:[email protected]]
> > Sent: Tuesday, November 18, 2008 4:32 PM
> > To: Andrew Morton
> > Cc: Mingarelli, Thomas; [email protected]; [email protected]; [email protected]
> > Subject: Re: [PATCH] [WATCHDOG] [hpwdt] Set the mapped BIOS address space as executable
> >
> > * Andrew Morton [2008-11-18 14:30]:
> > >
> > > This is also needed in 2.6.27.x, yes?
> >
> > Right. But I was waiting for feedback from the author since I'm really
> > not familiar with low level memory management and that driver. :(
> >
>
> We have also verified that returning NOTIFY_OK from the hpwdt_pretimeout routine makes the KDUMP feature work correctly.
>
> Bernhard pointed that out a week or so ago and we have since verified it.
>
(top-posting repaired. Please don't top-post!)
I haven't seen any patch which alters hpwdt_pretimeout() and there is
no such patch in linux-next. Perhaps it got lost?
* Andrew Morton [2008-11-19 09:30]:
> >
>
> (top-posting repaired. Please don't top-post!)
>
> I haven't seen any patch which alters hpwdt_pretimeout() and there is
> no such patch in linux-next. Perhaps it got lost?
I need to repost.
Regards,
Bernhard
Hi All,
> We have also verified that returning NOTIFY_OK from the hpwdt_pretimeout routine makes the KDUMP feature work correctly.
>
> Bernhard pointed that out a week or so ago and we have since verified it.
I'll create the patches for the linux-2.6-watchdog trees tonight.
Kind regards,
Wim.
Hi All,
> * Andrew Morton [2008-11-19 09:30]:
> > >
> >
> > (top-posting repaired. Please don't top-post!)
> >
> > I haven't seen any patch which alters hpwdt_pretimeout() and there is
> > no such patch in linux-next. Perhaps it got lost?
This was the patch (see below). It's in the linux-2.6-watchdog-next tree now,
so it should go into the linux-next tree soon.
The other patch is in the linux-2.6-watchdog-next tree also.
Kind regards,
Wim.
----------------------------------------------------------
commit ed22ea64f9efe4531be5130c0f77130c2ad74130
Author: Bernhard Walle <[email protected]>
Date: Sun Oct 26 15:59:37 2008 +0100
[WATCHDOG] hpwdt: Fix kdump when using hpwdt
When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.
The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier()
is never executed. This causes the shutdown of other CPUs to fail.
Reverting the order is no option: The crash_nmi_callback executes HLT
and so never returns normally. Because of that, it must be executed as
last notifier, which currently is done.
So, that patch returns NOTIFY_OK in case allow_kdump is set as module parameter
in the hpwdt module. Also, it changes the default of allow_kdump to 1. Kdump is
quite common and should be working as default.
Signed-off-by: Bernhard Walle <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Thomas Mingarelli <[email protected]>
Cc: Vivek Goyal <[email protected]>
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index f6cff7b..e83e1ac 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -117,7 +117,7 @@ static unsigned int reload; /* the computed soft_margin */
static int nowayout = WATCHDOG_NOWAYOUT;
static char expect_release;
static unsigned long hpwdt_is_open;
-static unsigned int allow_kdump;
+static unsigned int allow_kdump = 1;
static void __iomem *pci_mem_addr; /* the PCI-memory address */
static unsigned long __iomem *hpwdt_timer_reg;
@@ -485,7 +485,11 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
"Management Log for details.\n");
}
- return NOTIFY_STOP;
+ /*
+ * for kdump, we must return NOTIFY_OK here to execute the
+ * crash_nmi_callback afterwards, see arch/x86/kernel/crash.c
+ */
+ return allow_kdump ? NOTIFY_OK : NOTIFY_STOP;
}
/*
* Wim Van Sebroeck [2008-11-20 00:00]:
>
> Hi All,
>
> > * Andrew Morton [2008-11-19 09:30]:
> > > >
> > >
> > > (top-posting repaired. Please don't top-post!)
> > >
> > > I haven't seen any patch which alters hpwdt_pretimeout() and there is
> > > no such patch in linux-next. Perhaps it got lost?
>
> This was the patch (see below). It's in the linux-2.6-watchdog-next tree now,
> so it should go into the linux-next tree soon.
>
> The other patch is in the linux-2.6-watchdog-next tree also.
Wasn't the conclusion that NOTIFY_OK always works and we should not
rely on that 'allow_kdump' option?
Regards,
Bernhard
>
> static void __iomem *pci_mem_addr; /* the PCI-memory address */
> static unsigned long __iomem *hpwdt_timer_reg;
> @@ -485,7 +485,11 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
> "Management Log for details.\n");
> }
>
> - return NOTIFY_STOP;
> + /*
> + * for kdump, we must return NOTIFY_OK here to execute the
> + * crash_nmi_callback afterwards, see arch/x86/kernel/crash.c
> + */
> + return allow_kdump ? NOTIFY_OK : NOTIFY_STOP;
> }
>
> /*
Hi Bernhard,
> Wasn't the conclusion that NOTIFY_OK always works and we should not
> rely on that 'allow_kdump' option?
Wasn't completely clear to me, but it would indeed be much cleaner.
Hi Tom,
What did you test exactly?
Thanks in advance,
Wim.
I tested changing the return value to NOTIFY_OK always.
-----Original Message-----
From: Wim Van Sebroeck [mailto:[email protected]]
Sent: Wednesday, November 19, 2008 5:11 PM
To: Bernhard Walle; Mingarelli, Thomas
Cc: Andrew Morton; [email protected]; [email protected]
Subject: Re: [PATCH] [WATCHDOG] [hpwdt] Set the mapped BIOS address space as executable
Hi Bernhard,
> Wasn't the conclusion that NOTIFY_OK always works and we should not
> rely on that 'allow_kdump' option?
Wasn't completely clear to me, but it would indeed be much cleaner.
Hi Tom,
What did you test exactly?
Thanks in advance,
Wim.
Hi All,
> I tested changing the return value to NOTIFY_OK always.
Splendid. I changed the patch (see below).
Can you check if all is ok now, so that I can sent them off for mainline inclusion?
Kind regards,
Wim.
-----------------------------------------------------------------------------------------
commit 1adecc3d59d01ac7ffe9ab695cbd7311ab50198e
Author: Bernhard Walle <[email protected]>
Date: Sun Oct 26 15:59:37 2008 +0100
[WATCHDOG] hpwdt: Fix kdump when using hpwdt
When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.
The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier()
is never executed. This causes the shutdown of other CPUs to fail.
Reverting the order is no option: The crash_nmi_callback executes HLT
and so never returns normally. Because of that, it must be executed as
last notifier, which currently is done.
So, this patch returns NOTIFY_OK instead of NOTIFY_STOP.
Also, it changes the default of allow_kdump to 1.
Kdump is quite common and should be working as default.
Signed-off-by: Bernhard Walle <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Thomas Mingarelli <[email protected]>
Cc: Vivek Goyal <[email protected]>
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index f6cff7b..8900989 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -117,7 +117,7 @@ static unsigned int reload; /* the computed soft_margin */
static int nowayout = WATCHDOG_NOWAYOUT;
static char expect_release;
static unsigned long hpwdt_is_open;
-static unsigned int allow_kdump;
+static unsigned int allow_kdump = 1;
static void __iomem *pci_mem_addr; /* the PCI-memory address */
static unsigned long __iomem *hpwdt_timer_reg;
@@ -485,7 +485,7 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
"Management Log for details.\n");
}
- return NOTIFY_STOP;
+ return NOTIFY_OK;
}
/*
* Wim Van Sebroeck [2008-11-20 21:24]:
>
> > I tested changing the return value to NOTIFY_OK always.
>
> Splendid. I changed the patch (see below).
> Can you check if all is ok now, so that I can sent them off for mainline inclusion?
Sorry for my late reply, I'm really busy these days. :-(
In that case we don't need to change the allow_kdump variable. I send a
patch with corrected patch description just afterwards.
Regards,
Bernhard