2005-05-08 20:37:40

by Ralf Hildebrandt

[permalink] [raw]
Subject: kexec?

I know kexec used to be a patch, but has it gone into the mainstream
kernels yet?

--
Ralf Hildebrandt (i.A. des IT-Zentrums) [email protected]
Charite - Universit?tsmedizin Berlin Tel. +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin Fax. +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [email protected]


2005-05-10 01:34:57

by Randy.Dunlap

[permalink] [raw]
Subject: Re: kexec?

On Sun, 8 May 2005 22:20:50 +0200 Ralf Hildebrandt wrote:

| I know kexec used to be a patch, but has it gone into the mainstream
| kernels yet?

Nope, it's only in the -mm patchset.
Testing/reporting that could help....

---
~Randy

2005-05-10 07:00:51

by Coywolf Qi Hunt

[permalink] [raw]
Subject: Re: kexec?

On 5/10/05, Randy.Dunlap <[email protected]> wrote:
> On Sun, 8 May 2005 22:20:50 +0200 Ralf Hildebrandt wrote:
>
> | I know kexec used to be a patch, but has it gone into the mainstream
> | kernels yet?
>
> Nope, it's only in the -mm patchset.
> Testing/reporting that could help....

coywolf@prodigy:~/kexec-tools-1.95/objdir/build/bin$ ./kexec_test
Segmentation fault

prodigy:/home/coywolf/kexec-tools-1.95/objdir/build/sbin# ./kexec -l
/var/local/build/vmlinux
kexec_load failed: Cannot assign requested address
entry = (nil)
nr_segments = 2
segment[0].buf = 0x80b4558
segment[0].bufsz = 15c
segment[0].mem = (nil)
segment[0].memsz = 15c
segment[1].buf = 0xb7d53008
segment[1].bufsz = 2a0086
segment[1].mem = 0x100000
segment[1].memsz = 2c8a78

prodigy:/home/coywolf/kexec-tools-1.95/objdir/build/sbin# ./kexec -l
/var/local/build/arch/i386/boot/bzImage
kexec_load failed: Cannot assign requested address
entry = 0x91734
nr_segments = 2
segment[0].buf = 0x80b4480
segment[0].bufsz = 1850
segment[0].mem = 0x90000
segment[0].memsz = 1850
segment[1].buf = 0xb7eaa008
segment[1].bufsz = 14032d
segment[1].mem = 0x100000
segment[1].memsz = 14032d

--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/

2005-05-10 12:30:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: kexec?

On Tuesday 10 May 2005 09:00, Coywolf Qi Hunt wrote:
> On 5/10/05, Randy.Dunlap <[email protected]> wrote:
> > On Sun, 8 May 2005 22:20:50 +0200 Ralf Hildebrandt wrote:
> > | I know kexec used to be a patch, but has it gone into the mainstream
> > | kernels yet?
> >
> > Nope, it's only in the -mm patchset.
> > Testing/reporting that could help....
>
> coywolf@prodigy:~/kexec-tools-1.95/objdir/build/bin$ ./kexec_test
> Segmentation fault
>
> prodigy:/home/coywolf/kexec-tools-1.95/objdir/build/sbin# ./kexec -l
> /var/local/build/vmlinux
> kexec_load failed: Cannot assign requested address
> entry = (nil)
> nr_segments = 2
> segment[0].buf = 0x80b4558
> segment[0].bufsz = 15c
> segment[0].mem = (nil)
> segment[0].memsz = 15c
> segment[1].buf = 0xb7d53008
> segment[1].bufsz = 2a0086
> segment[1].mem = 0x100000
> segment[1].memsz = 2c8a78
>
> prodigy:/home/coywolf/kexec-tools-1.95/objdir/build/sbin# ./kexec -l
> /var/local/build/arch/i386/boot/bzImage
> kexec_load failed: Cannot assign requested address
> entry = 0x91734
> nr_segments = 2
> segment[0].buf = 0x80b4480
> segment[0].bufsz = 1850
> segment[0].mem = 0x90000
> segment[0].memsz = 1850
> segment[1].buf = 0xb7eaa008
> segment[1].bufsz = 14032d
> segment[1].mem = 0x100000
> segment[1].memsz = 14032d

Hi,

I've been doing some kexec tests (as described in Documentation/kdump.txt) too
but can't get to load the image and get similar error messages. Let me know
if you need more info about the hardware. The first_kernel was booted with
"crashkernel=64M@16M" and the 16M value was configured into the second during
kconfig in "Physical address where the kernel is loaded" as 0x1000000.

[root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1 maxcpus=1
init 1"

kexec_load failed: Cannot assign requested address
entry = 0x1498 flags = 1
nr_segments = 4
segment[0].buf = 0x8067ba0
segment[0].bufsz = 80c0
segment[0].mem = 0x1000
segment[0].memsz = a000
segment[1].buf = 0x806fd80
segment[1].bufsz = 1000
segment[1].mem = 0xb000
segment[1].memsz = 1000
segment[2].buf = 0xb6dbc008
segment[2].bufsz = 3d619c
segment[2].mem = 0x1000000
segment[2].memsz = 3d7000
segment[3].buf = 0xb7193008
segment[3].bufsz = 2b086
segment[3].mem = 0x13d8000
segment[3].memsz = 55000



2005-05-10 13:11:35

by Coywolf Qi Hunt

[permalink] [raw]
Subject: Re: kexec?

On 5/10/05, Borislav Petkov <[email protected]> wrote:
> I've been doing some kexec tests (as described in Documentation/kdump.txt) too
> but can't get to load the image and get similar error messages. Let me know
> if you need more info about the hardware. The first_kernel was booted with
> "crashkernel=64M@16M" and the 16M value was configured into the second during
> kconfig in "Physical address where the kernel is loaded" as 0x1000000.
>
> [root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1 maxcpus=1
> init 1"

kexec-tools-1.101 loads for me, but if cmdline is used, it hangs up
after "Starting new kernel"

--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/

2005-05-11 06:06:07

by Maneesh Soni

[permalink] [raw]
Subject: Re: kexec?

On Tue, May 10, 2005 at 09:11:31PM +0800, Coywolf Qi Hunt wrote:
> On 5/10/05, Borislav Petkov <[email protected]> wrote:
> > I've been doing some kexec tests (as described in Documentation/kdump.txt) too
> > but can't get to load the image and get similar error messages. Let me know
> > if you need more info about the hardware. The first_kernel was booted with
> > "crashkernel=64M@16M" and the 16M value was configured into the second during
> > kconfig in "Physical address where the kernel is loaded" as 0x1000000.
> >
> > [root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1 maxcpus=1
> > init 1"
>
> kexec-tools-1.101 loads for me, but if cmdline is used, it hangs up
> after "Starting new kernel"

Thanks for trying this out. As Vivek mentioned can you please try with bulding
second or dump capture kernel with CONFIG_SMP=N and _without_ maxcpus= option.
Basically the second kernel's job is just to save the dump and it doesnot need
to be a SMP kernel. There are some issues with booting SMP kernel as dump
capture kernel.

Also, it would be great help if you can also send us some hardware details
about the system you are trying, like lspci, /proc/cpuinfo and which kernel
you tried. I am maintaining a webpage to consolidate the test reports of kexec/kdump
at
http://lse.sourceforge.net/kdump/kdump-test.html

Thanks
Maneesh



--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990

2005-05-11 11:53:56

by Borislav Petkov

[permalink] [raw]
Subject: Re: kexec?

On Wednesday 11 May 2005 08:04, Maneesh Soni wrote:
<snip>
> > > [root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1
> > > maxcpus=1 init 1"
> >
> > kexec-tools-1.101 loads for me, but if cmdline is used, it hangs up
> > after "Starting new kernel"
>
> Thanks for trying this out. As Vivek mentioned can you please try with
> bulding second or dump capture kernel with CONFIG_SMP=N and _without_
> maxcpus= option. Basically the second kernel's job is just to save the dump
> and it doesnot need to be a SMP kernel. There are some issues with booting
> SMP kernel as dump capture kernel.

Hm, without 'maxcpus' seems to work. However, when booting into the new
kernel, the rootfs had to be fsck'ed due to "/ was not cleanly unmounted,
check forced." and then was forced to reboot linux due to inconsistency in
the fs. I simply did kexec -l <vmlinux> --args-linux --append="root=/dev/hda1
init 1" and then kexec -e to execute the loaded image. It seems that the
filesystems are not unmounted properly before loading the second kernel, (or
I am missing something..., which is more likely :))

> Also, it would be great help if you can also send us some hardware details
> about the system you are trying, like lspci,
[root@zmei] lspci -vv
0000:00:00.0 Host bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE DRAM
Controller/Host-Hub Interface (rev 03)
Subsystem: Unknown device 1849:2560
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
Latency: 0
Region 0: Memory at e0000000 (32-bit, prefetchable) [size=64M]
Capabilities: [e4] #09 [6105]
Capabilities: [a0] AGP version 2.0
Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+
AGP3- Rate=x1,x2,x4
Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=<none>

0000:00:01.0 PCI bridge: Intel Corp. 82845G/GL[Brookdale-G]/GE/PE Host-to-AGP
Bridge (rev 03) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32
Bus: primary=00, secondary=01, subordinate=02, sec-latency=32
I/O behind bridge: 0000c000-0000cfff
Memory behind bridge: dfd00000-dfdfffff
Prefetchable memory behind bridge: bfb00000-dfafffff
BridgeCtl: Parity+ SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B-

0000:00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI])
Subsystem: Unknown device 1849:24c0
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 4: I/O ports at e400 [size=32]

0000:00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 02) (prog-if 00 [UHCI])
Subsystem: Unknown device 1849:24c0
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin B routed to IRQ 19
Region 4: I/O ports at e800 [size=32]

0000:00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 02) (prog-if 00 [UHCI])
Subsystem: Unknown device 1849:24c0
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin C routed to IRQ 18
Region 4: I/O ports at ec00 [size=32]

0000:00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB 2.0
EHCI Controller (rev 02) (prog-if 20 [EHCI])
Subsystem: Unknown device 1849:24c0
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin D routed to IRQ 23
Region 0: Memory at dffffc00 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] #0a [2080]

0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev 82) (prog-if 00
[Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR+
Latency: 0
Bus: primary=00, secondary=03, subordinate=03, sec-latency=32
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: dfe00000-dfefffff
Prefetchable memory behind bridge: dfb00000-dfbfffff
BridgeCtl: Parity- SERR+ NoISA+ VGA- MAbort- >Reset- FastB2B-

0000:00:1f.0 ISA bridge: Intel Corp. 82801DB/DBL (ICH4/ICH4-L) LPC Bridge (rev
02)
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0

0000:00:1f.1 IDE interface: Intel Corp. 82801DB/DBL (ICH4/ICH4-L) UltraATA-100
IDE Controller (rev 02) (prog-if 8a [Master SecP PriP])
Subsystem: Unknown device 1849:24c0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0
Interrupt: pin A routed to IRQ 18
Region 0: I/O ports at <unassigned>
Region 1: I/O ports at <unassigned>
Region 2: I/O ports at <unassigned>
Region 3: I/O ports at <unassigned>
Region 4: I/O ports at fc00 [size=16]
Region 5: Memory at 20000000 (32-bit, non-prefetchable) [size=1K]

0000:01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon
9200 SE] (rev 01) (prog-if 00 [VGA])
Subsystem: C.P. Technology Co. Ltd CN-AG92E
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32 (2000ns min), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 16
Region 0: Memory at c0000000 (32-bit, prefetchable) [size=256M]
Region 1: I/O ports at c800 [size=256]
Region 2: Memory at dfdf0000 (32-bit, non-prefetchable) [size=64K]
Expansion ROM at dfdc0000 [disabled] [size=128K]
Capabilities: [58] AGP version 2.0
Status: RQ=80 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+
AGP3- Rate=x1,x2,x4
Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP- GART64- 64bit- FW- Rate=<none>
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

0000:03:04.0 Multimedia audio controller: Yamaha Corporation YMF-724 (rev 05)
Subsystem: Yamaha Corporation YMF724-Based PCI Audio Adapter
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR+
Latency: 32 (1250ns min, 6250ns max)
Interrupt: pin A routed to IRQ 17
Region 0: Memory at dfef8000 (32-bit, non-prefetchable) [size=32K]
Capabilities: [50] Power Management version 1
Flags: PMEClk- DSI- D1- D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

0000:03:06.0 Communication controller: Conexant HSF 56k HSFi Modem (rev 01)
Subsystem: Unknown device 182d:3525
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32
Interrupt: pin A routed to IRQ 3
Region 0: Memory at dfee0000 (32-bit, non-prefetchable) [size=64K]
Region 1: I/O ports at dc00 [size=8]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

0000:03:07.0 Multimedia video controller: Brooktree Corporation Bt878 Video
Capture (rev 02)
Subsystem: Hauppauge computer works Inc. WinTV Series
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32 (4000ns min, 10000ns max)
Interrupt: pin A routed to IRQ 16
Region 0: Memory at dfbfe000 (32-bit, prefetchable) [size=4K]

0000:03:07.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture
(rev 02)
Subsystem: Hauppauge computer works Inc. WinTV Series
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32 (1000ns min, 63750ns max)
Interrupt: pin A routed to IRQ 11
Region 0: Memory at dfbff000 (32-bit, prefetchable) [size=4K]

0000:03:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)
Subsystem: Unknown device 1849:8139
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32 (8000ns min, 16000ns max)
Interrupt: pin A routed to IRQ 17
Region 0: I/O ports at d800 [size=256]
Region 1: Memory at dfef7f00 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

[root@zmei]: cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping : 9
cpu MHz : 2606.716
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 5216.96

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.60GHz
stepping : 9
cpu MHz : 2606.716
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr
bogomips : 5211.17

The kernel i used is:
2.6.12-rc3-mm3

Regards,
Boris.

2005-05-11 14:46:19

by Coywolf Qi Hunt

[permalink] [raw]
Subject: Re: kexec?

On 5/11/05, Borislav Petkov <[email protected]> wrote:
> On Wednesday 11 May 2005 08:04, Maneesh Soni wrote:
> <snip>
> > > > [root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1
> > > > maxcpus=1 init 1"
> > >
> > > kexec-tools-1.101 loads for me, but if cmdline is used, it hangs up
> > > after "Starting new kernel"
> >
> > Thanks for trying this out. As Vivek mentioned can you please try with
> > bulding second or dump capture kernel with CONFIG_SMP=N and _without_
> > maxcpus= option. Basically the second kernel's job is just to save the dump
> > and it doesnot need to be a SMP kernel. There are some issues with booting
> > SMP kernel as dump capture kernel.
>
> Hm, without 'maxcpus' seems to work. However, when booting into the new
> kernel, the rootfs had to be fsck'ed due to "/ was not cleanly unmounted,
> check forced." and then was forced to reboot linux due to inconsistency in
> the fs. I simply did kexec -l <vmlinux> --args-linux --append="root=/dev/hda1
> init 1" and then kexec -e to execute the loaded image. It seems that the
> filesystems are not unmounted properly before loading the second kernel, (or
> I am missing something..., which is more likely :))

kexec is like a bare reboot.
Add kexec -l and -e just above the reboot line in your
/etc/init.d/reboot script,
or umount manually( sysrq+s sysrq+u ).

--
Coywolf Qi Hunt
http://sosdg.org/~coywolf/

2005-05-12 06:42:59

by Maneesh Soni

[permalink] [raw]
Subject: Re: kexec?

On Wed, May 11, 2005 at 01:51:41PM +0200, Borislav Petkov wrote:
> On Wednesday 11 May 2005 08:04, Maneesh Soni wrote:
> <snip>
> > > > [root@zmei]: kexec -p vmlinux --args-linux --append="root=/dev/hda1
> > > > maxcpus=1 init 1"
> > >
> > > kexec-tools-1.101 loads for me, but if cmdline is used, it hangs up
> > > after "Starting new kernel"
> >
> > Thanks for trying this out. As Vivek mentioned can you please try with
> > bulding second or dump capture kernel with CONFIG_SMP=N and _without_
> > maxcpus= option. Basically the second kernel's job is just to save the dump
> > and it doesnot need to be a SMP kernel. There are some issues with booting
> > SMP kernel as dump capture kernel.
>
> Hm, without 'maxcpus' seems to work. However, when booting into the new
> kernel, the rootfs had to be fsck'ed due to "/ was not cleanly unmounted,
> check forced." and then was forced to reboot linux due to inconsistency in
> the fs. I simply did kexec -l <vmlinux> --args-linux --append="root=/dev/hda1
> init 1" and then kexec -e to execute the loaded image. It seems that the
> filesystems are not unmounted properly before loading the second kernel, (or
> I am missing something..., which is more likely :))
>
> > Also, it would be great help if you can also send us some hardware details
> > about the system you are trying, like lspci,
> [root@zmei] lspci -vv

Thanks Boris, I have updated the kdump test page with details you provided.

http://lse.sourceforge.net/kdump/kdump-test.html

It will be nice if you could try kdump also on the similar lines.

Thanks
Maneesh

--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990

2005-05-16 22:16:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: kexec?

<snip>
>
> It will be nice if you could try kdump also on the similar lines.

HI,

After patching kexec-tools with the kdump patch here's what I did according to
the test plan:

0. load kernel with crashkernel=64M@16M
1. kexec -p vmlinux --args-linux --append="root=/dev/hda1 init 1" (loads fine)
2. sysrq+c
the system issues here : SysRq: Trigger a crashdump and hangs so that even
SysRq is dead.


Regards,
Boris.

2005-05-17 10:02:48

by Vivek Goyal

[permalink] [raw]
Subject: Re: kexec?

On Tue, May 17, 2005 at 12:11:43AM +0200, Borislav Petkov wrote:
Hi,

> <snip>
> >
> > It will be nice if you could try kdump also on the similar lines.
>
> HI,
>
> After patching kexec-tools with the kdump patch here's what I did according to
> the test plan:
>
> 0. load kernel with crashkernel=64M@16M
> 1. kexec -p vmlinux --args-linux --append="root=/dev/hda1 init 1" (loads fine)
> 2. sysrq+c
> the system issues here : SysRq: Trigger a crashdump and hangs so that even
> SysRq is dead.
>

Thanks for testing this out. So kexec on panic seems to be hanging. Are you
booted in first kernel with commandline option nmi_watchdog? We have a known
issue with nmi_watchdog and just now I have posted a patch.

Could you please try loading the new kernel with --console-vga or
--console-serial option (depending on what console you are on) and post
the output.

Is it possible to post the .config file for both the kernels as well serial
console output (if one is setup). This should help.

Thanks
Vivek



2005-05-18 08:11:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: kexec?

On Tuesday 17 May 2005 12:02, Vivek Goyal wrote:
> On Tue, May 17, 2005 at 12:11:43AM +0200, Borislav Petkov wrote:
> Hi,
>
> > <snip>
> >
> > > It will be nice if you could try kdump also on the similar lines.
> >
> > HI,
> >
> > After patching kexec-tools with the kdump patch here's what I did
> > according to the test plan:
> >
> > 0. load kernel with crashkernel=64M@16M
> > 1. kexec -p vmlinux --args-linux --append="root=/dev/hda1 init 1" (loads
> > fine) 2. sysrq+c
> > the system issues here : SysRq: Trigger a crashdump and hangs so that
> > even SysRq is dead.
>
> Thanks for testing this out. So kexec on panic seems to be hanging. Are you
> booted in first kernel with commandline option nmi_watchdog? We have a
> known issue with nmi_watchdog and just now I have posted a patch.

No, my kernel commandline options are: root=/dev/hda1 vga=0
crashkernel=64M@16M

> Could you please try loading the new kernel with --console-vga or
> --console-serial option (depending on what console you are on) and post
> the output.

Same thing happens - total lockup.
> Is it possible to post the .config file for both the kernels as well serial
> console output (if one is setup). This should help.
.configs and console output attached. The serial console output sheds no more
light on the matter, unfortunately. It seems that the lockup happens pretty
early and the kernel suffers an early death without even a last breath of
goodbye :)

Regards,
Boris.


Attachments:
(No filename) (1.43 kB)
1st_config (34.68 kB)
2nd_config (29.47 kB)
minicom.cap (15.92 kB)
Download all attachments

2005-05-20 10:53:10

by Vivek Goyal

[permalink] [raw]
Subject: Re: kexec?

On Wed, May 18, 2005 at 09:58:03AM +0200, Borislav Petkov wrote:
> On Tuesday 17 May 2005 12:02, Vivek Goyal wrote:
> > On Tue, May 17, 2005 at 12:11:43AM +0200, Borislav Petkov wrote:
> > Hi,
> >
> > > <snip>
> > >
> > > > It will be nice if you could try kdump also on the similar lines.
> > >
> > > HI,
> > >
> > > After patching kexec-tools with the kdump patch here's what I did
> > > according to the test plan:
> > >
> > > 0. load kernel with crashkernel=64M@16M
> > > 1. kexec -p vmlinux --args-linux --append="root=/dev/hda1 init 1" (loads
> > > fine) 2. sysrq+c
> > > the system issues here : SysRq: Trigger a crashdump and hangs so that
> > > even SysRq is dead.
> >
> > Thanks for testing this out. So kexec on panic seems to be hanging. Are you
> > booted in first kernel with commandline option nmi_watchdog? We have a
> > known issue with nmi_watchdog and just now I have posted a patch.
>
> No, my kernel commandline options are: root=/dev/hda1 vga=0
> crashkernel=64M@16M


Boris, I used your config files and it is working for me. I disabled kgdb
from your first config and enabled serial console output in second config.


>
> > Could you please try loading the new kernel with --console-vga or
> > --console-serial option (depending on what console you are on) and post
> > the output.
>
> Same thing happens - total lockup.

Second kernel did not have serial console output enabled in config file. Is
it possible to test it out once again with serial console enabled. May be
disable kgdb in first kernel.

With --console-serial option enabled while loading panic kernel (kexec -p) I
am expecting at least following message on serial console after Sysrq-c.

"I am in purgatory".

It gives me some indicator whether purgatory code started execution or not.

Thanks
Vivek

2005-05-21 08:21:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: kexec?

<snip>
> Second kernel did not have serial console output enabled in config file. Is
> it possible to test it out once again with serial console enabled. May be
> disable kgdb in first kernel.
>
> With --console-serial option enabled while loading panic kernel (kexec -p)
> I am expecting at least following message on serial console after Sysrq-c.
>
> "I am in purgatory".
>
HI Vivek,

well kgdb was the offending problem here. After disabling it and booting into
the first kernel, everything worked just fine. I even got to
save /proc/vmcore successfully so kdump works also :) Log attached.

Regards,
Boris.


Attachments:
(No filename) (613.00 B)
minicom.cap (31.60 kB)
Download all attachments

2005-05-23 06:43:59

by Maneesh Soni

[permalink] [raw]
Subject: Re: kexec?

On Sat, May 21, 2005 at 10:20:35AM +0200, Borislav Petkov wrote:
> <snip>
> > Second kernel did not have serial console output enabled in config file. Is
> > it possible to test it out once again with serial console enabled. May be
> > disable kgdb in first kernel.
> >
> > With --console-serial option enabled while loading panic kernel (kexec -p)
> > I am expecting at least following message on serial console after Sysrq-c.
> >
> > "I am in purgatory".
> >
> HI Vivek,
>
> well kgdb was the offending problem here. After disabling it and booting into
> the first kernel, everything worked just fine. I even got to
> save /proc/vmcore successfully so kdump works also :) Log attached.
>
> Regards,
> Boris.

Thanks Boris, I have updated this on the test webpage. I have also noted the
issue of kgdb interactions.

http://lse.sourceforge.net/kdump/kdump-test.html


Thanks
Maneesh

--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990