2005-04-23 03:30:13

by Vivek Goyal

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

Quoting "Eric W. Biederman" <[email protected]>:

> Nagesh Sharyathi <[email protected]> writes:
>
> > Here is the console boot log, before the machine jumps to BIOS
> > after hang during panic kerenl boot
>
> Ok thanks. So this is manually triggered with SysRq
> and the kexec part works but the recover kernel simply fails
> to boot.
>
> It looks like that hunk of the ACPI code that messes up maxcpus=1
> needs to be looked at.

I faced the similiar issue on one of my machine. Little debugging showed that
Boot cpu sends an INIT IPI to application processor to wake it up and then boot
cpu loses its way and jumps to bios. Strange....

Further, in my case this problem was noticed only if crash happened on non-boot
cpu.

It works well with Uniporcessor capture kernel. For the time being sufficient
to capture the dump but it is always good idea to be able to boot and SMP kernel
as well.


Thanks
Vivek


2005-04-25 12:16:06

by Nagesh Sharyathi

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

[email protected] wrote on 23/04/2005 09:00:03:

> Quoting "Eric W. Biederman" <[email protected]>:

> > Nagesh Sharyathi <[email protected]> writes:
> >
> > > Here is the console boot log, before the machine jumps to BIOS
> > > after hang during panic kerenl boot
> >
> > Ok thanks. So this is manually triggered with SysRq
> > and the kexec part works but the recover kernel simply fails
> > to boot.
> >
> > It looks like that hunk of the ACPI code that messes up maxcpus=1
> > needs to be looked at.

> It works well with Uniporcessor capture kernel. For the time being
sufficient
> to capture the dump but it is always good idea to be able to boot
> and SMP kernel
> as well.
>
> Vivek
I verified on my machine where earlier kdump used to fail and after
disabling CONFIG_SMP(ie CONFIG_SMP=n) crash kernel boots properly and I am
able to take the memory dump
Regards
Sharyathi

2005-04-25 23:09:53

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

On Mon, 25 Apr 2005 17:45:43 +0530
Nagesh Sharyathi <[email protected]> wrote:

> [email protected] wrote on 23/04/2005 09:00:03:
>
> > Quoting "Eric W. Biederman" <[email protected]>:
>
> > > Nagesh Sharyathi <[email protected]> writes:
> > >
> > > > Here is the console boot log, before the machine jumps to BIOS
> > > > after hang during panic kerenl boot
> > >
> > > Ok thanks. So this is manually triggered with SysRq
> > > and the kexec part works but the recover kernel simply fails
> > > to boot.
> > >
> > > It looks like that hunk of the ACPI code that messes up maxcpus=1
> > > needs to be looked at.
>
> > It works well with Uniporcessor capture kernel. For the time being
> sufficient
> > to capture the dump but it is always good idea to be able to boot
> > and SMP kernel
> > as well.
> >
> > Vivek
> I verified on my machine where earlier kdump used to fail and after
> disabling CONFIG_SMP(ie CONFIG_SMP=n) crash kernel boots properly and I am
> able to take the memory dump


Thanks for those hints. However, my testing didn't go quite
as well as that.


2.6.12-rc2-mm3 reboots vmlinux-recover-UP on panic.
(vmlinux-recover-SMP hangs during [early] reboot, but -UP
goes further....)

(BTW, how does I do serial console from the second
kernel...? It has the drivers, but not the command
line info? TBD.)

vmlinux-recover-UP gets to this point, hand-written,
several lines missing:

kfree_debugcheck: bad ptr c3dbffb0h. ( == %esi)
kernel BUG at <bad filename>:23128!
invalid operand: 0000 [#1]
DEBUG_PAGEALLOC
EIP is at kfree_debugcheck+0x45/0x50

Stack dump shows lots of ext3 cache and inode functions...

On a dual-proc P4 with 1 GB RAM.
--
~Randy

2005-04-26 08:54:53

by Vivek Goyal

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

>
> 2.6.12-rc2-mm3 reboots vmlinux-recover-UP on panic.
> (vmlinux-recover-SMP hangs during [early] reboot, but -UP
> goes further....)
>
> (BTW, how does I do serial console from the second
> kernel...? It has the drivers, but not the command
> line info? TBD.)
>


While pre-loading the capture kernel using kexec, you can specify the command
line options to second kernel using --append="". You must already be passing
the root device. Add you serial console parameters as well something like
--append="console=ttyS0, 38400"


> vmlinux-recover-UP gets to this point, hand-written,
> several lines missing:
>
> kfree_debugcheck: bad ptr c3dbffb0h. ( == %esi)
> kernel BUG at <bad filename>:23128!
> invalid operand: 0000 [#1]
> DEBUG_PAGEALLOC
> EIP is at kfree_debugcheck+0x45/0x50
>
> Stack dump shows lots of ext3 cache and inode functions...
>

Can you post a full serial console output of second kernel? That would help.

Thanks
Vivek

2005-04-27 16:46:38

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

On Tue, 26 Apr 2005 14:24:48 +0530
Vivek Goyal <[email protected]> wrote:

> >
> > 2.6.12-rc2-mm3 reboots vmlinux-recover-UP on panic.
> > (vmlinux-recover-SMP hangs during [early] reboot, but -UP
> > goes further....)
> >
> > (BTW, how does I do serial console from the second
> > kernel...? It has the drivers, but not the command
> > line info? TBD.)
> >
>
>
> While pre-loading the capture kernel using kexec, you can specify the command
> line options to second kernel using --append="". You must already be passing
> the root device. Add you serial console parameters as well something like
> --append="console=ttyS0, 38400"

Yes, that's what I was planning to try anyway, thanks for the
confirmation. Finally got it working.


> > vmlinux-recover-UP gets to this point, hand-written,
> > several lines missing:
> >
> > kfree_debugcheck: bad ptr c3dbffb0h. ( == %esi)
> > kernel BUG at <bad filename>:23128!
> > invalid operand: 0000 [#1]
> > DEBUG_PAGEALLOC
> > EIP is at kfree_debugcheck+0x45/0x50
> >
> > Stack dump shows lots of ext3 cache and inode functions...
> >
>
> Can you post a full serial console output of second kernel? That would help.

Here:

Linux version 2.6.12-rc2-mm3 (rddunlap@gargoyle) (gcc version 3.3.3 (SuSE Linux)) #25 Tue Apr 26 17:52:39 PDT 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000100 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
user-defined physical RAM map:
user: 0000000000000000 - 00000000000a0000 (usable)
user: 0000000001000000 - 000000000144d000 (usable)
user: 00000000014ed400 - 0000000005000000 (usable)
0MB HIGHMEM available.
80MB LOWMEM available.
DMI 2.3 present.
Allocating PCI resources starting at 05000000 (gap: 05000000:fb000000)
Built 1 zonelists
Initializing CPU#0
Kernel command line: root=/dev/hda9 nosmp console=ttyS0,115200n8 console=tty0 init 1 memmap=exactmap memmap=640K@0K memmap=4404K@16384K memmap=60491K@21429K elfcorehdr=21428K
PID hash table entries: 512 (order: 9, 8192 bytes)
Detected 1685.910 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Unknown interrupt or fault at EIP 00000246 00000060 c13d6653 [*1]
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 59468k/81920k available (2561k kernel code, 5956k reserved, 1311k data, 220k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.

---
[1] c13d6653 is vfs_caches_init_early

2005-04-27 19:23:48

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

On Tue, 26 Apr 2005 14:24:48 +0530
Vivek Goyal <[email protected]> wrote:

> >
> > 2.6.12-rc2-mm3 reboots vmlinux-recover-UP on panic.
> > (vmlinux-recover-SMP hangs during [early] reboot, but -UP
> > goes further....)
> >
> > (BTW, how does I do serial console from the second
> > kernel...? It has the drivers, but not the command
> > line info? TBD.)
> >
>
>
> While pre-loading the capture kernel using kexec, you can specify the command
> line options to second kernel using --append="". You must already be passing
> the root device. Add you serial console parameters as well something like
> --append="console=ttyS0, 38400"
>
>
> > vmlinux-recover-UP gets to this point, hand-written,
> > several lines missing:
> >
> > kfree_debugcheck: bad ptr c3dbffb0h. ( == %esi)
> > kernel BUG at <bad filename>:23128!
> > invalid operand: 0000 [#1]
> > DEBUG_PAGEALLOC
> > EIP is at kfree_debugcheck+0x45/0x50
> >
> > Stack dump shows lots of ext3 cache and inode functions...
> >
>
> Can you post a full serial console output of second kernel? That would help.

I did another test run, same kernels (both running and recovery).
The recovery kernel got a little further this time, still had
Badness and a BUG.

---

Kernel panic - not syncing: crashtest
Linux version 2.6.12-rc2-mm3 (rddunlap@gargoyle) (gcc version 3.3.3 (SuSE Linux)) #25 Tue Apr 26 17:52:39 PDT 2005
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000100 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
user-defined physical RAM map:
user: 0000000000000000 - 00000000000a0000 (usable)
user: 0000000001000000 - 000000000144d000 (usable)
user: 00000000014ed400 - 0000000005000000 (usable)
0MB HIGHMEM available.
80MB LOWMEM available.
DMI 2.3 present.
Allocating PCI resources starting at 05000000 (gap: 05000000:fb000000)
Built 1 zonelists
Initializing CPU#0
Kernel command line: root=/dev/hda9 nosmp console=ttyS0,115200n8 console=tty0 init 1 memmap=exactmap memmap=640K@0K memmap=4404K@16384K memmap=60491K@21429K elfcorehdr=21428K
PID hash table entries: 512 (order: 9, 8192 bytes)
Detected 1685.983 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Unknown interrupt or fault at EIP 00000246 00000060 c13d6653
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Memory: 59468k/81920k available (2561k kernel code, 5956k reserved, 1311k data, 220k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Mount-cache hash table entries: 512
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU: Intel(R) Xeon(TM) CPU 1.70GHz stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
softlockup thread 0 started up.
NET: Registered protocol family 16
EISA bus registered
PCI: PCI BIOS revision 2.10 entry at 0xfb110, last bus=4
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
Linux Plug and Play Support v0.97 (c) Adam Belay
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Using IRQ router PIIX/ICH [8086/2440] at 0000:00:1f.0
fscache: general fs caching registered
CacheFS: general fs caching v0.1 registered
inotify device minor=63
Initializing Cryptographic API
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
lp: driver loaded but no devices found
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0)
Linux agpgart interface v0.101 (c) Dave Jones
agpgart: Detected an Intel i860 Chipset.
agpgart: AGP aperture is 64M @ 0xe8000000
Hangcheck: starting hangcheck timer 0.5.0 (tick is 180 seconds, margin is 60 seconds).
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
parport0: irq 7 detected
lp0: using parport0 (polling).
lp0: console ready
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
pktcdvd: v0.2.0a 2004-07-14 Jens Axboe ([email protected]) and [email protected]
Intel(R) PRO/1000 Network Driver - version 5.7.6-k2
Copyright (c) 1999-2004 Intel Corporation.
e100: Intel(R) PRO/100 Network Driver, 3.3.6-k2-NAPI
e100: Copyright(c) 1999-2004 Intel Corporation
PCI: Found IRQ 10 for device 0000:04:04.0
e100: eth0: e100_probe: addr 0xf4020000, irq 10, MAC addr 00:02:55:1A:35:D4
Linux video capture interface: v1.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH2: IDE controller at PCI slot 0000:00:1f.1
ICH2: chipset revision 4
ICH2: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
hda: ST3160023A, ATA DISK drive
hdb: ST3160023A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: LTN486S, ATAPI CD/DVD-ROM drive
hdd: SONY CD-RW CRX140E, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 1024KiB
hda: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
hda: cache flushes supported
hda: hda1 < hda5 hda6 hda7 hda8 hda9 >
hdb: max request size: 1024KiB
hdb: 312581808 sectors (160041 MB) w/8192KiB Cache, CHS=19457/255/63, UDMA(100)
hdb: cache flushes supported
hdb: hdb1 hdb2 hdb3 hdb4
hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
hdd: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
PCI: Enabling device 0000:03:01.0 (0006 -> 0007)
PCI: Found IRQ 11 for device 0000:03:01.0
PCI: Sharing IRQ 11 with 0000:03:01.1
PCI: Enabling device 0000:03:01.1 (0006 -> 0007)
PCI: Found IRQ 11 for device 0000:03:01.1
PCI: Sharing IRQ 11 with 0000:03:01.0
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7899 Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7899 Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs

scsi2 : scsi_debug, version 1.75 [20050113], dev_size_mb=8, opts=0x0
Vendor: Linux Model: scsi_debug Rev: 0004
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 16384 512-byte hdwr sectors (8 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 16384 512-byte hdwr sectors (8 MB)
SCSI device sda: drive cache: write back
sda: unknown partition table
Attached scsi disk sda at scsi2, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi2, channel 0, id 0, lun 0, type 0
SCSI Media Changer driver v0.24
USB Universal Host Controller Interface driver v2.2
PCI: Found IRQ 11 for device 0000:00:1f.2
uhci_hcd 0000:00:1f.2: Intel Corporation 82801BA/BAM USB (Hub #1)
uhci_hcd 0000:00:1f.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1f.2: irq 11, io base 0x0000b000
uhci_hcd 0000:00:1f.2: detected 2 ports
usb usb1: Product: Intel Corporation 82801BA/BAM USB (Hub #1)
usb usb1: Manufacturer: Linux 2.6.12-rc2-mm3 uhci_hcd
usb usb1: SerialNumber: 0000:00:1f.2
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.01:USB HID core driver
mice: PS/2 mouse device common for all mice
input: PC Speaker
i2c /dev entries driver
EISA: Probing bus 0 at eisa.0
Cannot allocate resource for EISA slot 4
Cannot allocate resource for EISA slot 5
EISA: Detected 0 cards.
Advanced Linux Sound Architecture Driver Version 1.0.9rc2 (Thu Mar 24 10:33:39 2005 UTC).
PCI: Found IRQ 11 for device 0000:00:1f.5
PCI: Sharing IRQ 11 with 0000:00:1f.3
input: AT Translated Set 2 keyboard on isa0060/serio0
intel8x0_measure_ac97_clock: measured 49559 usecs
intel8x0: clocking to 48000
ALSA device list:
#0: Intel 82801BA-ICH2 with AD1885 at 0xb800, irq 11
NET: Registered protocol family 26
NET: Registered protocol family 2
IP: routing cache hash table of 128 buckets, 4Kbytes
TCP established hash table entries: 4096 (order: 3, 32768 bytes)
TCP bind hash table entries: 4096 (order: 4, 114688 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
NET: Registered protocol family 1
NET: Registered protocol family 17
CacheFS: Wrong magic number on cache
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 220k freed
Adding 2104472k swap on /dev/hda7. Priority:42 extents:1
mismatch in kmem_cache_free: expected cache c168fc80, got c4daca80
c4daca80 is ext3_inode_cache.
c168fc80 is skbuff_head_cache.
Badness in cache_free_debugcheck at mm/slab.c:1917
[<c1003368>] dump_stack+0x16/0x18
[<c1041a94>] cache_free_debugcheck+0x88/0x1d5
[<c10424fd>] kmem_cache_free+0x26/0x65
[<c10a8c01>] ext3_destroy_inode+0x17/0x19
[<c10784c9>] destroy_inode+0x27/0x3d
[<c1078837>] dispose_list+0x60/0x178
[<c1078f81>] prune_icache+0x363/0x399
[<c1078fd0>] shrink_icache_memory+0x19/0x32
[<c1044dd7>] shrink_slab+0x104/0x172
[<c104641e>] try_to_free_pages+0xbe/0x16f
[<c103d9a0>] __alloc_pages+0x1d3/0x393
[<c104037c>] kmem_getpages+0x2d/0x7f
[<c1041869>] cache_grow+0x155/0x2a8
[<c1041f1f>] cache_alloc_refill+0x285/0x2c2
[<c10423c6>] kmem_cache_alloc+0x5d/0x77
[<c1075dac>] d_alloc+0x16/0x27a
[<c106b2b9>] real_lookup+0x40/0xc2
[<c106b68e>] do_lookup+0x41/0x75
[<c106c3a7>] __link_path_walk+0xce5/0x1066
[<c106c768>] link_path_walk+0x40/0xc7
[<c106ca87>] path_lookup+0xec/0xf7
[<c106cbc9>] __user_walk+0x28/0x42
[<c10667b3>] vfs_lstat+0x17/0x3f
[<c1066d1e>] sys_lstat64+0x13/0x29
[<c1002c5f>] sysenter_past_esp+0x54/0x75
slab error in cache_free_debugcheck(): cache `ext3_inode_cache': double free, or memory outside object was overwritten
[<c1003368>] dump_stack+0x16/0x18
[<c1041ad2>] cache_free_debugcheck+0xc6/0x1d5
[<c10424fd>] kmem_cache_free+0x26/0x65
[<c10a8c01>] ext3_destroy_inode+0x17/0x19
[<c10784c9>] destroy_inode+0x27/0x3d
[<c1078837>] dispose_list+0x60/0x178
[<c1078f81>] prune_icache+0x363/0x399
[<c1078fd0>] shrink_icache_memory+0x19/0x32
[<c1044dd7>] shrink_slab+0x104/0x172
[<c104641e>] try_to_free_pages+0xbe/0x16f
[<c103d9a0>] __alloc_pages+0x1d3/0x393
[<c104037c>] kmem_getpages+0x2d/0x7f
[<c1041869>] cache_grow+0x155/0x2a8
[<c1041f1f>] cache_alloc_refill+0x285/0x2c2
[<c10423c6>] kmem_cache_alloc+0x5d/0x77
[<c1075dac>] d_alloc+0x16/0x27a
[<c106b2b9>] real_lookup+0x40/0xc2
[<c106b68e>] do_lookup+0x41/0x75
[<c106c3a7>] __link_path_walk+0xce5/0x1066
[<c106c768>] link_path_walk+0x40/0xc7
[<c106ca87>] path_lookup+0xec/0xf7
[<c106cbc9>] __user_walk+0x28/0x42
[<c10667b3>] vfs_lstat+0x17/0x3f
[<c1066d1e>] sys_lstat64+0x13/0x29
[<c1002c5f>] sysenter_past_esp+0x54/0x75
c3d7afb0: redzone 1: 0x0, redzone 2: 0x0.
------------[ cut here ]------------
kernel BUG at <bad filename>:18422!
invalid operand: 0000 [#1]
DEBUG_PAGEALLOC
Modules linked in:
CPU: 0
EIP: 0060:[<c1041b46>] Not tainted VLI
EFLAGS: 00010002 (2.6.12-rc2-mm3)
EIP is at cache_free_debugcheck+0x13a/0x1d5
eax: c3d7a000 ebx: c3d7a000 ecx: 00001000 edx: 00000fb0
esi: c3d7afb0 edi: c4daca80 ebp: c2f73bb8 esp: c2f73bac
ds: 007b es: 007b ss: 0068
Process showconsole (pid: 1264, threadinfo=c2f72000 task=c2f68ac0)
Stack: c4d0fec4 c4daca80 c3d7bd44 c2f73be0 c10424fd c4daca80 c3d7bd44 c10a8c01
00000080 00000286 c3d7bddc c2f73c2c 00000080 c2f73bf0 c10a8c01 c4daca80
c3d7bd44 c2f73c00 c10784c9 c3d7bddc c3d7bddc c2f73c1c c1078837 c3d7bddc
Call Trace:
[<c100334a>] show_stack+0x7a/0x82
[<c1003453>] show_registers+0xe9/0x153
[<c100369f>] die+0x15c/0x23d
[<c1003a79>] do_invalid_op+0x90/0x97
[<c1002ed3>] error_code+0x4f/0x54
[<c10424fd>] kmem_cache_free+0x26/0x65
[<c10a8c01>] ext3_destroy_inode+0x17/0x19
[<c10784c9>] destroy_inode+0x27/0x3d
[<c1078837>] dispose_list+0x60/0x178
[<c1078f81>] prune_icache+0x363/0x399
[<c1078fd0>] shrink_icache_memory+0x19/0x32
[<c1044dd7>] shrink_slab+0x104/0x172
[<c104641e>] try_to_free_pages+0xbe/0x16f
[<c103d9a0>] __alloc_pages+0x1d3/0x393
[<c104037c>] kmem_getpages+0x2d/0x7f
[<c1041869>] cache_grow+0x155/0x2a8
[<c1041f1f>] cache_alloc_refill+0x285/0x2c2
[<c10423c6>] kmem_cache_alloc+0x5d/0x77
[<c1075dac>] d_alloc+0x16/0x27a
[<c106b2b9>] real_lookup+0x40/0xc2
[<c106b68e>] do_lookup+0x41/0x75
[<c106c3a7>] __link_path_walk+0xce5/0x1066
[<c106c768>] link_path_walk+0x40/0xc7
[<c106ca87>] path_lookup+0xec/0xf7
[<c106cbc9>] __user_walk+0x28/0x42
[<c10667b3>] vfs_lstat+0x17/0x3f
[<c1066d1e>] sys_lstat64+0x13/0x29
[<c1002c5f>] sysenter_past_esp+0x54/0x75
Code: e8 bc e4 ff ff 8b 55 10 89 10 58 5a 8b 5b 0c 89 f0 31 d2 8b 4f 34 29 d8 f7 f1 3b 47 3c 72 02 0f 0b 0f af c1 8d 04 18 39 c6 74 02 <0f> 0b f6 47 39 02 74 15 6a 05 57 57 e8 1d e4 ff ff 8d 04 30 89

2005-04-28 11:45:10

by Vivek Goyal

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

On Wed, Apr 27, 2005 at 12:23:12PM -0700, Randy.Dunlap wrote:
> On Tue, 26 Apr 2005 14:24:48 +0530
> Vivek Goyal <[email protected]> wrote:
>
> > >
> > > 2.6.12-rc2-mm3 reboots vmlinux-recover-UP on panic.
> > > (vmlinux-recover-SMP hangs during [early] reboot, but -UP
> > > goes further....)
> > >
> > > (BTW, how does I do serial console from the second
> > > kernel...? It has the drivers, but not the command
> > > line info? TBD.)
> > >
> >
> >
> > While pre-loading the capture kernel using kexec, you can specify the command
> > line options to second kernel using --append="". You must already be passing
> > the root device. Add you serial console parameters as well something like
> > --append="console=ttyS0, 38400"
> >
> >
> > > vmlinux-recover-UP gets to this point, hand-written,
> > > several lines missing:
> > >
> > > kfree_debugcheck: bad ptr c3dbffb0h. ( == %esi)
> > > kernel BUG at <bad filename>:23128!
> > > invalid operand: 0000 [#1]
> > > DEBUG_PAGEALLOC
> > > EIP is at kfree_debugcheck+0x45/0x50
> > >
> > > Stack dump shows lots of ext3 cache and inode functions...
> > >
> >
> > Can you post a full serial console output of second kernel? That would help.
>
> I did another test run, same kernels (both running and recovery).
> The recovery kernel got a little further this time, still had
> Badness and a BUG.
>
> ---

Ok. I am also able to see this slab corruption occurring on my machine. I can
get away with the problem if I disable cachefs support.

Infact, I can reproduce the problem if I boot capture kernel normally through
BIOS with commandline "mem=64M". Looks like it is generic problem and not
associated with kexec/kdump. Cachefs might be doing some corruption.


> CacheFS: Wrong magic number on cache
> EXT3-fs: INFO: recovery required on readonly filesystem.
> EXT3-fs: write access will be enabled during recovery.
> input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
> kjournald starting. Commit interval 5 seconds
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with ordered data mode.
> VFS: Mounted root (ext3 filesystem) readonly.
> Freeing unused kernel memory: 220k freed
> Adding 2104472k swap on /dev/hda7. Priority:42 extents:1
> mismatch in kmem_cache_free: expected cache c168fc80, got c4daca80
> c4daca80 is ext3_inode_cache.
> c168fc80 is skbuff_head_cache.
> Badness in cache_free_debugcheck at mm/slab.c:1917
> [<c1003368>] dump_stack+0x16/0x18
> [<c1041a94>] cache_free_debugcheck+0x88/0x1d5
> [<c10424fd>] kmem_cache_free+0x26/0x65
> [<c10a8c01>] ext3_destroy_inode+0x17/0x19
> [<c10784c9>] destroy_inode+0x27/0x3d
> [<c1078837>] dispose_list+0x60/0x178
> [<c1078f81>] prune_icache+0x363/0x399
> [<c1078fd0>] shrink_icache_memory+0x19/0x32
> [<c1044dd7>] shrink_slab+0x104/0x172
> [<c104641e>] try_to_free_pages+0xbe/0x16f
> [<c103d9a0>] __alloc_pages+0x1d3/0x393
> [<c104037c>] kmem_getpages+0x2d/0x7f
> [<c1041869>] cache_grow+0x155/0x2a8
> [<c1041f1f>] cache_alloc_refill+0x285/0x2c2
> [<c10423c6>] kmem_cache_alloc+0x5d/0x77
> [<c1075dac>] d_alloc+0x16/0x27a
> [<c106b2b9>] real_lookup+0x40/0xc2
> [<c106b68e>] do_lookup+0x41/0x75
> [<c106c3a7>] __link_path_walk+0xce5/0x1066
> [<c106c768>] link_path_walk+0x40/0xc7
> [<c106ca87>] path_lookup+0xec/0xf7
> [<c106cbc9>] __user_walk+0x28/0x42
> [<c10667b3>] vfs_lstat+0x17/0x3f
> [<c1066d1e>] sys_lstat64+0x13/0x29
> [<c1002c5f>] sysenter_past_esp+0x54/0x75
> slab error in cache_free_debugcheck(): cache `ext3_inode_cache': double free, or memory outside object was overwritten
> [<c1003368>] dump_stack+0x16/0x18
> [<c1041ad2>] cache_free_debugcheck+0xc6/0x1d5
> [<c10424fd>] kmem_cache_free+0x26/0x65
> [<c10a8c01>] ext3_destroy_inode+0x17/0x19
> [<c10784c9>] destroy_inode+0x27/0x3d
> [<c1078837>] dispose_list+0x60/0x178
> [<c1078f81>] prune_icache+0x363/0x399
> [<c1078fd0>] shrink_icache_memory+0x19/0x32
> [<c1044dd7>] shrink_slab+0x104/0x172
> [<c104641e>] try_to_free_pages+0xbe/0x16f
> [<c103d9a0>] __alloc_pages+0x1d3/0x393
> [<c104037c>] kmem_getpages+0x2d/0x7f
> [<c1041869>] cache_grow+0x155/0x2a8
> [<c1041f1f>] cache_alloc_refill+0x285/0x2c2
> [<c10423c6>] kmem_cache_alloc+0x5d/0x77
> [<c1075dac>] d_alloc+0x16/0x27a
> [<c106b2b9>] real_lookup+0x40/0xc2
> [<c106b68e>] do_lookup+0x41/0x75
> [<c106c3a7>] __link_path_walk+0xce5/0x1066
> [<c106c768>] link_path_walk+0x40/0xc7
> [<c106ca87>] path_lookup+0xec/0xf7
> [<c106cbc9>] __user_walk+0x28/0x42
> [<c10667b3>] vfs_lstat+0x17/0x3f
> [<c1066d1e>] sys_lstat64+0x13/0x29
> [<c1002c5f>] sysenter_past_esp+0x54/0x75
> c3d7afb0: redzone 1: 0x0, redzone 2: 0x0.
> ------------[ cut here ]------------
> kernel BUG at <bad filename>:18422!
> invalid operand: 0000 [#1]
> DEBUG_PAGEALLOC
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c1041b46>] Not tainted VLI
> EFLAGS: 00010002 (2.6.12-rc2-mm3)
> EIP is at cache_free_debugcheck+0x13a/0x1d5
> eax: c3d7a000 ebx: c3d7a000 ecx: 00001000 edx: 00000fb0
> esi: c3d7afb0 edi: c4daca80 ebp: c2f73bb8 esp: c2f73bac
> ds: 007b es: 007b ss: 0068
> Process showconsole (pid: 1264, threadinfo=c2f72000 task=c2f68ac0)
> Stack: c4d0fec4 c4daca80 c3d7bd44 c2f73be0 c10424fd c4daca80 c3d7bd44 c10a8c01
> 00000080 00000286 c3d7bddc c2f73c2c 00000080 c2f73bf0 c10a8c01 c4daca80
> c3d7bd44 c2f73c00 c10784c9 c3d7bddc c3d7bddc c2f73c1c c1078837 c3d7bddc
> Call Trace:
> [<c100334a>] show_stack+0x7a/0x82
> [<c1003453>] show_registers+0xe9/0x153
> [<c100369f>] die+0x15c/0x23d
> [<c1003a79>] do_invalid_op+0x90/0x97
> [<c1002ed3>] error_code+0x4f/0x54
> [<c10424fd>] kmem_cache_free+0x26/0x65
> [<c10a8c01>] ext3_destroy_inode+0x17/0x19
> [<c10784c9>] destroy_inode+0x27/0x3d
> [<c1078837>] dispose_list+0x60/0x178
> [<c1078f81>] prune_icache+0x363/0x399
> [<c1078fd0>] shrink_icache_memory+0x19/0x32
> [<c1044dd7>] shrink_slab+0x104/0x172
> [<c104641e>] try_to_free_pages+0xbe/0x16f
> [<c103d9a0>] __alloc_pages+0x1d3/0x393
> [<c104037c>] kmem_getpages+0x2d/0x7f
> [<c1041869>] cache_grow+0x155/0x2a8
> [<c1041f1f>] cache_alloc_refill+0x285/0x2c2
> [<c10423c6>] kmem_cache_alloc+0x5d/0x77
> [<c1075dac>] d_alloc+0x16/0x27a
> [<c106b2b9>] real_lookup+0x40/0xc2
> [<c106b68e>] do_lookup+0x41/0x75
> [<c106c3a7>] __link_path_walk+0xce5/0x1066
> [<c106c768>] link_path_walk+0x40/0xc7
> [<c106ca87>] path_lookup+0xec/0xf7
> [<c106cbc9>] __user_walk+0x28/0x42
> [<c10667b3>] vfs_lstat+0x17/0x3f
> [<c1066d1e>] sys_lstat64+0x13/0x29
> [<c1002c5f>] sysenter_past_esp+0x54/0x75
> Code: e8 bc e4 ff ff 8b 55 10 89 10 58 5a 8b 5b 0c 89 f0 31 d2 8b 4f 34 29 d8 f7 f1 3b 47 3c 72 02 0f 0b 0f af c1 8d 04 18 39 c6 74 02 <0f> 0b f6 47 39 02 74 15 6a 05 57 57 e8 1d e4 ff ff 8d 04 30 89
>

2005-04-28 16:11:32

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

On Thu, 28 Apr 2005 17:14:16 +0530
Vivek Goyal <[email protected]> wrote:

> > > Can you post a full serial console output of second kernel? That would help.
> >
> > I did another test run, same kernels (both running and recovery).
> > The recovery kernel got a little further this time, still had
> > Badness and a BUG.
> >
> > ---
>
> Ok. I am also able to see this slab corruption occurring on my machine. I can
> get away with the problem if I disable cachefs support.
>
> Infact, I can reproduce the problem if I boot capture kernel normally through
> BIOS with commandline "mem=64M". Looks like it is generic problem and not
> associated with kexec/kdump. Cachefs might be doing some corruption.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Wheeeeeeeeee. Great, we (I) can do without cachefs,
and when I do that, kexec + kdump works.
First time that I've seen kdump work. :)

-rw-r--r-- 1 root root 1.0G Apr 28 08:41 oldmem.0428
-r-------- 1 root root 960M Apr 28 08:36 vmcore.0428

My (crashing/panic) kernel is built without -g, but gdb
can still tell me this much:

(gdb) bt
#0 0xc010ef95 in crash_get_current_regs ()
#1 0x00000000 in ?? ()
#2 0xee821ea0 in ?? ()
#3 0xee821ea0 in ?? ()
#4 0xee821ea0 in ?? ()
#5 0x00000046 in ?? ()
#6 0x00000000 in ?? ()
#7 0x00000000 in ?? ()
#8 0x00000000 in ?? ()
#9 0xee82c000 in ?? ()
#10 0x00000000 in ?? ()
#11 0xc010ed38 in machine_kexec ()


Thanks for following up, tracking, working on this.

---
~Randy

2005-04-28 19:13:43

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Fastboot] Re: Kdump Testing

"Randy.Dunlap" <[email protected]> writes:

> On Thu, 28 Apr 2005 17:14:16 +0530
> Vivek Goyal <[email protected]> wrote:
>
> > > > Can you post a full serial console output of second kernel? That would
> help.
>
> > >
> > > I did another test run, same kernels (both running and recovery).
> > > The recovery kernel got a little further this time, still had
> > > Badness and a BUG.
> > >
> > > ---
> >
> > Ok. I am also able to see this slab corruption occurring on my machine. I can
>
> > get away with the problem if I disable cachefs support.
> >
> > Infact, I can reproduce the problem if I boot capture kernel normally through
>
> > BIOS with commandline "mem=64M". Looks like it is generic problem and not
> > associated with kexec/kdump. Cachefs might be doing some corruption.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Wheeeeeeeeee. Great, we (I) can do without cachefs,
> and when I do that, kexec + kdump works.
> First time that I've seen kdump work. :)
>
> -rw-r--r-- 1 root root 1.0G Apr 28 08:41 oldmem.0428
> -r-------- 1 root root 960M Apr 28 08:36 vmcore.0428
>
> My (crashing/panic) kernel is built without -g, but gdb
> can still tell me this much:
>
> (gdb) bt
> #0 0xc010ef95 in crash_get_current_regs ()
> #1 0x00000000 in ?? ()
> #2 0xee821ea0 in ?? ()
> #3 0xee821ea0 in ?? ()
> #4 0xee821ea0 in ?? ()
> #5 0x00000046 in ?? ()
> #6 0x00000000 in ?? ()
> #7 0x00000000 in ?? ()
> #8 0x00000000 in ?? ()
> #9 0xee82c000 in ?? ()
> #10 0x00000000 in ?? ()
> #11 0xc010ed38 in machine_kexec ()
>
>
> Thanks for following up, tracking, working on this.

Congratulations everyone. The good really good news is when the
recovery kernel failed it failed early enough it did not make things
worse. It is good to see that prediction confirmed :)

Eric

2005-04-29 03:09:36

by Randy.Dunlap

[permalink] [raw]
Subject: [PATCH] Kdump docs.

On Thu, 28 Apr 2005 09:11:19 -0700 Randy.Dunlap wrote:

| Wheeeeeeeeee. Great, we (I) can do without cachefs,
| and when I do that, kexec + kdump works.
| First time that I've seen kdump work. :)


Vivek, Hari, Andrew-

Here's a patch to make Documentation/kdump.txt cleaner & clearer.

---

From: Randy Dunlap <[email protected]>

Cleanups and clear-ups for kdump doc:
typos, punctuation, 80 columns, examples.

Signed-off-by: Randy Dunlap <[email protected]>
---

Documentation/kdump.txt | 89 ++++++++++++++++++++++++++++--------------------
1 files changed, 52 insertions(+), 37 deletions(-)

diff -Naurp ./Documentation/kdump.txt~kdump_docco ./Documentation/kdump.txt
--- ./Documentation/kdump.txt~kdump_docco 2005-04-22 10:01:39.000000000 -0700
+++ ./Documentation/kdump.txt 2005-04-28 19:55:03.000000000 -0700
@@ -1,4 +1,4 @@
-Documentation for kdump - the kexec based crash dumping solution
+Documentation for kdump - the kexec-based crash dumping solution
================================================================

DESIGN
@@ -11,10 +11,10 @@ DMA from the first kernel does not corru

All the necessary information about Core image is encoded in ELF format and
stored in reserved area of memory before crash. Physical address of start of
-elf header is passed to new kernel through command line parameter elfcorehdr=.
+ELF header is passed to new kernel through command line parameter elfcorehdr=.

-On i386, first 640k of physical memory is needed to boot, irrespctive of where
-the kernel loads at. Hence, this region is backed up by kexec just before
+On i386, the first 640 KB of physical memory is needed to boot, irrespective
+of where the kernel loads. Hence, this region is backed up by kexec just before
rebooting into the new kernel.

In the second kernel, "old memory" can be accessed in two ways.
@@ -22,59 +22,72 @@ In the second kernel, "old memory" can b
- The first one is through a /dev/oldmem device interface. A capture utility
can read the device file and write out the memory in raw format. This is raw
dump of memory and analysis/capture tool should be intelligent enough to
- determine where to look for the right information. Elf headers (elfcorehdr=)
+ determine where to look for the right information. ELF headers (elfcorehdr=)
can become handy here.

- The second interface is through /proc/vmcore. This exports the dump as an ELF
format file which can be written out using any file copy command
(cp, scp, etc). Further, gdb can be used to perform limited debugging on
the dump file. This method ensures methods ensure that there is correct
- ordering of the dump pages (corresponding to the first 640k that has been
+ ordering of the dump pages (corresponding to the first 640 KB that has been
relocated).

SETUP
=====

-1) Obtain the appropriate -mm tree patch and apply it on to the vanilla
- kernel tree.
+1) Download and build the appropriate version of kexec-tools.

-2) Obtain appropriate version of kexec-tools.
+2) Download and build the appropriate (latest) kexec/kdump (-mm) kernel
+ patchset and apply it to the vanilla kernel tree.

-3) Two kernels need to be built in order to get this feature working.
+ Two kernels need to be built in order to get this feature working.

- First kernel:
- a) Enable "kexec system call" feature.
- b) Enable "sysfs file system support" (Pseudo filesystems).
- c) Boot into first kernel with command line "crashkernel=Y@X". Put
- appropriate values for X and Y. Y denotes, how much memory to reserve for
- second kernel, and X denotes at what physical address reserved memory
- section starts. For example, crashkernel=48M@16M.
-
- Second kernel:
- a) Enable "kernel crash dumps" feature.
- b) Specifiy a suitable value for "Physical address where the kernel is
- loaded". Typically this value should be same as X (See option c) above).
- c) Enable "/proc/vmcore support" (Optional).
-
- Note: Option a) and b) depend upon "Configure standard kernel feature
- (for small systems)".
- Option a) also depends on CONFIG_HIGHMEM.
- Both option a) and b) are under "Processor Types and Features"
+ A) First kernel:
+ a) Enable "kexec system call" feature (in Processor type and features).
+ CONFIG_KEXEC=y
+ b) This kernel's physical load address should be the default value of
+ 0x100000 (0x100000, 1 MB) (in Processor type and features).
+ CONFIG_PHYSICAL_START=0x100000
+ c) Enable "sysfs file system support" (in Pseudo filesystems).
+ CONFIG_SYSFS=y
+ d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
+ Use appropriate values for X and Y. Y denotes how much memory to reserve
+ for the second kernel, and X denotes at what physical address the reserved
+ memory section starts. For example: "crashkernel=64M@16M".
+
+ B) Second kernel:
+ a) Enable "kernel crash dumps" feature (in Processor type and features).
+ CONFIG_CRASH_DUMP=y
+ b) Specify a suitable value for "Physical address where the kernel is
+ loaded" (in Processor type and features). Typically this value
+ should be same as X (See option b) above, e.g., 16 MB or 0x1000000.
+ CONFIG_PHYSICAL_START=0x1000000
+ c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems).
+ CONFIG_PROC_VMCORE=y
+
+ Note: Options a) and b) depend upon "Configure standard kernel features
+ (for small systems)" (under General setup).
+ Option a) also depends on CONFIG_HIGHMEM (under Processor
+ type and features).
+ Both option a) and b) are under "Processor type and features".

-3) Boot into the first kernel. You are now ready to try out kexec based crash
+3) Boot into the first kernel. You are now ready to try out kexec-based crash
dumps.

-4) Load the second kernel to be booted using
+4) Load the second kernel to be booted using:

kexec -p <second-kernel> --crash-dump --args-linux --append="root=<root-dev>
maxcpus=1 init 1"

Note: i) <second-kernel> has to be a vmlinux image. bzImage will not work,
as of now.
- ii) By default elf headers are stored in ELF32 format(for i386). This is
- sufficient to represent the physical memory up to 4GB. To store
- headers in ELF64 format, specifiy "--elf64-core-headers" on kexec
- command line additionally.
+ ii) By default ELF headers are stored in ELF32 format (for i386). This
+ is sufficient to represent the physical memory up to 4GB. To store
+ headers in ELF64 format, specifiy "--elf64-core-headers" on the
+ kexec command line additionally.
+ iii) For now (or until it is fixed), it's best to build the
+ second-kernel without multi-processor support, i.e., make it
+ a uniprocessor kernel.

5) System reboots into the second kernel when a panic occurs. A module can be
written to force the panic, for testing purposes.
@@ -83,14 +96,16 @@ SETUP

cp /proc/vmcore <dump-file>

- Dump can also be accessed as a /dev/oldmem device for a linear/raw view.
- To create the device, type
+ Dump memory can also be accessed as a /dev/oldmem device for a linear/raw
+ view. To create the device, type:

mknod /dev/oldmem c 1 12

Use "dd" with suitable options for count, bs and skip to access specific
portions of the dump.

+ Entire memory: dd if=/dev/oldmem of=oldmem.001
+
ANALYSIS
========

@@ -102,7 +117,7 @@ Limited analysis can be done using gdb o
Stack trace for the task on processor 0, register display, memory display
work fine.

-Note: gdb can not analyse core files generated in ELF64 format for i386.
+Note: gdb cannot analyse core files generated in ELF64 format for i386.

TODO
====

2005-04-29 05:07:38

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH] Kdump docs.

Hi Randy,

> + A) First kernel:
> + a) Enable "kexec system call" feature (in Processor type and features).
> + CONFIG_KEXEC=y
> + b) This kernel's physical load address should be the default value of
> + 0x100000 (0x100000, 1 MB) (in Processor type and features).
> + CONFIG_PHYSICAL_START=0x100000
> + c) Enable "sysfs file system support" (in Pseudo filesystems).
> + CONFIG_SYSFS=y
> + d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
> + Use appropriate values for X and Y. Y denotes how much memory to reserve
> + for the second kernel, and X denotes at what physical address the reserved
> + memory section starts. For example: "crashkernel=64M@16M".
> +
> + B) Second kernel:
> + a) Enable "kernel crash dumps" feature (in Processor type and features).
> + CONFIG_CRASH_DUMP=y
> + b) Specify a suitable value for "Physical address where the kernel is
> + loaded" (in Processor type and features). Typically this value
> + should be same as X (See option b) above, e.g., 16 MB or 0x1000000.

Should above line be as follows.
"should be same as X (See option d) above."

This will make clear what is X and what should be the new value of
CONFIG_PHYSICAL_START.

Thanks for testing out and providing a clearer documentation.

Thanks
Vivek

2005-04-29 14:26:31

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [Fastboot] Re: [PATCH] Kdump docs.

On Fri, 29 Apr 2005 10:37:29 +0530 Vivek Goyal wrote:

| Hi Randy,
|
| > + A) First kernel:
| > + a) Enable "kexec system call" feature (in Processor type and features).
| > + CONFIG_KEXEC=y
| > + b) This kernel's physical load address should be the default value of
| > + 0x100000 (0x100000, 1 MB) (in Processor type and features).
| > + CONFIG_PHYSICAL_START=0x100000
| > + c) Enable "sysfs file system support" (in Pseudo filesystems).
| > + CONFIG_SYSFS=y
| > + d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
| > + Use appropriate values for X and Y. Y denotes how much memory to reserve
| > + for the second kernel, and X denotes at what physical address the reserved
| > + memory section starts. For example: "crashkernel=64M@16M".
| > +
| > + B) Second kernel:
| > + a) Enable "kernel crash dumps" feature (in Processor type and features).
| > + CONFIG_CRASH_DUMP=y
| > + b) Specify a suitable value for "Physical address where the kernel is
| > + loaded" (in Processor type and features). Typically this value
| > + should be same as X (See option b) above, e.g., 16 MB or 0x1000000.
|
| Should above line be as follows.
| "should be same as X (See option d) above."

Yes, thanks for catching that. Now how to update it....?

| This will make clear what is X and what should be the new value of
| CONFIG_PHYSICAL_START.


---
~Randy

2005-04-30 03:05:12

by Randy.Dunlap

[permalink] [raw]
Subject: [PATCH] Kdump doc. fix option typo.

On Fri, 29 Apr 2005 10:37:29 +0530 Vivek Goyal wrote:

| Should above line be as follows.
| "should be same as X (See option d) above."
|
| This will make clear what is X and what should be the new value of
| CONFIG_PHYSICAL_START.


From: Randy Dunlap <[email protected]>

Fix one-letter typo of option b->d.

Signed-off-by: Randy Dunlap <[email protected]>
---

Documentation/kdump.txt | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

diff -Naurp ./Documentation/kdump.txt~kdump_doc_fix_optionb ./Documentation/kdump.txt
--- ./Documentation/kdump.txt~kdump_doc_fix_optionb 2005-04-28 19:55:03.000000000 -0700
+++ ./Documentation/kdump.txt 2005-04-29 19:59:32.000000000 -0700
@@ -60,7 +60,7 @@ SETUP
CONFIG_CRASH_DUMP=y
b) Specify a suitable value for "Physical address where the kernel is
loaded" (in Processor type and features). Typically this value
- should be same as X (See option b) above, e.g., 16 MB or 0x1000000.
+ should be same as X (See option d) above, e.g., 16 MB or 0x1000000.
CONFIG_PHYSICAL_START=0x1000000
c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems).
CONFIG_PROC_VMCORE=y


---