2005-02-24 08:22:46

by Vivek Goyal

[permalink] [raw]
Subject: [PATCH] Fix for broken kexec on panic

Hi,

Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
re-organization of boot memory allocator initialization code. Primary
kernel does not boot if kexec is enabled and crashkernel=X@Y command
line parameter is passed. After re-organization, kexec is trying to call
reserve_bootmem before boot memory allocator has initialized.

This patch fixes the problem. I have moved the call to
reserved_bootmem() for kexec for both discontig and contig memory into
new setup_bootmem_allocator().

This patch has been generated against 2.6.11-rc4-mm1

Thanks
Vivek


Attachments:
kexec-reserve-bootmem-fix.patch (2.15 kB)

2005-02-24 09:13:38

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Fix for broken kexec on panic

Vivek Goyal <[email protected]> wrote:
>
> Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> re-organization of boot memory allocator initialization code.

OK...

Where are we up to with these patches, btw? Do you consider them
close-to-complete? Do you have a feel for what proportion of machines will
work correctly?

2005-02-24 12:19:04

by Maneesh Soni

[permalink] [raw]
Subject: Re: [Fastboot] Re: [PATCH] Fix for broken kexec on panic

On Thu, Feb 24, 2005 at 01:13:12AM -0800, Andrew Morton wrote:
> Vivek Goyal <[email protected]> wrote:
> >
> > Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> > re-organization of boot memory allocator initialization code.
>
> OK...
>
> Where are we up to with these patches, btw? Do you consider them
> close-to-complete? Do you have a feel for what proportion of machines will
> work correctly?

After the rework of kexec patches, there is very minimal kernel code needed
for kdump and most of the code is in user space kexec-tools. The changes
needed in kexec-tools to load the crashdump kernel and generate ELF headers,
for x86 architecture are done and will be posted for comments today by Vivek.

Currently the work remaining is to capture the old-kernel memory during second
kernel boot up. There is some lack of consensus whether this functionality
should go in kernel-space (/proc/vmcore) or user-space (a separate utility
which can be run from initrd). Before the last kexec rework, kdump has the
facility to do /proc/vmcore and now it has to be re-done accordingly. There is
some code already done by Eric to do it in user-space. We are evaluating both
the approaches and should arrive at the conclusion asap.

Thanks
Maneesh

--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990

2005-02-24 12:51:47

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Fastboot] Re: [PATCH] Fix for broken kexec on panic

Andrew Morton <[email protected]> writes:

> Vivek Goyal <[email protected]> wrote:
> >
> > Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> > re-organization of boot memory allocator initialization code.
>
> OK...
>
> Where are we up to with these patches, btw?

Currently we are in the middle of a reorganization that places
more of the work in user space, making the solution more robust.

> Do you consider them close-to-complete?

If you consider complete working code yes.
Design wise I believe we are complete except proving out
the pieces.

Off the top of my head the current todo looks like:
- kexec on panic user space preparation
Basically generating ELF headers of a CORE dump before
we crash.

- Moving ioapic setup on x86 and x86_64 into init_IRQs, where it
belongs. Currently when using IOAPICs we use them for everything
except calibrating the delay loop. Where we assume the legacy
interrupt controller is setup. When initiating kexec from
panic() we don't do a clean shutdown so that is not the case
and SMP is broken.

Once those two pieces are in place and tested we should
be able to drop all of the crashdump-* patches. As well
as the crash_shutdown patches that touch the apics.

Of course there will still be lots of pieces left to make the drivers
more robust, and to make things easier to use. But all of those
are evolutionary improvements, not core architecture things.

> Do you have a feel for what proportion of machines will
> work correctly?

Yes.

For the non panic case I need to get ioapic virtual wire
setup working on ACPI systems. And I need to fix the ACPI
using interrupts on the shutdown path bug.

Most device drivers work without being touched.

So we should work fine on x86 and x86_64 systems
using either the legacy interrupt controller or IOAPICs.

So it should be most of x86 and x86_64 systems, working
with no more effort than I have described. And as device
drivers are fixed even more systems.

In addition there is nothing non-portable about the architecture,
although avoiding firmware calls on non-embedded ports can be a very
challenging modification to the boot process. So we should
be able to start picking up most machines on other architectures like
ppc, ppc64 and ia64 as soon the ports are complete.

>From a user perspective things are going to be rough for a while
as things all the final kinks get worked out. But things will
be largely usable and we should be able to get usable
bug reports. Of course that is the level where we start seeing
a whole new class of bug reports :)

The biggest theoretical gotcha are systems with hot plug memory.
As we are memorizing the memory map and storing it in a safe
along with the recovery kernel before a crash occurs.

Basically we are in pretty good shape except for systems
like an SGI-Altix or an IBM-s390. Hot-plug memory, multi-terabyte
core dump sizes, and weird SMP architectures are problems that
we don't/won't cope with well. With only hotplug memory requiring a
modification of responsibilities to handle.

Eric

2005-02-24 13:08:24

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [Fastboot] Re: [PATCH] Fix for broken kexec on panic

Maneesh Soni <[email protected]> writes:

> On Thu, Feb 24, 2005 at 01:13:12AM -0800, Andrew Morton wrote:
> > Vivek Goyal <[email protected]> wrote:
> > >
> > > Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> > > re-organization of boot memory allocator initialization code.
> >
> > OK...
> >
> > Where are we up to with these patches, btw? Do you consider them
> > close-to-complete? Do you have a feel for what proportion of machines will
> > work correctly?
>
> After the rework of kexec patches, there is very minimal kernel code needed
> for kdump and most of the code is in user space kexec-tools. The changes
> needed in kexec-tools to load the crashdump kernel and generate ELF headers,
> for x86 architecture are done and will be posted for comments today by Vivek.

Cool.

> Currently the work remaining is to capture the old-kernel memory during second
> kernel boot up. There is some lack of consensus whether this functionality
> should go in kernel-space (/proc/vmcore) or user-space (a separate utility
> which can be run from initrd). Before the last kexec rework, kdump has the
> facility to do /proc/vmcore and now it has to be re-done accordingly. There is
> some code already done by Eric to do it in user-space. We are evaluating both
> the approaches and should arrive at the conclusion asap.

Do you have a pointer to your user space kdump stuff? I have never
seen it.

How to configure this and the usability issues are interesting. There is
no fundamental reason the code needs to live in a ramdisk. We are back
in a fully functional kernel after all. In this case a
ramdisk/initramfs is useful for the same reason a ramdisk with a
rescue disk is useful. It is possible the normal root filesystem is
corrupt. A ramdisk allows you to have a known good copy of your
tools.

Eric

2005-02-24 13:41:32

by Maneesh Soni

[permalink] [raw]
Subject: Re: [Fastboot] Re: [PATCH] Fix for broken kexec on panic

On Thu, Feb 24, 2005 at 06:05:45AM -0700, Eric W. Biederman wrote:
> Maneesh Soni <[email protected]> writes:
>
> > On Thu, Feb 24, 2005 at 01:13:12AM -0800, Andrew Morton wrote:
> > > Vivek Goyal <[email protected]> wrote:
> > > >
> > > > Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> > > > re-organization of boot memory allocator initialization code.
> > >
> > > OK...
> > >
> > > Where are we up to with these patches, btw? Do you consider them
> > > close-to-complete? Do you have a feel for what proportion of machines will
> > > work correctly?
> >
> > After the rework of kexec patches, there is very minimal kernel code needed
> > for kdump and most of the code is in user space kexec-tools. The changes
> > needed in kexec-tools to load the crashdump kernel and generate ELF headers,
> > for x86 architecture are done and will be posted for comments today by Vivek.
>
> Cool.
>
> > Currently the work remaining is to capture the old-kernel memory during second
> > kernel boot up. There is some lack of consensus whether this functionality
> > should go in kernel-space (/proc/vmcore) or user-space (a separate utility
> > which can be run from initrd). Before the last kexec rework, kdump has the
> > facility to do /proc/vmcore and now it has to be re-done accordingly. There is
> > some code already done by Eric to do it in user-space. We are evaluating both
> > the approaches and should arrive at the conclusion asap.
>
> Do you have a pointer to your user space kdump stuff? I have never
> seen it.

Actually I meant user space utility done by you. IMO, /proc/vmcore option
as saves one from the hassel of a _new_ user space tool to configure / capture
crash dumps. Though a user-space tool in addition to /proc/vmcore can be useful
also in a badly crashed system.

> How to configure this and the usability issues are interesting. There is
> no fundamental reason the code needs to live in a ramdisk. We are back
> in a fully functional kernel after all. In this case a
> ramdisk/initramfs is useful for the same reason a ramdisk with a
> rescue disk is useful. It is possible the normal root filesystem is
> corrupt. A ramdisk allows you to have a known good copy of your
> tools.
>
> Eric

--
Maneesh Soni
Linux Technology Center,
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990

2005-02-24 15:39:34

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] Fix for broken kexec on panic

On Thu, 2005-02-24 at 14:43 +0530, Vivek Goyal wrote:
> Kexec on panic is broken on i386 in 2.6.11-rc3-mm2 because of
> re-organization of boot memory allocator initialization code. Primary
> kernel does not boot if kexec is enabled and crashkernel=X@Y command
> line parameter is passed. After re-organization, kexec is trying to call
> reserve_bootmem before boot memory allocator has initialized.
>
> This patch fixes the problem. I have moved the call to
> reserved_bootmem() for kexec for both discontig and contig memory into
> new setup_bootmem_allocator().
>
> This patch has been generated against 2.6.11-rc4-mm1

Looks like a good change, especially since it reduces the total amount
of code (and the size of your patch).

Although, to make any potential merging easier, it is almost always
better to put those kinds of things in functions #ifdef'd in a header.
The fact that there are other #ifdefs in setup_bootmem_allocator() is a
partial excuse, but not a very good one. :)

-- Dave