2007-09-09 08:39:45

by Bernhard Walle

[permalink] [raw]
Subject: [patch 1/5] Extended crashkernel command line

This is the generic part of the patch. It adds a parse_crashkernel() function
in kernel/kexec.c that is called by the architecture specific code that
actually reserves the memory. That function takes the whole command line and
looks itself for "crashkernel=" in it.

If there are multiple occurrences, then the last one is taken. The advantage
is that if you have a bootloader like lilo or elilo which allows you to append
a command line parameter but not to remove one (like in GRUB), then you can add
another crashkernel value for testing at the boot command line and this one
overwrites the command line in the configuration then.


Signed-off-by: Bernhard Walle <[email protected]>

---
include/linux/kexec.h | 2
kernel/kexec.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 141 insertions(+)

--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -179,6 +179,8 @@ extern note_buf_t *crash_notes;
extern u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4];
extern unsigned int vmcoreinfo_size;
extern unsigned int vmcoreinfo_max_size;
+int parse_crashkernel(char *cmdline, unsigned long long system_ram,
+ unsigned long long *crash_size, unsigned long long *crash_base);


#else /* !CONFIG_KEXEC */
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1146,6 +1146,145 @@ static int __init crash_notes_memory_ini
}
module_init(crash_notes_memory_init)

+
+/*
+ * parsing the "crashkernel" commandline
+ *
+ * this code is intended to be called from architecture specific code
+ */
+
+
+/*
+ * This function parses command lines in the format
+ *
+ * crashkernel=<ramsize-range>:<size>[,...][@<base>]
+ *
+ * The function returns 0 on success and -EINVAL on failure.
+ */
+static int parse_crashkernel_mem(char *cmdline,
+ unsigned long long *crash_size,
+ unsigned long long *crash_base,
+ unsigned long system_ram)
+{
+ char *cur = cmdline;
+
+ *crash_size = 0;
+ *crash_base = 0;
+
+ /* for each entry of the comma-separated list */
+ do {
+ unsigned long long start = 0, end = ULLONG_MAX;
+ unsigned long long size = -1;
+
+ /* get the start of the range */
+ start = memparse(cur, &cur);
+ if (*cur != '-') {
+ printk(KERN_WARNING "crashkernel: '-' expected\n");
+ return -EINVAL;
+ }
+ cur++;
+
+ /* if no ':' is here, than we read the end */
+ if (*cur != ':') {
+ end = memparse(cur, &cur);
+ if (end <= start) {
+ printk(KERN_WARNING "crashkernel: end <= start\n");
+ return -EINVAL;
+ }
+ }
+
+ if (*cur != ':') {
+ printk(KERN_WARNING "crashkernel: ':' expected\n");
+ return -EINVAL;
+ }
+ cur++;
+
+ size = memparse(cur, &cur);
+ if (size < 0) {
+ printk(KERN_WARNING "crashkernel: invalid size\n");
+ return -EINVAL;
+ }
+
+ /* match ? */
+ if (system_ram >= start && system_ram <= end) {
+ *crash_size = size;
+ break;
+ }
+ } while (*cur++ == ',');
+
+ if (*crash_size > 0) {
+ while (*cur != ' ' && *cur != '@')
+ cur++;
+ if (*cur == '@')
+ *crash_base = memparse(cur+1, &cur);
+ }
+
+ return 0;
+}
+
+/*
+ * That function parses "simple" (old) crashkernel command lines like
+ *
+ * crashkernel=size[@base]
+ *
+ * It returns 0 on success and -EINVAL on failure.
+ */
+static int parse_crashkernel_simple(char *cmdline,
+ unsigned long long *crash_size,
+ unsigned long long *crash_base)
+{
+ char *cur = cmdline;
+
+ *crash_size = memparse(cmdline, &cur);
+ if (cmdline == cur)
+ return -EINVAL;
+
+ if (*cur == '@')
+ *crash_base = memparse(cur+1, &cur);
+
+ return 0;
+}
+
+/*
+ * That function is the entry point for command line parsing and should be
+ * called from the arch-specific code.
+ */
+int parse_crashkernel(char *cmdline,
+ unsigned long long system_ram,
+ unsigned long long *crash_size,
+ unsigned long long *crash_base)
+{
+ char *p = cmdline, *ck_cmdline = NULL;
+ char *first_colon, *first_space;
+
+ /* find crashkernel and use the last one if there are more */
+ p = strstr(p, "crashkernel=");
+ while (p) {
+ ck_cmdline = p;
+ p = strstr(p+1, "crashkernel=");
+ }
+
+ if (!ck_cmdline)
+ return -EINVAL;
+
+ ck_cmdline += 12; /* strlen("crashkernel=") */
+
+ /*
+ * if the commandline contains a ':', then that's the extended
+ * syntax -- if not, it must be the classic syntax
+ */
+ first_colon = strchr(ck_cmdline, ':');
+ first_space = strchr(ck_cmdline, ' ');
+ if (first_colon && (!first_space || first_colon < first_space))
+ return parse_crashkernel_mem(ck_cmdline, crash_size,
+ crash_base, system_ram);
+ else
+ return parse_crashkernel_simple(ck_cmdline,
+ crash_size, crash_base);
+}
+
+
+
void crash_save_vmcoreinfo(void)
{
u32 *buf;

--


2007-09-11 06:26:57

by Vivek Goyal

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

On Sun, Sep 09, 2007 at 10:39:15AM +0200, Bernhard Walle wrote:

[..]
> +
> +/*
> + * parsing the "crashkernel" commandline
> + *
> + * this code is intended to be called from architecture specific code
> + */
> +
> +
> +/*
> + * This function parses command lines in the format
> + *
> + * crashkernel=<ramsize-range>:<size>[,...][@<base>]
> + *
> + * The function returns 0 on success and -EINVAL on failure.
> + */
> +static int parse_crashkernel_mem(char *cmdline,
> + unsigned long long *crash_size,
> + unsigned long long *crash_base,
> + unsigned long system_ram)
> +{
> + char *cur = cmdline;
> +
> + *crash_size = 0;
> + *crash_base = 0;
> +
> + /* for each entry of the comma-separated list */
> + do {
> + unsigned long long start = 0, end = ULLONG_MAX;
> + unsigned long long size = -1;
> +
> + /* get the start of the range */
> + start = memparse(cur, &cur);
> + if (*cur != '-') {
> + printk(KERN_WARNING "crashkernel: '-' expected\n");
> + return -EINVAL;
> + }
> + cur++;
> +
> + /* if no ':' is here, than we read the end */
> + if (*cur != ':') {
> + end = memparse(cur, &cur);
> + if (end <= start) {
> + printk(KERN_WARNING "crashkernel: end <= start\n");
> + return -EINVAL;
> + }
> + }
> +
> + if (*cur != ':') {
> + printk(KERN_WARNING "crashkernel: ':' expected\n");
> + return -EINVAL;
> + }
> + cur++;
> +
> + size = memparse(cur, &cur);
> + if (size < 0) {
> + printk(KERN_WARNING "crashkernel: invalid size\n");
> + return -EINVAL;
> + }
> +
> + /* match ? */
> + if (system_ram >= start && system_ram <= end) {
> + *crash_size = size;
> + break;
> + }
> + } while (*cur++ == ',');
> +
> + if (*crash_size > 0) {
> + while (*cur != ' ' && *cur != '@')
> + cur++;
> + if (*cur == '@')
> + *crash_base = memparse(cur+1, &cur);

"offset" seems to be optional in the new syntax. What happens if user does
not specify offset. I think crash_base will be set to zero and system will
try to reserve x amount of memory start at zero? That would fail?

I think we should add some intelligence for automatic selection of "offset"
if user has not specified one. Automatically choose a chunk of free memory.
This takes away the headache from user for selecting a right place. In fact
one "offset" might not be valid for all the systems. I remember, somebody
had reported that ACPI tables were mapped lower in the address space and
reserving memory for kdump had failed.

Choosing the "offset" automatically should help distributions where one
command line is supposed to work on all the systems.

Thanks
Vivek

2007-09-11 10:01:22

by Bernhard Walle

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

* Vivek Goyal <[email protected]> [2007-09-11 08:15]:
>
> "offset" seems to be optional in the new syntax. What happens if user does
> not specify offset. I think crash_base will be set to zero and system will
> try to reserve x amount of memory start at zero? That would fail?

That's handled in the architecture specific code -- because it's
different on each architecture and the architecture specific code does
memory reservation. IA64 already can handle this case (on IA64,
specifying 0 is the same than leaving out the base address, and that's
why I wanted to keep that semantics). I think it doesn't also make
sense on i386/x86_64 to choose 0 as real base address, because the
value below 1 MB is special for booting ...

> I think we should add some intelligence for automatic selection of "offset"
> if user has not specified one. Automatically choose a chunk of free memory.
> This takes away the headache from user for selecting a right place. In fact
> one "offset" might not be valid for all the systems. I remember, somebody
> had reported that ACPI tables were mapped lower in the address space and
> reserving memory for kdump had failed.

I think you're right -- with the relocatable kernel it makes sense to
have the kernel to decide where to reservate that memory. But that's
beyond of the scope of that patch series -- however, I can come up
with a solution, after that patch has been finished.


Thanks,
Bernhard

2007-09-11 13:14:55

by Olaf Dabrunz

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

On 09-Sep-07, Bernhard Walle wrote:
> If there are multiple occurrences, then the last one is taken. The advantage
> is that if you have a bootloader like lilo or elilo which allows you to append
> a command line parameter but not to remove one (like in GRUB), then you can add
> another crashkernel value for testing at the boot command line and this one
> overwrites the command line in the configuration then.

AFAIK it is the other way round: the lilo configuration-parameter
"append" appends the kernel parameters in the configuration to the
parameters given by the user on the kernel command line.

So the parameters set by the user during boot appear first on the
command line.

> Signed-off-by: Bernhard Walle <[email protected]>
>
> ---
> include/linux/kexec.h | 2
> kernel/kexec.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 141 insertions(+)
>
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -179,6 +179,8 @@ extern note_buf_t *crash_notes;
> extern u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4];
> extern unsigned int vmcoreinfo_size;
> extern unsigned int vmcoreinfo_max_size;
> +int parse_crashkernel(char *cmdline, unsigned long long system_ram,
> + unsigned long long *crash_size, unsigned long long *crash_base);
>
> [...]
>
> +int parse_crashkernel(char *cmdline,
> + unsigned long long system_ram,
> + unsigned long long *crash_size,
> + unsigned long long *crash_base)
> +{
> + char *p = cmdline, *ck_cmdline = NULL;
> + char *first_colon, *first_space;
> +
> + /* find crashkernel and use the last one if there are more */
> + p = strstr(p, "crashkernel=");
> + while (p) {
> + ck_cmdline = p;
> + p = strstr(p+1, "crashkernel=");
> + }
> +

So perhaps something like this instead:

+ /* find crashkernel and use the first one if there are more */
+ p = strstr(p, "crashkernel=");
+ ck_cmdline = p;

> + if (!ck_cmdline)
> + return -EINVAL;
> +
> + ck_cmdline += 12; /* strlen("crashkernel=") */
> +
> + /*
> + * if the commandline contains a ':', then that's the extended
> + * syntax -- if not, it must be the classic syntax
> + */
> + first_colon = strchr(ck_cmdline, ':');
> + first_space = strchr(ck_cmdline, ' ');
> + if (first_colon && (!first_space || first_colon < first_space))
>
> [...]

--
Olaf Dabrunz (od/odabrunz), SUSE Linux Products GmbH, Nürnberg

2007-09-11 15:28:19

by Lombard, David N

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

On Tue, Sep 11, 2007 at 03:14:45PM +0200, Olaf Dabrunz wrote:
> On 09-Sep-07, Bernhard Walle wrote:
> > If there are multiple occurrences, then the last one is taken. The advantage
> > is that if you have a bootloader like lilo or elilo which allows you to append
> > a command line parameter but not to remove one (like in GRUB), then you can add
> > another crashkernel value for testing at the boot command line and this one
> > overwrites the command line in the configuration then.
>
> AFAIK it is the other way round: the lilo configuration-parameter
> "append" appends the kernel parameters in the configuration to the
> parameters given by the user on the kernel command line.
>
> So the parameters set by the user during boot appear first on the
> command line.

Given the kernel itself processes the cmdline from front to back,
that suggests the last value is the correct value.


A survey of bootloaders shows differing behaviors:

syslinux et al.:
APPEND options...
Add one or more options to the kernel command line. These are
added both for automatic and manual boots. The options are
added at the very beginning of the kernel command line,
usually permitting explicitly entered kernel options to override
them. This is the equivalent of the LILO "append" option.

lilo:
append=<string>

Appends the options specified to the parameter line passed to
the kernel. This is typically used to specify hardware parameters
that can't be entirely auto-detected or for which probing may be
dangerous. Multiple kernel parameters are separated by a blank
space, and the string must be enclosed in double quotes. A local
append= appearing withing an image= section overrides any
global append= appearing in the top section of the configuration file.
Append= may be used only once per "image=" section.

[There's also a local addappend string that's appended to the append string]

elilo:
append=string Append a string of options to kernel command line
...
The user can specify a kernel and related kernel options using the image
label. Alternatively, the user can also specify a kernel file that is
not specified in the config file. In any case, some of the global options
(such as append) are always concatenated to whatever the user type.

Grub:
Interactive editing of the command line...

kboot
Config file entries precede user's values from keyboard



--
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.

2007-09-11 17:21:49

by Bernhard Walle

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

* Lombard, David N <[email protected]> [2007-09-11 17:32]:
>
> lilo:
> append=<string>
>
> Appends the options specified to the parameter line passed to
> the kernel.

Given that lilo appends the user-specified command line, I think it's
the best to honor the last value. It has also been the previous
behaviour (execpt IA64).

Having a different behaviour if the last or the first value is honored
on different architectures is not what we want, IMO.


Thanks,
Bernhard

2007-09-12 11:23:18

by Vivek Goyal

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

On Tue, Sep 11, 2007 at 12:01:10PM +0200, Bernhard Walle wrote:
> * Vivek Goyal <[email protected]> [2007-09-11 08:15]:
> >
> > "offset" seems to be optional in the new syntax. What happens if user does
> > not specify offset. I think crash_base will be set to zero and system will
> > try to reserve x amount of memory start at zero? That would fail?
>
> That's handled in the architecture specific code -- because it's
> different on each architecture and the architecture specific code does
> memory reservation. IA64 already can handle this case (on IA64,
> specifying 0 is the same than leaving out the base address, and that's
> why I wanted to keep that semantics). I think it doesn't also make
> sense on i386/x86_64 to choose 0 as real base address, because the
> value below 1 MB is special for booting ...
>

Ok. I see IA64 is handling this case. But in current patchset, i386 and
x86_64 will try to reserve memory starting at zero? So we still got
to handle this case in i386 and x86_64?

Thanks
Vivek

2007-09-12 11:35:48

by Bernhard Walle

[permalink] [raw]
Subject: Re: [patch 1/5] Extended crashkernel command line

* Vivek Goyal <[email protected]> [2007-09-12 13:23]:
> On Tue, Sep 11, 2007 at 12:01:10PM +0200, Bernhard Walle wrote:
> > * Vivek Goyal <[email protected]> [2007-09-11 08:15]:
> > >
> > > "offset" seems to be optional in the new syntax. What happens if user does
> > > not specify offset. I think crash_base will be set to zero and system will
> > > try to reserve x amount of memory start at zero? That would fail?
> >
> > That's handled in the architecture specific code -- because it's
> > different on each architecture and the architecture specific code does
> > memory reservation. IA64 already can handle this case (on IA64,
> > specifying 0 is the same than leaving out the base address, and that's
> > why I wanted to keep that semantics). I think it doesn't also make
> > sense on i386/x86_64 to choose 0 as real base address, because the
> > value below 1 MB is special for booting ...
> >
>
> Ok. I see IA64 is handling this case. But in current patchset, i386 and
> x86_64 will try to reserve memory starting at zero? So we still got
> to handle this case in i386 and x86_64?

Yes, my fault. I need to replace

+ if (ret == 0 && crash_size > 0) {

with

+ if (ret == 0 && crash_size > 0 && crash_base > 0) {

I'll repost the whole patch with all the corrections when I finished
PPC64 and SH. (I'm not in office this week, that's why I'm a bit slow.)


Thanks,
Bernhard