2008-12-09 23:56:42

by Zachary Amsden

[permalink] [raw]
Subject: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

Patches backported into 2.6.27.4 caused a regression with VMI kernels
running on VMware which ends in a page fault during boot. I have a fix
which still allows DMI checks to be done early.

VMI initialiation can relocate the fixmap, causing early_ioremap
to malfunction if it is initialized before the relocation. The
ioremap area is low enough in virtual address space that no actual
collision occurs, however, because the pagetables for it were not
allocated under VMI mode, the pagetable updates are dropped by
the hypervisor as irrelevant, resulting in a crash on boot.

The best fix is perhaps to move early_ioremap_init() after vmi_init().
The only things done before VMI init are basic memory access, things
like collating the memory map, collecting boot CPUID capabilities, and
parsing the early command line options... which vmi_init needs.

Since this went back into 2.6.27, it needs to go to both 2.6.28 and
eventually to stable. I didn't add any comments or anything as there
could be some debate what the proper ordering should be. In case that
becomes an interesting discussion, there are two relevant facts in git
today:

1) no clients of early_ioremap occur before DMI.
2) VMI requires access to early boot params.

If any can suggest a better ordering, I am certainly open to that as
well.

Thanks,

Zach


Attachments:
x86-vmi-boot-ioremap-fix.patch (0.98 kB)

2008-12-10 00:47:18

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Tue, Dec 09, 2008 at 04:50:21PM -0800, Zachary Amsden wrote:
> Patches backported into 2.6.27.4 caused a regression with VMI kernels
> running on VMware which ends in a page fault during boot. I have a fix
> which still allows DMI checks to be done early.

Is this regression also present on 2.6.28-rc kernels as well?

thanks,

greg k-h

2008-12-10 01:15:24

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Tue, Dec 9, 2008 at 4:50 PM, Zachary Amsden <[email protected]> wrote:
> Patches backported into 2.6.27.4 caused a regression with VMI kernels
> running on VMware which ends in a page fault during boot. I have a fix
> which still allows DMI checks to be done early.
>
> The best fix is perhaps to move early_ioremap_init() after vmi_init().
> The only things done before VMI init are basic memory access, things
> like collating the memory map, collecting boot CPUID capabilities, and
> parsing the early command line options... which vmi_init needs.
>
> Since this went back into 2.6.27, it needs to go to both 2.6.28 and
> eventually to stable. I didn't add any comments or anything as there
> could be some debate what the proper ordering should be. In case that
> becomes an interesting discussion, there are two relevant facts in git
> today:
>
> 1) no clients of early_ioremap occur before DMI.
> 2) VMI requires access to early boot params.
>
> If any can suggest a better ordering, I am certainly open to that as
> well.

VMI initialiation can relocate the fixmap, causing early_ioremap
to malfunction if it is initialized before the relocation. The
ioremap area is low enough in virtual address space that no actual
collision occurs, however, because the pagetables for it were not
allocated under VMI mode, the pagetable updates are dropped by
the hypervisor as irrelevant, resulting in a crash on boot.

Signed-off-by: Zachary Amsden <[email protected]>

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9d5674f..9627753 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -795,7 +795,6 @@ void __init setup_arch(char **cmdline_p)
#endif

early_cpu_init();
- early_ioremap_init();

ROOT_DEV = old_decode_dev(boot_params.hdr.root_dev);
screen_info = boot_params.screen_info;
@@ -888,6 +887,8 @@ void __init setup_arch(char **cmdline_p)
vmi_init();
#endif

+ early_ioremap_init();
+
/* after early param, so could get panic from serial */
reserve_early_setup_data();


you can not move that late,

parse_setup_data==>early_memremap==>__early_ioremap

YH

2008-12-10 01:53:22

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

Yinghai Lu wrote:
>
> VMI initialiation can relocate the fixmap, causing early_ioremap
> to malfunction if it is initialized before the relocation. The
> ioremap area is low enough in virtual address space that no actual
> collision occurs, however, because the pagetables for it were not
> allocated under VMI mode, the pagetable updates are dropped by
> the hypervisor as irrelevant, resulting in a crash on boot.
>

I have mentioned in the past that I think the very concept of relocating
the fixmap to be utterly braindead. Instead, I believe we should locate
it low in kernel space so it doesn't have to be relocated. It's
unfortunately a relatively large change.

-hpa

2008-12-10 06:36:22

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Tue, 2008-12-09 at 16:44 -0800, Greg KH wrote:
> On Tue, Dec 09, 2008 at 04:50:21PM -0800, Zachary Amsden wrote:
> > Patches backported into 2.6.27.4 caused a regression with VMI kernels
> > running on VMware which ends in a page fault during boot. I have a fix
> > which still allows DMI checks to be done early.
>
> Is this regression also present on 2.6.28-rc kernels as well?

Yes, I reproduced and fixed this on a git tree synced to mainline today.

Thanks,

Zach

2008-12-10 06:37:38

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

A. On Tue, 2008-12-09 at 17:48 -0800, H. Peter Anvin wrote:
> Yinghai Lu wrote:
> >
> > VMI initialiation can relocate the fixmap, causing early_ioremap
> > to malfunction if it is initialized before the relocation. The
> > ioremap area is low enough in virtual address space that no actual
> > collision occurs, however, because the pagetables for it were not
> > allocated under VMI mode, the pagetable updates are dropped by
> > the hypervisor as irrelevant, resulting in a crash on boot.
> >
>
> I have mentioned in the past that I think the very concept of relocating
> the fixmap to be utterly braindead. Instead, I believe we should locate
> it low in kernel space so it doesn't have to be relocated. It's
> unfortunately a relatively large change.

I agree, but a bit too late for 2.6.27 stable and 2.6.28-rc.

2008-12-10 06:42:29

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Tue, 2008-12-09 at 17:15 -0800, Yinghai Lu wrote:

> you can not move that late,
>
> parse_setup_data==>early_memremap==>__early_ioremap

Eww. Let me find a way to fix that. It may be as simple as detecting
the required fixmap relocation first, then initializing VMI later, when
we can parse options, but it will need testing...

Thanks for the quick response, I realize it is late in the 2.6.28-rc
phase here and I will send an updated fix tomorrow.

Zach

2008-12-10 09:10:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

Zachary Amsden wrote:
>>>
>> I have mentioned in the past that I think the very concept of relocating
>> the fixmap to be utterly braindead. Instead, I believe we should locate
>> it low in kernel space so it doesn't have to be relocated. It's
>> unfortunately a relatively large change.
>
> I agree, but a bit too late for 2.6.27 stable and 2.6.28-rc.
>

Of course, no argument there.

-hpa

2008-12-10 23:12:40

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Tue, 2008-12-09 at 17:15 -0800, Yinghai Lu wrote:

> you can not move that late,
>
> parse_setup_data==>early_memremap==>__early_ioremap

How does this look?


Attachments:
x86-vmi-boot-ioremap-fix-take-2.patch (2.39 kB)

2008-12-11 00:21:53

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

Zachary Amsden wrote:
> On Tue, 2008-12-09 at 17:15 -0800, Yinghai Lu wrote:
>
>> you can not move that late,
>>
>> parse_setup_data==>early_memremap==>__early_ioremap
>
> How does this look?
>

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9d5674f..4c381cb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -794,6 +794,11 @@ void __init setup_arch(char **cmdline_p)
printk(KERN_INFO "Command line: %s\n", boot_command_line);
#endif

+#ifdef CONFIG_VMI
+ /* VMI may relocate the fixmap; do this before touching ioremap area */
+ vmi_init();
+#endif
+
early_cpu_init();
early_ioremap_init();

@@ -880,12 +885,9 @@ void __init setup_arch(char **cmdline_p)
check_efer();
#endif

-#if defined(CONFIG_VMI) && defined(CONFIG_X86_32)
- /*
- * Must be before kernel pagetables are setup
- * or fixmap area is touched.
- */
- vmi_init();
+#if defined(CONFIG_VMI)
+ /* Must be before kernel pagetables are setup */
+ vmi_activate();
#endif

/* after early param, so could get panic from serial */
diff --git a/arch/x86/kernel/vmi_32.c b/arch/x86/kernel/vmi_32.c
index 8b6c393..22fd657 100644
--- a/arch/x86/kernel/vmi_32.c
+++ b/arch/x86/kernel/vmi_32.c
@@ -960,8 +960,6 @@ static inline int __init activate_vmi(void)

void __init vmi_init(void)
{
- unsigned long flags;
-
if (!vmi_rom)
probe_vmi_rom();
else
@@ -973,13 +971,21 @@ void __init vmi_init(void)

reserve_top_address(-vmi_rom->virtual_top);


it seems still have some problem.
you moved reserve_top_address before parse_parameter...

so
void __init reserve_top_address(unsigned long reserve)
{
BUG_ON(fixmaps_set > 0);
printk(KERN_INFO "Reserving virtual address space above 0x%08x\n",
(int)-reserve);
__FIXADDR_TOP = -reserve - PAGE_SIZE;
__VMALLOC_RESERVE += reserve;
}

/*
* vmalloc=size forces the vmalloc area to be exactly 'size'
* bytes. This can be used to increase (or decrease) the
* vmalloc area - the default is 128m.
*/
static int __init parse_vmalloc(char *arg)
{
if (!arg)
return -EINVAL;

/* Add VMALLOC_OFFSET to the parsed value due to vm area guard hole*/
__VMALLOC_RESERVE = memparse(arg, &arg) + VMALLOC_OFFSET;
return 0;
}
early_param("vmalloc", parse_vmalloc);

__VMALLOC_RESERVE will be overwriten by vmalloc=...

you may need to split reserve_top_address() to two functions...

YH

2008-12-11 03:45:35

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Wed, Dec 10, 2008 at 04:20:06PM -0800, Yinghai Lu wrote:
> Zachary Amsden wrote:
> > On Tue, 2008-12-09 at 17:15 -0800, Yinghai Lu wrote:
> >
> >> you can not move that late,
> >>
> >> parse_setup_data==>early_memremap==>__early_ioremap
> >
> > How does this look?
> >
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 9d5674f..4c381cb 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -794,6 +794,11 @@ void __init setup_arch(char **cmdline_p)
> printk(KERN_INFO "Command line: %s\n", boot_command_line);
> #endif
>
> +#ifdef CONFIG_VMI
> + /* VMI may relocate the fixmap; do this before touching ioremap area */
> + vmi_init();
> +#endif

Shouldn't the #ifdef not be needed here if the .h files are set up
properly for the vmi_init prototype? Please try to keep them out of .c
files wherever possible.

thanks,

greg k-h

2008-12-11 04:50:49

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27+ kernels

On Wed, 2008-12-10 at 16:20 -0800, Yinghai Lu wrote:
> it seems still have some problem.
> you moved reserve_top_address before parse_parameter...

I really only care about resetting fixmap_top. Adding more vmalloc
space to accomodate what we stole is just being nice... in practice this
should not be a problem.

> __VMALLOC_RESERVE will be overwriten by vmalloc=...
>
> you may need to split reserve_top_address() to two functions...

For now, misuse vmalloc=XXX at your own risk - why the need for it
anyway?

I agree, it should be cleaned up, as HPA suggests, we should move the
fixmap to a fixed place anyways, and have vmalloc area from the top of
linear memory down to the end of the physical memory mapping. But there
really isn't time to do anything better for 2.6.28.

Zach

2008-12-11 21:29:30

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27, 2.6.28 kernels

On Wed, 2008-12-10 at 19:31 -0800, Greg KH wrote:

> > +#ifdef CONFIG_VMI
> > + /* VMI may relocate the fixmap; do this before touching ioremap area */
> > + vmi_init();
> > +#endif
>
> Shouldn't the #ifdef not be needed here if the .h files are set up
> properly for the vmi_init prototype? Please try to keep them out of .c
> files wherever possible.

Yes, they should. Judging by setup.c though, you would think the
opposite... in any case I fixed it. Please apply - and yes, I tested
compile both ways.


Attachments:
x86-vmi-boot-ioremap-fix-take-3.patch (3.00 kB)

2008-12-11 21:46:27

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27, 2.6.28 kernels

On Thu, Dec 11, 2008 at 02:23:11PM -0800, Zachary Amsden wrote:
> On Wed, 2008-12-10 at 19:31 -0800, Greg KH wrote:
>
> > > +#ifdef CONFIG_VMI
> > > + /* VMI may relocate the fixmap; do this before touching ioremap area */
> > > + vmi_init();
> > > +#endif
> >
> > Shouldn't the #ifdef not be needed here if the .h files are set up
> > properly for the vmi_init prototype? Please try to keep them out of .c
> > files wherever possible.
>
> Yes, they should. Judging by setup.c though, you would think the
> opposite... in any case I fixed it. Please apply - and yes, I tested
> compile both ways.

> VMI initialiation can relocate the fixmap, causing early_ioremap
> to malfunction if it is initialized before the relocation.
> To fix this, VMI activation is split into two phases; the detection,
> which must happen before setting up ioremap, and the activation,
> which must happen after parsing early boot parameters.
>
> This fixes a crash on boot when VMI is enabled under VMware.
>
> Signed-off-by: Zachary Amsden <[email protected]>
>
> diff --git a/arch/x86/include/asm/vmi.h b/arch/x86/include/asm/vmi.h
> index b7c0dea..128958a 100644
> --- a/arch/x86/include/asm/vmi.h
> +++ b/arch/x86/include/asm/vmi.h
> @@ -223,9 +223,15 @@ struct pci_header {
> } __attribute__((packed));
>
> /* Function prototypes for bootstrapping */
> +#ifdef CONFIG_VMI
> extern void vmi_init(void);
> +extern void vmi_activate(void);
> extern void vmi_bringup(void);
> -extern void vmi_apply_boot_page_allocations(void);
> +#else
> +#define vmi_init()
> +#define vmi_activate()
> +#define vmi_bringup()
> +#endif

static inline please, don't use #defines for function prototypes, it's
not nice. See Andrew's previous rants about this for details :)

thanks,

greg k-h

2008-12-11 23:41:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27, 2.6.28 kernels

Greg KH wrote:
>> +#else
>> +#define vmi_init()
>> +#define vmi_activate()
>> +#define vmi_bringup()
>> +#endif
>
> static inline please, don't use #defines for function prototypes, it's
> not nice. See Andrew's previous rants about this for details :)

And if it is not possible, technically, for whatever reason, the proper
forms look like:

#define foo() ((void)0)
#define bar(x) ((void)(x))
#define baz(x,y) ((void)((x),(y)))

... which preserve side effects, even if they don't guarantee type safety.

-hpa

2008-12-12 04:43:51

by Zachary Amsden

[permalink] [raw]
Subject: Re: [PATCH] Fix VMI crash on boot in 2.6.27, 2.6.28 kernels

On Thu, 2008-12-11 at 13:45 -0800, Greg KH wrote:

>
> static inline please, don't use #defines for function prototypes, it's
> not nice. See Andrew's previous rants about this for details :)

Fixed, thanks.


Attachments:
x86-vmi-boot-ioremap-fix-take-4.patch (3.06 kB)