When specify 'reservetop=0xbadc0de' kernel parameter, the kernel will
stop booting due to a early_ioremap bug that relate to commit 8827247ff.
The root cause of boot failure problem is the value of 'slot_virt[i]'
was initialized in setup_arch->early_ioremap_init. But later in
setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP'
when 'reservetop=0xbadc0de' being specified.
The simplest fix might be use __fix_to_virt(idx0) to get updated value
of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value
from slot_virt[slot] directly.
Changelog since v0:
-v1: When reservetop being handled then FIXADDR_TOP get adjusted, Hence
check prev_map then re-initialize slot_virt and PMD based on new
FIXADDR_TOP.
-v2: place fixup_early_ioremap hence call early_ioremap_init in
reserve_top_address to re-initialize slot_virt and corresponding PMD
when parse_reservetop
-v3: move fixup_early_ioremap out of reserve_top_address to make sure
other clients of reserve_top_address like xen/lguest won't broken
Signed-off-by: Liang Li <[email protected]>
Cc: Wang Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: Andrew Morton <[email protected]>
Acked-by: Jeremy Fitzhardinge <[email protected]>
Tested-by: Konrad Rzeszutek Wilk <[email protected]>
---
arch/x86/include/asm/io.h | 1 +
arch/x86/mm/ioremap.c | 15 +++++++++++++++
arch/x86/mm/pgtable_32.c | 1 +
3 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index a1dcfa3..30a3e97 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -347,6 +347,7 @@ extern void __iomem *early_ioremap(resource_size_t phys_addr,
extern void __iomem *early_memremap(resource_size_t phys_addr,
unsigned long size);
extern void early_iounmap(void __iomem *addr, unsigned long size);
+extern void fixup_early_ioremap(void);
#define IO_SPACE_LIMIT 0xffff
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 5eb1ba7..e4ab706 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -448,6 +448,21 @@ static inline void __init early_clear_fixmap(enum fixed_addresses idx)
static void __iomem *prev_map[FIX_BTMAPS_SLOTS] __initdata;
static unsigned long prev_size[FIX_BTMAPS_SLOTS] __initdata;
+void __init fixup_early_ioremap(void)
+{
+ int i;
+ for (i = 0; i < FIX_BTMAPS_SLOTS; i++) {
+ if (prev_map[i])
+ break;
+ }
+
+ if (i < FIX_BTMAPS_SLOTS)
+ BUG_ON(1);
+
+ early_ioremap_init();
+ return;
+}
+
static int __init check_early_ioremap_leak(void)
{
int count = 0;
diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c
index 1a8faf0..26eadaa 100644
--- a/arch/x86/mm/pgtable_32.c
+++ b/arch/x86/mm/pgtable_32.c
@@ -128,6 +128,7 @@ static int __init parse_reservetop(char *arg)
address = memparse(arg, &arg);
reserve_top_address(address);
+ fixup_early_ioremap();
return 0;
}
early_param("reservetop", parse_reservetop);
--
1.6.6
On 04/08/2010 05:43 PM, Liang Li wrote:
> When specify 'reservetop=0xbadc0de' kernel parameter, the kernel will
> stop booting due to a early_ioremap bug that relate to commit 8827247ff.
>
> The root cause of boot failure problem is the value of 'slot_virt[i]'
> was initialized in setup_arch->early_ioremap_init. But later in
> setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP'
> when 'reservetop=0xbadc0de' being specified.
>
> The simplest fix might be use __fix_to_virt(idx0) to get updated value
> of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value
> from slot_virt[slot] directly.
>
While I guess this patch works OK, I have to say that I'm worried by the
need for it at all; it seems to be papering over a more serious
problem. reserve_top_address() is supposed to be called very early,
before anything has used or referenced FIXADDR_TOP. If we're seeing
problems with FIXADDR_TOP changing after it has been used, then it means
that reserve_top_address() is being called too late. Fixing that would
be the real fix.
J
> Changelog since v0:
>
> -v1: When reservetop being handled then FIXADDR_TOP get adjusted, Hence
> check prev_map then re-initialize slot_virt and PMD based on new
> FIXADDR_TOP.
>
> -v2: place fixup_early_ioremap hence call early_ioremap_init in
> reserve_top_address to re-initialize slot_virt and corresponding PMD
> when parse_reservetop
>
> -v3: move fixup_early_ioremap out of reserve_top_address to make sure
> other clients of reserve_top_address like xen/lguest won't broken
>
> Signed-off-by: Liang Li <[email protected]>
> Cc: Wang Chen <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Yinghai Lu <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Acked-by: Jeremy Fitzhardinge <[email protected]>
> Tested-by: Konrad Rzeszutek Wilk <[email protected]>
> ---
> arch/x86/include/asm/io.h | 1 +
> arch/x86/mm/ioremap.c | 15 +++++++++++++++
> arch/x86/mm/pgtable_32.c | 1 +
> 3 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> index a1dcfa3..30a3e97 100644
> --- a/arch/x86/include/asm/io.h
> +++ b/arch/x86/include/asm/io.h
> @@ -347,6 +347,7 @@ extern void __iomem *early_ioremap(resource_size_t phys_addr,
> extern void __iomem *early_memremap(resource_size_t phys_addr,
> unsigned long size);
> extern void early_iounmap(void __iomem *addr, unsigned long size);
> +extern void fixup_early_ioremap(void);
>
> #define IO_SPACE_LIMIT 0xffff
>
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 5eb1ba7..e4ab706 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -448,6 +448,21 @@ static inline void __init early_clear_fixmap(enum fixed_addresses idx)
> static void __iomem *prev_map[FIX_BTMAPS_SLOTS] __initdata;
> static unsigned long prev_size[FIX_BTMAPS_SLOTS] __initdata;
>
> +void __init fixup_early_ioremap(void)
> +{
> + int i;
> + for (i = 0; i < FIX_BTMAPS_SLOTS; i++) {
> + if (prev_map[i])
> + break;
> + }
> +
> + if (i < FIX_BTMAPS_SLOTS)
> + BUG_ON(1);
> +
> + early_ioremap_init();
> + return;
> +}
> +
> static int __init check_early_ioremap_leak(void)
> {
> int count = 0;
> diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c
> index 1a8faf0..26eadaa 100644
> --- a/arch/x86/mm/pgtable_32.c
> +++ b/arch/x86/mm/pgtable_32.c
> @@ -128,6 +128,7 @@ static int __init parse_reservetop(char *arg)
>
> address = memparse(arg, &arg);
> reserve_top_address(address);
> + fixup_early_ioremap();
> return 0;
> }
> early_param("reservetop", parse_reservetop);
>
On Thu, Apr 08, 2010 at 08:12:18PM -0700, Jeremy Fitzhardinge wrote:
> On 04/08/2010 05:43 PM, Liang Li wrote:
> > When specify 'reservetop=0xbadc0de' kernel parameter, the kernel will
> > stop booting due to a early_ioremap bug that relate to commit 8827247ff.
> >
> > The root cause of boot failure problem is the value of 'slot_virt[i]'
> > was initialized in setup_arch->early_ioremap_init. But later in
> > setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP'
> > when 'reservetop=0xbadc0de' being specified.
> >
> > The simplest fix might be use __fix_to_virt(idx0) to get updated value
> > of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value
> > from slot_virt[slot] directly.
> >
>
> While I guess this patch works OK, I have to say that I'm worried by the
> need for it at all; it seems to be papering over a more serious
> problem. reserve_top_address() is supposed to be called very early,
> before anything has used or referenced FIXADDR_TOP. If we're seeing
> problems with FIXADDR_TOP changing after it has been used, then it means
> that reserve_top_address() is being called too late. Fixing that would
> be the real fix.
The ideal thing is FIXADDR_TOP should not be touched after
early_ioremap_init. The late call to reserve_top_address is from
parse_reservetop, aka when reservetop=0xabcd0000 being passed as kernel
commandline parameter. In setup_arch, the call sequence is:
setup_arch
-> early_ioremap_init
-> parse_early_param
-> parse_reservetop
->reserve_top_address
See, how could we solve the confliction better?
Best regards,
-Liang Li
>
> J
>
> > Changelog since v0:
> >
> > -v1: When reservetop being handled then FIXADDR_TOP get adjusted, Hence
> > check prev_map then re-initialize slot_virt and PMD based on new
> > FIXADDR_TOP.
> >
> > -v2: place fixup_early_ioremap hence call early_ioremap_init in
> > reserve_top_address to re-initialize slot_virt and corresponding PMD
> > when parse_reservetop
> >
> > -v3: move fixup_early_ioremap out of reserve_top_address to make sure
> > other clients of reserve_top_address like xen/lguest won't broken
> >
> > Signed-off-by: Liang Li <[email protected]>
> > Cc: Wang Chen <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Thomas Gleixner <[email protected]>
> > Cc: "H. Peter Anvin" <[email protected]>
> > Cc: Yinghai Lu <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Acked-by: Jeremy Fitzhardinge <[email protected]>
> > Tested-by: Konrad Rzeszutek Wilk <[email protected]>
> > ---
> > arch/x86/include/asm/io.h | 1 +
> > arch/x86/mm/ioremap.c | 15 +++++++++++++++
> > arch/x86/mm/pgtable_32.c | 1 +
> > 3 files changed, 17 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
> > index a1dcfa3..30a3e97 100644
> > --- a/arch/x86/include/asm/io.h
> > +++ b/arch/x86/include/asm/io.h
> > @@ -347,6 +347,7 @@ extern void __iomem *early_ioremap(resource_size_t phys_addr,
> > extern void __iomem *early_memremap(resource_size_t phys_addr,
> > unsigned long size);
> > extern void early_iounmap(void __iomem *addr, unsigned long size);
> > +extern void fixup_early_ioremap(void);
> >
> > #define IO_SPACE_LIMIT 0xffff
> >
> > diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> > index 5eb1ba7..e4ab706 100644
> > --- a/arch/x86/mm/ioremap.c
> > +++ b/arch/x86/mm/ioremap.c
> > @@ -448,6 +448,21 @@ static inline void __init early_clear_fixmap(enum fixed_addresses idx)
> > static void __iomem *prev_map[FIX_BTMAPS_SLOTS] __initdata;
> > static unsigned long prev_size[FIX_BTMAPS_SLOTS] __initdata;
> >
> > +void __init fixup_early_ioremap(void)
> > +{
> > + int i;
> > + for (i = 0; i < FIX_BTMAPS_SLOTS; i++) {
> > + if (prev_map[i])
> > + break;
> > + }
> > +
> > + if (i < FIX_BTMAPS_SLOTS)
> > + BUG_ON(1);
> > +
> > + early_ioremap_init();
> > + return;
> > +}
> > +
> > static int __init check_early_ioremap_leak(void)
> > {
> > int count = 0;
> > diff --git a/arch/x86/mm/pgtable_32.c b/arch/x86/mm/pgtable_32.c
> > index 1a8faf0..26eadaa 100644
> > --- a/arch/x86/mm/pgtable_32.c
> > +++ b/arch/x86/mm/pgtable_32.c
> > @@ -128,6 +128,7 @@ static int __init parse_reservetop(char *arg)
> >
> > address = memparse(arg, &arg);
> > reserve_top_address(address);
> > + fixup_early_ioremap();
> > return 0;
> > }
> > early_param("reservetop", parse_reservetop);
> >
On 04/08/2010 08:23 PM, Liang Li wrote:
> On Thu, Apr 08, 2010 at 08:12:18PM -0700, Jeremy Fitzhardinge wrote:
>
>> On 04/08/2010 05:43 PM, Liang Li wrote:
>>
>>> When specify 'reservetop=0xbadc0de' kernel parameter, the kernel will
>>> stop booting due to a early_ioremap bug that relate to commit 8827247ff.
>>>
>>> The root cause of boot failure problem is the value of 'slot_virt[i]'
>>> was initialized in setup_arch->early_ioremap_init. But later in
>>> setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP'
>>> when 'reservetop=0xbadc0de' being specified.
>>>
>>> The simplest fix might be use __fix_to_virt(idx0) to get updated value
>>> of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value
>>> from slot_virt[slot] directly.
>>>
>>>
>> While I guess this patch works OK, I have to say that I'm worried by the
>> need for it at all; it seems to be papering over a more serious
>> problem. reserve_top_address() is supposed to be called very early,
>> before anything has used or referenced FIXADDR_TOP. If we're seeing
>> problems with FIXADDR_TOP changing after it has been used, then it means
>> that reserve_top_address() is being called too late. Fixing that would
>> be the real fix.
>>
> The ideal thing is FIXADDR_TOP should not be touched after
> early_ioremap_init. The late call to reserve_top_address is from
> parse_reservetop, aka when reservetop=0xabcd0000 being passed as kernel
> commandline parameter. In setup_arch, the call sequence is:
>
> setup_arch
> -> early_ioremap_init
> -> parse_early_param
> -> parse_reservetop
> ->reserve_top_address
>
> See, how could we solve the confliction better?
>
Well, the first question is "do we need the reservetop= kernel
parameter"? Zach added it, I think, so that VMI could be loaded
dynamically as a module. Given that VMI is deprecated anyway, I wonder
if we can just drop support for modular VMI and remove the reservetop=
kernel parameter. Or are there other uses for it?
J
On Thu, Apr 08, 2010 at 08:41:24PM -0700, Jeremy Fitzhardinge wrote:
> On 04/08/2010 08:23 PM, Liang Li wrote:
> > On Thu, Apr 08, 2010 at 08:12:18PM -0700, Jeremy Fitzhardinge wrote:
> >
> >> On 04/08/2010 05:43 PM, Liang Li wrote:
> >>
> >>> When specify 'reservetop=0xbadc0de' kernel parameter, the kernel will
> >>> stop booting due to a early_ioremap bug that relate to commit 8827247ff.
> >>>
> >>> The root cause of boot failure problem is the value of 'slot_virt[i]'
> >>> was initialized in setup_arch->early_ioremap_init. But later in
> >>> setup_arch, the function 'parse_early_param' will modify 'FIXADDR_TOP'
> >>> when 'reservetop=0xbadc0de' being specified.
> >>>
> >>> The simplest fix might be use __fix_to_virt(idx0) to get updated value
> >>> of 'FIXADDR_TOP' in '__early_ioremap' instead of reference old value
> >>> from slot_virt[slot] directly.
> >>>
> >>>
> >> While I guess this patch works OK, I have to say that I'm worried by the
> >> need for it at all; it seems to be papering over a more serious
> >> problem. reserve_top_address() is supposed to be called very early,
> >> before anything has used or referenced FIXADDR_TOP. If we're seeing
> >> problems with FIXADDR_TOP changing after it has been used, then it means
> >> that reserve_top_address() is being called too late. Fixing that would
> >> be the real fix.
> >>
> > The ideal thing is FIXADDR_TOP should not be touched after
> > early_ioremap_init. The late call to reserve_top_address is from
> > parse_reservetop, aka when reservetop=0xabcd0000 being passed as kernel
> > commandline parameter. In setup_arch, the call sequence is:
> >
> > setup_arch
> > -> early_ioremap_init
> > -> parse_early_param
> > -> parse_reservetop
> > ->reserve_top_address
> >
> > See, how could we solve the confliction better?
> >
>
> Well, the first question is "do we need the reservetop= kernel
> parameter"? Zach added it, I think, so that VMI could be loaded
> dynamically as a module. Given that VMI is deprecated anyway, I wonder
> if we can just drop support for modular VMI and remove the reservetop=
> kernel parameter. Or are there other uses for it?
Agree. We can remove the 'reservetop=' kernel parameter then the
problem dispear. :) But we don't have to.....
Regards,
-Liang Li