2007-05-16 08:23:32

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: 2.6.22-rc1-mm1: boot failure under qemu

2.6.22-rc1-mm1 doesn't boot for me under qemu or kvm. Under qemu it
just hangs sullenly, but kvm gives a more useful dump:

(qemu) exception 13 (0)
rax 000000004050ffff rbx 0000000000009000 rcx 0000000000000000 rdx 0000000000007b00
rsi 000000000001fc05 rdi 0000000000040000 rsp 0000000000008f9a rbp 0000000000008100
r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000
r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
rip 0000000000001062 rflags 00033046
cs 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ds 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es 8100 (00081000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
fs 9900 (00099000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
gs 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
tr 0000 (30850000/00002088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt fa4e4/30
idt 0/3ff
cr0 60000010 cr2 0 cr3 0 cr4 0 cr8 0 efer 0

rip/eip 0x1062 seems to correspond to:

else
while (num != 0)
tmp[i++] = digits[do_div(num, base)];
1050: 66 89 f0 mov %esi,%eax
1053: 66 31 d2 xor %edx,%edx
1056: 66 f7 f5 div %ebp
1059: 66 89 c6 mov %eax,%esi
105c: 67 66 8b 44 24 28 addr32 mov 0x28(%esp),%eax
1062: 67 8a 14 10 addr32 mov (%eax,%edx,1),%dl
1066: 67 66 8b 44 24 2c addr32 mov 0x2c(%esp),%eax
106c: 67 88 54 04 3e addr32 mov %dl,0x3e(%esp,%eax,1)


0x1062 is in number (/home/jeremy/hg/xen/paravirt/linux/arch/i386/boot/printf.c:109).
104 i = 0;
105 if (num == 0)
106 tmp[i++] = '0';
107 else
108 while (num != 0)
109 tmp[i++] = digits[do_div(num, base)];
110 if (i > precision)
111 precision = i;
112 size -= precision;
113 if (!(type & (ZEROPAD + LEFT)))


I haven't tried booting on real hardware, but this is a definite
regression from the old setup code.

J


2007-05-16 15:47:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

Jeremy Fitzhardinge wrote:
> rax 000000004050ffff rbx 0000000000009000 rcx 0000000000000000 rdx 0000000000007b00
> rsi 000000000001fc05 rdi 0000000000040000 rsp 0000000000008f9a rbp 0000000000008100
> r8 0000000000000000 r9 0000000000000000 r10 0000000000000000 r11 0000000000000000
> r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
> rip 0000000000001062 rflags 00033046
> cs 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> ds 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> es 8100 (00081000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> ss 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> fs 9900 (00099000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
> gs 9000 (00090000/0000ffff p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)

>
> else
> while (num != 0)
> tmp[i++] = digits[do_div(num, base)];
> 1050: 66 89 f0 mov %esi,%eax
> 1053: 66 31 d2 xor %edx,%edx
> 1056: 66 f7 f5 div %ebp
> 1059: 66 89 c6 mov %eax,%esi
> 105c: 67 66 8b 44 24 28 addr32 mov 0x28(%esp),%eax
> 1062: 67 8a 14 10 addr32 mov (%eax,%edx,1),%dl
> 1066: 67 66 8b 44 24 2c addr32 mov 0x2c(%esp),%eax
> 106c: 67 88 54 04 3e addr32 mov %dl,0x3e(%esp,%eax,1)
>
>
> 0x1062 is in number (/home/jeremy/hg/xen/paravirt/linux/arch/i386/boot/printf.c:109).
> 104 i = 0;
> 105 if (num == 0)
> 106 tmp[i++] = '0';
> 107 else
> 108 while (num != 0)
> 109 tmp[i++] = digits[do_div(num, base)];
> 110 if (i > precision)
> 111 precision = i;
> 112 size -= precision;
> 113 if (!(type & (ZEROPAD + LEFT)))
>
>
> I haven't tried booting on real hardware, but this is a definite
> regression from the old setup code.
>

Hmmm...

There are a number of highly odd things about your dump, in particular,
%es == 0x8100 at this point, which means the constraint %cs == %ds ==
%es == %ss has been violated in this code; this should only happen
locally inside an assembly routine or asm() statement. Another bizarre
thing is that %ebp, which apparently is supposed to contain the base at
this point, is *also* set to 0x8100.

Finally, the total zincher is the flags -- VM RF IOPL=3. In real mode.
That's nuttier than Dick Cheney.

I have been using Qemu (as well as Bochs) to develop and test the code,
so obviously it Works For Me[TM]. Please describe the entry conditions
in more detail; in particular, what did you use to load the kernel?

Also, could you send me your .config and simulation image?

-hpa

2007-05-16 16:30:23

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

H. Peter Anvin wrote:
> There are a number of highly odd things about your dump, in particular,
> %es == 0x8100 at this point, which means the constraint %cs == %ds ==
> %es == %ss has been violated in this code; this should only happen
> locally inside an assembly routine or asm() statement. Another bizarre
> thing is that %ebp, which apparently is supposed to contain the base at
> this point, is *also* set to 0x8100.
>

Yeah, I noticed the segment state was all over the place.

> I have been using Qemu (as well as Bochs) to develop and test the code,
> so obviously it Works For Me[TM]. Please describe the entry conditions
> in more detail; in particular, what did you use to load the kernel?
>

qemu -kernel bzImage -hda filesystem.img

> Also, could you send me your .config and simulation image?
>

Yep, separate mail.

J

2007-05-16 18:00:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

Okay, I've established that this is a bug in the Qemu kernel loader: the
Qemu loader puts zero in the loadflags, which is wrong no matter how you
slice it.

I have checked in a workaround in the git.newsetup tree; the workaround
is to rely on a compile-time value for load low/load high instead of
looking at loadflags.

-hpa

2007-05-16 18:25:09

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

H. Peter Anvin wrote:
> Okay, I've established that this is a bug in the Qemu kernel loader: the
> Qemu loader puts zero in the loadflags, which is wrong no matter how you
> slice it.
>
> I have checked in a workaround in the git.newsetup tree; the workaround
> is to rely on a compile-time value for load low/load high instead of
> looking at loadflags.
>

Can you post a patch to try?

J

2007-05-16 18:53:16

by Nish Aravamudan

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

On 5/16/07, Jeremy Fitzhardinge <[email protected]> wrote:
> H. Peter Anvin wrote:
> > Okay, I've established that this is a bug in the Qemu kernel loader: the
> > Qemu loader puts zero in the loadflags, which is wrong no matter how you
> > slice it.
> >
> > I have checked in a workaround in the git.newsetup tree; the workaround
> > is to rely on a compile-time value for load low/load high instead of
> > looking at loadflags.
> >
>
> Can you post a patch to try?

You can snag it from gitweb:

http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-newsetup.git;a=commit;h=a1608be536b7e60362923c5bdc9f3ab3ddd27ee5

The patch itself:

http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-newsetup.git;a=commitdiff;h=a1608be536b7e60362923c5bdc9f3ab3ddd27ee5;hp=92d07d79f86a778f253001a7cf0758b49f39eb77

Thanks,
Nish

2007-05-16 19:08:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

diff --git a/arch/i386/boot/Makefile b/arch/i386/boot/Makefile
index 6792d09..a2b3f93 100644
--- a/arch/i386/boot/Makefile
+++ b/arch/i386/boot/Makefile
@@ -62,6 +62,7 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__
$(obj)/zImage: IMAGE_OFFSET := 0x1000
$(obj)/zImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK)
$(obj)/bzImage: IMAGE_OFFSET := 0x100000
+$(obj)/bzImage: EXTRA_CFLAGS := -D__BIG_KERNEL__
$(obj)/bzImage: EXTRA_AFLAGS := $(SVGA_MODE) $(RAMDISK) -D__BIG_KERNEL__
$(obj)/bzImage: BUILDFLAGS := -b

diff --git a/arch/i386/boot/main.c b/arch/i386/boot/main.c
index 5f4d99d..873c777 100644
--- a/arch/i386/boot/main.c
+++ b/arch/i386/boot/main.c
@@ -112,6 +112,10 @@ void main(void)
if (boot_params.hdr.loadflags & CAN_USE_HEAP) {
heap_end = (char *)(boot_params.hdr.heap_end_ptr
+0x200-STACK_SIZE);
+ } else {
+ /* Boot protocol 2.00 only, no heap available */
+ puts("WARNING: Ancient bootloader, some functionality "
+ "may be limited!\n");
}

/* Make sure we have all the proper CPU support */
diff --git a/arch/i386/boot/memory.c b/arch/i386/boot/memory.c
index 8a82aa9..d7b250b 100644
--- a/arch/i386/boot/memory.c
+++ b/arch/i386/boot/memory.c
@@ -30,7 +30,7 @@ static int detect_memory_e820(void)
size = sizeof(struct e820entry);
id = SMAP;
asm("int $0x15; setc %0"
- : "=dm" (err), "+b" (next), "+d" (id), "+c" (size),
+ : "=am" (err), "+b" (next), "+d" (id), "+c" (size),
"=m" (*desc)
: "D" (desc), "a" (0xe820));

diff --git a/arch/i386/boot/pm.c b/arch/i386/boot/pm.c
index 1c586f1..7af65f9 100644
--- a/arch/i386/boot/pm.c
+++ b/arch/i386/boot/pm.c
@@ -41,12 +41,13 @@ static void realmode_switch_hook(void)
*/
static void move_kernel_around(void)
{
+ /* Note: rely on the compile-time option here rather than
+ the LOADED_HIGH flag. The Qemu kernel loader unconditionally
+ sets the loadflags to zero. */
+#ifndef __BIG_KERNEL__
u16 dst_seg, src_seg;
u32 syssize;

- if (boot_params.hdr.loadflags & LOADED_HIGH)
- return;
-
dst_seg = 0x1000 >> 4;
src_seg = 0x10000 >> 4;
syssize = boot_params.hdr.syssize; /* Size in 16-byte paragraps */
@@ -72,6 +73,7 @@ static void move_kernel_around(void)
dst_seg += paras;
src_seg += paras;
}
+#endif
}

/*


Attachments:
diff (2.21 kB)

2007-05-17 00:47:27

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: 2.6.22-rc1-mm1: boot failure under qemu

H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>
>> H. Peter Anvin wrote:
>>
>>> Okay, I've established that this is a bug in the Qemu kernel loader: the
>>> Qemu loader puts zero in the loadflags, which is wrong no matter how you
>>> slice it.
>>>
>>> I have checked in a workaround in the git.newsetup tree; the workaround
>>> is to rely on a compile-time value for load low/load high instead of
>>> looking at loadflags.
>>>
>>>
>> Can you post a patch to try?
>>
>>
>
> Cumulative diff from -rc1-mm1.
>

Thanks, this works for me.

J