Jeremy Fitzhardinge wrote:
> Eric W. Biederman wrote:
>> I have several ideas on how we can make this work but first I have to
>> ask what is it that you are trying to accomplish?
>
> The requirements are:
>
> 1. the domain builder needs to get various information about the
> guest kernel by inspecting its ELF notes
Doesn't need to be ELF notes. The current (3.0.5+) domain builder has
pluggable binary parsers. Right now there are two: ELF (obviously ...)
and binary (with a multiboot-like header). Filling the informations
such as virt_base is a function of the parser, so when adding one more
parser to the domain builder for bzImage kernels the parser could do
something completely different to gather the needed information ...
> That works OK for a kernel which is compiled to run under Xen and can't
> run in any other environment, but now that we can generate a single
> kernel which can run in any number of different environments, its
> unfortunate that we still need multiple variants of the kernel image.
Yep, although already much better than completely different kernels.
Most space of a typical distro kernel is modules which are shared even
with different kernel binaries.
> So, I have no problem in also building a boot protocol info structure,
> and passing that in %esi, so long as I can store a pointer to the
> Xen-specific info as well.
Yep, should work fine.
> I think I'd prefer to have the domain builder decompress/relocate the
> kernel from the bzImage and start it directly, rather than have it
> decompress/relocate itself,
I'd expect that work better too.
> It depends
> on how well it can deal with having paging enabled and being in ring 1.
Xen direct paging mode requiring (leaf) page tables being mapped
read-only makes page table manipulation a bit difficult. Xen has to
care whenever the memory it maps is a page table. Native hasn't.
Also switching to a completely different set of page tables isn't easy
under Xen. My xen guest kexec patches have to perform some intresting
tricks because of that ...
> Looks like it might just be a matter of starting up with "enough" memory
> mapped.
Doesn't solve the problem of having to switch from identity mapping to
the 0xc0000000 one ...
cheers,
Gerd
Gerd Hoffmann wrote:
> Doesn't need to be ELF notes. The current (3.0.5+) domain builder has
> pluggable binary parsers. Right now there are two: ELF (obviously
> ...) and binary (with a multiboot-like header). Filling the
> informations such as virt_base is a function of the parser, so when
> adding one more parser to the domain builder for bzImage kernels the
> parser could do something completely different to gather the needed
> information ...
True. But the plan is already to make bzImage an ELF file, so notes
would seem to be the best option. At worst, it could be ELF notes
wrapped in some other container, but that's not pretty.
>> That works OK for a kernel which is compiled to run under Xen and can't
>> run in any other environment, but now that we can generate a single
>> kernel which can run in any number of different environments, its
>> unfortunate that we still need multiple variants of the kernel image.
>
> Yep, although already much better than completely different kernels.
> Most space of a typical distro kernel is modules which are shared even
> with different kernel binaries.
Yep.
>> So, I have no problem in also building a boot protocol info structure,
>> and passing that in %esi, so long as I can store a pointer to the
>> Xen-specific info as well.
>
> Yep, should work fine.
>
>> I think I'd prefer to have the domain builder decompress/relocate the
>> kernel from the bzImage and start it directly, rather than have it
>> decompress/relocate itself,
>
> I'd expect that work better too.
>
>> It depends
>> on how well it can deal with having paging enabled and being in ring 1.
>
> Xen direct paging mode requiring (leaf) page tables being mapped
> read-only makes page table manipulation a bit difficult. Xen has to
> care whenever the memory it maps is a page table. Native hasn't.
>
> Also switching to a completely different set of page tables isn't easy
> under Xen. My xen guest kexec patches have to perform some intresting
> tricks because of that ...
Yeah, that's tricky. I ended up copying the Xen pagetables's pmd into
the kernel's so that they could share ptes. Making a completely new
pagetable means you need to update the RO state on both old and new.
>> Looks like it might just be a matter of starting up with "enough" memory
>> mapped.
>
> Doesn't solve the problem of having to switch from identity mapping to
> the 0xc0000000 one ...
Hm. That's right. Xen will boot a vmlinux with its pagetable
pre-constructed to map it at its virtual address. Going through bzImage
would mean it would be identity mapped, and someone early would need to
construct the virtual mapping.
But if the path is:
1. enter bzImage in 32-bit mode
2. decompress kernel
3. jump to startup_32
4. detect paravirt and choose appropriate backend
5. run Xen startup code
then the Xen startup code can construct the virtual mapping before going
on with the rest of the kernel boot - steps 1-4 can be run with identity
mapping.
J
Jeremy Fitzhardinge wrote:
>
> True. But the plan is already to make bzImage an ELF file, so notes
> would seem to be the best option. At worst, it could be ELF notes
> wrapped in some other container, but that's not pretty.
>
It's not going to happen. Too many boot loaders make assumptions about
ELF files which aren't really compatible; the entry conditions for an
ELF from a boot loader are pretty ill-defined, so I think this is a bad
idea.
At the very least, it shouldn't present the ELF magic number IMNSHO.
-hpa
H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>
>> True. But the plan is already to make bzImage an ELF file, so notes
>> would seem to be the best option. At worst, it could be ELF notes
>> wrapped in some other container, but that's not pretty.
>>
>>
>
> It's not going to happen. Too many boot loaders make assumptions about
> ELF files which aren't really compatible; the entry conditions for an
> ELF from a boot loader are pretty ill-defined, so I think this is a bad
> idea.
>
> At the very least, it shouldn't present the ELF magic number IMNSHO.
>
Hm, that's unfortunate. How about an ELF file wrapped in some other
container, so that we can easily extract a properly formed ELF file?
J
Jeremy Fitzhardinge wrote:
>
> Hm, that's unfortunate. How about an ELF file wrapped in some other
> container, so that we can easily extract a properly formed ELF file?
>
Effectively the same thing as changing the magic number. Note that the
format for bzImage is pretty rigid, and it would be *highly* undesirable
to muck that up.
-hpa
"H. Peter Anvin" <[email protected]> writes:
> Jeremy Fitzhardinge wrote:
>>
>> True. But the plan is already to make bzImage an ELF file, so notes
>> would seem to be the best option. At worst, it could be ELF notes
>> wrapped in some other container, but that's not pretty.
>>
>
> It's not going to happen. Too many boot loaders make assumptions about
> ELF files which aren't really compatible; the entry conditions for an
> ELF from a boot loader are pretty ill-defined, so I think this is a bad
> idea.
>
> At the very least, it shouldn't present the ELF magic number IMNSHO.
I agree that there are some issues.
However we need the information that is contained in ELF headers or
a semantic equivalent so we might as well play with the possibility.
There are two practical issues for ELF and bootloaders.
virtual vs. physical addresses. In a bzImage header all
we will present will be physical addresses so that isn't an
issue.
The other issue is what is the format of the arguments that the
executable expects. There seems to be 0 consensus on this so
bootloaders simply can't agree, and any bootloader that is
prepared to deal with kernels from different locations is going
to have to cope.
So I figure we keep our current calling conventions and have a
note saying that we are linux so the format can be auto-detected.
There are of course plenty of bootloaders that load whatever happens
to be their OS kernel however they managed to get ld to spit it out,
and there are some really weird things going on there. But that doesn't
matter because those bootloaders can make no pretense at being general
purpose.
There is a lot of future flexibility that comes from this in addition
to making x86 closer to the other architectures.
I do agree we need to tread carefully, but I have yet to hear about
any show stopper bugs, and it works well enough at least one major distro
has shipped a linux kernel bzImage with an ELF header.
So we won't do this casually and if it there are real problems we will
remove the ELF magic number.
Eric
Eric W. Biederman wrote:
>
> So we won't do this casually and if it there are real problems we will
> remove the ELF magic number.
>
I think we can use ELF-compatible format just fine, but it would make
more sense to use a non-ELF magic number from the start, instead of
signalling it with a note. Since bootloaders need to be aware, anyway,
they can just detect this magic and treat is as an Linux calling
convention ELF image, or they can not detect it, and treat it as a
bzImage. As a side benefit, we:
a) can use a magic number that contains a jump instruction (to keep the
non-bootsector happy);
b) get a proper Linux kernel magic number.
-hpa
"H. Peter Anvin" <[email protected]> writes:
> Eric W. Biederman wrote:
>>
>> So we won't do this casually and if it there are real problems we will
>> remove the ELF magic number.
>>
>
> I think we can use ELF-compatible format just fine, but it would make
> more sense to use a non-ELF magic number from the start, instead of
> signalling it with a note. Since bootloaders need to be aware, anyway,
> they can just detect this magic and treat is as an Linux calling
> convention ELF image, or they can not detect it, and treat it as a
> bzImage. As a side benefit, we:
>
> a) can use a magic number that contains a jump instruction (to keep the
> non-bootsector happy);
> b) get a proper Linux kernel magic number.
To the best of my knowledge I have already resolved both of those concerns,
in my current code.
Eric
H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
>
>> Hm, that's unfortunate. How about an ELF file wrapped in some other
>> container, so that we can easily extract a properly formed ELF file?
>>
>>
>
> Effectively the same thing as changing the magic number. Note that the
> format for bzImage is pretty rigid, and it would be *highly* undesirable
> to muck that up.
So the bzImage structure is currently:
1. old-style boot sector
2. old-style boot info, followed by 0xaa55 at the end of the sector
3. the HdrS boot param block
4. setup.S boot code
5. the self-decompressing kernel
If we make 5 actually an ELF file, containing properly formed Ehdr,
Phdrs (for all the mappings required), and the actual kernel
decompressor, relocator and compressed kernel data, then it would be
easy for the Xen domain builder to find that and use it as a basis for
loading. I think it would just require the bzImage boot param block to
contain an offset of the start of the ELF file. The contents of the ELF
file would be in a form where the normal boot code could just jump over
the ELF headers, directly into the segment data itself.
ie:
1. old-style boot sector
2. old-style boot info, followed by 0xaa55 at the end of the sector
3. the HdrS boot param block
4. setup.S boot code (jumps directly into 5.3)
5. 32-bit self-decompressing kernel:
1. Ehdr
2. Phdrs for all necessary mappings
3. decompressor/relocator .text
4. compressed kernel data
Does that sound reasonable?
J
Jeremy Fitzhardinge wrote:
>
> So the bzImage structure is currently:
>
> 1. old-style boot sector
> 2. old-style boot info, followed by 0xaa55 at the end of the sector
> 3. the HdrS boot param block
> 4. setup.S boot code
> 5. the self-decompressing kernel
>
> If we make 5 actually an ELF file, containing properly formed Ehdr,
> Phdrs (for all the mappings required), and the actual kernel
> decompressor, relocator and compressed kernel data, then it would be
> easy for the Xen domain builder to find that and use it as a basis for
> loading. I think it would just require the bzImage boot param block to
> contain an offset of the start of the ELF file. The contents of the ELF
> file would be in a form where the normal boot code could just jump over
> the ELF headers, directly into the segment data itself.
>
> ie:
>
> 1. old-style boot sector
> 2. old-style boot info, followed by 0xaa55 at the end of the sector
> 3. the HdrS boot param block
> 4. setup.S boot code (jumps directly into 5.3)
> 5. 32-bit self-decompressing kernel:
> 1. Ehdr
> 2. Phdrs for all necessary mappings
> 3. decompressor/relocator .text
> 4. compressed kernel data
>
> Does that sound reasonable?
>
I don't know if that would break any programs that are currently
bypassing the setup. The existing setup protocol definitely allows
invoking an entry point which isn't 0x100000 (rather, the 32-bit
entrypoint is defined by code32_start); I'm not sure how Eric's
relocatable kernel patches (2.05 protocol) affect that, mostly because I
haven't seen any boot loaders which actually use it so I can't comment
on what their code looks like.
-hpa
H. Peter Anvin wrote:
> I don't know if that would break any programs that are currently
> bypassing the setup. The existing setup protocol definitely allows
> invoking an entry point which isn't 0x100000 (rather, the 32-bit
> entrypoint is defined by code32_start); I'm not sure how Eric's
> relocatable kernel patches (2.05 protocol) affect that, mostly because I
> haven't seen any boot loaders which actually use it so I can't comment
> on what their code looks like.
Yes, I'd expect that code32_start would point into the ELF text
segment. You could align things so that the entrypoint is still
actually 0x100000, or bump it up a bit to fit the ELF headers. I have
to admit I don't quite understand how all that fits together at the moment.
J
On Wed, 2007-05-02 at 14:09 -0700, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
> >
> > Hm, that's unfortunate. How about an ELF file wrapped in some other
> > container, so that we can easily extract a properly formed ELF file?
> >
>
> Effectively the same thing as changing the magic number. Note that the
> format for bzImage is pretty rigid, and it would be *highly* undesirable
> to muck that up.
To add some code to the debate, here's how lguest loads a bzImage (from
my draft documentation). Almost anything would be an improvement:
/* A bzImage, unlike an ELF file, is not meant to be loaded. You're
* supposed to jump into it and it will unpack itself. We can't do that
* because the Guest can't run the unpacking code, and adding features to
* lguest kills puppies, so we don't want to.
*
* The bzImage is formed by putting the decompressing code in front of the
* compressed kernel code. So we can simple scan through it looking for the
* first "gzip" header, and start decompressing from there. */
static unsigned long load_bzimage(int fd, unsigned long *page_offset)
{
unsigned char c;
int state = 0;
/* GZIP header is 0x1F 0x8B <method> <flags>... <compressed-by>. */
while (read(fd, &c, 1) == 1) {
switch (state) {
case 0:
if (c == 0x1F)
state++;
break;
case 1:
if (c == 0x8B)
state++;
else
state = 0;
break;
case 2 ... 8:
state++;
break;
case 9:
/* Seek back to the start of the gzip header. */
lseek(fd, -10, SEEK_CUR);
/* One final check: "compressed under UNIX". */
if (c != 0x03)
state = -1;
else
return unpack_bzimage(fd, page_offset);
}
}
errx(1, "Could not find kernel in bzImage");
}
/* Unfortunately the entire ELF image isn't compressed: the segments
* which need loading are extracted and compressed raw. This denies us the
* information we need to make a fully-general loader. */
static unsigned long unpack_bzimage(int fd, unsigned long *page_offset)
{
gzFile f;
int ret, len = 0;
/* A bzImage always gets loaded at physical address 1M. This is
* actually configurable as CONFIG_PHYSICAL_START, but as the comment
* there says, "Don't change this unless you know what you are doing".
* Indeed. */
void *img = (void *)0x100000;
/* gzdopen takes our file descriptor (carefully placed at the start of
* the GZIP header we found) and returns a gzFile. */
f = gzdopen(fd, "rb");
/* Unfortunately, if we made a mistake and it wasn't really a gzip
* header, it will still read the file, but directly without
* decompressing it. For us, that's a misfeature. */
if (gzdirect(f))
errx(1, "did not find correct gzip header");
/* We read it into memory in 64k chunks until we hit the end. */
while ((ret = gzread(f, img + len, 65536)) > 0)
len += ret;
if (ret < 0)
err(1, "reading image from bzImage");
verbose("Unpacked size %i addr %p\n", len, img);
/* Without the ELF header, we can't tell virtual-physical gap. This is
* CONFIG_PAGE_OFFSET, and people do actually change it. Fortunately,
* I have a clever way of figuring it out from the code itself. */
*page_offset = intuit_page_offset(img, len);
/* Entry is physical address: convert to virtual */
return (unsigned long)img + *page_offset;
}
/* Prepare to be SHOCKED and AMAZED. And possibly a trifle nauseated.
*
* We know that CONFIG_PAGE_OFFSET sets what virtual address the kernel expects
* to be. We don't know what that option was, but we can figure it out
* approximately by looking at the addresses in the code. I chose the common
* case of reading a memory location into the %eax register:
*
* movl <some-address>, %eax
*
* This gets encoded as five bytes: "0xA1 <4-byte-address>". For example,
* "0xA1 0x18 0x60 0x47 0xC0" reads the address 0xC0476018 into %eax.
*
* In this example can guess that the kernel was compiled with
* CONFIG_PAGE_OFFSET set to 0xC0000000 (it's always a round number). If the
* kernel were larger than 16MB, we might see 0xC1 addresses show up, but our
* kernel isn't that bloated yet.
*
* Unfortunately, x86 has variable-length instructions, so finding this
* particular instruction properly involves writing a disassembler. Instead,
* we rely on statistics. We look for "0xA1" and tally the different bytes
* which occur 4 bytes later (the "0xC0" in our example above). When one of
* those bytes appears three times, we can be reasonably confident that it
* forms the start of CONFIG_PAGE_OFFSET.
*
* This is amazingly reliable. */
static unsigned long intuit_page_offset(unsigned char *img, unsigned long len)
{
unsigned int i, possibilities[256] = { 0 };
for (i = 0; i + 4 < len; i++) {
/* mov 0xXXXXXXXX,%eax */
if (img[i] == 0xA1 && ++possibilities[img[i+4]] > 3)
return (unsigned long)img[i+4] << 24;
}
errx(1, "could not determine page offset");
}
On Wed, May 02, 2007 at 02:59:11PM -0700, H. Peter Anvin wrote:
> Jeremy Fitzhardinge wrote:
> >
> > So the bzImage structure is currently:
> >
> > 1. old-style boot sector
> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
> > 3. the HdrS boot param block
> > 4. setup.S boot code
> > 5. the self-decompressing kernel
> >
> > If we make 5 actually an ELF file, containing properly formed Ehdr,
> > Phdrs (for all the mappings required), and the actual kernel
> > decompressor, relocator and compressed kernel data, then it would be
> > easy for the Xen domain builder to find that and use it as a basis for
> > loading. I think it would just require the bzImage boot param block to
> > contain an offset of the start of the ELF file. The contents of the ELF
> > file would be in a form where the normal boot code could just jump over
> > the ELF headers, directly into the segment data itself.
> >
> > ie:
> >
> > 1. old-style boot sector
> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
> > 3. the HdrS boot param block
> > 4. setup.S boot code (jumps directly into 5.3)
> > 5. 32-bit self-decompressing kernel:
> > 1. Ehdr
> > 2. Phdrs for all necessary mappings
> > 3. decompressor/relocator .text
> > 4. compressed kernel data
> >
> > Does that sound reasonable?
> >
>
> I don't know if that would break any programs that are currently
> bypassing the setup.
I think kexec bzImage loader will break. It bypasses the setup code and
directly jumps to the code present after setup sectors(decompressor).
> The existing setup protocol definitely allows
> invoking an entry point which isn't 0x100000 (rather, the 32-bit
> entrypoint is defined by code32_start); I'm not sure how Eric's
> relocatable kernel patches (2.05 protocol) affect that, mostly because I
> haven't seen any boot loaders which actually use it so I can't comment
> on what their code looks like.
With relocatable patches, if a boot loader decides to load protected mode
component at non-1MB address, then it shall have to modify code32_start to
reflect the new location of protected mode code.
Thanks
Vivek
Vivek Goyal <[email protected]> writes:
> On Wed, May 02, 2007 at 02:59:11PM -0700, H. Peter Anvin wrote:
>> Jeremy Fitzhardinge wrote:
>> >
>> > So the bzImage structure is currently:
>> >
>> > 1. old-style boot sector
>> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
>> > 3. the HdrS boot param block
>> > 4. setup.S boot code
>> > 5. the self-decompressing kernel
>> >
>> > If we make 5 actually an ELF file, containing properly formed Ehdr,
>> > Phdrs (for all the mappings required), and the actual kernel
>> > decompressor, relocator and compressed kernel data, then it would be
>> > easy for the Xen domain builder to find that and use it as a basis for
>> > loading. I think it would just require the bzImage boot param block to
>> > contain an offset of the start of the ELF file. The contents of the ELF
>> > file would be in a form where the normal boot code could just jump over
>> > the ELF headers, directly into the segment data itself.
>> >
>> > ie:
>> >
>> > 1. old-style boot sector
>> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
>> > 3. the HdrS boot param block
>> > 4. setup.S boot code (jumps directly into 5.3)
>> > 5. 32-bit self-decompressing kernel:
>> > 1. Ehdr
>> > 2. Phdrs for all necessary mappings
>> > 3. decompressor/relocator .text
>> > 4. compressed kernel data
>> >
>> > Does that sound reasonable?
>> >
>>
>> I don't know if that would break any programs that are currently
>> bypassing the setup.
I think everything will break, unless we make 5.1 and 5.2
into 4.2 and 4.3. In the above design.
> I think kexec bzImage loader will break. It bypasses the setup code and
> directly jumps to the code present after setup sectors(decompressor).
Quite likely. The boot sector except for a handful of bytes actually
goes unused so we can put extra header information there, I actually
have patches for placing an ELF header there.
If we wanted to do an ELF header in the middle we would have to put
it at the end of the setup sectors rather then the beginning of the
raw protected mode kernel image.
>> The existing setup protocol definitely allows
>> invoking an entry point which isn't 0x100000 (rather, the 32-bit
>> entrypoint is defined by code32_start); I'm not sure how Eric's
>> relocatable kernel patches (2.05 protocol) affect that, mostly because I
>> haven't seen any boot loaders which actually use it so I can't comment
>> on what their code looks like.
>
> With relocatable patches, if a boot loader decides to load protected mode
> component at non-1MB address, then it shall have to modify code32_start to
> reflect the new location of protected mode code.
Yes. And this aspect of the relocatable kernel is all Vivek.
Eric
Eric W. Biederman wrote:
> Vivek Goyal <[email protected]> writes:
>
>
>> On Wed, May 02, 2007 at 02:59:11PM -0700, H. Peter Anvin wrote:
>>
>>> Jeremy Fitzhardinge wrote:
>>>
>>>> So the bzImage structure is currently:
>>>>
>>>> 1. old-style boot sector
>>>> 2. old-style boot info, followed by 0xaa55 at the end of the sector
>>>> 3. the HdrS boot param block
>>>> 4. setup.S boot code
>>>> 5. the self-decompressing kernel
>>>>
>>>> If we make 5 actually an ELF file, containing properly formed Ehdr,
>>>> Phdrs (for all the mappings required), and the actual kernel
>>>> decompressor, relocator and compressed kernel data, then it would be
>>>> easy for the Xen domain builder to find that and use it as a basis for
>>>> loading. I think it would just require the bzImage boot param block to
>>>> contain an offset of the start of the ELF file. The contents of the ELF
>>>> file would be in a form where the normal boot code could just jump over
>>>> the ELF headers, directly into the segment data itself.
>>>>
>>>> ie:
>>>>
>>>> 1. old-style boot sector
>>>> 2. old-style boot info, followed by 0xaa55 at the end of the sector
>>>> 3. the HdrS boot param block
>>>> 4. setup.S boot code (jumps directly into 5.3)
>>>> 5. 32-bit self-decompressing kernel:
>>>> 1. Ehdr
>>>> 2. Phdrs for all necessary mappings
>>>> 3. decompressor/relocator .text
>>>> 4. compressed kernel data
>>>>
>>>> Does that sound reasonable?
>>>>
>>>>
>>> I don't know if that would break any programs that are currently
>>> bypassing the setup.
>>>
>
> I think everything will break, unless we make 5.1 and 5.2
> into 4.2 and 4.3. In the above design.
>
>
>> I think kexec bzImage loader will break. It bypasses the setup code and
>> directly jumps to the code present after setup sectors(decompressor).
>>
>
> Quite likely. The boot sector except for a handful of bytes actually
> goes unused so we can put extra header information there, I actually
> have patches for placing an ELF header there.
OK, whatever you think will work. But I do think it should be a proper
ELF file with a correct magic number, so that you can just point an ELF
file parser at it and have it work (which means, of course, that all the
file offsets are offsets from the start of the Ehdr, rather than from
the start of the bzImage).
You haven't specifically commented on using the Phdrs as a way of
specifying the mappings required for decompression and early kernel
execution. It seems pretty natural to me, but I guess that raises the
general question of what execution environment the kernel can expect to
find itself in, and which modes of booting will actually enable paging
and establish any kinds of mapping at all.
In the Xen case, its obviously the domain builder who creates the
mappings, and we can easily implement p != v mappings. But when booting
native, presumably paging is off at this stage, and only identity maps
can be implemented. I guess the rough rule is that if paging is enabled
on entry, the kernel should expect all the bzImage mappings to be in
place, but if paging is off, well, the question is moot.
J
Jeremy Fitzhardinge <[email protected]> writes:
> OK, whatever you think will work. But I do think it should be a proper
> ELF file with a correct magic number, so that you can just point an ELF
> file parser at it and have it work (which means, of course, that all the
> file offsets are offsets from the start of the Ehdr, rather than from
> the start of the bzImage).
Yes. I guess in this context, I am generally for building the ELF
headers by hand instead of with a linker script, because then we
know exactly what is happening and can ensure everything is just so.
> You haven't specifically commented on using the Phdrs as a way of
> specifying the mappings required for decompression and early kernel
> execution. It seems pretty natural to me, but I guess that raises the
> general question of what execution environment the kernel can expect to
> find itself in, and which modes of booting will actually enable paging
> and establish any kinds of mapping at all.
Sorry, for not being clear I have been expecting to do this for years,
it is one of the reasons I keep coming back to putting an ELF header
on the bzImage. arch/x86_64/kernel already does this to some extent
as it has to setup up some identity page mappings for itself in the
case it has to do the switch from real to protected mode itself.
> In the Xen case, its obviously the domain builder who creates the
> mappings, and we can easily implement p != v mappings. But when booting
> native, presumably paging is off at this stage, and only identity maps
> can be implemented. I guess the rough rule is that if paging is enabled
> on entry, the kernel should expect all the bzImage mappings to be in
> place, but if paging is off, well, the question is moot.
Right. Except that there is a bit of a catch 22 in the
para-virtualized environments of setting up the page tables, I'm not
at all certain what the gain of setting up p != v mappings are.
Having just written some C code that runs fairly successfully in p !=
v, on arch/i386 I'm not too concerned. arch/x86_64 ought to work with
a similar level of effort although the expectations there are a little
different. So while I don't necessarily considering running in p != v
when compiled to run at v general it should work for the cases we
are interested in. Setting up the page tables for arch/x86_64 will
be more interesting.
Part of what I find compelling about this is our initial page tables
for linux have always had more going on than the virtual addresses
just being at a constant offset from of the physical addresses, so
the actions of the current domain builders have me concerned that they
may be violating some early linux booting assumptions and are
currently just getting lucky. Moving the page table setup code into
the kernel removes that dependency from the domain builders.
Eric
Eric W. Biederman wrote:
> Yes. I guess in this context, I am generally for building the ELF
> headers by hand instead of with a linker script, because then we
> know exactly what is happening and can ensure everything is just so.
>
Yes, it seems easiest - particularly given how flaky binutils can get
when you really try to control its ELF generation.
> Sorry, for not being clear I have been expecting to do this for years,
> it is one of the reasons I keep coming back to putting an ELF header
> on the bzImage.
>
OK. It seems obvious, but I just wanted to make sure ;)
>> In the Xen case, its obviously the domain builder who creates the
>> mappings, and we can easily implement p != v mappings. But when booting
>> native, presumably paging is off at this stage, and only identity maps
>> can be implemented. I guess the rough rule is that if paging is enabled
>> on entry, the kernel should expect all the bzImage mappings to be in
>> place, but if paging is off, well, the question is moot.
>>
>
> Right. Except that there is a bit of a catch 22 in the
> para-virtualized environments of setting up the page tables, I'm not
> at all certain what the gain of setting up p != v mappings are.
>
Well, that's more or less it. If the decompressor ends up jumping to
startup_32, and that immediately goes into xen_start_kernel(), then
we're still running on the initial bzImage p=v pagetables. At the
moment, when the domain builder maps the kernel's vmlinux to the vaddrs
in its Phdrs, so there's no need to do any more boot-time pagetable
manipulation. If we come out of bzImage with only identity mappings,
then obviously the Xen case will need to do the same pagetable setup as
the native case - which is good so long as we can work out how to share
the code to do so.
For i386, it looks like this will be tricky because at this point:
* we're not running at the linked address, so C code will be tricky
and non-standard
* we need to deal with multiple hypervisors and their constraints on
what can be in a pagetable
* we could be running with no paging, or paging in either non-PAE or
PAE modes
Writing some code which can deal with all of those at once will be an
interesting exercise.
> Part of what I find compelling about this is our initial page tables
> for linux have always had more going on than the virtual addresses
> just being at a constant offset from of the physical addresses, so
> the actions of the current domain builders have me concerned that they
> may be violating some early linux booting assumptions and are
> currently just getting lucky. Moving the page table setup code into
> the kernel removes that dependency from the domain builders.
The nice thing about having the domain builder create the pagetables is
that it turns it from a tricky bootstrap problem into a relatively easy
job. The main thing is that the domain builder can create a scaffolding
pagetable which is enough to get everything started. Once you have that
in place, its pretty easy to update it to set precisely the right bits
in the ptes, etc.
It also means that the path for Xen vs native will be more similar,
because the bzImage code won't need to deal with pagetable setup at all:
for native it won't matter, and for Xen it has already been done. It
only matters once we hit the 32-bit kernel-proper code, and we diverge
at that point anyway.
J
On 5/2/07, Eric W. Biederman <[email protected]> wrote:
> Vivek Goyal <[email protected]> writes:
>
> > On Wed, May 02, 2007 at 02:59:11PM -0700, H. Peter Anvin wrote:
> >> Jeremy Fitzhardinge wrote:
> >> >
> >> > So the bzImage structure is currently:
> >> >
> >> > 1. old-style boot sector
> >> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
> >> > 3. the HdrS boot param block
> >> > 4. setup.S boot code
> >> > 5. the self-decompressing kernel
> >> >
Eric,
With the latest change that make vmlinux to be elf64 and make bzImage
do switch to 64bit long mode, the kernel started via kexec can not get
VGA console. but the serial console works well. I wonder if the
setup.S is skipped in bzImage via kexec path.
or i missed sth?
#!/bin/bash
./kexec -t bzImage -l bzImage_2.6.22_k8.1 --command-line="apic=debug
acpi_dbg_level=0x00000007 pci=routeirq snd-hda-intel.enable_msi=1
ramdisk_size=65536 root=/dev/ram0 rw ip=dhcp console=tty0
console=ttyS0,9600n8" --ramdisk=mydisk8_x86_64.gz
YH
yhlu <[email protected]> writes:
> Eric,
>
> With the latest change that make vmlinux to be elf64 and make bzImage
> do switch to 64bit long mode, the kernel started via kexec can not get
> VGA console. but the serial console works well. I wonder if the
> setup.S is skipped in bzImage via kexec path.
Yes. setup.S has always been skipped by bzImage via the kexec path
unless you explicitly tell /sbin/kexec to use the 16bit entry point.
Is not having a VGA console a new thing, or it something you just noticed?
Eric
On Tue, May 08, 2007 at 09:41:09AM -0700, yhlu wrote:
> On 5/2/07, Eric W. Biederman <[email protected]> wrote:
> >Vivek Goyal <[email protected]> writes:
> >
> >> On Wed, May 02, 2007 at 02:59:11PM -0700, H. Peter Anvin wrote:
> >>> Jeremy Fitzhardinge wrote:
> >>> >
> >>> > So the bzImage structure is currently:
> >>> >
> >>> > 1. old-style boot sector
> >>> > 2. old-style boot info, followed by 0xaa55 at the end of the sector
> >>> > 3. the HdrS boot param block
> >>> > 4. setup.S boot code
> >>> > 5. the self-decompressing kernel
> >>> >
>
> Eric,
>
> With the latest change that make vmlinux to be elf64 and make bzImage
> do switch to 64bit long mode, the kernel started via kexec can not get
> VGA console. but the serial console works well. I wonder if the
> setup.S is skipped in bzImage via kexec path.
>
> or i missed sth?
>
Hi,
setup.S is never executed while doing kexec (unless somebody chooses to
do a real mode entry) and these patches don't change this beahviour.
Tomorrow I will test VGA behaviour on my machine. Are you using some
special frame buffer mode etc?
Thanks
Vivek
On 5/8/07, Eric W. Biederman <[email protected]> wrote:
> Yes. setup.S has always been skipped by bzImage via the kexec path
> unless you explicitly tell /sbin/kexec to use the 16bit entry point.
>
> Is not having a VGA console a new thing, or it something you just noticed?
>
> Eric
>
before the changes, it works well.
with --real-mode, it will reset the machine.
with --reset-vga, i will get
Kernel alive
kernel direct mapping tables up to 100000000 @ 8000-d000
on VGA monitor.
YH
On 5/8/07, Vivek Goyal <[email protected]> wrote:
> setup.S is never executed while doing kexec (unless somebody chooses to
> do a real mode entry) and these patches don't change this beahviour.
>
> Tomorrow I will test VGA behaviour on my machine. Are you using some
> special frame buffer mode etc?
>
I disabled the FB in the kernel.
YH
Eric,
i tried to load vmlinux with kexec and got
Ramdisks not supported with generic elf arguments
So i use mkelfImage with my patch ( convert elf64 to elf32) to make
another elf32. and loaded with kexec and can not get vga console too.
---serial console works well.
the mkelfImage 2.7 patch is at
http://72.14.253.104/search?q=cache:fuxOvFw3ZIIJ:lists.osdl.org/pipermail/fastboot/attachments/20061108/009064a6/attachment.obj+mkelfImage+2.7+patch&hl=en&ct=clnk&cd=4&gl=us
So the problem is not bzImage related, but in somewhere in vmlinux.
YH
yhlu <[email protected]> writes:
> Eric,
>
> i tried to load vmlinux with kexec and got
> Ramdisks not supported with generic elf arguments
>
> So i use mkelfImage with my patch ( convert elf64 to elf32) to make
> another elf32. and loaded with kexec and can not get vga console too.
> ---serial console works well.
>
> the mkelfImage 2.7 patch is at
> http://72.14.253.104/search?q=cache:fuxOvFw3ZIIJ:lists.osdl.org/pipermail/fastboot/attachments/20061108/009064a6/attachment.obj+mkelfImage+2.7+patch&hl=en&ct=clnk&cd=4&gl=us
>
> So the problem is not bzImage related, but in somewhere in vmlinux.
Odd. Is it specifically these patches?
Or is it just the recent kernel from Linus?
You might try a git-bisect, or if it is just these patches
walking through them one-by-one.
Eric
On 5/8/07, Eric W. Biederman <[email protected]> wrote:
> You might try a git-bisect, or if it is just these patches
> walking through them one-by-one.
f82af20e1a028e16b9bb11da081fa1148d40fa6a is first bad commit
commit f82af20e1a028e16b9bb11da081fa1148d40fa6a
Author: Gerd Hoffmann <[email protected]>
Date: Wed May 2 19:27:19 2007 +0200
[PATCH] x86-64: ignore vgacon if hardware not present
Avoid trying to set up vgacon if there's no vga hardware present.
Signed-off-by: Jeremy Fitzhardinge <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Cc: Alan <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
YH
yhlu wrote:
> On 5/8/07, Eric W. Biederman <[email protected]> wrote:
>> You might try a git-bisect, or if it is just these patches
>> walking through them one-by-one.
>
> f82af20e1a028e16b9bb11da081fa1148d40fa6a is first bad commit
> commit f82af20e1a028e16b9bb11da081fa1148d40fa6a
> Author: Gerd Hoffmann <[email protected]>
> Date: Wed May 2 19:27:19 2007 +0200
>
> [PATCH] x86-64: ignore vgacon if hardware not present
>
> Avoid trying to set up vgacon if there's no vga hardware present.
>
> Signed-off-by: Jeremy Fitzhardinge <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>
> Signed-off-by: Andi Kleen <[email protected]>
> Cc: Alan <[email protected]>
> Acked-by: Ingo Molnar <[email protected]>
Interesting. I haven't really been following this thread, but doesn't
it mean something isn't being initialized properly if this patch makes a
difference?
J
Jeremy Fitzhardinge wrote:
>
> Interesting. I haven't really been following this thread, but doesn't
> it mean something isn't being initialized properly if this patch makes a
> difference?
>
Specifically boot_params.screen_info isn't being properly set up by the
caller.
-hpa
On 5/8/07, H. Peter Anvin <[email protected]> wrote:
> Jeremy Fitzhardinge wrote:
> Specifically boot_params.screen_info isn't being properly set up by the
> caller.
will setup real_mode_data in kexec path?
YH
yhlu wrote:
> On 5/8/07, H. Peter Anvin <[email protected]> wrote:
>> Jeremy Fitzhardinge wrote:
>> Specifically boot_params.screen_info isn't being properly set up by the
>> caller.
>
> will setup real_mode_data in kexec path?
-ENOPARSE
-hpa
"H. Peter Anvin" <[email protected]> writes:
> yhlu wrote:
>> On 5/8/07, H. Peter Anvin <[email protected]> wrote:
>>> Jeremy Fitzhardinge wrote:
>>> Specifically boot_params.screen_info isn't being properly set up by the
>>> caller.
>>
>> will setup real_mode_data in kexec path?
>
> -ENOPARSE
I believe YH is asking how we setup real_mode_data in /sbin/kexec.
The setup is:
> real_mode->orig_x = 0;
> real_mode->orig_y = 0;
> real_mode->orig_video_page = 0;
> real_mode->orig_video_mode = 0;
> real_mode->orig_video_cols = 80;
> real_mode->orig_video_lines = 25;
> real_mode->orig_video_ega_bx = 0;
> real_mode->orig_video_isVGA = 1;
> real_mode->orig_video_points = 16;
Silly but generally safe.
More relevant because the code is in kernel we have:
arch/arm/kernel/setup.c:
> struct screen_info screen_info = {
> .orig_video_lines = 30,
> .orig_video_cols = 80,
> .orig_video_mode = 0,
> .orig_video_ega_bx = 0,
> .orig_video_isVGA = 1,
> .orig_video_points = 8
> };
arch/alpha/kernel/sys_sio.c:
> /* The AlphaBook1 has LCD video fixed at 800x600,
> 37 rows and 100 cols. */
> screen_info.orig_y = 37;
> screen_info.orig_video_cols = 100;
> screen_info.orig_video_lines = 37;
I expect I can find a few more examples where we specify
video_cols and video_lines but we use video_mode == 0.
Going farther mode 0x00 is a BIOS 40x25 mode. So the patch below is
not always safe even if we boot the bzImage. It is just highly
unlikely anyone would start the kernel in 40x25 text mode.
Therefore I expect the test should test several additional
fields, in particular video lines and columns before we
decide that we have an uninitialized screen_info and give up.
> commit f82af20e1a028e16b9bb11da081fa1148d40fa6a
> Author: Gerd Hoffmann <[email protected]>
> Date: Wed May 2 19:27:19 2007 +0200
>
> [PATCH] x86-64: ignore vgacon if hardware not present
>
> Avoid trying to set up vgacon if there's no vga hardware present.
>
> Signed-off-by: Jeremy Fitzhardinge <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>
> Signed-off-by: Andi Kleen <[email protected]>
> Cc: Alan <[email protected]>
> Acked-by: Ingo Molnar <[email protected]>
>
> diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c
> index 91a2078..3e67c34 100644
> --- a/drivers/video/console/vgacon.c
> +++ b/drivers/video/console/vgacon.c
> @@ -371,7 +371,8 @@ static const char *vgacon_startup(void)
> }
>
> /* VGA16 modes are not handled by VGACON */
> - if ((ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> + if ((ORIG_VIDEO_MODE == 0x00) || /* SCREEN_INFO not initialized */
> + (ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> (ORIG_VIDEO_MODE == 0x0E) || /* 640x200/4 */
> (ORIG_VIDEO_MODE == 0x10) || /* 640x350/4 */
> (ORIG_VIDEO_MODE == 0x12) || /* 640x480/4 */
Eric
Eric W. Biederman wrote:
>
> I expect I can find a few more examples where we specify
> video_cols and video_lines but we use video_mode == 0.
>
> Going farther mode 0x00 is a BIOS 40x25 mode. So the patch below is
> not always safe even if we boot the bzImage. It is just highly
> unlikely anyone would start the kernel in 40x25 text mode.
>
Mode 0x00 is, at least theoretically, BIOS 40x25 *grayscale*; this mode
(and mode 0x02 which is the same thing in 80x25) were as far as I know
only ever used with composite monitors off CGA cards, i.e. functionally
never. Actual monochrome monitors used mode 0x07.
-hpa
On 5/8/07, Eric W. Biederman <[email protected]> wrote:
> "H. Peter Anvin" <[email protected]> writes:
> I believe YH is asking how we setup real_mode_data in /sbin/kexec.
pxelinux:
SCREEN_INFO.orig_video_mode = 3
SCREEN_INFO.orig_x = 0
SCREEN_INFO.orig_y = 24
x86_boot_params[] :
0000: 00 18 ff ff 08 00 03 50 8c c8 03 00 8e c0 19 01
0010: 10 00 7c fb fc be 31 00 ac 20 c0 74 09 b4 0e bb
0020: 07 00 cd 10 eb f2 31 c0 cd 16 cd 19 ea f0 ff 00
0030: f0 44 69 72 65 63 74 20 62 6f 6f 74 15 00 10 00
current kexec:
SCREEN_INFO.orig_video_mode = 0
SCREEN_INFO.orig_x = 0
SCREEN_INFO.orig_y = 3
x86_boot_params[] :
0000: 00 03 00 fc 00 00 00 50 8c c8 00 00 8e c0 19 01
0010: 10 00 7c fb fc be 31 00 ac 20 c0 74 09 b4 0e bb
0020: 3f a3 00 16 eb f2 31 c0 cd 16 cd 19 ea f0 ff 00
0030: f0 44 69 72 65 63 74 20 62 6f 6f 74 15 00 20 00
YH
"H. Peter Anvin" <[email protected]> writes:
> Eric W. Biederman wrote:
>>
>> I expect I can find a few more examples where we specify
>> video_cols and video_lines but we use video_mode == 0.
>>
>> Going farther mode 0x00 is a BIOS 40x25 mode. So the patch below is
>> not always safe even if we boot the bzImage. It is just highly
>> unlikely anyone would start the kernel in 40x25 text mode.
>>
>
> Mode 0x00 is, at least theoretically, BIOS 40x25 *grayscale*; this mode
> (and mode 0x02 which is the same thing in 80x25) were as far as I know
> only ever used with composite monitors off CGA cards, i.e. functionally
> never. Actual monochrome monitors used mode 0x07.
I agree. We are not at all likely to see it in practice. Even
if my memory is correct and vga cards and non-monochrome cga
cards supported that mode.
That doesn't mean checking for 0x00 is sufficient to detect
an initialized struct screen_info, or a lack of a video screen.
We have in kernel historical precedent for using 0x00 as just meaning
a text mode. I'm fairly certain that I looked I more closely I could
find this convention of using 0x00 to mean a text mode on ia64, mips,
and ppc, in addition to the instances I found on alpha, arm,
Since the whole point is to detect the case where we don't have
a screen at all it makes sense to check several additional variables
and make certain that they are all 0. Agreed?
Eric
On Tue, May 08, 2007 at 11:51:35AM -0700, yhlu wrote:
> Eric,
>
> i tried to load vmlinux with kexec and got
> Ramdisks not supported with generic elf arguments
>
This message generally appears if you did not specify --args-linux
on kexec command line while loading vmlinux.
Thanks
Vivek
On 5/8/07, Vivek Goyal <[email protected]> wrote:
> On Tue, May 08, 2007 at 11:51:35AM -0700, yhlu wrote:
> This message generally appears if you did not specify --args-linux
> on kexec command line while loading vmlinux.
>
besides elf-x86_64, still need --args-linux to pass sth? but how to
let it load ramdisk?
YH
On 5/8/07, Eric W. Biederman <[email protected]> wrote:
> Since the whole point is to detect the case where we don't have
> a screen at all it makes sense to check several additional variables
> and make certain that they are all 0. Agreed?
need one good way to find if there is support vga console.
YH
yhlu <[email protected]> writes:
> On 5/8/07, Vivek Goyal <[email protected]> wrote:
>> On Tue, May 08, 2007 at 11:51:35AM -0700, yhlu wrote:
>> This message generally appears if you did not specify --args-linux
>> on kexec command line while loading vmlinux.
>>
> besides elf-x86_64, still need --args-linux to pass sth? but how to
> let it load ramdisk?
Same arguments just use --args-linux.
Basically the calling convention needs to be specified because
there isn't a universal one, and /sbin/kexec can't yet detect
vmlinux is linux.
Eric
yhlu wrote:
> On 5/8/07, Eric W. Biederman <[email protected]> wrote:
>> Since the whole point is to detect the case where we don't have
>> a screen at all it makes sense to check several additional variables
>> and make certain that they are all 0. Agreed?
>
> need one good way to find if there is support vga console.
There really isn't one, at least not given the current data structure;
the data structure has an "isVGA" flag, but if that is 0 it's supposed
to mean CGA/MDA/HGC/EGA, as opposed to VGA...
-hpa
H. Peter Anvin wrote:
> yhlu wrote:
>> On 5/8/07, Eric W. Biederman <[email protected]> wrote:
>>> Since the whole point is to detect the case where we don't have
>>> a screen at all it makes sense to check several additional variables
>>> and make certain that they are all 0. Agreed?
>> need one good way to find if there is support vga console.
>
> There really isn't one, at least not given the current data structure;
> the data structure has an "isVGA" flag, but if that is 0 it's supposed
> to mean CGA/MDA/HGC/EGA, as opposed to VGA...
Of course, one could argue that since all of those were obsolete by the
time Linux was first created, that it probably doesn't matter and that
isVGA == 0 pretty much means the more obvious thing.
MDA/HGC stuck around for quite a while for debugging, since it was
non-conflicting with VGA, but even if it is, the reason people put it in
their system is to have something that the OS doesn't readily see.
-hpa
On 5/8/07, H. Peter Anvin <[email protected]> wrote:
> H. Peter Anvin wrote:
> Of course, one could argue that since all of those were obsolete by the
> time Linux was first created, that it probably doesn't matter and that
> isVGA == 0 pretty much means the more obvious thing.
>
> MDA/HGC stuck around for quite a while for debugging, since it was
> non-conflicting with VGA, but even if it is, the reason people put it in
> their system is to have something that the OS doesn't readily see.
>
so the kexec tools need to scan the pci devices list, and find out how
to set real_mode.isVGA and orig_video_mode, also need to parse the
comand line about vga console.
YH
yhlu wrote:
> On 5/8/07, H. Peter Anvin <[email protected]> wrote:
>> H. Peter Anvin wrote:
>> Of course, one could argue that since all of those were obsolete by the
>> time Linux was first created, that it probably doesn't matter and that
>> isVGA == 0 pretty much means the more obvious thing.
>>
>> MDA/HGC stuck around for quite a while for debugging, since it was
>> non-conflicting with VGA, but even if it is, the reason people put it in
>> their system is to have something that the OS doesn't readily see.
>>
> so the kexec tools need to scan the pci devices list, and find out how
> to set real_mode.isVGA and orig_video_mode, also need to parse the
> comand line about vga console.
A better way, probably, would be for the kernel to export the
boot_params structure so kexec can reuse it.
-hpa
yhlu wrote:
> so the kexec tools need to scan the pci devices list, and find out how
> to set real_mode.isVGA and orig_video_mode, also need to parse the
> comand line about vga console.
BTW, welcome to the hell of bypassing setup.
-hpa
Refine SCREEN_INFO sanity check for vgacon initialization.
Checking video mode field only to see whenever SCREEN_INFO is
initialized is not enougth, in some cases it is zero although
a vga card is present. Lets additionally check cols and lines.
Signed-off-by: Gerd Hoffmann <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Alan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Eric W. Biederman <[email protected]>
---
drivers/video/console/vgacon.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
Index: vanilla-2.6.21-git11/drivers/video/console/vgacon.c
===================================================================
--- vanilla-2.6.21-git11.orig/drivers/video/console/vgacon.c
+++ vanilla-2.6.21-git11/drivers/video/console/vgacon.c
@@ -368,9 +368,14 @@ static const char *vgacon_startup(void)
#endif
}
+ /* SCREEN_INFO initialized? */
+ if ((ORIG_VIDEO_MODE == 0) &&
+ (ORIG_VIDEO_LINES == 0) &&
+ (ORIG_VIDEO_COLS == 0))
+ goto no_vga;
+
/* VGA16 modes are not handled by VGACON */
- if ((ORIG_VIDEO_MODE == 0x00) || /* SCREEN_INFO not initialized */
- (ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
+ if ((ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
(ORIG_VIDEO_MODE == 0x0E) || /* 640x200/4 */
(ORIG_VIDEO_MODE == 0x10) || /* 640x350/4 */
(ORIG_VIDEO_MODE == 0x12) || /* 640x480/4 */
"H. Peter Anvin" <[email protected]> writes:
> yhlu wrote:
>> so the kexec tools need to scan the pci devices list, and find out how
>> to set real_mode.isVGA and orig_video_mode, also need to parse the
>> comand line about vga console.
>
> BTW, welcome to the hell of bypassing setup.
Well in this case things are so very much better then attempting to
us setup.S it isn't a real option.
In general BIOS calls just don't work reliably after linux has
been running for a while.
As for YH's point it does look like there are a few ways
to poke the linux kernel to see what is happening.
We can look in /proc/ioports and see what has reserved
the video resources. That should give us a reasonable
estimate of the video adapter. We can do an ioctl to
the console and see how many lines and columns we have.
Reusing boot_params could be nice but if we have the information
available in other ways digging it out that way is quite possibly
better.
Eric
Gerd Hoffmann <[email protected]> writes:
> Hi,
>
>> Since the whole point is to detect the case where we don't have
>> a screen at all it makes sense to check several additional variables
>> and make certain that they are all 0. Agreed?
>
> Like in the attached patch?
Looks good to me.
> cheers,
> Gerd
> Refine SCREEN_INFO sanity check for vgacon initialization.
>
> Checking video mode field only to see whenever SCREEN_INFO is
> initialized is not enougth, in some cases it is zero although
> a vga card is present. Lets additionally check cols and lines.
>
> Signed-off-by: Gerd Hoffmann <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Alan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Eric W. Biederman <[email protected]>
> ---
> drivers/video/console/vgacon.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Index: vanilla-2.6.21-git11/drivers/video/console/vgacon.c
> ===================================================================
> --- vanilla-2.6.21-git11.orig/drivers/video/console/vgacon.c
> +++ vanilla-2.6.21-git11/drivers/video/console/vgacon.c
> @@ -368,9 +368,14 @@ static const char *vgacon_startup(void)
> #endif
> }
>
> + /* SCREEN_INFO initialized? */
> + if ((ORIG_VIDEO_MODE == 0) &&
> + (ORIG_VIDEO_LINES == 0) &&
> + (ORIG_VIDEO_COLS == 0))
> + goto no_vga;
> +
> /* VGA16 modes are not handled by VGACON */
> - if ((ORIG_VIDEO_MODE == 0x00) || /* SCREEN_INFO not initialized */
> - (ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> + if ((ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> (ORIG_VIDEO_MODE == 0x0E) || /* 640x200/4 */
> (ORIG_VIDEO_MODE == 0x10) || /* 640x350/4 */
> (ORIG_VIDEO_MODE == 0x12) || /* 640x480/4 */
On 5/9/07, Eric W. Biederman <[email protected]> wrote:
> "H. Peter Anvin" <[email protected]> writes:
> We can look in /proc/ioports and see what has reserved
> the video resources. That should give us a reasonable
> estimate of the video adapter. We can do an ioctl to
> the console and see how many lines and columns we have.
>
> Reusing boot_params could be nice but if we have the information
> available in other ways digging it out that way is quite possibly
> better.
Another path:
LiuxBIOS+elfboot+payload, and payload is compressed elf
(vmlinux+initrd) via lzma.
and use kexec to boot final production kernel.
We don't need to use boot_params from the first tiny kernel.
YH
[email protected] wrote:
>
> Well in this case things are so very much better then attempting to
> us setup.S it isn't a real option.
>
Obviously not, but it was more of a comment on the apparent assumption
that doing so would be simple.
-hpa
"H. Peter Anvin" <[email protected]> writes:
> Obviously not, but it was more of a comment on the apparent assumption
> that doing so would be simple.
Reasonable comment then.
Eric
On 5/9/07, Gerd Hoffmann <[email protected]> wrote:
> Refine SCREEN_INFO sanity check for vgacon initialization.
>
> Checking video mode field only to see whenever SCREEN_INFO is
> initialized is not enougth, in some cases it is zero although
> a vga card is present. Lets additionally check cols and lines.
>
> Signed-off-by: Gerd Hoffmann <[email protected]>
> Cc: Rusty Russell <[email protected]>
> Cc: Andi Kleen <[email protected]>
> Cc: Alan <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Eric W. Biederman <[email protected]>
> ---
> drivers/video/console/vgacon.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> Index: vanilla-2.6.21-git11/drivers/video/console/vgacon.c
> ===================================================================
> --- vanilla-2.6.21-git11.orig/drivers/video/console/vgacon.c
> +++ vanilla-2.6.21-git11/drivers/video/console/vgacon.c
> @@ -368,9 +368,14 @@ static const char *vgacon_startup(void)
> #endif
> }
>
> + /* SCREEN_INFO initialized? */
> + if ((ORIG_VIDEO_MODE == 0) &&
> + (ORIG_VIDEO_LINES == 0) &&
> + (ORIG_VIDEO_COLS == 0))
> + goto no_vga;
> +
> /* VGA16 modes are not handled by VGACON */
> - if ((ORIG_VIDEO_MODE == 0x00) || /* SCREEN_INFO not initialized */
> - (ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> + if ((ORIG_VIDEO_MODE == 0x0D) || /* 320x200/4 */
> (ORIG_VIDEO_MODE == 0x0E) || /* 640x200/4 */
> (ORIG_VIDEO_MODE == 0x10) || /* 640x350/4 */
> (ORIG_VIDEO_MODE == 0x12) || /* 640x480/4 */
>
it works.
YH