2013-03-05 08:15:21

by Robin Holt

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Thu, Feb 28, 2013 at 03:09:34PM -0800, H. Peter Anvin wrote:
> On 02/28/2013 03:02 PM, Yinghai Lu wrote:
> >
> > Good, kexec is not only one.
> >
> > ELILO has one.
> > 5 ELILO
> >
> > We need switch/case to have different fields set for it.
> >
>
> I wouldn't consider that "good", but it's a bloody good thing we have
> bootloader IDs.

What is the planned course of action here. With this commit applied,
I still cannot boot. Is there something you are waiting for from me?

Thanks,
Robin


2013-03-05 15:22:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

Yes, please do the analysis I asked for.

Robin Holt <[email protected]> wrote:

>On Thu, Feb 28, 2013 at 03:09:34PM -0800, H. Peter Anvin wrote:
>> On 02/28/2013 03:02 PM, Yinghai Lu wrote:
>> >
>> > Good, kexec is not only one.
>> >
>> > ELILO has one.
>> > 5 ELILO
>> >
>> > We need switch/case to have different fields set for it.
>> >
>>
>> I wouldn't consider that "good", but it's a bloody good thing we have
>> bootloader IDs.
>
>What is the planned course of action here. With this commit applied,
>I still cannot boot. Is there something you are waiting for from me?
>
>Thanks,
>Robin

--
Sent from my mobile phone. Please excuse brevity and lack of formatting.

2013-03-05 19:12:11

by Yinghai Lu

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Tue, Mar 5, 2013 at 7:22 AM, H. Peter Anvin <[email protected]> wrote:
> Yes, please do the analysis I asked for.

it uses first 2 pages in bzImage to bootparams.

then update the fields with ===> X

struct boot_params {
struct screen_info screen_info; /* 0x000 */ ===> X
struct apm_bios_info apm_bios_info; /* 0x040 */ ===> X
__u8 _pad2[4]; /* 0x054 */
__u64 tboot_addr; /* 0x058 */
struct ist_info ist_info; /* 0x060 */
__u8 _pad3[16]; /* 0x070 */
__u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ ===> X
__u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ ===> X
struct sys_desc_table sys_desc_table; /* 0x0a0 */ ===> X
struct olpc_ofw_header olpc_ofw_header; /* 0x0b0 */
__u32 ext_ramdisk_image; /* 0x0c0 */
__u32 ext_ramdisk_size; /* 0x0c4 */
__u32 ext_cmd_line_ptr; /* 0x0c8 */
__u8 _pad4[116]; /* 0x0cc */
struct edid_info edid_info; /* 0x140 */
struct efi_info efi_info; /* 0x1c0 */ ===> X
__u32 alt_mem_k; /* 0x1e0 */ ===> X
__u32 scratch; /* Scratch field! */ /* 0x1e4 */
__u8 e820_entries; /* 0x1e8 */ ===> X
__u8 eddbuf_entries; /* 0x1e9 */
__u8 edd_mbr_sig_buf_entries; /* 0x1ea */
__u8 kbd_status; /* 0x1eb */
__u8 _pad5[3]; /* 0x1ec */
/*
* The sentinel is set to a nonzero value (0xff) in header.S.
*
* A bootloader is supposed to only take setup_header and put
* it into a clean boot_params buffer. If it turns out that
* it is clumsy or too generous with the buffer, it most
* probably will pick up the sentinel variable too. The fact
* that this variable then is still 0xff will let kernel
* know that some variables in boot_params are invalid and
* kernel should zero out certain portions of boot_params.
*/
__u8 sentinel; /* 0x1ef */
__u8 _pad6[1]; /* 0x1f0 */
struct setup_header hdr; /* setup header */ /* 0x1f1 */ ===> X
__u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
__u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
struct e820entry e820_map[E820MAX]; /* 0x2d0 */ ===> X
__u8 _pad8[48]; /* 0xcd0 */
struct edd_info eddbuf[EDDMAXNR]; /* 0xd00 */
__u8 _pad9[276]; /* 0xeec */

so sentinel will be kept as 0xff, so efi_info get cleared by kernel...

Attached patches should avoid the clearing of efi_info when elilo is used.

Do we need to clear edd and pad2 and pad3 for elilo?

Thanks

Yinghai


Attachments:
fix_elilo.patch (925.00 B)

2013-03-05 19:52:31

by Robin Holt

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

That fixed it for me.

Can you help me understand why sentinel is non-zero? It looks to me
like 3.14 allocates 16kB plus strlen of the command line, zeros it,
and then proceeds to fill in fields, some differing from what is in the
boot_params structure. That said, it looks like the sentinel field
should remain 0. I am still trying to understand, but if this patch
makes it in, I am happy.

Thanks,
Robin

On Tue, Mar 05, 2013 at 11:12:08AM -0800, Yinghai Lu wrote:
> On Tue, Mar 5, 2013 at 7:22 AM, H. Peter Anvin <[email protected]> wrote:
> > Yes, please do the analysis I asked for.
>
> it uses first 2 pages in bzImage to bootparams.
>
> then update the fields with ===> X
>
> struct boot_params {
> struct screen_info screen_info; /* 0x000 */ ===> X
> struct apm_bios_info apm_bios_info; /* 0x040 */ ===> X
> __u8 _pad2[4]; /* 0x054 */
> __u64 tboot_addr; /* 0x058 */
> struct ist_info ist_info; /* 0x060 */
> __u8 _pad3[16]; /* 0x070 */
> __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ ===> X
> __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ ===> X
> struct sys_desc_table sys_desc_table; /* 0x0a0 */ ===> X
> struct olpc_ofw_header olpc_ofw_header; /* 0x0b0 */
> __u32 ext_ramdisk_image; /* 0x0c0 */
> __u32 ext_ramdisk_size; /* 0x0c4 */
> __u32 ext_cmd_line_ptr; /* 0x0c8 */
> __u8 _pad4[116]; /* 0x0cc */
> struct edid_info edid_info; /* 0x140 */
> struct efi_info efi_info; /* 0x1c0 */ ===> X
> __u32 alt_mem_k; /* 0x1e0 */ ===> X
> __u32 scratch; /* Scratch field! */ /* 0x1e4 */
> __u8 e820_entries; /* 0x1e8 */ ===> X
> __u8 eddbuf_entries; /* 0x1e9 */
> __u8 edd_mbr_sig_buf_entries; /* 0x1ea */
> __u8 kbd_status; /* 0x1eb */
> __u8 _pad5[3]; /* 0x1ec */
> /*
> * The sentinel is set to a nonzero value (0xff) in header.S.
> *
> * A bootloader is supposed to only take setup_header and put
> * it into a clean boot_params buffer. If it turns out that
> * it is clumsy or too generous with the buffer, it most
> * probably will pick up the sentinel variable too. The fact
> * that this variable then is still 0xff will let kernel
> * know that some variables in boot_params are invalid and
> * kernel should zero out certain portions of boot_params.
> */
> __u8 sentinel; /* 0x1ef */
> __u8 _pad6[1]; /* 0x1f0 */
> struct setup_header hdr; /* setup header */ /* 0x1f1 */ ===> X
> __u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
> __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
> struct e820entry e820_map[E820MAX]; /* 0x2d0 */ ===> X
> __u8 _pad8[48]; /* 0xcd0 */
> struct edd_info eddbuf[EDDMAXNR]; /* 0xd00 */
> __u8 _pad9[276]; /* 0xeec */
>
> so sentinel will be kept as 0xff, so efi_info get cleared by kernel...
>
> Attached patches should avoid the clearing of efi_info when elilo is used.
>
> Do we need to clear edd and pad2 and pad3 for elilo?
>
> Thanks
>
> Yinghai

2013-03-05 20:15:04

by Yinghai Lu

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Tue, Mar 5, 2013 at 11:52 AM, Robin Holt <[email protected]> wrote:
> That fixed it for me.
>
> Can you help me understand why sentinel is non-zero? It looks to me
> like 3.14 allocates 16kB plus strlen of the command line, zeros it,
> and then proceeds to fill in fields, some differing from what is in the
> boot_params structure. That said, it looks like the sentinel field
> should remain 0. I am still trying to understand, but if this patch
> makes it in, I am happy.

sentinel is out of setup_header.
it is 0x1ef. setup_header is from 0x1f1.

it will be 0xff from arch/x86/boot/header.S for bzImage.

.section ".header", "a"
.globl sentinel
sentinel: .byte 0xff, 0xff /* Used to detect broken loaders */

elilo first copy first page, and get 0x1f1 for setup code sector number.
then it copies all setup code ( include setup header).

after that sysdeps_create_boot_params will copy first two pages to bp...
CopyMem(bp, param_start, 0x2000);

in that case, it will keep all fields in boot_params around setup_header as
with setup code...thus 0x1ef will be 0xff.

elilo need attached fix.

Thanks

Yinghai


Attachments:
fix_boot_params.patch (1.38 kB)

2013-03-05 20:22:34

by Robin Holt

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Tue, Mar 05, 2013 at 12:14:56PM -0800, Yinghai Lu wrote:
> On Tue, Mar 5, 2013 at 11:52 AM, Robin Holt <[email protected]> wrote:
> > That fixed it for me.
> >
> > Can you help me understand why sentinel is non-zero? It looks to me
> > like 3.14 allocates 16kB plus strlen of the command line, zeros it,
> > and then proceeds to fill in fields, some differing from what is in the
> > boot_params structure. That said, it looks like the sentinel field
> > should remain 0. I am still trying to understand, but if this patch
> > makes it in, I am happy.
>
> sentinel is out of setup_header.
> it is 0x1ef. setup_header is from 0x1f1.
>
> it will be 0xff from arch/x86/boot/header.S for bzImage.
>
> .section ".header", "a"
> .globl sentinel
> sentinel: .byte 0xff, 0xff /* Used to detect broken loaders */
>
> elilo first copy first page, and get 0x1f1 for setup code sector number.
> then it copies all setup code ( include setup header).
>
> after that sysdeps_create_boot_params will copy first two pages to bp...
> CopyMem(bp, param_start, 0x2000);

Had not gotten that far in my understanding of elilo. Now I understand.

Robin

2013-03-06 16:53:52

by Josh Boyer

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Tue, Mar 5, 2013 at 2:12 PM, Yinghai Lu <[email protected]> wrote:
> On Tue, Mar 5, 2013 at 7:22 AM, H. Peter Anvin <[email protected]> wrote:
>> Yes, please do the analysis I asked for.
>
> it uses first 2 pages in bzImage to bootparams.
>
> then update the fields with ===> X
>
> struct boot_params {
> struct screen_info screen_info; /* 0x000 */ ===> X
> struct apm_bios_info apm_bios_info; /* 0x040 */ ===> X
> __u8 _pad2[4]; /* 0x054 */
> __u64 tboot_addr; /* 0x058 */
> struct ist_info ist_info; /* 0x060 */
> __u8 _pad3[16]; /* 0x070 */
> __u8 hd0_info[16]; /* obsolete! */ /* 0x080 */ ===> X
> __u8 hd1_info[16]; /* obsolete! */ /* 0x090 */ ===> X
> struct sys_desc_table sys_desc_table; /* 0x0a0 */ ===> X
> struct olpc_ofw_header olpc_ofw_header; /* 0x0b0 */
> __u32 ext_ramdisk_image; /* 0x0c0 */
> __u32 ext_ramdisk_size; /* 0x0c4 */
> __u32 ext_cmd_line_ptr; /* 0x0c8 */
> __u8 _pad4[116]; /* 0x0cc */
> struct edid_info edid_info; /* 0x140 */
> struct efi_info efi_info; /* 0x1c0 */ ===> X
> __u32 alt_mem_k; /* 0x1e0 */ ===> X
> __u32 scratch; /* Scratch field! */ /* 0x1e4 */
> __u8 e820_entries; /* 0x1e8 */ ===> X
> __u8 eddbuf_entries; /* 0x1e9 */
> __u8 edd_mbr_sig_buf_entries; /* 0x1ea */
> __u8 kbd_status; /* 0x1eb */
> __u8 _pad5[3]; /* 0x1ec */
> /*
> * The sentinel is set to a nonzero value (0xff) in header.S.
> *
> * A bootloader is supposed to only take setup_header and put
> * it into a clean boot_params buffer. If it turns out that
> * it is clumsy or too generous with the buffer, it most
> * probably will pick up the sentinel variable too. The fact
> * that this variable then is still 0xff will let kernel
> * know that some variables in boot_params are invalid and
> * kernel should zero out certain portions of boot_params.
> */
> __u8 sentinel; /* 0x1ef */
> __u8 _pad6[1]; /* 0x1f0 */
> struct setup_header hdr; /* setup header */ /* 0x1f1 */ ===> X
> __u8 _pad7[0x290-0x1f1-sizeof(struct setup_header)];
> __u32 edd_mbr_sig_buffer[EDD_MBR_SIG_MAX]; /* 0x290 */
> struct e820entry e820_map[E820MAX]; /* 0x2d0 */ ===> X
> __u8 _pad8[48]; /* 0xcd0 */
> struct edd_info eddbuf[EDDMAXNR]; /* 0xd00 */
> __u8 _pad9[276]; /* 0xeec */
>
> so sentinel will be kept as 0xff, so efi_info get cleared by kernel...
>
> Attached patches should avoid the clearing of efi_info when elilo is used.
>
> Do we need to clear edd and pad2 and pad3 for elilo?

I don't think this is limited to elilo. I have a UEFI machine booting
with grub2 that also fails to boot because of this patch. I was in the
middle of bisecting when I found this thread and if I revert 5dcd14ecd4
the machine boots again. Put that commit back in and it doesn't. We've
had three other reports in Fedora of similar cases.

I discussed this with Peter Jones this morning. He was looking into what
grub2 does for boot_params and it seems to be read-modify-write instead
of clearing the whole thing. (CC'd Peter now.)

The patch for elilo probably works, but if grub2 is hitting this then I'm
curious if most bootloaders will. I'll finish my bisect just to be extra
sure, but something probably needs to be done in a more generic fashion
here.

josh

2013-03-06 17:26:35

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On 03/06/2013 08:53 AM, Josh Boyer wrote:
>
> I don't think this is limited to elilo. I have a UEFI machine booting
> with grub2 that also fails to boot because of this patch. I was in the
> middle of bisecting when I found this thread and if I revert 5dcd14ecd4
> the machine boots again. Put that commit back in and it doesn't. We've
> had three other reports in Fedora of similar cases.
>
> I discussed this with Peter Jones this morning. He was looking into what
> grub2 does for boot_params and it seems to be read-modify-write instead
> of clearing the whole thing. (CC'd Peter now.)
>
> The patch for elilo probably works, but if grub2 is hitting this then I'm
> curious if most bootloaders will. I'll finish my bisect just to be extra
> sure, but something probably needs to be done in a more generic fashion
> here.
>

Come to think about it...

The EFI field actually has a magic, unless just about all the rest of
them, so clearing it is probabilistically redundant; in other words we
almost certainly should exclude it from the clearing, unconditionally.

Does the "elilo" patch made unconditional work for you?

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-03-06 17:36:35

by Josh Boyer

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Wed, Mar 6, 2013 at 12:26 PM, H. Peter Anvin <[email protected]> wrote:
> On 03/06/2013 08:53 AM, Josh Boyer wrote:
>>
>> I don't think this is limited to elilo. I have a UEFI machine booting
>> with grub2 that also fails to boot because of this patch. I was in the
>> middle of bisecting when I found this thread and if I revert 5dcd14ecd4
>> the machine boots again. Put that commit back in and it doesn't. We've
>> had three other reports in Fedora of similar cases.
>>
>> I discussed this with Peter Jones this morning. He was looking into what
>> grub2 does for boot_params and it seems to be read-modify-write instead
>> of clearing the whole thing. (CC'd Peter now.)
>>
>> The patch for elilo probably works, but if grub2 is hitting this then I'm
>> curious if most bootloaders will. I'll finish my bisect just to be extra
>> sure, but something probably needs to be done in a more generic fashion
>> here.
>>
>
> Come to think about it...
>
> The EFI field actually has a magic, unless just about all the rest of
> them, so clearing it is probabilistically redundant; in other words we
> almost certainly should exclude it from the clearing, unconditionally.
>
> Does the "elilo" patch made unconditional work for you?

Something like this?

Index: linux-2.6/arch/x86/include/asm/bootparam_utils.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/bootparam_utils.h
+++ linux-2.6/arch/x86/include/asm/bootparam_utils.h
@@ -20,8 +20,11 @@ static void sanitize_boot_params(struct
if (boot_params->sentinel) {
/*fields in boot_params are not valid, clear them */
memset(&boot_params->olpc_ofw_header, 0,
- (char *)&boot_params->alt_mem_k -
+ (char *)&boot_params->efi_info -
(char *)&boot_params->olpc_ofw_header);
memset(&boot_params->kbd_status, 0,
(char *)&boot_params->hdr -
(char *)&boot_params->kbd_status);

I can try that in a bit.

josh

2013-03-06 17:37:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On 03/06/2013 09:36 AM, Josh Boyer wrote:
>
> Something like this?
>
> Index: linux-2.6/arch/x86/include/asm/bootparam_utils.h
> ===================================================================
> --- linux-2.6.orig/arch/x86/include/asm/bootparam_utils.h
> +++ linux-2.6/arch/x86/include/asm/bootparam_utils.h
> @@ -20,8 +20,11 @@ static void sanitize_boot_params(struct
> if (boot_params->sentinel) {
> /*fields in boot_params are not valid, clear them */
> memset(&boot_params->olpc_ofw_header, 0,
> - (char *)&boot_params->alt_mem_k -
> + (char *)&boot_params->efi_info -
> (char *)&boot_params->olpc_ofw_header);
> memset(&boot_params->kbd_status, 0,
> (char *)&boot_params->hdr -
> (char *)&boot_params->kbd_status);
>
> I can try that in a bit.
>

Right.

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2013-03-06 18:00:42

by Peter Jones

[permalink] [raw]
Subject: [PATCH] Be explicit about what the x86 0x020c boot parameter version requires.

This should help avoid making the incorrect change in non-compliant
bootloaders.

Signed-off-by: Peter Jones <[email protected]>
---
Documentation/x86/boot.txt | 5 +++--
arch/x86/include/asm/bootparam_utils.h | 7 +++++++
2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 3840b6f..72702db 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -1110,7 +1110,8 @@ firmware, 'table' is the EFI system table - these are the first two
arguments of the "handoff state" as described in section 2.3 of the
UEFI specification. 'bp' is the boot loader-allocated boot params.

-The boot loader *must* fill out the following fields in bp,
+The boot loader *must* zero the entirity of bp, and then fill out the
+following fields:

o hdr.code32_start
o hdr.cmd_line_ptr
@@ -1118,4 +1119,4 @@ The boot loader *must* fill out the following fields in bp,
o hdr.ramdisk_image (if applicable)
o hdr.ramdisk_size (if applicable)

-All other fields should be zero.
+All other fields should remain zero.
diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 5b5e9cb..b4e8aa8 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -17,6 +17,13 @@
*/
static void sanitize_boot_params(struct boot_params *boot_params)
{
+ /* Note: do not simply clear this field. The purpose of this field is
+ * to guarantee compliance with the x86 boot spec located in
+ * Documentation/x86/boot.txt . That spec says that the *whole*
+ * structure should be cleared. If you're having an issue because
+ * the sentinel is set, you need to change the whole structure to be
+ * cleared, not this (or any other) indidivual field.
+ */
if (boot_params->sentinel) {
/*fields in boot_params are not valid, clear them */
memset(&boot_params->olpc_ofw_header, 0,
--
1.8.1.4

2013-03-06 20:40:15

by Josh Boyer

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

On Wed, Mar 6, 2013 at 12:37 PM, H. Peter Anvin <[email protected]> wrote:
> On 03/06/2013 09:36 AM, Josh Boyer wrote:
>>
>> Something like this?
>>
>> Index: linux-2.6/arch/x86/include/asm/bootparam_utils.h
>> ===================================================================
>> --- linux-2.6.orig/arch/x86/include/asm/bootparam_utils.h
>> +++ linux-2.6/arch/x86/include/asm/bootparam_utils.h
>> @@ -20,8 +20,11 @@ static void sanitize_boot_params(struct
>> if (boot_params->sentinel) {
>> /*fields in boot_params are not valid, clear them */
>> memset(&boot_params->olpc_ofw_header, 0,
>> - (char *)&boot_params->alt_mem_k -
>> + (char *)&boot_params->efi_info -
>> (char *)&boot_params->olpc_ofw_header);
>> memset(&boot_params->kbd_status, 0,
>> (char *)&boot_params->hdr -
>> (char *)&boot_params->kbd_status);
>>
>> I can try that in a bit.
>>
>
> Right.

OK. Doing that on top of 3.9-rc1 results in me having a booting machine
again. I'd suggest we get that into upstream quick-ish.

josh

2013-03-06 20:44:07

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Revert commit 5dcd14ecd4 - breaks EFI boot with SLES11 elilo.efi

Yeah. I'll queue it up today and send to Linus in the next batch.

Josh Boyer <[email protected]> wrote:

>On Wed, Mar 6, 2013 at 12:37 PM, H. Peter Anvin <[email protected]> wrote:
>> On 03/06/2013 09:36 AM, Josh Boyer wrote:
>>>
>>> Something like this?
>>>
>>> Index: linux-2.6/arch/x86/include/asm/bootparam_utils.h
>>> ===================================================================
>>> --- linux-2.6.orig/arch/x86/include/asm/bootparam_utils.h
>>> +++ linux-2.6/arch/x86/include/asm/bootparam_utils.h
>>> @@ -20,8 +20,11 @@ static void sanitize_boot_params(struct
>>> if (boot_params->sentinel) {
>>> /*fields in boot_params are not valid, clear them */
>>> memset(&boot_params->olpc_ofw_header, 0,
>>> - (char *)&boot_params->alt_mem_k -
>>> + (char *)&boot_params->efi_info -
>>> (char *)&boot_params->olpc_ofw_header);
>>> memset(&boot_params->kbd_status, 0,
>>> (char *)&boot_params->hdr -
>>> (char *)&boot_params->kbd_status);
>>>
>>> I can try that in a bit.
>>>
>>
>> Right.
>
>OK. Doing that on top of 3.9-rc1 results in me having a booting
>machine
>again. I'd suggest we get that into upstream quick-ish.
>
>josh

--
Sent from my mobile phone. Please excuse brevity and lack of formatting.

2013-03-07 04:31:14

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] Be explicit about what the x86 0x020c boot parameter version requires.

On 03/06/2013 10:00 AM, Peter Jones wrote:
> This should help avoid making the incorrect change in non-compliant
> bootloaders.
>
> Signed-off-by: Peter Jones <[email protected]>
> ---
> Documentation/x86/boot.txt | 5 +++--
> arch/x86/include/asm/bootparam_utils.h | 7 +++++++
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> index 3840b6f..72702db 100644
> --- a/Documentation/x86/boot.txt
> +++ b/Documentation/x86/boot.txt
> @@ -1110,7 +1110,8 @@ firmware, 'table' is the EFI system table - these are the first two
> arguments of the "handoff state" as described in section 2.3 of the
> UEFI specification. 'bp' is the boot loader-allocated boot params.
>
> -The boot loader *must* fill out the following fields in bp,
> +The boot loader *must* zero the entirity of bp, and then fill out the
> +following fields:
>
> o hdr.code32_start
> o hdr.cmd_line_ptr
> @@ -1118,4 +1119,4 @@ The boot loader *must* fill out the following fields in bp,
> o hdr.ramdisk_image (if applicable)
> o hdr.ramdisk_size (if applicable)
>

Wait a bloody minute here... I seem to have managed to miss something big.

Matt, should we not be copying the setup part of the structure just as
we do for the normal 32/64-bit protocol? If not, why not?


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

Subject: [tip:x86/urgent] x86: Don' t clear efi_info even if the sentinel hits

Commit-ID: 2e604c0f19dcdd433b3863ffc3da9bc0787ca765
Gitweb: http://git.kernel.org/tip/2e604c0f19dcdd433b3863ffc3da9bc0787ca765
Author: Josh Boyer <[email protected]>
AuthorDate: Wed, 6 Mar 2013 20:23:30 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Wed, 6 Mar 2013 20:23:30 -0800

x86: Don't clear efi_info even if the sentinel hits

When boot_params->sentinel is set, all we really know is that some
undefined set of fields in struct boot_params contain garbage. In the
particular case of efi_info, however, there is a private magic for
that substructure, so it is generally safe to leave it even if the
bootloader is broken.

kexec (for which we did the initial analysis) did not initialize this
field, but of course all the EFI bootloaders do, and most EFI
bootloaders are broken in this respect (and should be fixed.)

Reported-by: Robin Holt <[email protected]>
Link: http://lkml.kernel.org/r/CA%2B5PVA51-FT14p4CRYKbicykugVb=PiaEycdQ57CK2km_OQuRQ@mail.gmail.com
Tested-by: Josh Boyer <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/bootparam_utils.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 5b5e9cb..ff808ef 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -14,13 +14,15 @@
* analysis of kexec-tools; if other broken bootloaders initialize a
* different set of fields we will need to figure out how to disambiguate.
*
+ * Note: efi_info is commonly left uninitialized, but that field has a
+ * private magic, so it is better to leave it unchanged.
*/
static void sanitize_boot_params(struct boot_params *boot_params)
{
if (boot_params->sentinel) {
/*fields in boot_params are not valid, clear them */
memset(&boot_params->olpc_ofw_header, 0,
- (char *)&boot_params->alt_mem_k -
+ (char *)&boot_params->efi_info -
(char *)&boot_params->olpc_ofw_header);
memset(&boot_params->kbd_status, 0,
(char *)&boot_params->hdr -

Subject: [tip:x86/urgent] x86, doc: Be explicit about what the x86 struct boot_params requires

Commit-ID: 3c4aff6b9a183b4f24eb7b8dd6c8a92cdba3bc75
Gitweb: http://git.kernel.org/tip/3c4aff6b9a183b4f24eb7b8dd6c8a92cdba3bc75
Author: Peter Jones <[email protected]>
AuthorDate: Wed, 6 Mar 2013 13:00:23 -0500
Committer: H. Peter Anvin <[email protected]>
CommitDate: Wed, 6 Mar 2013 20:34:43 -0800

x86, doc: Be explicit about what the x86 struct boot_params requires

If the sentinel triggers, we do not want the boot loader authors to
just poke it and make the error go away, we want them to actually fix
the problem.

This should help avoid making the incorrect change in non-compliant
bootloaders.

[ hpa: dropped the Documentation/x86/boot.txt hunk pending
clarifications ]

Signed-off-by: Peter Jones <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/bootparam_utils.h | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index ff808ef..653668d 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -19,8 +19,22 @@
*/
static void sanitize_boot_params(struct boot_params *boot_params)
{
+ /*
+ * IMPORTANT NOTE TO BOOTLOADER AUTHORS: do not simply clear
+ * this field. The purpose of this field is to guarantee
+ * compliance with the x86 boot spec located in
+ * Documentation/x86/boot.txt . That spec says that the
+ * *whole* structure should be cleared, after which only the
+ * portion defined by struct setup_header (boot_params->hdr)
+ * should be copied in.
+ *
+ * If you're having an issue because the sentinel is set, you
+ * need to change the whole structure to be cleared, not this
+ * (or any other) individual field, or you will soon have
+ * problems again.
+ */
if (boot_params->sentinel) {
- /*fields in boot_params are not valid, clear them */
+ /* fields in boot_params are left uninitialized, clear them */
memset(&boot_params->olpc_ofw_header, 0,
(char *)&boot_params->efi_info -
(char *)&boot_params->olpc_ofw_header);

2013-03-07 08:39:37

by Matt Fleming

[permalink] [raw]
Subject: Re: [PATCH] Be explicit about what the x86 0x020c boot parameter version requires.

On Wed, 2013-03-06 at 20:31 -0800, H. Peter Anvin wrote:
> On 03/06/2013 10:00 AM, Peter Jones wrote:
> > This should help avoid making the incorrect change in non-compliant
> > bootloaders.
> >
> > Signed-off-by: Peter Jones <[email protected]>
> > ---
> > Documentation/x86/boot.txt | 5 +++--
> > arch/x86/include/asm/bootparam_utils.h | 7 +++++++
> > 2 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
> > index 3840b6f..72702db 100644
> > --- a/Documentation/x86/boot.txt
> > +++ b/Documentation/x86/boot.txt
> > @@ -1110,7 +1110,8 @@ firmware, 'table' is the EFI system table - these are the first two
> > arguments of the "handoff state" as described in section 2.3 of the
> > UEFI specification. 'bp' is the boot loader-allocated boot params.
> >
> > -The boot loader *must* fill out the following fields in bp,
> > +The boot loader *must* zero the entirity of bp, and then fill out the
> > +following fields:
> >
> > o hdr.code32_start
> > o hdr.cmd_line_ptr
> > @@ -1118,4 +1119,4 @@ The boot loader *must* fill out the following fields in bp,
> > o hdr.ramdisk_image (if applicable)
> > o hdr.ramdisk_size (if applicable)
> >
>
> Wait a bloody minute here... I seem to have managed to miss something big.
>
> Matt, should we not be copying the setup part of the structure just as
> we do for the normal 32/64-bit protocol? If not, why not?

The structure that contains the hdr.version? Yeah, we should be copying
that.