Due to limited space left in the setup header it was decided to
introduce the setup_header2. Its role is to communicate Linux kernel
supported features to the boot loader. Starting from now this is the
primary way to communicate things to the boot loader.
Suggested-by: H. Peter Anvin <[email protected]>
Signed-off-by: Daniel Kiper <[email protected]>
Reviewed-by: Ross Philipson <[email protected]>
Reviewed-by: Eric Snowberg <[email protected]>
---
I know that setup_header2 is not the best name. There were some
alternatives proposed like setup_header_extra, setup_header_addendum,
setup_header_more, ext_setup_header, extended_setup_header, extended_header
and extended_setup. Sadly, I am not happy with any of them. So,
leaving setup_header2 as is but still looking for better name.
Probably shorter == better...
---
Documentation/x86/boot.txt | 49 ++++++++++++++++++++++++++++++++
arch/x86/boot/Makefile | 2 +-
arch/x86/boot/compressed/Makefile | 4 +--
arch/x86/boot/compressed/setup_header2.S | 12 ++++++++
arch/x86/boot/header.S | 3 +-
arch/x86/boot/tools/build.c | 8 ++++++
arch/x86/include/uapi/asm/bootparam.h | 1 +
7 files changed, 75 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/boot/compressed/setup_header2.S
diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index f4c2a97bfdbd..ff10c6116662 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -61,6 +61,22 @@ Protocol 2.12: (Kernel 3.8) Added the xloadflags field and extension fields
to struct boot_params for loading bzImage and ramdisk
above 4G in 64bit.
+Protocol 2.14: BURNT BY INCORRECT COMMIT ae7e1238e68f2a472a125673ab506d49158c1889
+ (x86/boot: Add ACPI RSDP address to setup_header)
+ DO NOT USE!!! ASSUME SAME AS 2.13.
+
+Protocol 2.15: (Kernel 5.2) Added the setup_header2.
+
+Note: The protocol version number should be changed only if the setup header
+ is changed. There is no need to update the version number if boot_params
+ or setup_header2 are changed. Additionally, it is recommended to use
+ xloadflags (in this case the protocol version number should not be
+ updated either) or setup_header2 to communicate supported Linux kernel
+ features to the boot loader. Due to very limited space available in
+ the original setup header every update to it should be considered
+ with great care. Starting from the protocol 2.15 the primary way to
+ communicate things to the boot loader is the setup_header2.
+
**** MEMORY LAYOUT
The traditional memory map for the kernel loader, used for Image or
@@ -197,6 +213,7 @@ Offset Proto Name Meaning
0258/8 2.10+ pref_address Preferred loading address
0260/4 2.10+ init_size Linear memory required during initialization
0264/4 2.11+ handover_offset Offset of handover entry point
+0268/4 2.15+ setup_header2_offset Offset of the setup_header2
(1) For backwards compatibility, if the setup_sects field contains 0, the
real value is 4.
@@ -744,6 +761,38 @@ Offset/size: 0x264/4
See EFI HANDOVER PROTOCOL below for more details.
+Field name: setup_header2_offset
+Type: read
+Offset/size: 0x268/4
+Protocol: 2.15+
+
+ This field is the offset from the beginning of the kernel image to the
+ setup_header2. It is embedded in the Linux image in the uncompressed
+ protected mode region.
+
+
+**** THE SETUP_HEADER2
+
+Due to limited space left in the setup header it was decided to introduce
+the setup_header2. Its role is to communicate Linux kernel supported features
+to the boot loader. All fields of the setup_header2 are read only from the
+boot loader point of view. The setup_header2 is supported starting from the
+boot protocol version 2.15.
+
+
+**** DETAILS OF THE SETUP_HEADER2 FIELDS
+
+Field name: header
+Offset/size: 0x0000/4
+
+ Contains the magic number "hDR2" (0x68445232).
+
+Field name: size
+Offset/size: 0x0004/4
+
+ This field contains the size of the setup_header2 including setup_header2.header.
+ It should be used by the boot loader to detect supported fields in the setup_header2.
+
**** THE IMAGE CHECKSUM
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
index e2839b5c246c..c11b57da90f6 100644
--- a/arch/x86/boot/Makefile
+++ b/arch/x86/boot/Makefile
@@ -87,7 +87,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
-sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|_ehead\|_text\|z_.*\)$$/\#define ZO_\2 0x\1/p'
+sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|setup_header2\|input_data\|_end\|_ehead\|_text\|z_.*\)$$/\#define ZO_\2 0x\1/p'
quiet_cmd_zoffset = ZOFFSET $@
cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 6b84afdd7538..c12ccc2bd923 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -72,8 +72,8 @@ $(obj)/../voffset.h: vmlinux FORCE
$(obj)/misc.o: $(obj)/../voffset.h
-vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
- $(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
+vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/setup_header2.o $(obj)/head_$(BITS).o \
+ $(obj)/misc.o $(obj)/string.o $(obj)/cmdline.o $(obj)/error.o \
$(obj)/piggy.o $(obj)/cpuflags.o
vmlinux-objs-$(CONFIG_EARLY_PRINTK) += $(obj)/early_serial_console.o
diff --git a/arch/x86/boot/compressed/setup_header2.S b/arch/x86/boot/compressed/setup_header2.S
new file mode 100644
index 000000000000..0b3963296825
--- /dev/null
+++ b/arch/x86/boot/compressed/setup_header2.S
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+ .section ".rodata.setup_header2", "a"
+
+ .global setup_header2
+
+setup_header2:
+ /* Header. */
+ .ascii "hDR2"
+ /* Size. */
+ .long setup_header2_end - setup_header2
+setup_header2_end:
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 850b8762e889..72387b1e359c 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -300,7 +300,7 @@ _start:
# Part 2 of the header, from the old setup.S
.ascii "HdrS" # header signature
- .word 0x020d # header version number (>= 0x0105)
+ .word 0x020f # header version number (>= 0x0105)
# or else old loadlin-1.5 will fail)
.globl realmode_swtch
realmode_swtch: .word 0, 0 # default_switch, SETUPSEG
@@ -557,6 +557,7 @@ pref_address: .quad LOAD_PHYSICAL_ADDR # preferred load addr
init_size: .long INIT_SIZE # kernel initialization size
handover_offset: .long 0 # Filled in by build.c
+setup_header2_offset: .long 0 # Filled in by build.c
# End of setup header #####################################################
diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c
index a93d44e58f9c..7fc9425a0fc9 100644
--- a/arch/x86/boot/tools/build.c
+++ b/arch/x86/boot/tools/build.c
@@ -56,6 +56,7 @@ u8 buf[SETUP_SECT_MAX*512];
unsigned long efi32_stub_entry;
unsigned long efi64_stub_entry;
unsigned long efi_pe_entry;
+unsigned long setup_header2;
unsigned long startup_64;
/*----------------------------------------------------------------------*/
@@ -289,6 +290,10 @@ static inline int reserve_pecoff_reloc_section(int c)
}
#endif /* CONFIG_EFI_STUB */
+static void setup_header2_offset_update(void)
+{
+ put_unaligned_le32(setup_header2, &buf[0x268]);
+}
/*
* Parse zoffset.h and find the entry points. We could just #include zoffset.h
@@ -321,6 +326,7 @@ static void parse_zoffset(char *fname)
PARSE_ZOFS(p, efi32_stub_entry);
PARSE_ZOFS(p, efi64_stub_entry);
PARSE_ZOFS(p, efi_pe_entry);
+ PARSE_ZOFS(p, setup_header2);
PARSE_ZOFS(p, startup_64);
p = strchr(p, '\n');
@@ -410,6 +416,8 @@ int main(int argc, char ** argv)
efi_stub_entry_update();
+ setup_header2_offset_update();
+
crc = partial_crc32(buf, i, crc);
if (fwrite(buf, 1, i, dest) != i)
die("Writing setup failed");
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 60733f137e9a..77b48c7a23b4 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -86,6 +86,7 @@ struct setup_header {
__u64 pref_address;
__u32 init_size;
__u32 handover_offset;
+ __u32 setup_header2_offset;
} __attribute__((packed));
struct sys_desc_table {
--
2.11.0
On 5/24/19 2:55 AM, Daniel Kiper wrote:
> Due to limited space left in the setup header it was decided to
> introduce the setup_header2. Its role is to communicate Linux kernel
> supported features to the boot loader. Starting from now this is the
> primary way to communicate things to the boot loader.
>
> Suggested-by: H. Peter Anvin <[email protected]>
> Signed-off-by: Daniel Kiper <[email protected]>
> Reviewed-by: Ross Philipson <[email protected]>
> Reviewed-by: Eric Snowberg <[email protected]>
> ---
> I know that setup_header2 is not the best name. There were some
> alternatives proposed like setup_header_extra, setup_header_addendum,
> setup_header_more, ext_setup_header, extended_setup_header, extended_header
> and extended_setup. Sadly, I am not happy with any of them. So,
> leaving setup_header2 as is but still looking for better name.
> Probably shorter == better...
I would say kernel_info. The relationships between the headers are analogous
to the various data sections:
setup_header = .data
boot_params/setup_data = .bss
What is missing from the above list? That's right:
kernel_info = .rodata
We have been (ab)using .data for things that could go into .rodata or .bss for
a long time, for lack of alternatives and -- especially early on -- intertia.
Also, the BIOS stub is responsible for creating boot_params, so it isn't
available to a BIOS-based loader (setup_data is, though.)
setup_header is permanently limited to 144 bytes due to the reach of the
2-byte jump field, which doubles as a length field for the structure, combined
with the size of the "hole" in struct boot_params that a protected-mode loader
or the BIOS stub has to copy it into. It is currently 119 bytes long, which
leaves us with 25 very precious bytes. This isn't something that can be fixed
without revising the boot protocol entirely, breaking backwards compatibility.
boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
by adding setup_data entries. It cannot be used to communicate properties of
the kernel image, because it is .bss and has no image-provided content.
kernel_info solves this by providing an extensible place for information about
the kernel image. It is readonly, because the kernel cannot rely on a
bootloader copying its contents anywhere, but that is OK; if it becomes
necessary it can still contain data items that an enabled bootloader would be
expected to copy into a setup_data chunk.
^ The above or some variant thereof may be a good thing to put both in your
patch comments as well as in the boot protocol documentation.
While we are making a change that bumps the version number anyway, there is
another change I would like to make to the boot protocol which we might as
well do at the same time. setup_data is a bit awkward to use for extremely
large data objects, both because the setup_data header has to be adjacent to
the data object, and because it has a 32-bit length field. However, it is
important that intermediate stages of the boot process have a way to identify
which chunks of memory are occupied by kernel data.
Thus I think we should introduce a uniform way to specify such indirect data.
We define a new setup_data type we can maybe call SETUP_INDIRECT; a
SETUP_INDIRECT data item would be an array of structures of the form:
struct setup_indirect {
__u32 type;
__u32 reserved; /* Reserved, must be set to zero */
__u64 len;
__u64 addr;
};
... where type is itself simply a SETUP_* type -- although we probably don't
want to let it be SETUP_INDIRECT itself since making it a tree structure could
require a lot of stack space in something that needs to parse it, and stack
space can be limited in boot contexts.
This would be particularly useful for having SETUP_INITRAMFS, if it becomes
desirable to allow the kernel to parse a non-contiguous set of memory regions
for the initramfs.
It might be a good idea to immediately start out struct kernel_info with
either a high mark or a bitmask of SETUP_* types that the kernel supports. A
bitmask would be more flexible, but would need provisions to be grown in the
future.
Which leads me to yet another thought.
We probably want to make the contents of kernel_info a bit more structured to
allow for content that may need to be extended in the future, or is inherently
variable length (like strings.)
This would lend itself to a structure such as:
- Magic number
- Length of total structure
... followed by a list of data chunks, each prefixed by a length field. The
first data chunk would be the main (root) structure; other data structures are
pointed to from the root structure using offsets from the beginning of the
structure (the magic number field.)
As an implementation detail, strings can of course be "pooled" into a single
data chunk as long as they are zero-terminated.
I have intentionally avoided specifying a type field for each data chunk;
history shows that it is generally a bad idea to have multiple ways to derive
the same information, as different implementations will do it differently,
resulting in bugs when things change.
-hpa
I am working on new version of patches but I have some concerns. Please
look below for more details...
On Thu, Jun 06, 2019 at 03:06:30PM -0700, H. Peter Anvin wrote:
> On 5/24/19 2:55 AM, Daniel Kiper wrote:
> > Due to limited space left in the setup header it was decided to
> > introduce the setup_header2. Its role is to communicate Linux kernel
> > supported features to the boot loader. Starting from now this is the
> > primary way to communicate things to the boot loader.
> >
> > Suggested-by: H. Peter Anvin <[email protected]>
> > Signed-off-by: Daniel Kiper <[email protected]>
> > Reviewed-by: Ross Philipson <[email protected]>
> > Reviewed-by: Eric Snowberg <[email protected]>
> > ---
> > I know that setup_header2 is not the best name. There were some
> > alternatives proposed like setup_header_extra, setup_header_addendum,
> > setup_header_more, ext_setup_header, extended_setup_header, extended_header
> > and extended_setup. Sadly, I am not happy with any of them. So,
> > leaving setup_header2 as is but still looking for better name.
> > Probably shorter == better...
>
> I would say kernel_info. The relationships between the headers are analogous
> to the various data sections:
>
> setup_header = .data
> boot_params/setup_data = .bss
>
> What is missing from the above list? That's right:
>
> kernel_info = .rodata
>
> We have been (ab)using .data for things that could go into .rodata or .bss for
> a long time, for lack of alternatives and -- especially early on -- intertia.
> Also, the BIOS stub is responsible for creating boot_params, so it isn't
> available to a BIOS-based loader (setup_data is, though.)
>
> setup_header is permanently limited to 144 bytes due to the reach of the
> 2-byte jump field, which doubles as a length field for the structure, combined
> with the size of the "hole" in struct boot_params that a protected-mode loader
> or the BIOS stub has to copy it into. It is currently 119 bytes long, which
> leaves us with 25 very precious bytes. This isn't something that can be fixed
> without revising the boot protocol entirely, breaking backwards compatibility.
>
> boot_params proper is limited to 4096 bytes, but can be arbitrarily extended
> by adding setup_data entries. It cannot be used to communicate properties of
> the kernel image, because it is .bss and has no image-provided content.
>
> kernel_info solves this by providing an extensible place for information about
> the kernel image. It is readonly, because the kernel cannot rely on a
> bootloader copying its contents anywhere, but that is OK; if it becomes
> necessary it can still contain data items that an enabled bootloader would be
> expected to copy into a setup_data chunk.
>
> ^ The above or some variant thereof may be a good thing to put both in your
> patch comments as well as in the boot protocol documentation.
Will do...
> While we are making a change that bumps the version number anyway, there is
> another change I would like to make to the boot protocol which we might as
> well do at the same time. setup_data is a bit awkward to use for extremely
> large data objects, both because the setup_data header has to be adjacent to
> the data object, and because it has a 32-bit length field. However, it is
> important that intermediate stages of the boot process have a way to identify
> which chunks of memory are occupied by kernel data.
Is not possible to "identify which chunks of memory are occupied by
kernel data" today? I think that it is. So, this does not look like
a valid point for change. Am I missing something?
> Thus I think we should introduce a uniform way to specify such indirect data.
> We define a new setup_data type we can maybe call SETUP_INDIRECT; a
> SETUP_INDIRECT data item would be an array of structures of the form:
OK.
> struct setup_indirect {
> __u32 type;
> __u32 reserved; /* Reserved, must be set to zero */
> __u64 len;
> __u64 addr;
> };
>
> ... where type is itself simply a SETUP_* type -- although we probably don't
> want to let it be SETUP_INDIRECT itself since making it a tree structure could
> require a lot of stack space in something that needs to parse it, and stack
> space can be limited in boot contexts.
Yeah...
> This would be particularly useful for having SETUP_INITRAMFS, if it becomes
> desirable to allow the kernel to parse a non-contiguous set of memory regions
> for the initramfs.
OK.
> It might be a good idea to immediately start out struct kernel_info with
> either a high mark or a bitmask of SETUP_* types that the kernel supports. A
High mark seems better for me here.
> bitmask would be more flexible, but would need provisions to be grown in the
> future.
Yep.
Anyway, I agree with the idea but I am not sure it makes sense to
introduce something which does not have users right now. Does it?
> Which leads me to yet another thought.
>
> We probably want to make the contents of kernel_info a bit more structured to
> allow for content that may need to be extended in the future, or is inherently
> variable length (like strings.)
>
> This would lend itself to a structure such as:
>
> - Magic number
> - Length of total structure
My first proposal have both fields...
> ... followed by a list of data chunks, each prefixed by a length field. The
> first data chunk would be the main (root) structure; other data structures are
I am not sure that I understand the idea of the main (root) structure.
Should it point to itself?
> pointed to from the root structure using offsets from the beginning of the
> structure (the magic number field.)
Sounds nice but I think that it is an overkill in many simple cases,
e.g. look at MLE entry point. In case of MLE entry point we do not need
nothing fancy. So, I think that this filed should be a member of the
kernel_info which points directly to new MLE entry point. I can agree
that we can use the mechanism mentioned by you above in more complicated
cases and it can be described in Documentation/x86/boot.rst. But I would
not enforce it for everything. Just for the sake of simplicity.
> As an implementation detail, strings can of course be "pooled" into a single
> data chunk as long as they are zero-terminated.
OK.
> I have intentionally avoided specifying a type field for each data chunk;
> history shows that it is generally a bad idea to have multiple ways to derive
> the same information, as different implementations will do it differently,
> resulting in bugs when things change.
Make sense for me.
So, it seems to me that we have to have at least two patches:
- one introducing the kernel_info structure,
- another introducing the setup_indirect and bumping the
boot protocol version number.
This thing has at least one cons: first patch introduces a new boot
protocol feature but it is not reflected in the protocol version change.
Not perfect but I do not think that we should bump the version number in
first patch. do you have better idea?
Daniel