2020-01-07 19:46:10

by Arvind Sankar

[permalink] [raw]
Subject: [PATCH 1/3] x86/boot/compressed/32: Simplify calculation of output address

Condense the calculation of decompressed kernel start a little.

Signed-off-by: Arvind Sankar <[email protected]>
---
arch/x86/boot/compressed/head_32.S | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index f2dfd6d083ef..1cc55c79d1d0 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -240,11 +240,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/* push arguments for extract_kernel: */
pushl $z_output_len /* decompressed length, end of relocs */

- movl BP_init_size(%esi), %eax
- subl $_end, %eax
- movl %ebx, %ebp
- subl %eax, %ebp
- pushl %ebp /* output address */
+ leal _end(%ebx), %eax
+ subl BP_init_size(%esi), %eax
+ pushl %eax /* output address */

pushl $z_input_len /* input_len */
leal input_data(%ebx), %eax
--
2.24.1


2020-01-07 19:46:50

by Arvind Sankar

[permalink] [raw]
Subject: [PATCH 3/3] x86/boot/compressed/64: Use 32-bit move for z_output_len

z_output_len is a 32-bit quantity, it's enough to use a 32-bit move to
load it.

Signed-off-by: Arvind Sankar <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index edd29340bcfd..17139c130ac9 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -531,7 +531,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
leaq input_data(%rip), %rdx /* input_data */
movl $z_input_len, %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movq $z_output_len, %r9 /* decompressed length, end of relocs */
+ movl $z_output_len, %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

--
2.24.1

2020-01-07 19:46:59

by Arvind Sankar

[permalink] [raw]
Subject: [PATCH 2/3] x86/boot/compressed/64: Use leal to initialize boot stack pointer

It's shorter, and it's what we use in every other place, so make it
consistent.

Signed-off-by: Arvind Sankar <[email protected]>
---
arch/x86/boot/compressed/head_64.S | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 58a512e33d8d..edd29340bcfd 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -81,9 +81,7 @@ SYM_FUNC_START(startup_32)
subl $1b, %ebp

/* setup a stack and make sure cpu supports long mode. */
- movl $boot_stack_end, %eax
- addl %ebp, %eax
- movl %eax, %esp
+ leal boot_stack_end(%ebp), %esp

call verify_cpu
testl %eax, %eax
--
2.24.1

Subject: [tip: x86/asm] x86/boot: Simplify calculation of output address

The following commit has been merged into the x86/asm branch of tip:

Commit-ID: 183ef7adf4ed638ac0fb0c3c9a71fc00e8512b61
Gitweb: https://git.kernel.org/tip/183ef7adf4ed638ac0fb0c3c9a71fc00e8512b61
Author: Arvind Sankar <[email protected]>
AuthorDate: Tue, 07 Jan 2020 14:44:34 -05:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Thu, 23 Jan 2020 11:58:43 +01:00

x86/boot: Simplify calculation of output address

Condense the calculation of decompressed kernel start a little.

Committer notes:

before:

ebp = ebx - (init_size - _end)

after:

eax = (ebx + _end) - init_size

where in both ebx contains the temporary address the kernel is moved to
for in-place decompression.

The before and after difference in register state is %eax and %ebp
but that is immaterial because the compressed image is not built with
-mregparm, i.e., all arguments of the following extract_kernel() call
are passed on the stack.

Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/head_32.S | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index f2dfd6d..1cc55c7 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -240,11 +240,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/* push arguments for extract_kernel: */
pushl $z_output_len /* decompressed length, end of relocs */

- movl BP_init_size(%esi), %eax
- subl $_end, %eax
- movl %ebx, %ebp
- subl %eax, %ebp
- pushl %ebp /* output address */
+ leal _end(%ebx), %eax
+ subl BP_init_size(%esi), %eax
+ pushl %eax /* output address */

pushl $z_input_len /* input_len */
leal input_data(%ebx), %eax

2020-02-11 17:20:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] x86/boot/compressed/64: Use 32-bit move for z_output_len

On Tue, Jan 07, 2020 at 02:44:36PM -0500, Arvind Sankar wrote:
> z_output_len is a 32-bit quantity,

It took me a while to figure out why that is, with Michael's help.
Please write in the commit message why it is a 32-bit quantity so that
it is clear to readers.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-02-11 21:03:59

by Arvind Sankar

[permalink] [raw]
Subject: [PATCH v2] x86/boot: Use 32-bit (zero-extended) move for z_output_len

z_output_len is the size of the decompressed payload (i.e. vmlinux +
vmlinux.relocs) and is generated as an unsigned 32-bit quantity by
mkpiggy.c.

The current movq $z_output_len, %r9 instruction generates a
sign-extended move to %r9. Using movl $z_output_len, %r9d will instead
zero-extend into %r9, which is appropriate for an unsigned 32-bit
quantity. This is also what we already do for z_input_len, the size of
the compressed payload.

Signed-off-by: Arvind Sankar <[email protected]>
---
v2: Improve commit message

arch/x86/boot/compressed/head_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 1f1f6c8139b3..03369246a4ff 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -484,7 +484,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
leaq input_data(%rip), %rdx /* input_data */
movl $z_input_len, %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movq $z_output_len, %r9 /* decompressed length, end of relocs */
+ movl $z_output_len, %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

--
2.24.1

2020-02-11 21:33:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/boot: Use 32-bit (zero-extended) move for z_output_len

On Tue, Feb 11, 2020 at 12:33:33PM -0500, Arvind Sankar wrote:
> z_output_len is the size of the decompressed payload (i.e. vmlinux +
> vmlinux.relocs) and is generated as an unsigned 32-bit quantity by
> mkpiggy.c.
>
> The current movq $z_output_len, %r9 instruction generates a
> sign-extended move to %r9. Using movl $z_output_len, %r9d will instead
> zero-extend into %r9, which is appropriate for an unsigned 32-bit
> quantity. This is also what we already do for z_input_len, the size of
> the compressed payload.

Yes, thanks.

What I'll also add to this is the fact that

init_size: .long INIT_SIZE # kernel initialization size

where z_output_len participates in through INIT_SIZE is a 32-bit
quantity determined by the ".long" so even if something made
z_output_len bigger than 32-bit by explicitly using MOVABS so that it
builds fine, you'd still get:

arch/x86/boot/header.S: Assembler messages:
arch/x86/boot/header.S:568: Warning: value 0x10103b000 truncated to 0x103b000

as a warning.

Btw, while poking at this, we found out that the MOV really remains
MOV and not MOVABS if gas doesn't know what the quantity behind the
z_output_len symbol is, as it is a symbol assignment. Which, AFAIU, with
ELF64 objects, it should be using 8-byte quantities in the symbol table
to accommodate such assignments. But for some reason it doesn't.

Anyway, Michael can correct me if I'm still imprecise here.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-02-11 21:39:19

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v2] x86/boot: Use 32-bit (zero-extended) move for z_output_len

On Tue, Feb 11, 2020 at 07:15:59PM +0100, Borislav Petkov wrote:
> On Tue, Feb 11, 2020 at 12:33:33PM -0500, Arvind Sankar wrote:
> > z_output_len is the size of the decompressed payload (i.e. vmlinux +
> > vmlinux.relocs) and is generated as an unsigned 32-bit quantity by
> > mkpiggy.c.
> >
> > The current movq $z_output_len, %r9 instruction generates a
> > sign-extended move to %r9. Using movl $z_output_len, %r9d will instead
> > zero-extend into %r9, which is appropriate for an unsigned 32-bit
> > quantity. This is also what we already do for z_input_len, the size of
> > the compressed payload.
>
> Yes, thanks.
>
> What I'll also add to this is the fact that
>
> init_size: .long INIT_SIZE # kernel initialization size
>
> where z_output_len participates in through INIT_SIZE is a 32-bit
> quantity determined by the ".long" so even if something made
> z_output_len bigger than 32-bit by explicitly using MOVABS so that it
> builds fine, you'd still get:
>
> arch/x86/boot/header.S: Assembler messages:
> arch/x86/boot/header.S:568: Warning: value 0x10103b000 truncated to 0x103b000
>
> as a warning.

Yes, this is just a "neatening" patch to use a more appropriate
instruction. There would have to be a lot of work to allow kernels to be
bigger than 2Gb, they're currently limited to at most 1Gb (or even 0.5Gb
if KASLR is disabled) by KERNEL_IMAGE_SIZE definition in
asm/page_64_types.h and there are checks in the linker script and a
bunch of other places, so the decompressed length can't be much bigger
than that.

>
> Btw, while poking at this, we found out that the MOV really remains
> MOV and not MOVABS if gas doesn't know what the quantity behind the
> z_output_len symbol is, as it is a symbol assignment. Which, AFAIU, with
> ELF64 objects, it should be using 8-byte quantities in the symbol table
> to accommodate such assignments. But for some reason it doesn't.
>
> Anyway, Michael can correct me if I'm still imprecise here.

For absolute symbols that it sees in the same file it does use 64-bit
immediate move for movq if necessary, but otherwise seems to need the
explicit opcode.

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

2020-02-12 09:47:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v2] x86/boot: Use 32-bit (zero-extended) move for z_output_len

On Tue, Feb 11, 2020 at 02:27:39PM -0500, Arvind Sankar wrote:
> Yes, this is just a "neatening" patch to use a more appropriate
> instruction. There would have to be a lot of work to allow kernels to be
> bigger than 2Gb,

And yet we're bloating up, right into that limit... ;-\

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Subject: [tip: x86/boot] x86/boot/compressed/64: Use LEA to initialize boot stack pointer

The following commit has been merged into the x86/boot branch of tip:

Commit-ID: 48bfdb9deffdc6b683feb25e15f4f26aac503501
Gitweb: https://git.kernel.org/tip/48bfdb9deffdc6b683feb25e15f4f26aac503501
Author: Arvind Sankar <[email protected]>
AuthorDate: Tue, 07 Jan 2020 14:44:35 -05:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 12 Feb 2020 11:11:06 +01:00

x86/boot/compressed/64: Use LEA to initialize boot stack pointer

It's shorter, and it's what is used in every other place, so make it
consistent.

Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/head_64.S | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 1f1f6c8..d1220de 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -81,9 +81,7 @@ SYM_FUNC_START(startup_32)
subl $1b, %ebp

/* setup a stack and make sure cpu supports long mode. */
- movl $boot_stack_end, %eax
- addl %ebp, %eax
- movl %eax, %esp
+ leal boot_stack_end(%ebp), %esp

call verify_cpu
testl %eax, %eax

Subject: [tip: x86/boot] x86/boot/compressed/64: Use 32-bit (zero-extended) MOV for z_output_len

The following commit has been merged into the x86/boot branch of tip:

Commit-ID: a86255fe5258714e1f7c1bdfe95f08e4d098d450
Gitweb: https://git.kernel.org/tip/a86255fe5258714e1f7c1bdfe95f08e4d098d450
Author: Arvind Sankar <[email protected]>
AuthorDate: Tue, 11 Feb 2020 12:33:33 -05:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Wed, 12 Feb 2020 11:15:31 +01:00

x86/boot/compressed/64: Use 32-bit (zero-extended) MOV for z_output_len

z_output_len is the size of the decompressed payload (i.e. vmlinux +
vmlinux.relocs) and is generated as an unsigned 32-bit quantity by
mkpiggy.c.

The current

movq $z_output_len, %r9

instruction generates a sign-extended move to %r9. Using

movl $z_output_len, %r9d

will instead zero-extend into %r9, which is appropriate for an unsigned
32-bit quantity. This is also what is already done for z_input_len, the
size of the compressed payload.

[ bp:

Also, z_output_len cannot be a 64-bit quantity because it participates
in:

init_size: .long INIT_SIZE # kernel initialization size

through INIT_SIZE which is a 32-bit quantity determined by the .long
directive (vs .quad for 64-bit). Furthermore, if it really must be a
64-bit quantity, then the insn must be MOVABS which can accommodate a
64-bit immediate and which the toolchain does not generate automatically.
]

Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/boot/compressed/head_64.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index d1220de..68f31c4 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -482,7 +482,7 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
leaq input_data(%rip), %rdx /* input_data */
movl $z_input_len, %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movq $z_output_len, %r9 /* decompressed length, end of relocs */
+ movl $z_output_len, %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi