2024-01-29 18:07:52

by Ard Biesheuvel

[permalink] [raw]
Subject: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

From: Ard Biesheuvel <[email protected]>

Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
primary startup sequence sets the code segment register (CS) to __KERNEL_CS
before calling into the startup code shared between primary and
secondary boot.

This means a simple indirect call is sufficient here.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
arch/x86/kernel/head_64.S | 35 ++------------------
1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d4918d03efb4..4017a49d7b76 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
movq %r15, %rdi

.Ljump_to_C_code:
- /*
- * Jump to run C code and to be on a real kernel address.
- * Since we are running on identity-mapped space we have to jump
- * to the full 64bit address, this is only possible as indirect
- * jump. In addition we need to ensure %cs is set so we make this
- * a far return.
- *
- * Note: do not change to far jump indirect with 64bit offset.
- *
- * AMD does not support far jump indirect with 64bit offset.
- * AMD64 Architecture Programmer's Manual, Volume 3: states only
- * JMP FAR mem16:16 FF /5 Far jump indirect,
- * with the target specified by a far pointer in memory.
- * JMP FAR mem16:32 FF /5 Far jump indirect,
- * with the target specified by a far pointer in memory.
- *
- * Intel64 does support 64bit offset.
- * Software Developer Manual Vol 2: states:
- * FF /5 JMP m16:16 Jump far, absolute indirect,
- * address given in m16:16
- * FF /5 JMP m16:32 Jump far, absolute indirect,
- * address given in m16:32.
- * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
- * address given in m16:64.
- */
- pushq $.Lafter_lret # put return address on stack for unwinder
xorl %ebp, %ebp # clear frame pointer
- movq initial_code(%rip), %rax
- pushq $__KERNEL_CS # set correct cs
- pushq %rax # target address in negative space
- lretq
-.Lafter_lret:
- ANNOTATE_NOENDBR
+ ANNOTATE_RETPOLINE_SAFE
+ callq *initial_code(%rip)
+ int3
SYM_CODE_END(secondary_startup_64)

#include "verify_cpu.S"
--
2.43.0.429.g432eaa2c6b-goog



2024-01-31 13:45:53

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <[email protected]>
>
> Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> before calling into the startup code shared between primary and
> secondary boot.
>
> This means a simple indirect call is sufficient here.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> arch/x86/kernel/head_64.S | 35 ++------------------
> 1 file changed, 3 insertions(+), 32 deletions(-)
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index d4918d03efb4..4017a49d7b76 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> movq %r15, %rdi
>
> .Ljump_to_C_code:
> - /*
> - * Jump to run C code and to be on a real kernel address.
> - * Since we are running on identity-mapped space we have to jump
> - * to the full 64bit address, this is only possible as indirect
> - * jump. In addition we need to ensure %cs is set so we make this
> - * a far return.
> - *
> - * Note: do not change to far jump indirect with 64bit offset.
> - *
> - * AMD does not support far jump indirect with 64bit offset.
> - * AMD64 Architecture Programmer's Manual, Volume 3: states only
> - * JMP FAR mem16:16 FF /5 Far jump indirect,
> - * with the target specified by a far pointer in memory.
> - * JMP FAR mem16:32 FF /5 Far jump indirect,
> - * with the target specified by a far pointer in memory.
> - *
> - * Intel64 does support 64bit offset.
> - * Software Developer Manual Vol 2: states:
> - * FF /5 JMP m16:16 Jump far, absolute indirect,
> - * address given in m16:16
> - * FF /5 JMP m16:32 Jump far, absolute indirect,
> - * address given in m16:32.
> - * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> - * address given in m16:64.
> - */
> - pushq $.Lafter_lret # put return address on stack for unwinder
> xorl %ebp, %ebp # clear frame pointer
> - movq initial_code(%rip), %rax
> - pushq $__KERNEL_CS # set correct cs
> - pushq %rax # target address in negative space
> - lretq
> -.Lafter_lret:
> - ANNOTATE_NOENDBR
> + ANNOTATE_RETPOLINE_SAFE
> + callq *initial_code(%rip)
> + int3
> SYM_CODE_END(secondary_startup_64)
>
> #include "verify_cpu.S"

objtool doesn't like it yet:

vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0

Once we've solved this, I'll take this one even now - very nice cleanup!

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-01-31 13:59:06

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <[email protected]> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <[email protected]>
> >
> > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > before calling into the startup code shared between primary and
> > secondary boot.
> >
> > This means a simple indirect call is sufficient here.
> >
> > Signed-off-by: Ard Biesheuvel <[email protected]>
> > ---
> > arch/x86/kernel/head_64.S | 35 ++------------------
> > 1 file changed, 3 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > index d4918d03efb4..4017a49d7b76 100644
> > --- a/arch/x86/kernel/head_64.S
> > +++ b/arch/x86/kernel/head_64.S
> > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> > movq %r15, %rdi
> >
> > .Ljump_to_C_code:
> > - /*
> > - * Jump to run C code and to be on a real kernel address.
> > - * Since we are running on identity-mapped space we have to jump
> > - * to the full 64bit address, this is only possible as indirect
> > - * jump. In addition we need to ensure %cs is set so we make this
> > - * a far return.
> > - *
> > - * Note: do not change to far jump indirect with 64bit offset.
> > - *
> > - * AMD does not support far jump indirect with 64bit offset.
> > - * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > - * JMP FAR mem16:16 FF /5 Far jump indirect,
> > - * with the target specified by a far pointer in memory.
> > - * JMP FAR mem16:32 FF /5 Far jump indirect,
> > - * with the target specified by a far pointer in memory.
> > - *
> > - * Intel64 does support 64bit offset.
> > - * Software Developer Manual Vol 2: states:
> > - * FF /5 JMP m16:16 Jump far, absolute indirect,
> > - * address given in m16:16
> > - * FF /5 JMP m16:32 Jump far, absolute indirect,
> > - * address given in m16:32.
> > - * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > - * address given in m16:64.
> > - */
> > - pushq $.Lafter_lret # put return address on stack for unwinder
> > xorl %ebp, %ebp # clear frame pointer
> > - movq initial_code(%rip), %rax
> > - pushq $__KERNEL_CS # set correct cs
> > - pushq %rax # target address in negative space
> > - lretq
> > -.Lafter_lret:
> > - ANNOTATE_NOENDBR
> > + ANNOTATE_RETPOLINE_SAFE
> > + callq *initial_code(%rip)
> > + int3
> > SYM_CODE_END(secondary_startup_64)
> >
> > #include "verify_cpu.S"
>
> objtool doesn't like it yet:
>
> vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
>
> Once we've solved this, I'll take this one even now - very nice cleanup!
>

s/int3/RET seems to do the trick.

As long as there is an instruction that follows the callq, the
unwinder will see secondary_startup_64 at the base of the call stack.
We never return here anyway.

2024-01-31 14:09:06

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

On Wed, 31 Jan 2024 at 14:57, Ard Biesheuvel <[email protected]> wrote:
>
> On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <[email protected]> wrote:
> >
> > On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <[email protected]>
> > >
> > > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > > before calling into the startup code shared between primary and
> > > secondary boot.
> > >
> > > This means a simple indirect call is sufficient here.
> > >
> > > Signed-off-by: Ard Biesheuvel <[email protected]>
> > > ---
> > > arch/x86/kernel/head_64.S | 35 ++------------------
> > > 1 file changed, 3 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > > index d4918d03efb4..4017a49d7b76 100644
> > > --- a/arch/x86/kernel/head_64.S
> > > +++ b/arch/x86/kernel/head_64.S
> > > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> > > movq %r15, %rdi
> > >
> > > .Ljump_to_C_code:
> > > - /*
> > > - * Jump to run C code and to be on a real kernel address.
> > > - * Since we are running on identity-mapped space we have to jump
> > > - * to the full 64bit address, this is only possible as indirect
> > > - * jump. In addition we need to ensure %cs is set so we make this
> > > - * a far return.
> > > - *
> > > - * Note: do not change to far jump indirect with 64bit offset.
> > > - *
> > > - * AMD does not support far jump indirect with 64bit offset.
> > > - * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > > - * JMP FAR mem16:16 FF /5 Far jump indirect,
> > > - * with the target specified by a far pointer in memory.
> > > - * JMP FAR mem16:32 FF /5 Far jump indirect,
> > > - * with the target specified by a far pointer in memory.
> > > - *
> > > - * Intel64 does support 64bit offset.
> > > - * Software Developer Manual Vol 2: states:
> > > - * FF /5 JMP m16:16 Jump far, absolute indirect,
> > > - * address given in m16:16
> > > - * FF /5 JMP m16:32 Jump far, absolute indirect,
> > > - * address given in m16:32.
> > > - * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > > - * address given in m16:64.
> > > - */
> > > - pushq $.Lafter_lret # put return address on stack for unwinder
> > > xorl %ebp, %ebp # clear frame pointer
> > > - movq initial_code(%rip), %rax
> > > - pushq $__KERNEL_CS # set correct cs
> > > - pushq %rax # target address in negative space
> > > - lretq
> > > -.Lafter_lret:
> > > - ANNOTATE_NOENDBR
> > > + ANNOTATE_RETPOLINE_SAFE
> > > + callq *initial_code(%rip)
> > > + int3
> > > SYM_CODE_END(secondary_startup_64)
> > >
> > > #include "verify_cpu.S"
> >
> > objtool doesn't like it yet:
> >
> > vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
> >
> > Once we've solved this, I'll take this one even now - very nice cleanup!
> >
>
> s/int3/RET seems to do the trick.
>

or ud2, even better,

2024-01-31 16:42:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

On Wed, Jan 31, 2024 at 03:07:50PM +0100, Ard Biesheuvel wrote:
> > s/int3/RET seems to do the trick.
> >
> or ud2, even better,

Yap, that does it. And yes, we don't return here. I guess objtool
complains because

"7. file: warning: objtool: func()+0x5c: stack state mismatch

The instruction's frame pointer state is inconsistent, depending on
which execution path was taken to reach the instruction.

...

Another possibility is that the code has some asm or inline asm which
does some unusual things to the stack or the frame pointer. In such
cases it's probably appropriate to use the unwind hint macros in
asm/unwind_hints.h.
"

Lemme test this one a bit on my machines and queue it.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2024-01-31 18:24:06

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: x86/boot] x86/startup_64: Drop long return to initial_code pointer

The following commit has been merged into the x86/boot branch of tip:

Commit-ID: 15675706241887ed7fdad9e91f4bf977b9896d0f
Gitweb: https://git.kernel.org/tip/15675706241887ed7fdad9e91f4bf977b9896d0f
Author: Ard Biesheuvel <[email protected]>
AuthorDate: Mon, 29 Jan 2024 19:05:06 +01:00
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Wed, 31 Jan 2024 18:31:21 +01:00

x86/startup_64: Drop long return to initial_code pointer

Since

866b556efa12 ("x86/head/64: Install startup GDT")

the primary startup sequence sets the code segment register (CS) to
__KERNEL_CS before calling into the startup code shared between primary
and secondary boot.

This means a simple indirect call is sufficient here.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/head_64.S | 35 +++--------------------------------
1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d4918d0..bfbac50 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
movq %r15, %rdi

.Ljump_to_C_code:
- /*
- * Jump to run C code and to be on a real kernel address.
- * Since we are running on identity-mapped space we have to jump
- * to the full 64bit address, this is only possible as indirect
- * jump. In addition we need to ensure %cs is set so we make this
- * a far return.
- *
- * Note: do not change to far jump indirect with 64bit offset.
- *
- * AMD does not support far jump indirect with 64bit offset.
- * AMD64 Architecture Programmer's Manual, Volume 3: states only
- * JMP FAR mem16:16 FF /5 Far jump indirect,
- * with the target specified by a far pointer in memory.
- * JMP FAR mem16:32 FF /5 Far jump indirect,
- * with the target specified by a far pointer in memory.
- *
- * Intel64 does support 64bit offset.
- * Software Developer Manual Vol 2: states:
- * FF /5 JMP m16:16 Jump far, absolute indirect,
- * address given in m16:16
- * FF /5 JMP m16:32 Jump far, absolute indirect,
- * address given in m16:32.
- * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
- * address given in m16:64.
- */
- pushq $.Lafter_lret # put return address on stack for unwinder
xorl %ebp, %ebp # clear frame pointer
- movq initial_code(%rip), %rax
- pushq $__KERNEL_CS # set correct cs
- pushq %rax # target address in negative space
- lretq
-.Lafter_lret:
- ANNOTATE_NOENDBR
+ ANNOTATE_RETPOLINE_SAFE
+ callq *initial_code(%rip)
+ ud2
SYM_CODE_END(secondary_startup_64)

#include "verify_cpu.S"