2016-03-02 11:20:14

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH] x86: Make sure verify_cpu has a good stack

From: Borislav Petkov <[email protected]>

04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
added the call to verify_cpu() for sanitizing CPU configuration.

The latter uses the stack minimally and it can happen that we land in
startup_64() directly from a 64-bit bootloader. Then we want to use our
own, known good stack.

Do that.

APs don't need this as the trampoline sets up a stack for them.

Reported-by: Tom Lendacky <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/head_64.S | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 22fbf9df61bb..d60a044c2fdc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -64,6 +64,10 @@ startup_64:
* tables and then reload them.
*/

+ /* Setup a stack for verify_cpu */
+ movq stack_start - __START_KERNEL_map, %rsp
+ subq $__START_KERNEL_map, %rsp
+
/* Sanitize CPU configuration */
call verify_cpu

--
2.3.5


2016-03-02 16:10:39

by Mika Penttilä

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 02.03.2016 13:20, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> 04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
> added the call to verify_cpu() for sanitizing CPU configuration.
>
> The latter uses the stack minimally and it can happen that we land in
> startup_64() directly from a 64-bit bootloader. Then we want to use our
> own, known good stack.
>
> Do that.
>
> APs don't need this as the trampoline sets up a stack for them.
>
> Reported-by: Tom Lendacky <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/kernel/head_64.S | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 22fbf9df61bb..d60a044c2fdc 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -64,6 +64,10 @@ startup_64:
> * tables and then reload them.
> */
>
> + /* Setup a stack for verify_cpu */
> + movq stack_start - __START_KERNEL_map, %rsp
> + subq $__START_KERNEL_map, %rsp
> +


You subtract __START_KERNEL_map twice ?

--Mika

2016-03-02 16:15:41

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 05:55:14PM +0200, Mika Penttilä wrote:
> > + /* Setup a stack for verify_cpu */
> > + movq stack_start - __START_KERNEL_map, %rsp
> > + subq $__START_KERNEL_map, %rsp
> > +
>
> You subtract __START_KERNEL_map twice ?

Yes. That's not very obvious and it took me a while. I probably should
add a comment.

Want to stare at it a little bit more and try to figure it out or should
I explain?

:-)

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 16:22:33

by Brian Gerst

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 2, 2016 at 6:20 AM, Borislav Petkov <[email protected]> wrote:
> From: Borislav Petkov <[email protected]>
>
> 04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
> added the call to verify_cpu() for sanitizing CPU configuration.
>
> The latter uses the stack minimally and it can happen that we land in
> startup_64() directly from a 64-bit bootloader. Then we want to use our
> own, known good stack.
>
> Do that.
>
> APs don't need this as the trampoline sets up a stack for them.
>
> Reported-by: Tom Lendacky <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/kernel/head_64.S | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 22fbf9df61bb..d60a044c2fdc 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -64,6 +64,10 @@ startup_64:
> * tables and then reload them.
> */
>
> + /* Setup a stack for verify_cpu */
> + movq stack_start - __START_KERNEL_map, %rsp

This should be: movq stack_start(%rip), %rsp

> + subq $__START_KERNEL_map, %rsp

It would be better to add the offset to the initializer for
stack_start instead of adjusting it at runtime. That would require
moving the existing load of stack_start from the common path to the
secondary startup, which probably isn't a bad thing as it wouldn't
depend on the trampoline stack anymore.

--
Brian Gerst

2016-03-02 16:25:38

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 11:22:30AM -0500, Brian Gerst wrote:
> This should be: movq stack_start(%rip), %rsp

No it wouldn't. That doesn't work.

> > + subq $__START_KERNEL_map, %rsp
>
> It would be better to add the offset to the initializer for
> stack_start instead of adjusting it at runtime. That would require
> moving the existing load of stack_start from the common path to the
> secondary startup,

This is the BSP we're talking about - no secondary startup. We want to run
verify_cpu as early as possible.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 16:38:26

by Mika Penttilä

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack


On 02.03.2016 18:15, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 05:55:14PM +0200, Mika Penttilä wrote:
>>> + /* Setup a stack for verify_cpu */
>>> + movq stack_start - __START_KERNEL_map, %rsp
>>> + subq $__START_KERNEL_map, %rsp
>>> +
>> You subtract __START_KERNEL_map twice ?
> Yes. That's not very obvious and it took me a while. I probably should
> add a comment.
>
> Want to stare at it a little bit more and try to figure it out or should
> I explain?
>
> :-)
>

I actually looked at it a while too...

The
movq stack_start - __START_KERNEL_map, %rsp

turns into (objdump disassembly)

mov 0x0,%rsp

with relocation
0000000000000004 R_X86_64_32S stack_start+0x0000000080000000

Now stack_start is at ffffffff81ef3380, so the relocation gives 1ef3380 which would be correct, so why the
second subq ?

You may explain :)

--Mika

2016-03-02 16:55:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 06:38:15PM +0200, Mika Penttilä wrote:
> I actually looked at it a while too...
>
> The
> movq stack_start - __START_KERNEL_map, %rsp
>
> turns into (objdump disassembly)
>
> mov 0x0,%rsp
>
> with relocation
> 0000000000000004 R_X86_64_32S stack_start+0x0000000080000000
>
> Now stack_start is at ffffffff81ef3380, so the relocation gives 1ef3380 which would be correct, so why the
> second subq ?
>
> You may explain :)

Here it is :-)

$ readelf -a vmlinux | grep stack_start
70526: ffffffff81cbabf8 0 NOTYPE GLOBAL DEFAULT 14 stack_start

0xffffffff81cbabf8 - __START_KERNEL_map =
0xffffffff81cbabf8 - 0xffffffff80000000 =
0x1cbabf8

(gdb) x/x 0x1cbabf8
0x1cbabf8: 0xffffffff81c03ff8

(You don't need gdb for that - you can hexdump or objdump vmlinux).

Now stack_start is:

GLOBAL(stack_start)
.quad init_thread_union+THREAD_SIZE-8

which is

$ readelf -a vmlinux | grep init_thread_union
82491: ffffffff81c00000 16384 OBJECT GLOBAL DEFAULT 14 init_thread_union

so init_thread_union+THREAD_SIZE-8 = 0xffffffff81c00000 + 4*4096-8 = 0xffffffff81c03ff8

So you have to subtract __START_KERNEL_map again because it has there a
virtual address and we haven't enabled paging yet:

0xffffffff81c03ff8 - 0xffffffff80000000 = 0x1c03ff8.

Makes sense?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 17:44:22

by Mika Penttilä

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack



On 02.03.2016 18:55, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 06:38:15PM +0200, Mika Penttilä wrote:
>> I actually looked at it a while too...
>>
>> The
>> movq stack_start - __START_KERNEL_map, %rsp
>>
>> turns into (objdump disassembly)
>>
>> mov 0x0,%rsp
>>
>> with relocation
>> 0000000000000004 R_X86_64_32S stack_start+0x0000000080000000
>>
>> Now stack_start is at ffffffff81ef3380, so the relocation gives 1ef3380 which would be correct, so why the
>> second subq ?
>>
>> You may explain :)
> Here it is :-)
>
> $ readelf -a vmlinux | grep stack_start
> 70526: ffffffff81cbabf8 0 NOTYPE GLOBAL DEFAULT 14 stack_start
>
> 0xffffffff81cbabf8 - __START_KERNEL_map =
> 0xffffffff81cbabf8 - 0xffffffff80000000 =
> 0x1cbabf8
>
> (gdb) x/x 0x1cbabf8
> 0x1cbabf8: 0xffffffff81c03ff8
>
> (You don't need gdb for that - you can hexdump or objdump vmlinux).
>
> Now stack_start is:
>
> GLOBAL(stack_start)
> .quad init_thread_union+THREAD_SIZE-8
>
> which is
>
> $ readelf -a vmlinux | grep init_thread_union
> 82491: ffffffff81c00000 16384 OBJECT GLOBAL DEFAULT 14 init_thread_union
>
> so init_thread_union+THREAD_SIZE-8 = 0xffffffff81c00000 + 4*4096-8 = 0xffffffff81c03ff8
>
> So you have to subtract __START_KERNEL_map again because it has there a
> virtual address and we haven't enabled paging yet:
>
> 0xffffffff81c03ff8 - 0xffffffff80000000 = 0x1c03ff8.
>
> Makes sense?
>
Ah missed completely that stack_start is effectively a pointer to stack..

Thanks,
Mika

2016-03-02 17:54:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On March 2, 2016 8:25:30 AM PST, Borislav Petkov <[email protected]> wrote:
>On Wed, Mar 02, 2016 at 11:22:30AM -0500, Brian Gerst wrote:
>> This should be: movq stack_start(%rip), %rsp
>
>No it wouldn't. That doesn't work.
>
>> > + subq $__START_KERNEL_map, %rsp
>>
>> It would be better to add the offset to the initializer for
>> stack_start instead of adjusting it at runtime. That would require
>> moving the existing load of stack_start from the common path to the
>> secondary startup,
>
>This is the BSP we're talking about - no secondary startup. We want to
>run
>verify_cpu as early as possible.

Please explain why we can't use rip-relative addressing in some form...
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

2016-03-02 18:16:05

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 09:53:28AM -0800, H. Peter Anvin wrote:
> Please explain why we can't use rip-relative addressing in some form...

We *can* do almost what Brian suggested:

movq stack_start(%rip), %rsp
subq $__START_KERNEL_map, %rsp

But we still have to subtract __START_KERNEL_map.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 18:25:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On March 2, 2016 10:15:56 AM PST, Borislav Petkov <[email protected]> wrote:
>On Wed, Mar 02, 2016 at 09:53:28AM -0800, H. Peter Anvin wrote:
>> Please explain why we can't use rip-relative addressing in some
>form...
>
>We *can* do almost what Brian suggested:
>
> movq stack_start(%rip), %rsp
> subq $__START_KERNEL_map, %rsp
>
>But we still have to subtract __START_KERNEL_map.

Obviously.
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

2016-03-02 18:39:36

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/02/16 10:15, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 09:53:28AM -0800, H. Peter Anvin wrote:
>> Please explain why we can't use rip-relative addressing in some form...
>
> We *can* do almost what Brian suggested:
>
> movq stack_start(%rip), %rsp
> subq $__START_KERNEL_map, %rsp
>
> But we still have to subtract __START_KERNEL_map.
>

Well, we definitely should use %rip-relative addressing if we can.

However, even so I believe this breaks if the kernel is loaded anywhere
but its default load address. I think we need to do something like:

movq stack_start(%rip), %rax
leaq __START_KERNEL_map(%rip), %rdx
subq %rdx, %rax
movq %rax, %rsp

The use of temporary registers avoids clobbering a valid stack pointer
for even a single instruction if we are given one.

-hpa


2016-03-02 19:51:00

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 10:39:05AM -0800, H. Peter Anvin wrote:
> Well, we definitely should use %rip-relative addressing if we can.

Right you are.

> However, even so I believe this breaks if the kernel is loaded anywhere
> but its default load address. I think we need to do something like:
>
> movq stack_start(%rip), %rax
> leaq __START_KERNEL_map(%rip), %rdx
> subq %rdx, %rax
> movq %rax, %rsp
>
> The use of temporary registers avoids clobbering a valid stack pointer
> for even a single instruction if we are given one.

Yeah, we should be prudent and make this as sturdy as possible. I did this:

CONFIG_PHYSICAL_START=0x100beef

and it aligned startup_64 up to ffffffff82000000. It seems to boot fine
in kvm. But better safe than sorry.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 20:46:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 08:50:53PM +0100, Borislav Petkov wrote:
> But better safe than sorry.

I got this, it looks good when I'm single-stepping through it with gdb
and it boots fine in kvm. I'll run it on baremetal tomorrow:

/*
* Setup stack for verify_cpu(): make sure we don't clobber a valid
* stack pointer by using temporary registers.
*/
movq stack_start(%rip), %rax
movq $__START_KERNEL_map, %rdx
subq %rdx, %rax
movq %rax, %rsp

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 21:35:42

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/02/16 11:50, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 10:39:05AM -0800, H. Peter Anvin wrote:
>> Well, we definitely should use %rip-relative addressing if we can.
>
> Right you are.
>
>> However, even so I believe this breaks if the kernel is loaded anywhere
>> but its default load address. I think we need to do something like:
>>
>> movq stack_start(%rip), %rax
>> leaq __START_KERNEL_map(%rip), %rdx
>> subq %rdx, %rax
>> movq %rax, %rsp
>>
>> The use of temporary registers avoids clobbering a valid stack pointer
>> for even a single instruction if we are given one.
>
> Yeah, we should be prudent and make this as sturdy as possible. I did this:
>
> CONFIG_PHYSICAL_START=0x100beef
>
> and it aligned startup_64 up to ffffffff82000000. It seems to boot fine
> in kvm. But better safe than sorry.
>

You're not actually testing anything as the real issue is what happens
with a relocating bootloader. That's okay; I think we can be pretty
sure the above works by inspection.

-hpa


2016-03-02 21:47:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 01:35:09PM -0800, H. Peter Anvin wrote:
> You're not actually testing anything as the real issue is what happens
> with a relocating bootloader.

Hmm, how would that relocation happen so that va - __START_KERNEL_map
doesn't give pa?

Or do you mean something else with "relocating bootloader"? Do you know
of one which does that?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 21:59:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/02/16 13:46, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 01:35:09PM -0800, H. Peter Anvin wrote:
>> You're not actually testing anything as the real issue is what happens
>> with a relocating bootloader.
>
> Hmm, how would that relocation happen so that va - __START_KERNEL_map
> doesn't give pa?
>
> Or do you mean something else with "relocating bootloader"? Do you know
> of one which does that?
>

A relocating bootloader is one that doesn't load the kernel at
CONFIG_PHYSICAL_ADDRESS. The EFI stub is one example.

__START_KERNEL_map is not relocated. On x86-64 we do relocation by
pointing the page tables at a different address.

So I really think we need this to be a leaq, so we take a nonstandard
load address into consideration.

-hpa

2016-03-02 22:10:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 01:54:50PM -0800, H. Peter Anvin wrote:
> A relocating bootloader is one that doesn't load the kernel at
> CONFIG_PHYSICAL_ADDRESS. The EFI stub is one example.
>
> __START_KERNEL_map is not relocated. On x86-64 we do relocation by
> pointing the page tables at a different address.
>
> So I really think we need this to be a leaq, so we take a nonstandard
> load address into consideration.

Hmm, but __START_KERNEL_map is a simple macro:

#define __START_KERNEL_map _AC(0xffffffff80000000, UL)

Ok, I think you want to do something like this for stack_start too:

/*
* Compute the delta between the address I am compiled to run at and the
* address I am actually running at.
*/
leaq _text(%rip), %rbp
subq $_text - __START_KERNEL_map, %rbp
...

in the normal case %rbp is 0, of course.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 22:12:21

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/02/16 14:09, Borislav Petkov wrote:
> On Wed, Mar 02, 2016 at 01:54:50PM -0800, H. Peter Anvin wrote:
>> A relocating bootloader is one that doesn't load the kernel at
>> CONFIG_PHYSICAL_ADDRESS. The EFI stub is one example.
>>
>> __START_KERNEL_map is not relocated. On x86-64 we do relocation by
>> pointing the page tables at a different address.
>>
>> So I really think we need this to be a leaq, so we take a nonstandard
>> load address into consideration.
>
> Hmm, but __START_KERNEL_map is a simple macro:
>
> #define __START_KERNEL_map _AC(0xffffffff80000000, UL)

That should not be a problem.
>
> Ok, I think you want to do something like this for stack_start too:
>
> /*
> * Compute the delta between the address I am compiled to run at and the
> * address I am actually running at.
> */
> leaq _text(%rip), %rbp
> subq $_text - __START_KERNEL_map, %rbp
> ...
>
> in the normal case %rbp is 0, of course.
>

Not sure if we need a reference to _text here.

-hpa

2016-03-02 22:28:49

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 02:11:51PM -0800, H. Peter Anvin wrote:
> Not sure if we need a reference to _text here.

Ah, so stack_start is in .ref.data, I guess we can add a __ref_data
marker in vmlinux.lds.S to denote the start of the .ref.data section and
use that for calculating the delta...

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-02 22:33:24

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On March 2, 2016 2:28:42 PM PST, Borislav Petkov <[email protected]> wrote:
>On Wed, Mar 02, 2016 at 02:11:51PM -0800, H. Peter Anvin wrote:
>> Not sure if we need a reference to _text here.
>
>Ah, so stack_start is in .ref.data, I guess we can add a __ref_data
>marker in vmlinux.lds.S to denote the start of the .ref.data section
>and
>use that for calculating the delta...

I'm trying to think of any reason why we couldn't simply have a symbol at the top of the initial stack? Then a simple leaq would suffice; this is for the BSP after all.
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

2016-03-02 22:40:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 02:32:54PM -0800, H. Peter Anvin wrote:
> I'm trying to think of any reason why we couldn't simply have a symbol
> at the top of the initial stack? Then a simple leaq would suffice;
> this is for the BSP after all.

That should be simpler. And we do games like that already in the trampoline:

# Setup stack
movl $rm_stack_end, %esp

...

GLOBAL(rm_stack)
.space 2048
GLOBAL(rm_stack_end)

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-03 00:13:05

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 2, 2016 at 2:32 PM, H. Peter Anvin <[email protected]> wrote:
>
> I'm trying to think of any reason why we couldn't simply have a symbol at the top of the initial stack? Then a simple leaq would suffice; this is for the BSP after all.

Why do we need to call verify_cpu in arch/x86/kernel/head_64.S aka the
vmlinux again ?

Is that already called in arch/x86/boot/compressed/head_64.S?

Or Tom is using 64bit bootloader that use vmlinux instead of bzImage?

Thanks

Yinghai

2016-03-03 01:00:30

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 2, 2016 at 4:13 PM, Yinghai Lu <[email protected]> wrote:
> On Wed, Mar 2, 2016 at 2:32 PM, H. Peter Anvin <[email protected]> wrote:
>>
>> I'm trying to think of any reason why we couldn't simply have a symbol at the top of the initial stack? Then a simple leaq would suffice; this is for the BSP after all.
>
> Why do we need to call verify_cpu in arch/x86/kernel/head_64.S aka the
> vmlinux again ?
>
> Is that already called in arch/x86/boot/compressed/head_64.S?

that calling is from startup_32, so may add another calling in startup_64,
so can avoid calling from arch/x86/kernel/head_64.S

>
> Or Tom is using 64bit bootloader that use vmlinux instead of bzImage?

2016-03-03 02:50:24

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 2, 2016 at 5:00 PM, Yinghai Lu <[email protected]> wrote:
> On Wed, Mar 2, 2016 at 4:13 PM, Yinghai Lu <[email protected]> wrote:
>> On Wed, Mar 2, 2016 at 2:32 PM, H. Peter Anvin <[email protected]> wrote:
>>>
>>> I'm trying to think of any reason why we couldn't simply have a symbol at the top of the initial stack? Then a simple leaq would suffice; this is for the BSP after all.
>>
>> Why do we need to call verify_cpu in arch/x86/kernel/head_64.S aka the
>> vmlinux again ?
>>
>> Is that already called in arch/x86/boot/compressed/head_64.S?
>
> that calling is from startup_32, so may add another calling in startup_64,
> so can avoid calling from arch/x86/kernel/head_64.S
>

at the same time, the "call verify_cpu" in
arch/x86/kernel/head_64.s::secondary_startup_64()
is not needed.
As APs already go through
arch/x86/realmode/rm/trampoline_64.S::trampoline_start(), and it
already
call verify_cpu in 16bit mode.

Thanks

Yinghai

2016-03-03 12:28:48

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Wed, Mar 02, 2016 at 02:32:54PM -0800, H. Peter Anvin wrote:
> I'm trying to think of any reason why we couldn't simply have a symbol
> at the top of the initial stack? Then a simple leaq would suffice;
> this is for the BSP after all.

How about something like this:

---
From: Borislav Petkov <[email protected]>
Date: Sun, 28 Feb 2016 21:35:44 +0100
Subject: [PATCH -v2] x86/asm: Make sure verify_cpu() has a good stack
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
added the call to verify_cpu() for sanitizing CPU configuration.

The latter uses the stack minimally and it can happen that we land in
startup_64() directly from a 64-bit bootloader. Then we want to use our
own, known good stack.

Do that.

APs don't need this as the trampoline sets up a stack for them.

Reported-by: Tom Lendacky <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Mika Penttilä <[email protected]>
---
arch/x86/kernel/head_64.S | 3 +++
include/asm-generic/vmlinux.lds.h | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 22fbf9df61bb..968d6408b887 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -64,6 +64,9 @@ startup_64:
* tables and then reload them.
*/

+ /* Setup stack for verify_cpu(). */
+ leaq (__end_init_task - 8)(%rip), %rsp
+
/* Sanitize CPU configuration */
call verify_cpu

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 772c784ba763..cba2a26628fc 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -246,7 +246,9 @@

#define INIT_TASK_DATA(align) \
. = ALIGN(align); \
- *(.data..init_task)
+ VMLINUX_SYMBOL(__start_init_task) = .; \
+ *(.data..init_task) \
+ VMLINUX_SYMBOL(__end_init_task) = .;

/*
* Read only Data
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-03 15:26:43

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On March 3, 2016 4:28:36 AM PST, Borislav Petkov <[email protected]> wrote:
>On Wed, Mar 02, 2016 at 02:32:54PM -0800, H. Peter Anvin wrote:
>> I'm trying to think of any reason why we couldn't simply have a
>symbol
>> at the top of the initial stack? Then a simple leaq would suffice;
>> this is for the BSP after all.
>
>How about something like this:
>
>---
>From: Borislav Petkov <[email protected]>
>Date: Sun, 28 Feb 2016 21:35:44 +0100
>Subject: [PATCH -v2] x86/asm: Make sure verify_cpu() has a good stack
>MIME-Version: 1.0
>Content-Type: text/plain; charset=UTF-8
>Content-Transfer-Encoding: 8bit
>
>04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long
>mode too")
>added the call to verify_cpu() for sanitizing CPU configuration.
>
>The latter uses the stack minimally and it can happen that we land in
>startup_64() directly from a 64-bit bootloader. Then we want to use our
>own, known good stack.
>
>Do that.
>
>APs don't need this as the trampoline sets up a stack for them.
>
>Reported-by: Tom Lendacky <[email protected]>
>Signed-off-by: Borislav Petkov <[email protected]>
>Cc: Brian Gerst <[email protected]>
>Cc: "H. Peter Anvin" <[email protected]>
>Cc: Mika Penttilä <[email protected]>
>---
> arch/x86/kernel/head_64.S | 3 +++
> include/asm-generic/vmlinux.lds.h | 4 +++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
>diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
>index 22fbf9df61bb..968d6408b887 100644
>--- a/arch/x86/kernel/head_64.S
>+++ b/arch/x86/kernel/head_64.S
>@@ -64,6 +64,9 @@ startup_64:
> * tables and then reload them.
> */
>
>+ /* Setup stack for verify_cpu(). */
>+ leaq (__end_init_task - 8)(%rip), %rsp
>+
> /* Sanitize CPU configuration */
> call verify_cpu
>
>diff --git a/include/asm-generic/vmlinux.lds.h
>b/include/asm-generic/vmlinux.lds.h
>index 772c784ba763..cba2a26628fc 100644
>--- a/include/asm-generic/vmlinux.lds.h
>+++ b/include/asm-generic/vmlinux.lds.h
>@@ -246,7 +246,9 @@
>
> #define INIT_TASK_DATA(align) \
> . = ALIGN(align); \
>- *(.data..init_task)
>+ VMLINUX_SYMBOL(__start_init_task) = .; \
>+ *(.data..init_task) \
>+ VMLINUX_SYMBOL(__end_init_task) = .;
>
> /*
> * Read only Data

Why -8?
--
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

2016-03-03 16:30:03

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Thu, Mar 03, 2016 at 07:26:06AM -0800, H. Peter Anvin wrote:
> Why -8?

GLOBAL(stack_start)
.quad init_thread_union+THREAD_SIZE-8
^^^

But I don't see why it needed the -8 then. It came with a conglomerate
dump in 2002:

commit af53c7a2c81399b805b6d4eff887401a5e50feef
Author: Andi Kleen <[email protected]>
Date: Fri Apr 19 20:23:17 2002 -0700

[PATCH] x86-64 architecture specific sync for 2.5.8


- /* Setup the first kernel stack (this instruction is modified by smpboot) */
- .byte 0x48, 0xb8 /* movq *init_rsp,%rax */
-init_rsp:
- .quad init_thread_union+THREAD_SIZE
- movq %rax, %rsp

...

-
- /* SMP bootup changes this */
+ /* SMP bootup changes these two */
.globl initial_code
initial_code:
.quad x86_64_start_kernel
+ .globl init_rsp
+init_rsp:
+ .quad init_thread_union+THREAD_SIZE-8
+
---

But since we decrement first and then copy to stack ptr when we push, I
don't see why we need the -8.

Do you have a better clue?

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-03 20:22:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/03/16 08:29, Borislav Petkov wrote:
> On Thu, Mar 03, 2016 at 07:26:06AM -0800, H. Peter Anvin wrote:
>> Why -8?
>
> GLOBAL(stack_start)
> .quad init_thread_union+THREAD_SIZE-8
> ^^^
>
> But I don't see why it needed the -8 then. It came with a conglomerate
> dump in 2002:
>
> commit af53c7a2c81399b805b6d4eff887401a5e50feef
> Author: Andi Kleen <[email protected]>
> Date: Fri Apr 19 20:23:17 2002 -0700
>
> [PATCH] x86-64 architecture specific sync for 2.5.8
>
>
> - /* Setup the first kernel stack (this instruction is modified by smpboot) */
> - .byte 0x48, 0xb8 /* movq *init_rsp,%rax */
> -init_rsp:
> - .quad init_thread_union+THREAD_SIZE
> - movq %rax, %rsp
>
> ...
>
> -
> - /* SMP bootup changes this */
> + /* SMP bootup changes these two */
> .globl initial_code
> initial_code:
> .quad x86_64_start_kernel
> + .globl init_rsp
> +init_rsp:
> + .quad init_thread_union+THREAD_SIZE-8
> +
> ---
>
> But since we decrement first and then copy to stack ptr when we push, I
> don't see why we need the -8.
>
> Do you have a better clue?
>

The only thing I can think of is that the -8 creates a null pointer that
terminates a stack trace.

-hpa

2016-03-03 20:54:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Thu, Mar 03, 2016 at 12:22:06PM -0800, H. Peter Anvin wrote:
> The only thing I can think of is that the -8 creates a null pointer that
> terminates a stack trace.

Probably not needed anymore as print_context_stack()->valid_stack_ptr()
in dumpstack.c look at the stack boundaries instead of checking for
NULL...

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-03 21:23:28

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On 03/03/16 12:54, Borislav Petkov wrote:
> On Thu, Mar 03, 2016 at 12:22:06PM -0800, H. Peter Anvin wrote:
>> The only thing I can think of is that the -8 creates a null pointer that
>> terminates a stack trace.
>
> Probably not needed anymore as print_context_stack()->valid_stack_ptr()
> in dumpstack.c look at the stack boundaries instead of checking for
> NULL...
>

Sure, but it could still affect kgdb or what not. That's the only
reason I can see for -8 though.

-hpa

2016-03-03 21:38:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Thu, Mar 03, 2016 at 01:22:59PM -0800, H. Peter Anvin wrote:
> reason I can see for -8 though.

... and keeping the -8 won't hurt us or anything. I'll add a small
comment about it.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

2016-03-04 01:18:36

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Thu, Mar 3, 2016 at 4:28 AM, Borislav Petkov <[email protected]> wrote:
>
> 04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
> added the call to verify_cpu() for sanitizing CPU configuration.
>
> The latter uses the stack minimally and it can happen that we land in
> startup_64() directly from a 64-bit bootloader. Then we want to use our
> own, known good stack.
>
> Do that.
>
> APs don't need this as the trampoline sets up a stack for them.

Even more than that. For AP verify_cpu already get called in trampoline.

arch/x86/realmode/rm/trampoline_64.S::trampoline_start().

So you remove verify_cpu calling in secondary_startup_64.

Yinghai

2016-03-04 02:25:46

by Yinghai Lu

[permalink] [raw]
Subject: Re: [RFC PATCH] x86: Make sure verify_cpu has a good stack

On Thu, Mar 3, 2016 at 4:28 AM, Borislav Petkov <[email protected]> wrote:
> From: Borislav Petkov <[email protected]>
> Date: Sun, 28 Feb 2016 21:35:44 +0100
> Subject: [PATCH -v2] x86/asm: Make sure verify_cpu() has a good stack

> 04633df0c43d ("x86/cpu: Call verify_cpu() after having entered long mode too")
> added the call to verify_cpu() for sanitizing CPU configuration.
>
> The latter uses the stack minimally and it can happen that we land in
> startup_64() directly from a 64-bit bootloader. Then we want to use our
> own, known good stack.
>
> Reported-by: Tom Lendacky <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> Cc: Brian Gerst <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Mika Penttilä <[email protected]>
> ---
> arch/x86/kernel/head_64.S | 3 +++
> include/asm-generic/vmlinux.lds.h | 4 +++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 22fbf9df61bb..968d6408b887 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -64,6 +64,9 @@ startup_64:
> * tables and then reload them.
> */
>
> + /* Setup stack for verify_cpu(). */
> + leaq (__end_init_task - 8)(%rip), %rsp
> +
> /* Sanitize CPU configuration */
> call verify_cpu
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 772c784ba763..cba2a26628fc 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -246,7 +246,9 @@
>
> #define INIT_TASK_DATA(align) \
> . = ALIGN(align); \
> - *(.data..init_task)
> + VMLINUX_SYMBOL(__start_init_task) = .; \
> + *(.data..init_task) \
> + VMLINUX_SYMBOL(__end_init_task) = .;
>
> /*
> * Read only Data
> --
> 2.3.5

I would suggest moving down verify_cpu calling after offset is
calcuated, like following instead of adding __end_init_task.

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 22fbf9d..e8c8085 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -64,9 +64,6 @@ startup_64:
* tables and then reload them.
*/

- /* Sanitize CPU configuration */
- call verify_cpu
-
/*
* Compute the delta between the address I am compiled to run at and the
* address I am actually running at.
@@ -74,6 +71,14 @@ startup_64:
leaq _text(%rip), %rbp
subq $_text - __START_KERNEL_map, %rbp

+ /* Setup stack for verify_cpu() */
+ movq stack_start(%rip), %rsp
+ subq $__START_KERNEL_map, %rsp
+ addq %rbp, %rsp
+
+ /* Sanitize CPU configuration */
+ call verify_cpu
+
/* Is the address not 2M aligned? */
testl $~PMD_PAGE_MASK, %ebp
jnz bad_address