From: Andi Kleen <[email protected]>
With the latest tip x86/pti I get oopses when booting
a 64bit VM in qemu with RETPOLINE/gcc7 and PTI enabled.
The following patch fixes it for me. Something doesn't
seem to work with ALTERNATIVE_2. It adds only a few bytes
more code, so seems acceptable.
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 8ddf8513550e..eea6340a0abb 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -46,9 +46,8 @@
*/
.macro JMP_NOSPEC reg:req
#ifdef CONFIG_RETPOLINE
- ALTERNATIVE_2 __stringify(jmp *\reg), \
- __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \
- __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+ ALTERNATIVE "", __stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+ ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE
#else
jmp *\reg
#endif
--
2.14.3
On Tue, 9 Jan 2018, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> With the latest tip x86/pti I get oopses when booting
> a 64bit VM in qemu with RETPOLINE/gcc7 and PTI enabled.
>
> The following patch fixes it for me. Something doesn't
> seem to work with ALTERNATIVE_2. It adds only a few bytes
> more code, so seems acceptable.
Can you please figure out what goes wrong. David has tested this with his
compiler and I really don't want to 'fix' this just because some random
compiler version fails to build it.
Thanks,
tglx
On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]> wrote:
>
> The following patch fixes it for me. Something doesn't
> seem to work with ALTERNATIVE_2. It adds only a few bytes
> more code, so seems acceptable.
Ugh. It's kind of stupid, though.
Why is the code sequence not simply:
ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg),
X86_FEATURE_RETPOLINE
ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
simply fall through to what will be the "jmp *\reg" of the
non-RETPOLINE version.
Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
That is both simpler an dsmaller, no?
Linus
On Tue, 9 Jan 2018, Linus Torvalds wrote:
> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]> wrote:
> >
> > The following patch fixes it for me. Something doesn't
> > seem to work with ALTERNATIVE_2. It adds only a few bytes
> > more code, so seems acceptable.
>
> Ugh. It's kind of stupid, though.
>
> Why is the code sequence not simply:
>
> ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg),
> X86_FEATURE_RETPOLINE
>
> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> simply fall through to what will be the "jmp *\reg" of the
> non-RETPOLINE version.
>
> Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
>
> That is both simpler an dsmaller, no?
Duh, yes.
On 1/9/2018 6:40 PM, Thomas Gleixner wrote:
> On Tue, 9 Jan 2018, Linus Torvalds wrote:
>
>> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]> wrote:
>>>
>>> The following patch fixes it for me. Something doesn't
>>> seem to work with ALTERNATIVE_2. It adds only a few bytes
>>> more code, so seems acceptable.
>>
>> Ugh. It's kind of stupid, though.
>>
>> Why is the code sequence not simply:
>>
>> ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
>> ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg),
>> X86_FEATURE_RETPOLINE
>>
>> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
>> simply fall through to what will be the "jmp *\reg" of the
>> non-RETPOLINE version.
>>
>> Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
I think there are areas that rely on X86_FEATURE_RETPOLINE being set
even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
Thanks,
Tom
>>
>> That is both simpler an dsmaller, no?
>
> Duh, yes.
>
On Tue, 9 Jan 2018, Tom Lendacky wrote:
> On 1/9/2018 6:40 PM, Thomas Gleixner wrote:
> > On Tue, 9 Jan 2018, Linus Torvalds wrote:
> >
> >> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]> wrote:
> >>>
> >>> The following patch fixes it for me. Something doesn't
> >>> seem to work with ALTERNATIVE_2. It adds only a few bytes
> >>> more code, so seems acceptable.
> >>
> >> Ugh. It's kind of stupid, though.
> >>
> >> Why is the code sequence not simply:
> >>
> >> ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> >> ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg),
> >> X86_FEATURE_RETPOLINE
> >>
> >> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> >> simply fall through to what will be the "jmp *\reg" of the
> >> non-RETPOLINE version.
> >>
> >> Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
>
> I think there are areas that rely on X86_FEATURE_RETPOLINE being set
> even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
> arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
That should be fixable
Thanks,
tglx
> Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
>
> That is both simpler an dsmaller, no?
Yes that works, and is clearly better/simpler.
Tested-by: Andi Kleen <[email protected]>
Thomas, I assume you will fix it up, or let me know if I should
send another patch.
-Andi
On Tue, 9 Jan 2018, Andi Kleen wrote:
> > Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
> >
> > That is both simpler an dsmaller, no?
>
> Yes that works, and is clearly better/simpler.
>
> Tested-by: Andi Kleen <[email protected]>
>
> Thomas, I assume you will fix it up, or let me know if I should
> send another patch.
I'd appreciate a patch as I'm almost falling asleep. Can you please have a
look at the usage sites of the feature bits as well?
Thanks,
tglx
On Tue, Jan 09, 2018 at 06:45:34PM -0600, Tom Lendacky wrote:
> On 1/9/2018 6:40 PM, Thomas Gleixner wrote:
> > On Tue, 9 Jan 2018, Linus Torvalds wrote:
> >
> >> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]> wrote:
> >>>
> >>> The following patch fixes it for me. Something doesn't
> >>> seem to work with ALTERNATIVE_2. It adds only a few bytes
> >>> more code, so seems acceptable.
> >>
> >> Ugh. It's kind of stupid, though.
> >>
> >> Why is the code sequence not simply:
> >>
> >> ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> >> ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP \reg),
> >> X86_FEATURE_RETPOLINE
> >>
> >> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> >> simply fall through to what will be the "jmp *\reg" of the
> >> non-RETPOLINE version.
> >>
> >> Then just make sure X86_FEATURE_RETPOLINE_AMD disables X86_FEATURE_RETPOLINE.
>
> I think there are areas that rely on X86_FEATURE_RETPOLINE being set
> even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
> arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
I audited the difference places. They all seem ok.
I assume you don't need FILL_RETURN_BUFFER on AMD. If not let me know
and we can add a X86_FEATURE_RETPOLINE_COMMON
-Andi
> > I think there are areas that rely on X86_FEATURE_RETPOLINE being set
> > even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
> > arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
>
> I audited the difference places. They all seem ok.
Actually 32bit is not ok. For 32bit we need COMMON. So adding it.
-Andi
On Tue, 9 Jan 2018, Andi Kleen wrote:
> > > I think there are areas that rely on X86_FEATURE_RETPOLINE being set
> > > even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
> > > arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
> >
> > I audited the difference places. They all seem ok.
>
> Actually 32bit is not ok. For 32bit we need COMMON. So adding it.
Please keep the current X86_FEATURE_RETPOLINE as the common one and add a
new X86_FEATURE_RETPOLINE_GENERIC which is selected by the generic/intel
side. Little more churn, but clearer.
Thanks,
tglx
On Wed, Jan 10, 2018 at 02:50:12AM +0100, Thomas Gleixner wrote:
> On Tue, 9 Jan 2018, Andi Kleen wrote:
>
> > > > I think there are areas that rely on X86_FEATURE_RETPOLINE being set
> > > > even if X86_FEATURE_RETPOLINE_AMD is set. For example, line 261 in
> > > > arch/x86/entry/entry_32.S is only checking for X86_FEATURE_RETPOLINE.
> > >
> > > I audited the difference places. They all seem ok.
> >
> > Actually 32bit is not ok. For 32bit we need COMMON. So adding it.
>
> Please keep the current X86_FEATURE_RETPOLINE as the common one and add a
> new X86_FEATURE_RETPOLINE_GENERIC which is selected by the generic/intel
> side. Little more churn, but clearer.
Ok, I just sent v2 without that. Will do a v3.
-Andi
On Tue, 2018-01-09 at 16:39 -0800, Linus Torvalds wrote:
> On Tue, Jan 9, 2018 at 4:31 PM, Andi Kleen <[email protected]>
> wrote:
> >
> >
> > The following patch fixes it for me. Something doesn't
> > seem to work with ALTERNATIVE_2. It adds only a few bytes
> > more code, so seems acceptable.
> Ugh. It's kind of stupid, though.
>
> Why is the code sequence not simply:
>
> ALTERNATIVE "", "lfence", X86_FEATURE_RETPOLINE_AMD
> ALTERNATIVE __stringify(jmp *\reg), __stringify(RETPOLINE_JMP
> \reg),
> X86_FEATURE_RETPOLINE
>
> ie make that X86_FEATURE_RETPOLINE_AMD _only_ emit the "lfence", and
> simply fall through to what will be the "jmp *\reg" of the
> non-RETPOLINE version.
>
> Then just make sure X86_FEATURE_RETPOLINE_AMD disables
> X86_FEATURE_RETPOLINE.
>
> That is both simpler and smaller, no?
Not smaller, as the lfence *could* have gone in all the space left by
turning the whole retpoline thing into a single 'jmp'. But I'll
certainly give you simpler. I'll retest and merge Andi's latest patches
for that; thanks.
I'd really like to know what went wrong though. Did we merge Borislav's
attempt to peek at jumps inside alternatives, perchance? Will take a
look...
On Tue, 2018-01-09 at 17:30 -0800, Andi Kleen wrote:
> I assume you don't need FILL_RETURN_BUFFER on AMD. If not let me know
> and we can add a X86_FEATURE_RETPOLINE_COMMON
FWIW the AMD doc I have here (Tom, is that public now?) does say we
should fill the RSB. That's a minor tweak s/GENERIC/COMMON/ in your
latest patch set I think, so I'll fix that up.
Thanks.
On Wed, 2018-01-10 at 07:15 +0000, David Woodhouse wrote:
> I'd really like to know what went wrong though. Did we merge Borislav's
> attempt to peek at jumps inside alternatives, perchance? Will take a
> look...
Ah, it only happens if I run in KVM, not with Qemu's CPU; that's why it
didn't show up in my testing. And it seems to have been introduced by
the addition of '.align 16' at the start of the RETPOLINE_JMP macro. So
this fixes it:
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -15,7 +15,6 @@
* invocation below less ugly.
*/
.macro RETPOLINE_JMP reg:req
- .align 16
call .Ldo_rop_\@
.Lspec_trap_\@:
pause
Borislav? Want to take a look at that please?
I'm going to go ahead and apply Andi's patches anyway. Once I admit
that I shoved the .align in there to precisely match up with Paul's
sequence, and since Linus got me to inline the thing at all the call
sites instead of using the out-of-line thunk, I don't think an
objection of "but it adds some extra bytes for the separate lfence" is
going to work... :)
But actually, I think I may also rip out the .align there. Because
inside alternatives it doesn't make a lot of sense. It may have aligned
the code by padding with NOPs in its *original* location, but once it
gets copied into the place it's executed from, the alignment could be
different.
From: Borislav Petkov <[email protected]>
Date: Wed, 10 Jan 2018 12:14:07 +0100
We check only the first byte whether it is a NOP but if David Woodhouse
wants to do some crazy experiments with slapping NOPs in front of the
payload and having the actual instructions after it, this "optimized"
test breaks. :-)
Make sure we scan all bytes before we decide to optimize the NOPs in
there.
Reported-by: David Woodhouse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: [email protected]
---
arch/x86/kernel/alternative.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d3382e91..78932b283915 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -344,9 +344,11 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
+ int i;
- if (instr[0] != 0x90)
- return;
+ for (i = 0; i < a->padlen; i++)
+ if (instr[i] != 0x90)
+ return;
local_irq_save(flags);
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
--
2.13.0
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, 2018-01-10 at 12:28 +0100, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
> Date: Wed, 10 Jan 2018 12:14:07 +0100
>
> We check only the first byte whether it is a NOP but if David Woodhouse
> wants to do some crazy experiments with slapping NOPs in front of the
> payload and having the actual instructions after it, this "optimized"
> test breaks. :-)
:)
Thank you.
Tested-by: David Woodhouse <[email protected]>
That fixed and understood, I shall remove the offending NOPs anyway,
because aligning instructions in the *altinstr* section is entirely
pointless as they might still not be aligned when they get copied into
place anyway.
On Wed, Jan 10, 2018 at 11:36:41AM +0000, David Woodhouse wrote:
> That fixed and understood, I shall remove the offending NOPs anyway,
> because aligning instructions in the *altinstr* section is entirely
> pointless as they might still not be aligned when they get copied into
> place anyway.
Yap. It was still a good experiment because we found this shortcoming!
:-)
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, 2018-01-10 at 12:45 +0100, Borislav Petkov wrote:
> On Wed, Jan 10, 2018 at 11:36:41AM +0000, David Woodhouse wrote:
> >
> > That fixed and understood, I shall remove the offending NOPs anyway,
> > because aligning instructions in the *altinstr* section is entirely
> > pointless as they might still not be aligned when they get copied into
> > place anyway.
>
> Yap. It was still a good experiment because we found this shortcoming!
Don't suppose you want to make the alignment actually *work*? :)
Paul, how much of a win was it really? And not just in a microbenchmark
of the retpoline itself, saying "look Ma! This goes faster if I make it
take three times as many icache lines!", but overall system benchmarks?
:)
On Wed, Jan 10, 2018 at 11:49:55AM +0000, David Woodhouse wrote:
> Don't suppose you want to make the alignment actually *work*? :)
I can try but only if it is really worth it. If we don't see it in
measurements, then it is wasted time and energy and added needless code
complexity.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, 2018-01-10 at 12:57 +0100, Borislav Petkov wrote:
> On Wed, Jan 10, 2018 at 11:49:55AM +0000, David Woodhouse wrote:
> > Don't suppose you want to make the alignment actually *work*? :)
>
> I can try but only if it is really worth it. If we don't see it in
> measurements, then it is wasted time and energy and added needless
> code complexity.
Agreed. Hence the question to Paul.
Commit-ID: d1cb4348f683d132ef2d468d4e9ad421486163f9
Gitweb: https://git.kernel.org/tip/d1cb4348f683d132ef2d468d4e9ad421486163f9
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 10 Jan 2018 12:28:16 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 10 Jan 2018 18:28:21 +0100
x86/alternatives: Fix optimize_nops() checking
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Paul Turner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/alternative.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d33..e0b97e4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -344,9 +344,12 @@ done:
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
+ int i;
- if (instr[0] != 0x90)
- return;
+ for (i = 0; i < a->padlen; i++) {
+ if (instr[i] != 0x90)
+ return;
+ }
local_irq_save(flags);
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
Commit-ID: 0795e94c2eacd888c88e2ad2321368b6b0fcb20a
Gitweb: https://git.kernel.org/tip/0795e94c2eacd888c88e2ad2321368b6b0fcb20a
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 10 Jan 2018 12:28:16 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 10 Jan 2018 19:09:08 +0100
x86/alternatives: Fix optimize_nops() checking
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Paul Turner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/alternative.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d33..e0b97e4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -344,9 +344,12 @@ done:
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
+ int i;
- if (instr[0] != 0x90)
- return;
+ for (i = 0; i < a->padlen; i++) {
+ if (instr[i] != 0x90)
+ return;
+ }
local_irq_save(flags);
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
Commit-ID: 612e8e9350fd19cae6900cf36ea0c6892d1a0dca
Gitweb: https://git.kernel.org/tip/612e8e9350fd19cae6900cf36ea0c6892d1a0dca
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 10 Jan 2018 12:28:16 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Wed, 10 Jan 2018 19:36:22 +0100
x86/alternatives: Fix optimize_nops() checking
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Paul Turner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/alternative.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3344d33..e0b97e4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -344,9 +344,12 @@ done:
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
+ int i;
- if (instr[0] != 0x90)
- return;
+ for (i = 0; i < a->padlen; i++) {
+ if (instr[i] != 0x90)
+ return;
+ }
local_irq_save(flags);
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
On Wed, Jan 10, 2018 at 3:28 AM, Borislav Petkov <[email protected]> wrote:
>
> Make sure we scan all bytes before we decide to optimize the NOPs in
> there.
Can we also add compile-time checking (presumably in objtool, but who
knows) that there are no relocations in the alternative section?
Because that was the other "oops, this really doesn't work with
altinstructions" issue, wasn't it?
Linus
On Wed, 10 Jan 2018, Linus Torvalds wrote:
> On Wed, Jan 10, 2018 at 3:28 AM, Borislav Petkov <[email protected]> wrote:
> >
> > Make sure we scan all bytes before we decide to optimize the NOPs in
> > there.
>
> Can we also add compile-time checking (presumably in objtool, but who
> knows) that there are no relocations in the alternative section?
Cc'ing the overlor^Haded objtool wizard
> Because that was the other "oops, this really doesn't work with
> altinstructions" issue, wasn't it?
Yes.
Thanks,
tglx
On Wed, Jan 10, 2018 at 11:38:36AM -0800, Linus Torvalds wrote:
> Can we also add compile-time checking (presumably in objtool, but who
> knows) that there are no relocations in the alternative section?
Well, so we do fail the build if the jmp's offset doesn't fit in s32:
Makefile:996: recipe for target 'vmlinux' failed
arch/x86/kernel/cpu/amd.o:(.altinstr_replacement+0x4): relocation truncated to fit: R_X86_64_32 against `.altinstr_replacement'
make: *** [vmlinux] Error 1
I did this:
alternative("", "xor %%rdi, %%rdi; .byte 0xe9; .long 2f; 2: jmp startup_64", X86_FEATURE_K8);
And I was thinking whether it would make sense to beef up
recompute_jump() to be smart enough to massage the JMPs into jumping to
the right place.
Because in the example above, I'm basically jumping to the label 2: but
since the .altinstr_replacement is so far away, it fails the build due
to the reloc.
Even though it shouldn't have because when we patch, that label is
immediately following the JMP. And probably it wouldn't need a reloc
either. Whether that sequence makes sense is another story...
Anyway, I asked a gcc person and we'll see what we could do.
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, Jan 10, 2018 at 08:55:40PM +0100, Thomas Gleixner wrote:
> On Wed, 10 Jan 2018, Linus Torvalds wrote:
>
> > On Wed, Jan 10, 2018 at 3:28 AM, Borislav Petkov <[email protected]> wrote:
> > >
> > > Make sure we scan all bytes before we decide to optimize the NOPs in
> > > there.
> >
> > Can we also add compile-time checking (presumably in objtool, but who
> > knows) that there are no relocations in the alternative section?
>
> Cc'ing the overlor^Haded objtool wizard
>
> > Because that was the other "oops, this really doesn't work with
> > altinstructions" issue, wasn't it?
I think .altinstruction relocations *do* work if they're for the first
instruction, and it's a jump or a call. There's some alternatives code
which adjusts the jump/call offset in that case, and there are some
users of alternatives who rely on that.
I think Boris had a patch floating around to add an instruction decoder
to alternatives, so you can do a call/jmp anywhere.
--
Josh
On Wed, 2018-01-10 at 14:15 -0600, Josh Poimboeuf wrote:
> On Wed, Jan 10, 2018 at 08:55:40PM +0100, Thomas Gleixner wrote:
> > On Wed, 10 Jan 2018, Linus Torvalds wrote:
> >
> > > On Wed, Jan 10, 2018 at 3:28 AM, Borislav Petkov <[email protected]> wrote:
> > > >
> > > > Make sure we scan all bytes before we decide to optimize the NOPs in
> > > > there.
> > >
> > > Can we also add compile-time checking (presumably in objtool, but who
> > > knows) that there are no relocations in the alternative section?
> >
> > Cc'ing the overlor^Haded objtool wizard
> >
> > > Because that was the other "oops, this really doesn't work with
> > > altinstructions" issue, wasn't it?
>
> I think .altinstruction relocations *do* work if they're for the first
> instruction, and it's a jump or a call. There's some alternatives code
> which adjusts the jump/call offset in that case, and there are some
> users of alternatives who rely on that.
>
> I think Boris had a patch floating around to add an instruction decoder
> to alternatives, so you can do a call/jmp anywhere.
Strictly speaking, it's not about the ELF relocs. Those get processed
in advance, and the altinstructions already have them applied.
What Borislav had was a patch to process the actual branch
instructions, then frob their targets by {&altinstr - &oldinstr} to
make them work again... except it only worked by chance for targets
*within* the altinstr.
On the whole I think we're better off not touching that right now for
fear that it will introduce new bugs. But yes, it *should* be fixed....
just not this week please ;)
On Wed, Jan 10, 2018 at 12:05 PM, Borislav Petkov <[email protected]> wrote:
>
> I did this:
>
> alternative("", "xor %%rdi, %%rdi; .byte 0xe9; .long 2f; 2: jmp startup_64", X86_FEATURE_K8);
No, that's not valid. That could never work anyway. The ".long 2f"
would be the absolute address in the alternative section, but opcode
E9 takes a relative 32-bit offset.
So the error you get isn't because relocations wouldn't work in
alternatives, it's because you tried to fit an absolute 64-biy value
in a 32-bit relocation, and it was wrong _regardless_ of which section
you were in.
Linus
On Wed, Jan 10, 2018 at 12:15 PM, Josh Poimboeuf <[email protected]> wrote:
>
> I think .altinstruction relocations *do* work if they're for the first
> instruction, and it's a jump or a call.
Yes - for the alternative that is in-line - not in the "altinstruction" section.
Because then the alternative is in the right spot at link-time already.
But the "altinstruction" section definitely should not have
relocations. I guess you could hack them up by hand by explicitly
trying to take the difference between the non-altinstruction and the
altinstruction into account, but it would be error-prone and fragile
as hell.
> I think Boris had a patch floating around to add an instruction decoder
> to alternatives, so you can do a call/jmp anywhere.
.. and no, we're not doing that. Christ.
People, we need to try to be *robust* here. That's doubly (triply!)
true of things like altinstructions where people - very much by design
- won't even *test* the alternatives very much, because very much by
design the altinstructions are only used on certain architectures or
in certain situations.
And we almost certainly don't actuially _need_ relocations. But we
need to protect against the "oops, I didn't realize" issue, exactly
because testing won't actually catch the odd cases.
Because we don't want to be in the situation where some random poor
user hits it because they have an old CPU that no developer has, and
then the relocation will basically do completely random things.
Imagine just how crazy that would be to debug. You'd be basically
executing insane code, and looking at the sources - or even the
binaries - it would _look_ completely sane.
Linus
On Wed, 2018-01-10 at 21:33 +0100, Peter Zijlstra wrote:
> On Wed, Jan 10, 2018 at 12:26:25PM -0800, Linus Torvalds wrote:
> > Imagine just how crazy that would be to debug. You'd be basically
> > executing insane code, and looking at the sources - or even the
> > binaries - it would _look_ completely sane.
>
> Been there done that.. we have too much self modifying code for that
> not to have been needed.
>
> Use gdb on /proc/kcore and disassemble self to see the _real_ code.
Or qemu -d in_asm to see what the code *was* at the time the CPU hit
it... :)
On Wed, Jan 10, 2018 at 12:33 PM, Peter Zijlstra <[email protected]> wrote:
>
> Use gdb on /proc/kcore and disassemble self to see the _real_ code.
Yes, because that works so well when you get these random reports of
crazy behavior that simply doesn't ever trigger on any machine you
have.
> But yes, tricky stuff. Not arguing we need relocations in alternatives,
> just saying debugging them (and static keys and others) is great fun.
Yes. It's definitely a good challenge, and can be _very_ satisfying
when you then finally figure it out.
I'm not sure it's worth the pain before that point, though.
So I think we should strive to make our alternatives code just catch
some of the more subtle errors early instead.
Linus
On Wed, Jan 10, 2018 at 12:20:40PM -0800, Linus Torvalds wrote:
> No, that's not valid. That could never work anyway. The ".long 2f"
> would be the absolute address in the alternative section, but opcode
> E9 takes a relative 32-bit offset.
Ah, right, doh. I remember now. We used to do those jmps by computing the
relative offset:
From: 090a3f615524c3f75d09fdb37f15ea1868d79f7e
- .section .altinstr_replacement,"ax"
-1: .byte 0xeb /* jmp <disp8> */
- .byte (copy_page_rep - copy_page) - (2f - 1b) /* offset */
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, Jan 10, 2018 at 12:26:25PM -0800, Linus Torvalds wrote:
> On Wed, Jan 10, 2018 at 12:15 PM, Josh Poimboeuf <[email protected]> wrote:
> >
> > I think .altinstruction relocations *do* work if they're for the first
> > instruction, and it's a jump or a call.
>
> Yes - for the alternative that is in-line - not in the "altinstruction" section.
>
> Because then the alternative is in the right spot at link-time already.
>
> But the "altinstruction" section definitely should not have
> relocations.
I misspoke, it's really .altinstr_replacement which has the replacement
instructions. And it has a bunch of relocations:
Relocation section [ 8] '.rela.altinstr_replacement' for section [ 7] '.altinstr_replacement' at offset 0x14439710 contains 355 entries:
> I guess you could hack them up by hand by explicitly
> trying to take the difference between the non-altinstruction and the
> altinstruction into account, but it would be error-prone and fragile
> as hell.
apply_alternatives() already does that today. It actually seems pretty
solid, except for the whole "only works on the first instruction" thing.
> > I think Boris had a patch floating around to add an instruction decoder
> > to alternatives, so you can do a call/jmp anywhere.
>
> .. and no, we're not doing that. Christ.
>
> People, we need to try to be *robust* here. That's doubly (triply!)
> true of things like altinstructions where people - very much by design
> - won't even *test* the alternatives very much, because very much by
> design the altinstructions are only used on certain architectures or
> in certain situations.
>
> And we almost certainly don't actuially _need_ relocations. But we
> need to protect against the "oops, I didn't realize" issue, exactly
> because testing won't actually catch the odd cases.
If we need objtool to detect them, it's certainly possible. But maybe I
missed the previous discussion -- what's the, um, alternative to
relocations, when we have calls and jumps being patched in?
> Because we don't want to be in the situation where some random poor
> user hits it because they have an old CPU that no developer has, and
> then the relocation will basically do completely random things.
>
> Imagine just how crazy that would be to debug. You'd be basically
> executing insane code, and looking at the sources - or even the
> binaries - it would _look_ completely sane.
Well, I think we already made that deal with the devil when we added
alternatives/paravirt/smp_locks/jump_labels/kprobes/ftrace/bpf, etc.
--
Josh
On Wed, Jan 10, 2018 at 12:26:25PM -0800, Linus Torvalds wrote:
> > I think Boris had a patch floating around to add an instruction decoder
> > to alternatives, so you can do a call/jmp anywhere.
>
> .. and no, we're not doing that. Christ.
>
> People, we need to try to be *robust* here. That's doubly (triply!)
> true of things like altinstructions where people - very much by design
> - won't even *test* the alternatives very much, because very much by
> design the altinstructions are only used on certain architectures or
> in certain situations.
Ok, so the problem was: how to fixup jumps which are not the first
instruction which is being replaced but a following one in the
instruction bytes with which we replace.
I used the insn decoder to get insn boundaries so that I can know
whether bytes 0xeb or 0xe9 are the actual JMP opcode and not some other
bytes from the stream.
So how do you suggest I do that without the decoder? I still need some
sort of parsing to find out where the boundaries are...
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, Jan 10, 2018 at 12:55 PM, Borislav Petkov <[email protected]> wrote:
>
> Ok, so the problem was: how to fixup jumps which are not the first
> instruction which is being replaced but a following one in the
> instruction bytes with which we replace.
What jumps do you have that need to be fixed up?
I really think we should avoid having things like that.
Any jumps *within* the alternatives should have been handled by the
assembler already.
And jumps between the alternatives and other places? Why do they exist?
Linus
On Wed, 2018-01-10 at 13:05 -0800, Linus Torvalds wrote:
> On Wed, Jan 10, 2018 at 12:55 PM, Borislav Petkov <[email protected]>
> wrote:
> >
> > Ok, so the problem was: how to fixup jumps which are not the first
> > instruction which is being replaced but a following one in the
> > instruction bytes with which we replace.
>
> What jumps do you have that need to be fixed up?
>
> I really think we should avoid having things like that.
>
> Any jumps *within* the alternatives should have been handled by the
> assembler already.
>
> And jumps between the alternatives and other places? Why do they
> exist?
There are a few of the form 'call *somefunc'.
The existing code handles them not by virtue of the relocs, as I said,
but by a simple delta of the old and new location of the instruction.
But it only does so for the *first* instruction of the altinstr, if it
happens to be a (4-byte?) branch.
Right now for retpoline I am just studiously avoiding doing anything
that the alternatives mechanism isn't going to get right, or might
change in future. I think ;)
On Wed, Jan 10, 2018 at 01:05:08PM -0800, Linus Torvalds wrote:
> Any jumps *within* the alternatives should have been handled by the
> assembler already.
>
> And jumps between the alternatives and other places? Why do they exist?
Yap, I say so too, the alternatives handle everything just fine but then
Woodhouse appeared... :-)
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
On Wed, Jan 10, 2018 at 1:08 PM, David Woodhouse <[email protected]> wrote:
>
> There are a few of the form 'call *somefunc'.
>
> The existing code handles them not by virtue of the relocs, as I said,
> but by a simple delta of the old and new location of the instruction.
>
> But it only does so for the *first* instruction of the altinstr, if it
> happens to be a (4-byte?) branch.
Ugh. That's nasty.
Wouldn't it be much better to simply do it as part of relocation instead?
Linus
On Wed, Jan 10, 2018 at 01:11:48PM -0800, Linus Torvalds wrote:
> On Wed, Jan 10, 2018 at 1:08 PM, David Woodhouse <[email protected]> wrote:
> >
> > There are a few of the form 'call *somefunc'.
> >
> > The existing code handles them not by virtue of the relocs, as I said,
> > but by a simple delta of the old and new location of the instruction.
> >
> > But it only does so for the *first* instruction of the altinstr, if it
> > happens to be a (4-byte?) branch.
>
> Ugh. That's nasty.
>
> Wouldn't it be much better to simply do it as part of relocation instead?
Not possible, the relocations are applied during vmlinux linking.
--
Josh
On Wed, Jan 10, 2018 at 1:11 PM, Linus Torvalds
<[email protected]> wrote:
>
> Wouldn't it be much better to simply do it as part of relocation instead?
.. except we only do real relocation for modules, and depend on the
linker doing everything for us (handle_relocations() at load-time) I
think.
So it's somewhat more involved surgery. Which explains the hack.
Nasty.
Linus
On Wed, Jan 10, 2018 at 01:17:45PM -0800, Linus Torvalds wrote:
> On Wed, Jan 10, 2018 at 1:11 PM, Linus Torvalds
> <[email protected]> wrote:
> >
> > Wouldn't it be much better to simply do it as part of relocation instead?
>
> .. except we only do real relocation for modules, and depend on the
> linker doing everything for us (handle_relocations() at load-time) I
> think.
>
> So it's somewhat more involved surgery. Which explains the hack.
>
> Nasty.
Right. With KASLR, the relocations seem to be resolved by
handle_relocations(), but without KASLR, they're resolved in vmlinux
linking.
--
Josh
On Wed, Jan 10, 2018 at 12:26:25PM -0800, Linus Torvalds wrote:
> Imagine just how crazy that would be to debug. You'd be basically
> executing insane code, and looking at the sources - or even the
> binaries - it would _look_ completely sane.
Been there done that.. we have too much self modifying code for that
not to have been needed.
Use gdb on /proc/kcore and disassemble self to see the _real_ code.
But yes, tricky stuff. Not arguing we need relocations in alternatives,
just saying debugging them (and static keys and others) is great fun.