"movw %ds,%cx" insn needs a 0x66 prefix, while "movw %ds,%ecx" does not.
The difference is that latter form (on 64-bit CPUs) overwrites
entire %ecx, not only its lower half.
But subsequent code doesn't depend on the value of upper
half of %ecx, we can safely use the shorter insn.
The new code is also faster than old one - now we don't depend on old value
of %ecx, but this code fragment is not performance-critical.
Signed-off-by: Denys Vlasenko <[email protected]>
CC: Linus Torvalds <[email protected]>
CC: Steven Rostedt <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: Oleg Nesterov <[email protected]>
CC: Frederic Weisbecker <[email protected]>
CC: Alexei Starovoitov <[email protected]>
CC: Will Drewry <[email protected]>
CC: Kees Cook <[email protected]>
CC: [email protected]
CC: [email protected]
---
Changes in v2: clarified that 32-bit form overwrites entire %ecx
*on 64-bit CPUs*. Rumor has it old 486s/P5s exist which don't do that,
or they write garbage in upper half.
arch/x86/kernel/entry_64.S | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 62b4c5f..ef2651d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1195,19 +1195,19 @@ ENTRY(xen_failsafe_callback)
/*CFI_REL_OFFSET ds,DS*/
CFI_REL_OFFSET r11,8
CFI_REL_OFFSET rcx,0
- movw %ds,%cx
+ movl %ds,%ecx
cmpw %cx,0x10(%rsp)
CFI_REMEMBER_STATE
jne 1f
- movw %es,%cx
+ movl %es,%ecx
cmpw %cx,0x18(%rsp)
jne 1f
- movw %fs,%cx
+ movl %fs,%ecx
cmpw %cx,0x20(%rsp)
jne 1f
- movw %gs,%cx
+ movl %gs,%ecx
cmpw %cx,0x28(%rsp)
jne 1f
/* All segments match their saved values => Category 2 (Bad IRET). */
movq (%rsp),%rcx
CFI_RESTORE rcx
movq 8(%rsp),%r11
--
1.8.1.4
On 05/14/2015 11:07 AM, Denys Vlasenko wrote:
> "movw %ds,%cx" insn needs a 0x66 prefix, while "movw %ds,%ecx" does not.
> The difference is that latter form (on 64-bit CPUs) overwrites
> entire %ecx, not only its lower half.
>
> But subsequent code doesn't depend on the value of upper
> half of %ecx, we can safely use the shorter insn.
>
> The new code is also faster than old one - now we don't depend on old value
> of %ecx, but this code fragment is not performance-critical.
This is still misleading. On P6 or later CPUs, not just 64 bits, this
zeroes the upper half, whereas on older CPUs it is undefined. Which is
still fine, but if we are going to make such a minor change we should
get it correct.
-hpa
x86/asm/entry/64: Use shorter MOVs from segmers registers
^^^^^^^
trivial typo in the summary (should be "segment)?
Jeff