2015-06-08 08:35:23

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 0/4] x86: Untangle and standardize x86 system call entry point names

This series does the following renames:

system_call (32) -> entry_INT80_32
system_call (64) -> entry_SYSCALL_64
ia32_cstar_target -> entry_SYSCALL_compat
ia32_syscall -> entry_INT80_compat
ia32_sysenter_target (32) -> entry_SYSENTER_32
ia32_sysenter_target (64) -> entry_SYSENTER_compat

As can be seen from that list alone, the naming was a mess:

- system_call() had two distinct uses, depending on
bitness: INT80 entry on 32-bit, SYSCALL entry on 64-bit.

- ia32_sysenter_target with its different semantics on
32-bit and compat kernels had the same name as well.

- 'ia32' in a generic x86 name makes no sense, neither does 'cstar'.

It was so confusing that even the x86 documentation got it wrong:

"- ia32_syscall, ia32_sysenter: syscall and sysenter from 32-bit"

In reality ia32_syscall is an INT80 entry.

The new naming scheme is simple, coherent and unambiguous in any context:

entry_MNEMONIC_qualifier

where:

- 'MNEMONIC' is one of INT80, SYSCALL or SYSENTER
- 'qualifier' is one of _32, _64 or _compat.

Plus while at it I've done some cleanups to the native 32-bit entry code
as well.

Thanks,

Ingo

====================================>
Ingo Molnar (4):
x86/asm/entry: Rename compat syscall entry points
x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat
x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32
x86/asm/entry/32: Clean up entry_32.S

Documentation/x86/entry_64.txt | 4 +-
arch/x86/entry/entry_32.S | 1149 +++++++++++++++++++++++++++++-----------------------------
arch/x86/entry/entry_64.S | 10 +-
arch/x86/entry/entry_64_compat.S | 12 +-
arch/x86/entry/syscall_32.c | 6 +-
arch/x86/include/asm/proto.h | 10 +-
arch/x86/kernel/asm-offsets_64.c | 2 +-
arch/x86/kernel/cpu/common.c | 8 +-
arch/x86/kernel/traps.c | 7 +-
arch/x86/xen/xen-asm_64.S | 6 +-
10 files changed, 607 insertions(+), 607 deletions(-)

--
2.1.4


2015-06-08 08:35:50

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 1/4] x86/asm/entry: Rename compat syscall entry points

Rename the following system call entry points:

ia32_cstar_target -> entry_SYSCALL_compat
ia32_syscall -> entry_INT80_compat

The generic naming scheme for x86 system call entry points is:

entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
Documentation/x86/entry_64.txt | 4 ++--
arch/x86/entry/entry_64_compat.S | 8 ++++----
arch/x86/entry/syscall_32.c | 6 +++---
arch/x86/include/asm/proto.h | 4 ++--
arch/x86/kernel/asm-offsets_64.c | 2 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/traps.c | 2 +-
arch/x86/xen/xen-asm_64.S | 2 +-
8 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.txt
index 9132b86176a3..33884d156125 100644
--- a/Documentation/x86/entry_64.txt
+++ b/Documentation/x86/entry_64.txt
@@ -18,10 +18,10 @@ The IDT vector assignments are listed in arch/x86/include/asm/irq_vectors.h.

- system_call: syscall instruction from 64-bit code.

- - ia32_syscall: int 0x80 from 32-bit or 64-bit code; compat syscall
+ - entry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall
either way.

- - ia32_syscall, ia32_sysenter: syscall and sysenter from 32-bit
+ - entry_INT80_compat, ia32_sysenter: syscall and sysenter from 32-bit
code

- interrupt: An array of entries. Every IDT vector that doesn't
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 9558dacf32b9..8058892fb5ff 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -288,7 +288,7 @@ ENDPROC(ia32_sysenter_target)
* path below. We set up a complete hardware stack frame to share code
* with the int 0x80 path.
*/
-ENTRY(ia32_cstar_target)
+ENTRY(entry_SYSCALL_compat)
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -409,7 +409,7 @@ ENTRY(ia32_cstar_target)

RESTORE_EXTRA_REGS
jmp cstar_do_call
-END(ia32_cstar_target)
+END(entry_SYSCALL_compat)

ia32_badarg:
ASM_CLAC
@@ -445,7 +445,7 @@ END(ia32_cstar_target)
* Assumes it is only called from user space and entered with interrupts off.
*/

-ENTRY(ia32_syscall)
+ENTRY(entry_INT80_compat)
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -511,7 +511,7 @@ ENTRY(ia32_syscall)
movl %eax, %eax /* zero extension */
RESTORE_EXTRA_REGS
jmp ia32_do_call
-END(ia32_syscall)
+END(entry_INT80_compat)

.macro PTREGSCALL label, func
ALIGN
diff --git a/arch/x86/entry/syscall_32.c b/arch/x86/entry/syscall_32.c
index 3777189c4a19..e398d033673f 100644
--- a/arch/x86/entry/syscall_32.c
+++ b/arch/x86/entry/syscall_32.c
@@ -10,7 +10,7 @@
#else
#define SYM(sym, compat) sym
#define ia32_sys_call_table sys_call_table
-#define __NR_ia32_syscall_max __NR_syscall_max
+#define __NR_entry_INT80_compat_max __NR_syscall_max
#endif

#define __SYSCALL_I386(nr, sym, compat) extern asmlinkage void SYM(sym, compat)(void) ;
@@ -23,11 +23,11 @@ typedef asmlinkage void (*sys_call_ptr_t)(void);

extern asmlinkage void sys_ni_syscall(void);

-__visible const sys_call_ptr_t ia32_sys_call_table[__NR_ia32_syscall_max+1] = {
+__visible const sys_call_ptr_t ia32_sys_call_table[__NR_entry_INT80_compat_max+1] = {
/*
* Smells like a compiler bug -- it doesn't work
* when the & below is removed.
*/
- [0 ... __NR_ia32_syscall_max] = &sys_ni_syscall,
+ [0 ... __NR_entry_INT80_compat_max] = &sys_ni_syscall,
#include <asm/syscalls_32.h>
};
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index a90f8972dad5..7d2961a231f1 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -8,8 +8,8 @@
void system_call(void);
void syscall_init(void);

-void ia32_syscall(void);
-void ia32_cstar_target(void);
+void entry_INT80_compat(void);
+void entry_SYSCALL_compat(void);
void ia32_sysenter_target(void);

void x86_configure_nx(void);
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index dcaab87da629..599afcf0005f 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -66,7 +66,7 @@ int main(void)
DEFINE(__NR_syscall_max, sizeof(syscalls_64) - 1);
DEFINE(NR_syscalls, sizeof(syscalls_64));

- DEFINE(__NR_ia32_syscall_max, sizeof(syscalls_ia32) - 1);
+ DEFINE(__NR_entry_INT80_compat_max, sizeof(syscalls_ia32) - 1);
DEFINE(IA32_NR_syscalls, sizeof(syscalls_ia32));

return 0;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6bec0b55863e..f0b85c401014 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1207,7 +1207,7 @@ void syscall_init(void)
wrmsrl(MSR_LSTAR, system_call);

#ifdef CONFIG_IA32_EMULATION
- wrmsrl(MSR_CSTAR, ia32_cstar_target);
+ wrmsrl(MSR_CSTAR, entry_SYSCALL_compat);
/*
* This only works on Intel CPUs.
* On AMD CPUs these MSRs are 32-bit, CPU truncates MSR_IA32_SYSENTER_EIP.
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 5e0791f9d3dc..edf97986a53d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -992,7 +992,7 @@ void __init trap_init(void)
set_bit(i, used_vectors);

#ifdef CONFIG_IA32_EMULATION
- set_system_intr_gate(IA32_SYSCALL_VECTOR, ia32_syscall);
+ set_system_intr_gate(IA32_SYSCALL_VECTOR, entry_INT80_compat);
set_bit(IA32_SYSCALL_VECTOR, used_vectors);
#endif

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 04529e620559..3c43c03a499c 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -122,7 +122,7 @@ ENDPROC(xen_syscall_target)
/* 32-bit compat syscall target */
ENTRY(xen_syscall32_target)
undo_xen_syscall
- jmp ia32_cstar_target
+ jmp entry_SYSCALL_compat
ENDPROC(xen_syscall32_target)

/* 32-bit compat sysenter target */
--
2.1.4

2015-06-08 08:35:32

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 2/4] x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat

So the SYSENTER instruction is pretty quirky and it has different behavior
depending on bitness and CPU maker.

Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
in both of the cases.

Split the name into its two uses:

ia32_sysenter_target (32) -> entry_SYSENTER_32
ia32_sysenter_target (64) -> entry_SYSENTER_compat

As per the generic naming scheme for x86 system call entry points:

entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/entry/entry_32.S | 10 +++++-----
arch/x86/entry/entry_64_compat.S | 4 ++--
arch/x86/include/asm/proto.h | 3 ++-
arch/x86/kernel/cpu/common.c | 4 ++--
arch/x86/xen/xen-asm_64.S | 2 +-
5 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 0ac73de925d1..a65f46c3b8e1 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -307,7 +307,7 @@ END(resume_kernel)
the vsyscall page. See vsyscall-sysentry.S, which defines the symbol. */

# sysenter call handler stub
-ENTRY(ia32_sysenter_target)
+ENTRY(entry_SYSENTER_32)
movl TSS_sysenter_sp0(%esp),%esp
sysenter_past_esp:
/*
@@ -412,7 +412,7 @@ ENTRY(ia32_sysenter_target)
.popsection
_ASM_EXTABLE(1b,2b)
PTGS_TO_GS_EX
-ENDPROC(ia32_sysenter_target)
+ENDPROC(entry_SYSENTER_32)

# system call handler stub
ENTRY(system_call)
@@ -1135,7 +1135,7 @@ END(page_fault)

ENTRY(debug)
ASM_CLAC
- cmpl $ia32_sysenter_target,(%esp)
+ cmpl $entry_SYSENTER_32,(%esp)
jne debug_stack_correct
FIX_STACK 12, debug_stack_correct, debug_esp_fix_insn
debug_stack_correct:
@@ -1165,7 +1165,7 @@ ENTRY(nmi)
popl %eax
je nmi_espfix_stack
#endif
- cmpl $ia32_sysenter_target,(%esp)
+ cmpl $entry_SYSENTER_32,(%esp)
je nmi_stack_fixup
pushl %eax
movl %esp,%eax
@@ -1176,7 +1176,7 @@ ENTRY(nmi)
cmpl $(THREAD_SIZE-20),%eax
popl %eax
jae nmi_stack_correct
- cmpl $ia32_sysenter_target,12(%esp)
+ cmpl $entry_SYSENTER_32,12(%esp)
je nmi_debug_stack_check
nmi_stack_correct:
pushl %eax
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 8058892fb5ff..59840e33d203 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -57,7 +57,7 @@ ENDPROC(native_usergs_sysret32)
* path below. We set up a complete hardware stack frame to share code
* with the int 0x80 path.
*/
-ENTRY(ia32_sysenter_target)
+ENTRY(entry_SYSENTER_compat)
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -256,7 +256,7 @@ ENTRY(ia32_sysenter_target)

RESTORE_EXTRA_REGS
jmp sysenter_do_call
-ENDPROC(ia32_sysenter_target)
+ENDPROC(entry_SYSENTER_compat)

/*
* 32-bit SYSCALL instruction entry.
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 7d2961a231f1..83a7f8227949 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -10,7 +10,8 @@ void syscall_init(void);

void entry_INT80_compat(void);
void entry_SYSCALL_compat(void);
-void ia32_sysenter_target(void);
+void entry_SYSENTER_32(void);
+void entry_SYSENTER_compat(void);

void x86_configure_nx(void);
void x86_report_nx(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f0b85c401014..b2ae7cec33ca 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1026,7 +1026,7 @@ void enable_sep_cpu(void)
(unsigned long)tss + offsetofend(struct tss_struct, SYSENTER_stack),
0);

- wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)ia32_sysenter_target, 0);
+ wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);

out:
put_cpu();
@@ -1216,7 +1216,7 @@ void syscall_init(void)
*/
wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)__KERNEL_CS);
wrmsrl_safe(MSR_IA32_SYSENTER_ESP, 0ULL);
- wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)ia32_sysenter_target);
+ wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
#else
wrmsrl(MSR_CSTAR, ignore_sysret);
wrmsrl_safe(MSR_IA32_SYSENTER_CS, (u64)GDT_ENTRY_INVALID_SEG);
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 3c43c03a499c..ccac1b1e6e93 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -128,7 +128,7 @@ ENDPROC(xen_syscall32_target)
/* 32-bit compat sysenter target */
ENTRY(xen_sysenter_target)
undo_xen_syscall
- jmp ia32_sysenter_target
+ jmp entry_SYSENTER_compat
ENDPROC(xen_sysenter_target)

#else /* !CONFIG_IA32_EMULATION */
--
2.1.4

2015-06-08 08:35:39

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 3/4] x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32

The 'system_call' entry points differ starkly between native 32-bit and 64-bit
kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on
64-bit it's the SYSCALL entry point.

This is pretty confusing when looking at generic code, and it also obscures
the nature of the entry point at the assembly level.

So unangle this by splitting the name into its two uses:

system_call (32) -> entry_INT80_32
system_call (64) -> entry_SYSCALL_64

As per the generic naming scheme for x86 system call entry points:

entry_MNEMONIC_qualifier

where 'qualifier' is one of _32, _64 or _compat.

Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/entry/entry_32.S | 4 ++--
arch/x86/entry/entry_64.S | 10 +++++-----
arch/x86/include/asm/proto.h | 5 +++--
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/traps.c | 5 ++---
arch/x86/xen/xen-asm_64.S | 2 +-
6 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index a65f46c3b8e1..d59461032625 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -415,7 +415,7 @@ ENTRY(entry_SYSENTER_32)
ENDPROC(entry_SYSENTER_32)

# system call handler stub
-ENTRY(system_call)
+ENTRY(entry_INT80_32)
ASM_CLAC
pushl %eax # save orig_eax
SAVE_ALL
@@ -508,7 +508,7 @@ ENTRY(iret_exc)
lss (%esp), %esp /* switch to espfix segment */
jmp restore_nocheck
#endif
-ENDPROC(system_call)
+ENDPROC(entry_INT80_32)

# perform work that needs to be done immediately before resumption
ALIGN
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4cf3dd36aa0d..e1852c407155 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -137,7 +137,7 @@ ENDPROC(native_usergs_sysret64)
* with them due to bugs in both AMD and Intel CPUs.
*/

-ENTRY(system_call)
+ENTRY(entry_SYSCALL_64)
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -149,7 +149,7 @@ ENTRY(system_call)
* after the swapgs, so that it can do the swapgs
* for the guest and jump here on syscall.
*/
-GLOBAL(system_call_after_swapgs)
+GLOBAL(entry_SYSCALL_64_after_swapgs)

movq %rsp,PER_CPU_VAR(rsp_scratch)
movq PER_CPU_VAR(cpu_current_top_of_stack),%rsp
@@ -182,7 +182,7 @@ GLOBAL(system_call_after_swapgs)

testl $_TIF_WORK_SYSCALL_ENTRY, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
jnz tracesys
-system_call_fastpath:
+entry_SYSCALL_64_fastpath:
#if __SYSCALL_MASK == ~0
cmpq $__NR_syscall_max,%rax
#else
@@ -246,7 +246,7 @@ GLOBAL(system_call_after_swapgs)
jnz tracesys_phase2 /* if needed, run the slow path */
RESTORE_C_REGS_EXCEPT_RAX /* else restore clobbered regs */
movq ORIG_RAX(%rsp), %rax
- jmp system_call_fastpath /* and return to the fast path */
+ jmp entry_SYSCALL_64_fastpath /* and return to the fast path */

tracesys_phase2:
SAVE_EXTRA_REGS
@@ -411,7 +411,7 @@ GLOBAL(int_with_check)
opportunistic_sysret_failed:
SWAPGS
jmp restore_c_regs_and_iret
-END(system_call)
+END(entry_SYSCALL_64)


.macro FORK_LIKE func
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 83a7f8227949..a4a77286cb1d 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -5,11 +5,12 @@

/* misc architecture specific prototypes */

-void system_call(void);
void syscall_init(void);

-void entry_INT80_compat(void);
+void entry_SYSCALL_64(void);
void entry_SYSCALL_compat(void);
+void entry_INT80_32(void);
+void entry_INT80_compat(void);
void entry_SYSENTER_32(void);
void entry_SYSENTER_compat(void);

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b2ae7cec33ca..914be4bbc2e5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1204,7 +1204,7 @@ void syscall_init(void)
* set CS/DS but only a 32bit target. LSTAR sets the 64bit rip.
*/
wrmsrl(MSR_STAR, ((u64)__USER32_CS)<<48 | ((u64)__KERNEL_CS)<<32);
- wrmsrl(MSR_LSTAR, system_call);
+ wrmsrl(MSR_LSTAR, entry_SYSCALL_64);

#ifdef CONFIG_IA32_EMULATION
wrmsrl(MSR_CSTAR, entry_SYSCALL_compat);
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index edf97986a53d..001ddac221a1 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -72,8 +72,7 @@ gate_desc debug_idt_table[NR_VECTORS] __page_aligned_bss;
#else
#include <asm/processor-flags.h>
#include <asm/setup.h>
-
-asmlinkage int system_call(void);
+#include <asm/proto.h>
#endif

/* Must be page-aligned because the real IDT is used in a fixmap. */
@@ -997,7 +996,7 @@ void __init trap_init(void)
#endif

#ifdef CONFIG_X86_32
- set_system_trap_gate(IA32_SYSCALL_VECTOR, &system_call);
+ set_system_trap_gate(IA32_SYSCALL_VECTOR, entry_INT80_32);
set_bit(IA32_SYSCALL_VECTOR, used_vectors);
#endif

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index ccac1b1e6e93..f22667abf7b9 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -114,7 +114,7 @@ RELOC(xen_sysret32, 1b+1)
/* Normal 64-bit system call target */
ENTRY(xen_syscall_target)
undo_xen_syscall
- jmp system_call_after_swapgs
+ jmp entry_SYSCALL_64_after_swapgs
ENDPROC(xen_syscall_target)

#ifdef CONFIG_IA32_EMULATION
--
2.1.4

2015-06-08 08:36:22

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH 4/4] x86/asm/entry/32: Clean up entry_32.S

Make the 32-bit syscall entry code a bit more readable:

- use consistent assembly coding style similar to entry_64.S

- remove old comments that are not true anymore

- eliminate whitespace noise

- use consistent vertical spacing

- fix various comments

No code changed:

# arch/x86/entry/entry_32.o:

text data bss dec hex filename
6025 0 0 6025 1789 entry_32.o.before
6025 0 0 6025 1789 entry_32.o.after

md5:
f3fa16b2b0dca804f052deb6b30ba6cb entry_32.o.before.asm
f3fa16b2b0dca804f052deb6b30ba6cb entry_32.o.after.asm

Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/entry/entry_32.S | 1141 ++++++++++++++++++++++++++++++++---------------------------------
1 file changed, 570 insertions(+), 571 deletions(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index d59461032625..edd7aadfacfa 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1,23 +1,12 @@
/*
+ * Copyright (C) 1991,1992 Linus Torvalds
*
- * Copyright (C) 1991, 1992 Linus Torvalds
- */
-
-/*
- * entry.S contains the system-call and fault low-level handling routines.
- * This also contains the timer-interrupt handler, as well as all interrupts
- * and faults that can result in a task-switch.
- *
- * NOTE: This code handles signal-recognition, which happens every time
- * after a timer-interrupt and after each system call.
- *
- * I changed all the .align's to 4 (16 byte alignment), as that's faster
- * on a 486.
+ * entry_32.S contains the system-call and low-level fault and trap handling routines.
*
* Stack layout in 'syscall_exit':
- * ptrace needs to have all regs on the stack.
- * if the order here is changed, it needs to be
- * updated in fork.c:copy_process, signal.c:do_signal,
+ * ptrace needs to have all registers on the stack.
+ * If the order here is changed, it needs to be
+ * updated in fork.c:copy_process(), signal.c:do_signal(),
* ptrace.c and ptrace.h
*
* 0(%esp) - %ebx
@@ -37,8 +26,6 @@
* 38(%esp) - %eflags
* 3C(%esp) - %oldesp
* 40(%esp) - %oldss
- *
- * "current" is in register %ebx during any slow entries.
*/

#include <linux/linkage.h>
@@ -61,11 +48,11 @@
/* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */
#include <linux/elf-em.h>
#define AUDIT_ARCH_I386 (EM_386|__AUDIT_ARCH_LE)
-#define __AUDIT_ARCH_LE 0x40000000
+#define __AUDIT_ARCH_LE 0x40000000

#ifndef CONFIG_AUDITSYSCALL
-#define sysenter_audit syscall_trace_entry
-#define sysexit_audit syscall_exit_work
+# define sysenter_audit syscall_trace_entry
+# define sysexit_audit syscall_exit_work
#endif

.section .entry.text, "ax"
@@ -84,16 +71,16 @@
*/

#ifdef CONFIG_PREEMPT
-#define preempt_stop(clobbers) DISABLE_INTERRUPTS(clobbers); TRACE_IRQS_OFF
+# define preempt_stop(clobbers) DISABLE_INTERRUPTS(clobbers); TRACE_IRQS_OFF
#else
-#define preempt_stop(clobbers)
-#define resume_kernel restore_all
+# define preempt_stop(clobbers)
+# define resume_kernel restore_all
#endif

.macro TRACE_IRQS_IRET
#ifdef CONFIG_TRACE_IRQFLAGS
- testl $X86_EFLAGS_IF,PT_EFLAGS(%esp) # interrupts off?
- jz 1f
+ testl $X86_EFLAGS_IF, PT_EFLAGS(%esp) # interrupts off?
+ jz 1f
TRACE_IRQS_ON
1:
#endif
@@ -112,10 +99,10 @@

/* unfortunately push/pop can't be no-op */
.macro PUSH_GS
- pushl $0
+ pushl $0
.endm
.macro POP_GS pop=0
- addl $(4 + \pop), %esp
+ addl $(4 + \pop), %esp
.endm
.macro POP_GS_EX
.endm
@@ -135,119 +122,119 @@
#else /* CONFIG_X86_32_LAZY_GS */

.macro PUSH_GS
- pushl %gs
+ pushl %gs
.endm

.macro POP_GS pop=0
-98: popl %gs
+98: popl %gs
.if \pop <> 0
add $\pop, %esp
.endif
.endm
.macro POP_GS_EX
.pushsection .fixup, "ax"
-99: movl $0, (%esp)
- jmp 98b
+99: movl $0, (%esp)
+ jmp 98b
.popsection
- _ASM_EXTABLE(98b,99b)
+ _ASM_EXTABLE(98b, 99b)
.endm

.macro PTGS_TO_GS
-98: mov PT_GS(%esp), %gs
+98: mov PT_GS(%esp), %gs
.endm
.macro PTGS_TO_GS_EX
.pushsection .fixup, "ax"
-99: movl $0, PT_GS(%esp)
- jmp 98b
+99: movl $0, PT_GS(%esp)
+ jmp 98b
.popsection
- _ASM_EXTABLE(98b,99b)
+ _ASM_EXTABLE(98b, 99b)
.endm

.macro GS_TO_REG reg
- movl %gs, \reg
+ movl %gs, \reg
.endm
.macro REG_TO_PTGS reg
- movl \reg, PT_GS(%esp)
+ movl \reg, PT_GS(%esp)
.endm
.macro SET_KERNEL_GS reg
- movl $(__KERNEL_STACK_CANARY), \reg
- movl \reg, %gs
+ movl $(__KERNEL_STACK_CANARY), \reg
+ movl \reg, %gs
.endm

-#endif /* CONFIG_X86_32_LAZY_GS */
+#endif /* CONFIG_X86_32_LAZY_GS */

.macro SAVE_ALL
cld
PUSH_GS
- pushl %fs
- pushl %es
- pushl %ds
- pushl %eax
- pushl %ebp
- pushl %edi
- pushl %esi
- pushl %edx
- pushl %ecx
- pushl %ebx
- movl $(__USER_DS), %edx
- movl %edx, %ds
- movl %edx, %es
- movl $(__KERNEL_PERCPU), %edx
- movl %edx, %fs
+ pushl %fs
+ pushl %es
+ pushl %ds
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
+ movl $(__USER_DS), %edx
+ movl %edx, %ds
+ movl %edx, %es
+ movl $(__KERNEL_PERCPU), %edx
+ movl %edx, %fs
SET_KERNEL_GS %edx
.endm

.macro RESTORE_INT_REGS
- popl %ebx
- popl %ecx
- popl %edx
- popl %esi
- popl %edi
- popl %ebp
- popl %eax
+ popl %ebx
+ popl %ecx
+ popl %edx
+ popl %esi
+ popl %edi
+ popl %ebp
+ popl %eax
.endm

.macro RESTORE_REGS pop=0
RESTORE_INT_REGS
-1: popl %ds
-2: popl %es
-3: popl %fs
+1: popl %ds
+2: popl %es
+3: popl %fs
POP_GS \pop
.pushsection .fixup, "ax"
-4: movl $0, (%esp)
- jmp 1b
-5: movl $0, (%esp)
- jmp 2b
-6: movl $0, (%esp)
- jmp 3b
+4: movl $0, (%esp)
+ jmp 1b
+5: movl $0, (%esp)
+ jmp 2b
+6: movl $0, (%esp)
+ jmp 3b
.popsection
- _ASM_EXTABLE(1b,4b)
- _ASM_EXTABLE(2b,5b)
- _ASM_EXTABLE(3b,6b)
+ _ASM_EXTABLE(1b, 4b)
+ _ASM_EXTABLE(2b, 5b)
+ _ASM_EXTABLE(3b, 6b)
POP_GS_EX
.endm

ENTRY(ret_from_fork)
- pushl %eax
- call schedule_tail
+ pushl %eax
+ call schedule_tail
GET_THREAD_INFO(%ebp)
- popl %eax
- pushl $0x0202 # Reset kernel eflags
+ popl %eax
+ pushl $0x0202 # Reset kernel eflags
popfl
- jmp syscall_exit
+ jmp syscall_exit
END(ret_from_fork)

ENTRY(ret_from_kernel_thread)
- pushl %eax
- call schedule_tail
+ pushl %eax
+ call schedule_tail
GET_THREAD_INFO(%ebp)
- popl %eax
- pushl $0x0202 # Reset kernel eflags
+ popl %eax
+ pushl $0x0202 # Reset kernel eflags
popfl
- movl PT_EBP(%esp),%eax
- call *PT_EBX(%esp)
- movl $0,PT_EAX(%esp)
- jmp syscall_exit
+ movl PT_EBP(%esp), %eax
+ call *PT_EBX(%esp)
+ movl $0, PT_EAX(%esp)
+ jmp syscall_exit
ENDPROC(ret_from_kernel_thread)

/*
@@ -264,62 +251,65 @@ ENDPROC(ret_from_kernel_thread)
ret_from_intr:
GET_THREAD_INFO(%ebp)
#ifdef CONFIG_VM86
- movl PT_EFLAGS(%esp), %eax # mix EFLAGS and CS
- movb PT_CS(%esp), %al
- andl $(X86_EFLAGS_VM | SEGMENT_RPL_MASK), %eax
+ movl PT_EFLAGS(%esp), %eax # mix EFLAGS and CS
+ movb PT_CS(%esp), %al
+ andl $(X86_EFLAGS_VM | SEGMENT_RPL_MASK), %eax
#else
/*
* We can be coming here from child spawned by kernel_thread().
*/
- movl PT_CS(%esp), %eax
- andl $SEGMENT_RPL_MASK, %eax
+ movl PT_CS(%esp), %eax
+ andl $SEGMENT_RPL_MASK, %eax
#endif
- cmpl $USER_RPL, %eax
- jb resume_kernel # not returning to v8086 or userspace
+ cmpl $USER_RPL, %eax
+ jb resume_kernel # not returning to v8086 or userspace

ENTRY(resume_userspace)
LOCKDEP_SYS_EXIT
- DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
- # setting need_resched or sigpending
- # between sampling and the iret
+ DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
+ # setting need_resched or sigpending
+ # between sampling and the iret
TRACE_IRQS_OFF
- movl TI_flags(%ebp), %ecx
- andl $_TIF_WORK_MASK, %ecx # is there any work to be done on
- # int/exception return?
- jne work_pending
- jmp restore_all
+ movl TI_flags(%ebp), %ecx
+ andl $_TIF_WORK_MASK, %ecx # is there any work to be done on
+ # int/exception return?
+ jne work_pending
+ jmp restore_all
END(ret_from_exception)

#ifdef CONFIG_PREEMPT
ENTRY(resume_kernel)
DISABLE_INTERRUPTS(CLBR_ANY)
need_resched:
- cmpl $0,PER_CPU_VAR(__preempt_count)
- jnz restore_all
- testl $X86_EFLAGS_IF,PT_EFLAGS(%esp) # interrupts off (exception path) ?
- jz restore_all
- call preempt_schedule_irq
- jmp need_resched
+ cmpl $0, PER_CPU_VAR(__preempt_count)
+ jnz restore_all
+ testl $X86_EFLAGS_IF, PT_EFLAGS(%esp) # interrupts off (exception path) ?
+ jz restore_all
+ call preempt_schedule_irq
+ jmp need_resched
END(resume_kernel)
#endif

-/* SYSENTER_RETURN points to after the "sysenter" instruction in
- the vsyscall page. See vsyscall-sysentry.S, which defines the symbol. */
+/*
+ * SYSENTER_RETURN points to after the SYSENTER instruction
+ * in the vsyscall page. See vsyscall-sysentry.S, which defines
+ * the symbol.
+ */

- # sysenter call handler stub
+ # SYSENTER call handler stub
ENTRY(entry_SYSENTER_32)
- movl TSS_sysenter_sp0(%esp),%esp
+ movl TSS_sysenter_sp0(%esp), %esp
sysenter_past_esp:
/*
* Interrupts are disabled here, but we can't trace it until
* enough kernel state to call TRACE_IRQS_OFF can be called - but
* we immediately enable interrupts at that point anyway.
*/
- pushl $__USER_DS
- pushl %ebp
+ pushl $__USER_DS
+ pushl %ebp
pushfl
- orl $X86_EFLAGS_IF, (%esp)
- pushl $__USER_CS
+ orl $X86_EFLAGS_IF, (%esp)
+ pushl $__USER_CS
/*
* Push current_thread_info()->sysenter_return to the stack.
* A tiny bit of offset fixup is necessary: TI_sysenter_return
@@ -328,9 +318,9 @@ ENTRY(entry_SYSENTER_32)
* TOP_OF_KERNEL_STACK_PADDING takes us to the top of the stack;
* and THREAD_SIZE takes us to the bottom.
*/
- pushl ((TI_sysenter_return) - THREAD_SIZE + TOP_OF_KERNEL_STACK_PADDING + 4*4)(%esp)
+ pushl ((TI_sysenter_return) - THREAD_SIZE + TOP_OF_KERNEL_STACK_PADDING + 4*4)(%esp)

- pushl %eax
+ pushl %eax
SAVE_ALL
ENABLE_INTERRUPTS(CLBR_NONE)

@@ -338,132 +328,134 @@ ENTRY(entry_SYSENTER_32)
* Load the potential sixth argument from user stack.
* Careful about security.
*/
- cmpl $__PAGE_OFFSET-3,%ebp
- jae syscall_fault
+ cmpl $__PAGE_OFFSET-3, %ebp
+ jae syscall_fault
ASM_STAC
-1: movl (%ebp),%ebp
+1: movl (%ebp), %ebp
ASM_CLAC
- movl %ebp,PT_EBP(%esp)
- _ASM_EXTABLE(1b,syscall_fault)
+ movl %ebp, PT_EBP(%esp)
+ _ASM_EXTABLE(1b, syscall_fault)

GET_THREAD_INFO(%ebp)

- testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%ebp)
- jnz sysenter_audit
+ testl $_TIF_WORK_SYSCALL_ENTRY, TI_flags(%ebp)
+ jnz sysenter_audit
sysenter_do_call:
- cmpl $(NR_syscalls), %eax
- jae sysenter_badsys
- call *sys_call_table(,%eax,4)
+ cmpl $(NR_syscalls), %eax
+ jae sysenter_badsys
+ call *sys_call_table(, %eax, 4)
sysenter_after_call:
- movl %eax,PT_EAX(%esp)
+ movl %eax, PT_EAX(%esp)
LOCKDEP_SYS_EXIT
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
- movl TI_flags(%ebp), %ecx
- testl $_TIF_ALLWORK_MASK, %ecx
- jnz sysexit_audit
+ movl TI_flags(%ebp), %ecx
+ testl $_TIF_ALLWORK_MASK, %ecx
+ jnz sysexit_audit
sysenter_exit:
/* if something modifies registers it must also disable sysexit */
- movl PT_EIP(%esp), %edx
- movl PT_OLDESP(%esp), %ecx
- xorl %ebp,%ebp
+ movl PT_EIP(%esp), %edx
+ movl PT_OLDESP(%esp), %ecx
+ xorl %ebp, %ebp
TRACE_IRQS_ON
-1: mov PT_FS(%esp), %fs
+1: mov PT_FS(%esp), %fs
PTGS_TO_GS
ENABLE_INTERRUPTS_SYSEXIT

#ifdef CONFIG_AUDITSYSCALL
sysenter_audit:
- testl $(_TIF_WORK_SYSCALL_ENTRY & ~_TIF_SYSCALL_AUDIT),TI_flags(%ebp)
- jnz syscall_trace_entry
- /* movl PT_EAX(%esp), %eax already set, syscall number: 1st arg to audit */
- movl PT_EBX(%esp), %edx /* ebx/a0: 2nd arg to audit */
- /* movl PT_ECX(%esp), %ecx already set, a1: 3nd arg to audit */
- pushl PT_ESI(%esp) /* a3: 5th arg */
- pushl PT_EDX+4(%esp) /* a2: 4th arg */
- call __audit_syscall_entry
- popl %ecx /* get that remapped edx off the stack */
- popl %ecx /* get that remapped esi off the stack */
- movl PT_EAX(%esp),%eax /* reload syscall number */
- jmp sysenter_do_call
+ testl $(_TIF_WORK_SYSCALL_ENTRY & ~_TIF_SYSCALL_AUDIT), TI_flags(%ebp)
+ jnz syscall_trace_entry
+ /* movl PT_EAX(%esp), %eax already set, syscall number: 1st arg to audit */
+ movl PT_EBX(%esp), %edx /* ebx/a0: 2nd arg to audit */
+ /* movl PT_ECX(%esp), %ecx already set, a1: 3nd arg to audit */
+ pushl PT_ESI(%esp) /* a3: 5th arg */
+ pushl PT_EDX+4(%esp) /* a2: 4th arg */
+ call __audit_syscall_entry
+ popl %ecx /* get that remapped edx off the stack */
+ popl %ecx /* get that remapped esi off the stack */
+ movl PT_EAX(%esp), %eax /* reload syscall number */
+ jmp sysenter_do_call

sysexit_audit:
- testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %ecx
- jnz syscall_exit_work
+ testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %ecx
+ jnz syscall_exit_work
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_ANY)
- movl %eax,%edx /* second arg, syscall return value */
- cmpl $-MAX_ERRNO,%eax /* is it an error ? */
- setbe %al /* 1 if so, 0 if not */
- movzbl %al,%eax /* zero-extend that */
- call __audit_syscall_exit
+ movl %eax, %edx /* second arg, syscall return value */
+ cmpl $-MAX_ERRNO, %eax /* is it an error ? */
+ setbe %al /* 1 if so, 0 if not */
+ movzbl %al, %eax /* zero-extend that */
+ call __audit_syscall_exit
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
- movl TI_flags(%ebp), %ecx
- testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %ecx
- jnz syscall_exit_work
- movl PT_EAX(%esp),%eax /* reload syscall return value */
- jmp sysenter_exit
+ movl TI_flags(%ebp), %ecx
+ testl $(_TIF_ALLWORK_MASK & ~_TIF_SYSCALL_AUDIT), %ecx
+ jnz syscall_exit_work
+ movl PT_EAX(%esp), %eax /* reload syscall return value */
+ jmp sysenter_exit
#endif

-.pushsection .fixup,"ax"
-2: movl $0,PT_FS(%esp)
- jmp 1b
+.pushsection .fixup, "ax"
+2: movl $0, PT_FS(%esp)
+ jmp 1b
.popsection
- _ASM_EXTABLE(1b,2b)
+ _ASM_EXTABLE(1b, 2b)
PTGS_TO_GS_EX
ENDPROC(entry_SYSENTER_32)

# system call handler stub
ENTRY(entry_INT80_32)
ASM_CLAC
- pushl %eax # save orig_eax
+ pushl %eax # save orig_eax
SAVE_ALL
GET_THREAD_INFO(%ebp)
- # system call tracing in operation / emulation
- testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%ebp)
- jnz syscall_trace_entry
- cmpl $(NR_syscalls), %eax
- jae syscall_badsys
+ # system call tracing in operation / emulation
+ testl $_TIF_WORK_SYSCALL_ENTRY, TI_flags(%ebp)
+ jnz syscall_trace_entry
+ cmpl $(NR_syscalls), %eax
+ jae syscall_badsys
syscall_call:
- call *sys_call_table(,%eax,4)
+ call *sys_call_table(, %eax, 4)
syscall_after_call:
- movl %eax,PT_EAX(%esp) # store the return value
+ movl %eax, PT_EAX(%esp) # store the return value
syscall_exit:
LOCKDEP_SYS_EXIT
- DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
- # setting need_resched or sigpending
- # between sampling and the iret
+ DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
+ # setting need_resched or sigpending
+ # between sampling and the iret
TRACE_IRQS_OFF
- movl TI_flags(%ebp), %ecx
- testl $_TIF_ALLWORK_MASK, %ecx # current->work
- jnz syscall_exit_work
+ movl TI_flags(%ebp), %ecx
+ testl $_TIF_ALLWORK_MASK, %ecx # current->work
+ jnz syscall_exit_work

restore_all:
TRACE_IRQS_IRET
restore_all_notrace:
#ifdef CONFIG_X86_ESPFIX32
- movl PT_EFLAGS(%esp), %eax # mix EFLAGS, SS and CS
- # Warning: PT_OLDSS(%esp) contains the wrong/random values if we
- # are returning to the kernel.
- # See comments in process.c:copy_thread() for details.
- movb PT_OLDSS(%esp), %ah
- movb PT_CS(%esp), %al
- andl $(X86_EFLAGS_VM | (SEGMENT_TI_MASK << 8) | SEGMENT_RPL_MASK), %eax
- cmpl $((SEGMENT_LDT << 8) | USER_RPL), %eax
- je ldt_ss # returning to user-space with LDT SS
+ movl PT_EFLAGS(%esp), %eax # mix EFLAGS, SS and CS
+ /*
+ * Warning: PT_OLDSS(%esp) contains the wrong/random values if we
+ * are returning to the kernel.
+ * See comments in process.c:copy_thread() for details.
+ */
+ movb PT_OLDSS(%esp), %ah
+ movb PT_CS(%esp), %al
+ andl $(X86_EFLAGS_VM | (SEGMENT_TI_MASK << 8) | SEGMENT_RPL_MASK), %eax
+ cmpl $((SEGMENT_LDT << 8) | USER_RPL), %eax
+ je ldt_ss # returning to user-space with LDT SS
#endif
restore_nocheck:
- RESTORE_REGS 4 # skip orig_eax/error_code
+ RESTORE_REGS 4 # skip orig_eax/error_code
irq_return:
INTERRUPT_RETURN
-.section .fixup,"ax"
-ENTRY(iret_exc)
- pushl $0 # no error code
- pushl $do_iret_error
- jmp error_code
+.section .fixup, "ax"
+ENTRY(iret_exc )
+ pushl $0 # no error code
+ pushl $do_iret_error
+ jmp error_code
.previous
- _ASM_EXTABLE(irq_return,iret_exc)
+ _ASM_EXTABLE(irq_return, iret_exc)

#ifdef CONFIG_X86_ESPFIX32
ldt_ss:
@@ -476,8 +468,8 @@ ENTRY(iret_exc)
* is still available to implement the setting of the high
* 16-bits in the INTERRUPT_RETURN paravirt-op.
*/
- cmpl $0, pv_info+PARAVIRT_enabled
- jne restore_nocheck
+ cmpl $0, pv_info+PARAVIRT_enabled
+ jne restore_nocheck
#endif

/*
@@ -492,21 +484,23 @@ ENTRY(iret_exc)
* a base address that matches for the difference.
*/
#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + (GDT_ENTRY_ESPFIX_SS * 8)
- mov %esp, %edx /* load kernel esp */
- mov PT_OLDESP(%esp), %eax /* load userspace esp */
- mov %dx, %ax /* eax: new kernel esp */
- sub %eax, %edx /* offset (low word is 0) */
+ mov %esp, %edx /* load kernel esp */
+ mov PT_OLDESP(%esp), %eax /* load userspace esp */
+ mov %dx, %ax /* eax: new kernel esp */
+ sub %eax, %edx /* offset (low word is 0) */
shr $16, %edx
- mov %dl, GDT_ESPFIX_SS + 4 /* bits 16..23 */
- mov %dh, GDT_ESPFIX_SS + 7 /* bits 24..31 */
- pushl $__ESPFIX_SS
- pushl %eax /* new kernel esp */
- /* Disable interrupts, but do not irqtrace this section: we
+ mov %dl, GDT_ESPFIX_SS + 4 /* bits 16..23 */
+ mov %dh, GDT_ESPFIX_SS + 7 /* bits 24..31 */
+ pushl $__ESPFIX_SS
+ pushl %eax /* new kernel esp */
+ /*
+ * Disable interrupts, but do not irqtrace this section: we
* will soon execute iret and the tracer was already set to
- * the irqstate after the iret */
+ * the irqstate after the IRET:
+ */
DISABLE_INTERRUPTS(CLBR_EAX)
- lss (%esp), %esp /* switch to espfix segment */
- jmp restore_nocheck
+ lss (%esp), %esp /* switch to espfix segment */
+ jmp restore_nocheck
#endif
ENDPROC(entry_INT80_32)

@@ -514,93 +508,93 @@ ENDPROC(entry_INT80_32)
ALIGN
work_pending:
testb $_TIF_NEED_RESCHED, %cl
- jz work_notifysig
+ jz work_notifysig
work_resched:
- call schedule
+ call schedule
LOCKDEP_SYS_EXIT
- DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
- # setting need_resched or sigpending
- # between sampling and the iret
+ DISABLE_INTERRUPTS(CLBR_ANY) # make sure we don't miss an interrupt
+ # setting need_resched or sigpending
+ # between sampling and the iret
TRACE_IRQS_OFF
- movl TI_flags(%ebp), %ecx
- andl $_TIF_WORK_MASK, %ecx # is there any work to be done other
- # than syscall tracing?
- jz restore_all
+ movl TI_flags(%ebp), %ecx
+ andl $_TIF_WORK_MASK, %ecx # is there any work to be done other
+ # than syscall tracing?
+ jz restore_all
testb $_TIF_NEED_RESCHED, %cl
- jnz work_resched
+ jnz work_resched

-work_notifysig: # deal with pending signals and
- # notify-resume requests
+work_notifysig: # deal with pending signals and
+ # notify-resume requests
#ifdef CONFIG_VM86
- testl $X86_EFLAGS_VM, PT_EFLAGS(%esp)
- movl %esp, %eax
- jnz work_notifysig_v86 # returning to kernel-space or
- # vm86-space
+ testl $X86_EFLAGS_VM, PT_EFLAGS(%esp)
+ movl %esp, %eax
+ jnz work_notifysig_v86 # returning to kernel-space or
+ # vm86-space
1:
#else
- movl %esp, %eax
+ movl %esp, %eax
#endif
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
- movb PT_CS(%esp), %bl
+ movb PT_CS(%esp), %bl
andb $SEGMENT_RPL_MASK, %bl
cmpb $USER_RPL, %bl
- jb resume_kernel
- xorl %edx, %edx
- call do_notify_resume
- jmp resume_userspace
+ jb resume_kernel
+ xorl %edx, %edx
+ call do_notify_resume
+ jmp resume_userspace

#ifdef CONFIG_VM86
ALIGN
work_notifysig_v86:
- pushl %ecx # save ti_flags for do_notify_resume
- call save_v86_state # %eax contains pt_regs pointer
- popl %ecx
- movl %eax, %esp
- jmp 1b
+ pushl %ecx # save ti_flags for do_notify_resume
+ call save_v86_state # %eax contains pt_regs pointer
+ popl %ecx
+ movl %eax, %esp
+ jmp 1b
#endif
END(work_pending)

# perform syscall exit tracing
ALIGN
syscall_trace_entry:
- movl $-ENOSYS,PT_EAX(%esp)
- movl %esp, %eax
- call syscall_trace_enter
+ movl $-ENOSYS, PT_EAX(%esp)
+ movl %esp, %eax
+ call syscall_trace_enter
/* What it returned is what we'll actually use. */
- cmpl $(NR_syscalls), %eax
- jnae syscall_call
- jmp syscall_exit
+ cmpl $(NR_syscalls), %eax
+ jnae syscall_call
+ jmp syscall_exit
END(syscall_trace_entry)

# perform syscall exit tracing
ALIGN
syscall_exit_work:
- testl $_TIF_WORK_SYSCALL_EXIT, %ecx
- jz work_pending
+ testl $_TIF_WORK_SYSCALL_EXIT, %ecx
+ jz work_pending
TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_ANY) # could let syscall_trace_leave() call
- # schedule() instead
- movl %esp, %eax
- call syscall_trace_leave
- jmp resume_userspace
+ ENABLE_INTERRUPTS(CLBR_ANY) # could let syscall_trace_leave() call
+ # schedule() instead
+ movl %esp, %eax
+ call syscall_trace_leave
+ jmp resume_userspace
END(syscall_exit_work)

syscall_fault:
ASM_CLAC
GET_THREAD_INFO(%ebp)
- movl $-EFAULT,PT_EAX(%esp)
- jmp resume_userspace
+ movl $-EFAULT, PT_EAX(%esp)
+ jmp resume_userspace
END(syscall_fault)

syscall_badsys:
- movl $-ENOSYS,%eax
- jmp syscall_after_call
+ movl $-ENOSYS, %eax
+ jmp syscall_after_call
END(syscall_badsys)

sysenter_badsys:
- movl $-ENOSYS,%eax
- jmp sysenter_after_call
+ movl $-ENOSYS, %eax
+ jmp sysenter_after_call
END(sysenter_badsys)

.macro FIXUP_ESPFIX_STACK
@@ -613,24 +607,24 @@ END(sysenter_badsys)
*/
#ifdef CONFIG_X86_ESPFIX32
/* fixup the stack */
- mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */
- mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */
+ mov GDT_ESPFIX_SS + 4, %al /* bits 16..23 */
+ mov GDT_ESPFIX_SS + 7, %ah /* bits 24..31 */
shl $16, %eax
- addl %esp, %eax /* the adjusted stack pointer */
- pushl $__KERNEL_DS
- pushl %eax
- lss (%esp), %esp /* switch to the normal stack segment */
+ addl %esp, %eax /* the adjusted stack pointer */
+ pushl $__KERNEL_DS
+ pushl %eax
+ lss (%esp), %esp /* switch to the normal stack segment */
#endif
.endm
.macro UNWIND_ESPFIX_STACK
#ifdef CONFIG_X86_ESPFIX32
- movl %ss, %eax
+ movl %ss, %eax
/* see if on espfix stack */
- cmpw $__ESPFIX_SS, %ax
- jne 27f
- movl $__KERNEL_DS, %eax
- movl %eax, %ds
- movl %eax, %es
+ cmpw $__ESPFIX_SS, %ax
+ jne 27f
+ movl $__KERNEL_DS, %eax
+ movl %eax, %ds
+ movl %eax, %es
/* switch to normal stack */
FIXUP_ESPFIX_STACK
27:
@@ -645,7 +639,7 @@ END(sysenter_badsys)
ENTRY(irq_entries_start)
vector=FIRST_EXTERNAL_VECTOR
.rept (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR)
- pushl $(~vector+0x80) /* Note: always in signed byte range */
+ pushl $(~vector+0x80) /* Note: always in signed byte range */
vector=vector+1
jmp common_interrupt
.align 8
@@ -659,35 +653,34 @@ END(irq_entries_start)
.p2align CONFIG_X86_L1_CACHE_SHIFT
common_interrupt:
ASM_CLAC
- addl $-0x80,(%esp) /* Adjust vector into the [-256,-1] range */
+ addl $-0x80, (%esp) /* Adjust vector into the [-256, -1] range */
SAVE_ALL
TRACE_IRQS_OFF
- movl %esp,%eax
- call do_IRQ
- jmp ret_from_intr
+ movl %esp, %eax
+ call do_IRQ
+ jmp ret_from_intr
ENDPROC(common_interrupt)

#define BUILD_INTERRUPT3(name, nr, fn) \
ENTRY(name) \
ASM_CLAC; \
- pushl $~(nr); \
+ pushl $~(nr); \
SAVE_ALL; \
TRACE_IRQS_OFF \
- movl %esp,%eax; \
- call fn; \
- jmp ret_from_intr; \
+ movl %esp, %eax; \
+ call fn; \
+ jmp ret_from_intr; \
ENDPROC(name)


#ifdef CONFIG_TRACING
-#define TRACE_BUILD_INTERRUPT(name, nr) \
- BUILD_INTERRUPT3(trace_##name, nr, smp_trace_##name)
+# define TRACE_BUILD_INTERRUPT(name, nr) BUILD_INTERRUPT3(trace_##name, nr, smp_trace_##name)
#else
-#define TRACE_BUILD_INTERRUPT(name, nr)
+# define TRACE_BUILD_INTERRUPT(name, nr)
#endif

-#define BUILD_INTERRUPT(name, nr) \
- BUILD_INTERRUPT3(name, nr, smp_##name); \
+#define BUILD_INTERRUPT(name, nr) \
+ BUILD_INTERRUPT3(name, nr, smp_##name); \
TRACE_BUILD_INTERRUPT(name, nr)

/* The include is where all of the SMP etc. interrupts come from */
@@ -695,30 +688,30 @@ ENDPROC(name)

ENTRY(coprocessor_error)
ASM_CLAC
- pushl $0
- pushl $do_coprocessor_error
- jmp error_code
+ pushl $0
+ pushl $do_coprocessor_error
+ jmp error_code
END(coprocessor_error)

ENTRY(simd_coprocessor_error)
ASM_CLAC
- pushl $0
+ pushl $0
#ifdef CONFIG_X86_INVD_BUG
/* AMD 486 bug: invd from userspace calls exception 19 instead of #GP */
- ALTERNATIVE "pushl $do_general_protection", \
- "pushl $do_simd_coprocessor_error", \
+ ALTERNATIVE "pushl $do_general_protection", \
+ "pushl $do_simd_coprocessor_error", \
X86_FEATURE_XMM
#else
- pushl $do_simd_coprocessor_error
+ pushl $do_simd_coprocessor_error
#endif
- jmp error_code
+ jmp error_code
END(simd_coprocessor_error)

ENTRY(device_not_available)
ASM_CLAC
- pushl $-1 # mark this as an int
- pushl $do_device_not_available
- jmp error_code
+ pushl $-1 # mark this as an int
+ pushl $do_device_not_available
+ jmp error_code
END(device_not_available)

#ifdef CONFIG_PARAVIRT
@@ -735,165 +728,171 @@ END(native_irq_enable_sysexit)

ENTRY(overflow)
ASM_CLAC
- pushl $0
- pushl $do_overflow
- jmp error_code
+ pushl $0
+ pushl $do_overflow
+ jmp error_code
END(overflow)

ENTRY(bounds)
ASM_CLAC
- pushl $0
- pushl $do_bounds
- jmp error_code
+ pushl $0
+ pushl $do_bounds
+ jmp error_code
END(bounds)

ENTRY(invalid_op)
ASM_CLAC
- pushl $0
- pushl $do_invalid_op
- jmp error_code
+ pushl $0
+ pushl $do_invalid_op
+ jmp error_code
END(invalid_op)

ENTRY(coprocessor_segment_overrun)
ASM_CLAC
- pushl $0
- pushl $do_coprocessor_segment_overrun
- jmp error_code
+ pushl $0
+ pushl $do_coprocessor_segment_overrun
+ jmp error_code
END(coprocessor_segment_overrun)

ENTRY(invalid_TSS)
ASM_CLAC
- pushl $do_invalid_TSS
- jmp error_code
+ pushl $do_invalid_TSS
+ jmp error_code
END(invalid_TSS)

ENTRY(segment_not_present)
ASM_CLAC
- pushl $do_segment_not_present
- jmp error_code
+ pushl $do_segment_not_present
+ jmp error_code
END(segment_not_present)

ENTRY(stack_segment)
ASM_CLAC
- pushl $do_stack_segment
- jmp error_code
+ pushl $do_stack_segment
+ jmp error_code
END(stack_segment)

ENTRY(alignment_check)
ASM_CLAC
- pushl $do_alignment_check
- jmp error_code
+ pushl $do_alignment_check
+ jmp error_code
END(alignment_check)

ENTRY(divide_error)
ASM_CLAC
- pushl $0 # no error code
- pushl $do_divide_error
- jmp error_code
+ pushl $0 # no error code
+ pushl $do_divide_error
+ jmp error_code
END(divide_error)

#ifdef CONFIG_X86_MCE
ENTRY(machine_check)
ASM_CLAC
- pushl $0
- pushl machine_check_vector
- jmp error_code
+ pushl $0
+ pushl machine_check_vector
+ jmp error_code
END(machine_check)
#endif

ENTRY(spurious_interrupt_bug)
ASM_CLAC
- pushl $0
- pushl $do_spurious_interrupt_bug
- jmp error_code
+ pushl $0
+ pushl $do_spurious_interrupt_bug
+ jmp error_code
END(spurious_interrupt_bug)

#ifdef CONFIG_XEN
-/* Xen doesn't set %esp to be precisely what the normal sysenter
- entrypoint expects, so fix it up before using the normal path. */
+/*
+ * Xen doesn't set %esp to be precisely what the normal SYSENTER
+ * entry point expects, so fix it up before using the normal path.
+ */
ENTRY(xen_sysenter_target)
- addl $5*4, %esp /* remove xen-provided frame */
- jmp sysenter_past_esp
+ addl $5*4, %esp /* remove xen-provided frame */
+ jmp sysenter_past_esp

ENTRY(xen_hypervisor_callback)
- pushl $-1 /* orig_ax = -1 => not a system call */
+ pushl $-1 /* orig_ax = -1 => not a system call */
SAVE_ALL
TRACE_IRQS_OFF

- /* Check to see if we got the event in the critical
- region in xen_iret_direct, after we've reenabled
- events and checked for pending events. This simulates
- iret instruction's behaviour where it delivers a
- pending interrupt when enabling interrupts. */
- movl PT_EIP(%esp),%eax
- cmpl $xen_iret_start_crit,%eax
- jb 1f
- cmpl $xen_iret_end_crit,%eax
- jae 1f
+ /*
+ * Check to see if we got the event in the critical
+ * region in xen_iret_direct, after we've reenabled
+ * events and checked for pending events. This simulates
+ * iret instruction's behaviour where it delivers a
+ * pending interrupt when enabling interrupts:
+ */
+ movl PT_EIP(%esp), %eax
+ cmpl $xen_iret_start_crit, %eax
+ jb 1f
+ cmpl $xen_iret_end_crit, %eax
+ jae 1f

- jmp xen_iret_crit_fixup
+ jmp xen_iret_crit_fixup

ENTRY(xen_do_upcall)
-1: mov %esp, %eax
- call xen_evtchn_do_upcall
+1: mov %esp, %eax
+ call xen_evtchn_do_upcall
#ifndef CONFIG_PREEMPT
- call xen_maybe_preempt_hcall
+ call xen_maybe_preempt_hcall
#endif
- jmp ret_from_intr
+ jmp ret_from_intr
ENDPROC(xen_hypervisor_callback)

-# Hypervisor uses this for application faults while it executes.
-# We get here for two reasons:
-# 1. Fault while reloading DS, ES, FS or GS
-# 2. Fault while executing IRET
-# Category 1 we fix up by reattempting the load, and zeroing the segment
-# register if the load fails.
-# Category 2 we fix up by jumping to do_iret_error. We cannot use the
-# normal Linux return path in this case because if we use the IRET hypercall
-# to pop the stack frame we end up in an infinite loop of failsafe callbacks.
-# We distinguish between categories by maintaining a status value in EAX.
+/*
+ * Hypervisor uses this for application faults while it executes.
+ * We get here for two reasons:
+ * 1. Fault while reloading DS, ES, FS or GS
+ * 2. Fault while executing IRET
+ * Category 1 we fix up by reattempting the load, and zeroing the segment
+ * register if the load fails.
+ * Category 2 we fix up by jumping to do_iret_error. We cannot use the
+ * normal Linux return path in this case because if we use the IRET hypercall
+ * to pop the stack frame we end up in an infinite loop of failsafe callbacks.
+ * We distinguish between categories by maintaining a status value in EAX.
+ */
ENTRY(xen_failsafe_callback)
- pushl %eax
- movl $1,%eax
-1: mov 4(%esp),%ds
-2: mov 8(%esp),%es
-3: mov 12(%esp),%fs
-4: mov 16(%esp),%gs
+ pushl %eax
+ movl $1, %eax
+1: mov 4(%esp), %ds
+2: mov 8(%esp), %es
+3: mov 12(%esp), %fs
+4: mov 16(%esp), %gs
/* EAX == 0 => Category 1 (Bad segment)
EAX != 0 => Category 2 (Bad IRET) */
- testl %eax,%eax
- popl %eax
- lea 16(%esp),%esp
- jz 5f
- jmp iret_exc
-5: pushl $-1 /* orig_ax = -1 => not a system call */
+ testl %eax, %eax
+ popl %eax
+ lea 16(%esp), %esp
+ jz 5f
+ jmp iret_exc
+5: pushl $-1 /* orig_ax = -1 => not a system call */
SAVE_ALL
- jmp ret_from_exception
-
-.section .fixup,"ax"
-6: xorl %eax,%eax
- movl %eax,4(%esp)
- jmp 1b
-7: xorl %eax,%eax
- movl %eax,8(%esp)
- jmp 2b
-8: xorl %eax,%eax
- movl %eax,12(%esp)
- jmp 3b
-9: xorl %eax,%eax
- movl %eax,16(%esp)
- jmp 4b
+ jmp ret_from_exception
+
+.section .fixup, "ax"
+6: xorl %eax, %eax
+ movl %eax, 4(%esp)
+ jmp 1b
+7: xorl %eax, %eax
+ movl %eax, 8(%esp)
+ jmp 2b
+8: xorl %eax, %eax
+ movl %eax, 12(%esp)
+ jmp 3b
+9: xorl %eax, %eax
+ movl %eax, 16(%esp)
+ jmp 4b
.previous
- _ASM_EXTABLE(1b,6b)
- _ASM_EXTABLE(2b,7b)
- _ASM_EXTABLE(3b,8b)
- _ASM_EXTABLE(4b,9b)
+ _ASM_EXTABLE(1b, 6b)
+ _ASM_EXTABLE(2b, 7b)
+ _ASM_EXTABLE(3b, 8b)
+ _ASM_EXTABLE(4b, 9b)
ENDPROC(xen_failsafe_callback)

BUILD_INTERRUPT3(xen_hvm_callback_vector, HYPERVISOR_CALLBACK_VECTOR,
xen_evtchn_do_upcall)

-#endif /* CONFIG_XEN */
+#endif /* CONFIG_XEN */

#if IS_ENABLED(CONFIG_HYPERV)

@@ -910,28 +909,28 @@ ENTRY(mcount)
END(mcount)

ENTRY(ftrace_caller)
- pushl %eax
- pushl %ecx
- pushl %edx
- pushl $0 /* Pass NULL as regs pointer */
- movl 4*4(%esp), %eax
- movl 0x4(%ebp), %edx
- movl function_trace_op, %ecx
- subl $MCOUNT_INSN_SIZE, %eax
+ pushl %eax
+ pushl %ecx
+ pushl %edx
+ pushl $0 /* Pass NULL as regs pointer */
+ movl 4*4(%esp), %eax
+ movl 0x4(%ebp), %edx
+ movl function_trace_op, %ecx
+ subl $MCOUNT_INSN_SIZE, %eax

.globl ftrace_call
ftrace_call:
- call ftrace_stub
+ call ftrace_stub

- addl $4,%esp /* skip NULL pointer */
- popl %edx
- popl %ecx
- popl %eax
+ addl $4, %esp /* skip NULL pointer */
+ popl %edx
+ popl %ecx
+ popl %eax
ftrace_ret:
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.globl ftrace_graph_call
ftrace_graph_call:
- jmp ftrace_stub
+ jmp ftrace_stub
#endif

.globl ftrace_stub
@@ -949,72 +948,72 @@ ENTRY(ftrace_regs_caller)
* as the current return ip is. We move the return ip into the
* ip location, and move flags into the return ip location.
*/
- pushl 4(%esp) /* save return ip into ip slot */
-
- pushl $0 /* Load 0 into orig_ax */
- pushl %gs
- pushl %fs
- pushl %es
- pushl %ds
- pushl %eax
- pushl %ebp
- pushl %edi
- pushl %esi
- pushl %edx
- pushl %ecx
- pushl %ebx
-
- movl 13*4(%esp), %eax /* Get the saved flags */
- movl %eax, 14*4(%esp) /* Move saved flags into regs->flags location */
- /* clobbering return ip */
- movl $__KERNEL_CS,13*4(%esp)
-
- movl 12*4(%esp), %eax /* Load ip (1st parameter) */
- subl $MCOUNT_INSN_SIZE, %eax /* Adjust ip */
- movl 0x4(%ebp), %edx /* Load parent ip (2nd parameter) */
- movl function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
- pushl %esp /* Save pt_regs as 4th parameter */
+ pushl 4(%esp) /* save return ip into ip slot */
+
+ pushl $0 /* Load 0 into orig_ax */
+ pushl %gs
+ pushl %fs
+ pushl %es
+ pushl %ds
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
+
+ movl 13*4(%esp), %eax /* Get the saved flags */
+ movl %eax, 14*4(%esp) /* Move saved flags into regs->flags location */
+ /* clobbering return ip */
+ movl $__KERNEL_CS, 13*4(%esp)
+
+ movl 12*4(%esp), %eax /* Load ip (1st parameter) */
+ subl $MCOUNT_INSN_SIZE, %eax /* Adjust ip */
+ movl 0x4(%ebp), %edx /* Load parent ip (2nd parameter) */
+ movl function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+ pushl %esp /* Save pt_regs as 4th parameter */

GLOBAL(ftrace_regs_call)
- call ftrace_stub
-
- addl $4, %esp /* Skip pt_regs */
- movl 14*4(%esp), %eax /* Move flags back into cs */
- movl %eax, 13*4(%esp) /* Needed to keep addl from modifying flags */
- movl 12*4(%esp), %eax /* Get return ip from regs->ip */
- movl %eax, 14*4(%esp) /* Put return ip back for ret */
-
- popl %ebx
- popl %ecx
- popl %edx
- popl %esi
- popl %edi
- popl %ebp
- popl %eax
- popl %ds
- popl %es
- popl %fs
- popl %gs
- addl $8, %esp /* Skip orig_ax and ip */
- popf /* Pop flags at end (no addl to corrupt flags) */
- jmp ftrace_ret
+ call ftrace_stub
+
+ addl $4, %esp /* Skip pt_regs */
+ movl 14*4(%esp), %eax /* Move flags back into cs */
+ movl %eax, 13*4(%esp) /* Needed to keep addl from modifying flags */
+ movl 12*4(%esp), %eax /* Get return ip from regs->ip */
+ movl %eax, 14*4(%esp) /* Put return ip back for ret */
+
+ popl %ebx
+ popl %ecx
+ popl %edx
+ popl %esi
+ popl %edi
+ popl %ebp
+ popl %eax
+ popl %ds
+ popl %es
+ popl %fs
+ popl %gs
+ addl $8, %esp /* Skip orig_ax and ip */
+ popf /* Pop flags at end (no addl to corrupt flags) */
+ jmp ftrace_ret

popf
- jmp ftrace_stub
+ jmp ftrace_stub
#else /* ! CONFIG_DYNAMIC_FTRACE */

ENTRY(mcount)
- cmpl $__PAGE_OFFSET, %esp
- jb ftrace_stub /* Paging not enabled yet? */
+ cmpl $__PAGE_OFFSET, %esp
+ jb ftrace_stub /* Paging not enabled yet? */

- cmpl $ftrace_stub, ftrace_trace_function
- jnz trace
+ cmpl $ftrace_stub, ftrace_trace_function
+ jnz trace
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- cmpl $ftrace_stub, ftrace_graph_return
- jnz ftrace_graph_caller
+ cmpl $ftrace_stub, ftrace_graph_return
+ jnz ftrace_graph_caller

- cmpl $ftrace_graph_entry_stub, ftrace_graph_entry
- jnz ftrace_graph_caller
+ cmpl $ftrace_graph_entry_stub, ftrace_graph_entry
+ jnz ftrace_graph_caller
#endif
.globl ftrace_stub
ftrace_stub:
@@ -1022,92 +1021,92 @@ ENTRY(mcount)

/* taken from glibc */
trace:
- pushl %eax
- pushl %ecx
- pushl %edx
- movl 0xc(%esp), %eax
- movl 0x4(%ebp), %edx
- subl $MCOUNT_INSN_SIZE, %eax
-
- call *ftrace_trace_function
-
- popl %edx
- popl %ecx
- popl %eax
- jmp ftrace_stub
+ pushl %eax
+ pushl %ecx
+ pushl %edx
+ movl 0xc(%esp), %eax
+ movl 0x4(%ebp), %edx
+ subl $MCOUNT_INSN_SIZE, %eax
+
+ call *ftrace_trace_function
+
+ popl %edx
+ popl %ecx
+ popl %eax
+ jmp ftrace_stub
END(mcount)
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_TRACER */

#ifdef CONFIG_FUNCTION_GRAPH_TRACER
ENTRY(ftrace_graph_caller)
- pushl %eax
- pushl %ecx
- pushl %edx
- movl 0xc(%esp), %eax
- lea 0x4(%ebp), %edx
- movl (%ebp), %ecx
- subl $MCOUNT_INSN_SIZE, %eax
- call prepare_ftrace_return
- popl %edx
- popl %ecx
- popl %eax
+ pushl %eax
+ pushl %ecx
+ pushl %edx
+ movl 0xc(%esp), %eax
+ lea 0x4(%ebp), %edx
+ movl (%ebp), %ecx
+ subl $MCOUNT_INSN_SIZE, %eax
+ call prepare_ftrace_return
+ popl %edx
+ popl %ecx
+ popl %eax
ret
END(ftrace_graph_caller)

.globl return_to_handler
return_to_handler:
- pushl %eax
- pushl %edx
- movl %ebp, %eax
- call ftrace_return_to_handler
- movl %eax, %ecx
- popl %edx
- popl %eax
- jmp *%ecx
+ pushl %eax
+ pushl %edx
+ movl %ebp, %eax
+ call ftrace_return_to_handler
+ movl %eax, %ecx
+ popl %edx
+ popl %eax
+ jmp *%ecx
#endif

#ifdef CONFIG_TRACING
ENTRY(trace_page_fault)
ASM_CLAC
- pushl $trace_do_page_fault
- jmp error_code
+ pushl $trace_do_page_fault
+ jmp error_code
END(trace_page_fault)
#endif

ENTRY(page_fault)
ASM_CLAC
- pushl $do_page_fault
+ pushl $do_page_fault
ALIGN
error_code:
/* the function address is in %gs's slot on the stack */
- pushl %fs
- pushl %es
- pushl %ds
- pushl %eax
- pushl %ebp
- pushl %edi
- pushl %esi
- pushl %edx
- pushl %ecx
- pushl %ebx
+ pushl %fs
+ pushl %es
+ pushl %ds
+ pushl %eax
+ pushl %ebp
+ pushl %edi
+ pushl %esi
+ pushl %edx
+ pushl %ecx
+ pushl %ebx
cld
- movl $(__KERNEL_PERCPU), %ecx
- movl %ecx, %fs
+ movl $(__KERNEL_PERCPU), %ecx
+ movl %ecx, %fs
UNWIND_ESPFIX_STACK
GS_TO_REG %ecx
- movl PT_GS(%esp), %edi # get the function address
- movl PT_ORIG_EAX(%esp), %edx # get the error code
- movl $-1, PT_ORIG_EAX(%esp) # no syscall to restart
+ movl PT_GS(%esp), %edi # get the function address
+ movl PT_ORIG_EAX(%esp), %edx # get the error code
+ movl $-1, PT_ORIG_EAX(%esp) # no syscall to restart
REG_TO_PTGS %ecx
SET_KERNEL_GS %ecx
- movl $(__USER_DS), %ecx
- movl %ecx, %ds
- movl %ecx, %es
+ movl $(__USER_DS), %ecx
+ movl %ecx, %ds
+ movl %ecx, %es
TRACE_IRQS_OFF
- movl %esp,%eax # pt_regs pointer
- call *%edi
- jmp ret_from_exception
+ movl %esp, %eax # pt_regs pointer
+ call *%edi
+ jmp ret_from_exception
END(page_fault)

/*
@@ -1124,28 +1123,28 @@ END(page_fault)
* the instruction that would have done it for sysenter.
*/
.macro FIX_STACK offset ok label
- cmpw $__KERNEL_CS, 4(%esp)
- jne \ok
+ cmpw $__KERNEL_CS, 4(%esp)
+ jne \ok
\label:
- movl TSS_sysenter_sp0 + \offset(%esp), %esp
+ movl TSS_sysenter_sp0 + \offset(%esp), %esp
pushfl
- pushl $__KERNEL_CS
- pushl $sysenter_past_esp
+ pushl $__KERNEL_CS
+ pushl $sysenter_past_esp
.endm

ENTRY(debug)
ASM_CLAC
- cmpl $entry_SYSENTER_32,(%esp)
- jne debug_stack_correct
+ cmpl $entry_SYSENTER_32, (%esp)
+ jne debug_stack_correct
FIX_STACK 12, debug_stack_correct, debug_esp_fix_insn
debug_stack_correct:
- pushl $-1 # mark this as an int
+ pushl $-1 # mark this as an int
SAVE_ALL
TRACE_IRQS_OFF
- xorl %edx,%edx # error code 0
- movl %esp,%eax # pt_regs pointer
- call do_debug
- jmp ret_from_exception
+ xorl %edx, %edx # error code 0
+ movl %esp, %eax # pt_regs pointer
+ call do_debug
+ jmp ret_from_exception
END(debug)

/*
@@ -1159,91 +1158,91 @@ END(debug)
ENTRY(nmi)
ASM_CLAC
#ifdef CONFIG_X86_ESPFIX32
- pushl %eax
- movl %ss, %eax
- cmpw $__ESPFIX_SS, %ax
- popl %eax
- je nmi_espfix_stack
+ pushl %eax
+ movl %ss, %eax
+ cmpw $__ESPFIX_SS, %ax
+ popl %eax
+ je nmi_espfix_stack
#endif
- cmpl $entry_SYSENTER_32,(%esp)
- je nmi_stack_fixup
- pushl %eax
- movl %esp,%eax
- /* Do not access memory above the end of our stack page,
+ cmpl $entry_SYSENTER_32, (%esp)
+ je nmi_stack_fixup
+ pushl %eax
+ movl %esp, %eax
+ /*
+ * Do not access memory above the end of our stack page,
* it might not exist.
*/
- andl $(THREAD_SIZE-1),%eax
- cmpl $(THREAD_SIZE-20),%eax
- popl %eax
- jae nmi_stack_correct
- cmpl $entry_SYSENTER_32,12(%esp)
- je nmi_debug_stack_check
+ andl $(THREAD_SIZE-1), %eax
+ cmpl $(THREAD_SIZE-20), %eax
+ popl %eax
+ jae nmi_stack_correct
+ cmpl $entry_SYSENTER_32, 12(%esp)
+ je nmi_debug_stack_check
nmi_stack_correct:
- pushl %eax
+ pushl %eax
SAVE_ALL
- xorl %edx,%edx # zero error code
- movl %esp,%eax # pt_regs pointer
- call do_nmi
- jmp restore_all_notrace
+ xorl %edx, %edx # zero error code
+ movl %esp, %eax # pt_regs pointer
+ call do_nmi
+ jmp restore_all_notrace

nmi_stack_fixup:
FIX_STACK 12, nmi_stack_correct, 1
- jmp nmi_stack_correct
+ jmp nmi_stack_correct

nmi_debug_stack_check:
- cmpw $__KERNEL_CS,16(%esp)
- jne nmi_stack_correct
- cmpl $debug,(%esp)
- jb nmi_stack_correct
- cmpl $debug_esp_fix_insn,(%esp)
- ja nmi_stack_correct
+ cmpw $__KERNEL_CS, 16(%esp)
+ jne nmi_stack_correct
+ cmpl $debug, (%esp)
+ jb nmi_stack_correct
+ cmpl $debug_esp_fix_insn, (%esp)
+ ja nmi_stack_correct
FIX_STACK 24, nmi_stack_correct, 1
- jmp nmi_stack_correct
+ jmp nmi_stack_correct

#ifdef CONFIG_X86_ESPFIX32
nmi_espfix_stack:
/*
* create the pointer to lss back
*/
- pushl %ss
- pushl %esp
- addl $4, (%esp)
+ pushl %ss
+ pushl %esp
+ addl $4, (%esp)
/* copy the iret frame of 12 bytes */
.rept 3
- pushl 16(%esp)
+ pushl 16(%esp)
.endr
- pushl %eax
+ pushl %eax
SAVE_ALL
- FIXUP_ESPFIX_STACK # %eax == %esp
- xorl %edx,%edx # zero error code
- call do_nmi
+ FIXUP_ESPFIX_STACK # %eax == %esp
+ xorl %edx, %edx # zero error code
+ call do_nmi
RESTORE_REGS
- lss 12+4(%esp), %esp # back to espfix stack
- jmp irq_return
+ lss 12+4(%esp), %esp # back to espfix stack
+ jmp irq_return
#endif
END(nmi)

ENTRY(int3)
ASM_CLAC
- pushl $-1 # mark this as an int
+ pushl $-1 # mark this as an int
SAVE_ALL
TRACE_IRQS_OFF
- xorl %edx,%edx # zero error code
- movl %esp,%eax # pt_regs pointer
- call do_int3
- jmp ret_from_exception
+ xorl %edx, %edx # zero error code
+ movl %esp, %eax # pt_regs pointer
+ call do_int3
+ jmp ret_from_exception
END(int3)

ENTRY(general_protection)
- pushl $do_general_protection
- jmp error_code
+ pushl $do_general_protection
+ jmp error_code
END(general_protection)

#ifdef CONFIG_KVM_GUEST
ENTRY(async_page_fault)
ASM_CLAC
- pushl $do_async_page_fault
- jmp error_code
+ pushl $do_async_page_fault
+ jmp error_code
END(async_page_fault)
#endif
-
--
2.1.4

2015-06-08 08:47:50

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/4] x86/asm/entry: Rename compat syscall entry points

On Mon, Jun 08, 2015 at 10:34:58AM +0200, Ingo Molnar wrote:
> Rename the following system call entry points:
>
> ia32_cstar_target -> entry_SYSCALL_compat
> ia32_syscall -> entry_INT80_compat
>
> The generic naming scheme for x86 system call entry points is:
>
> entry_MNEMONIC_qualifier
>
> where 'qualifier' is one of _32, _64 or _compat.
>
> Cc: Andy Lutomirski <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Brian Gerst <[email protected]>
> Cc: Denys Vlasenko <[email protected]>
> Cc: H. Peter Anvin <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: [email protected]
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> Documentation/x86/entry_64.txt | 4 ++--
> arch/x86/entry/entry_64_compat.S | 8 ++++----
> arch/x86/entry/syscall_32.c | 6 +++---
> arch/x86/include/asm/proto.h | 4 ++--
> arch/x86/kernel/asm-offsets_64.c | 2 +-
> arch/x86/kernel/cpu/common.c | 2 +-
> arch/x86/kernel/traps.c | 2 +-
> arch/x86/xen/xen-asm_64.S | 2 +-
> 8 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/Documentation/x86/entry_64.txt b/Documentation/x86/entry_64.txt
> index 9132b86176a3..33884d156125 100644
> --- a/Documentation/x86/entry_64.txt
> +++ b/Documentation/x86/entry_64.txt
> @@ -18,10 +18,10 @@ The IDT vector assignments are listed in arch/x86/include/asm/irq_vectors.h.
>
> - system_call: syscall instruction from 64-bit code.
>
> - - ia32_syscall: int 0x80 from 32-bit or 64-bit code; compat syscall
> + - entry_INT80_compat: int 0x80 from 32-bit or 64-bit code; compat syscall
> either way.

Haha, with the new naming scheme you don't even need the text anymore.
entry_INT80_compat is already enough.

:-)

Looks like an improvement to me, at a quick glance.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-06-08 13:15:05

by Denys Vlasenko

[permalink] [raw]
Subject: Re: [PATCH 4/4] x86/asm/entry/32: Clean up entry_32.S

On 06/08/2015 10:35 AM, Ingo Molnar wrote:
> Make the 32-bit syscall entry code a bit more readable:
>
> - use consistent assembly coding style similar to entry_64.S

Well, entry_64.S does not have consistent assembly coding style -
you already reformatted entry_64_compat.S, not entry_64.S ;)

Reformatting entry_64.S too would be great. I can send a patch,
just ask.

2015-06-08 18:51:22

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Denys Vlasenko <[email protected]> wrote:

> On 06/08/2015 10:35 AM, Ingo Molnar wrote:
> > Make the 32-bit syscall entry code a bit more readable:
> >
> > - use consistent assembly coding style similar to entry_64.S
>
> Well, entry_64.S does not have consistent assembly coding style - you already
> reformatted entry_64_compat.S, not entry_64.S ;)

Yeah, so I remembered entry_64.S as the 'clean' entry code file, to be used as
reference.

Turns out I was wrong, it has some work left as well! :-)

> Reformatting entry_64.S too would be great. I can send a patch, just ask.

No need, still had it all in muscle memory so I just did it - see the attached
patch. Lightly tested.

Thanks,

Ingo

=========================>
>From 4d7321381e5c7102a3d3faf0a0a0035a09619612 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Mon, 8 Jun 2015 20:43:07 +0200
Subject: [PATCH] x86/asm/entry/64: Clean up entry_64.S

Make the 64-bit syscall entry code a bit more readable:

- use consistent assembly coding style similar to the other entry_*.S files

- remove old comments that are not true anymore

- eliminate whitespace noise

- use consistent vertical spacing

- fix various comments

- reorganize entry point generation tables to be more readable

No code changed:

# arch/x86/entry/entry_64.o:

text data bss dec hex filename
12282 0 0 12282 2ffa entry_64.o.before
12282 0 0 12282 2ffa entry_64.o.after

md5:
cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.before.asm
cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.after.asm

Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/entry/entry_64.S | 820 +++++++++++++++++++++++-----------------------
1 file changed, 404 insertions(+), 416 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d2a0ed211bed..bd97161f90cb 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -4,26 +4,20 @@
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs
* Copyright (C) 2000 Pavel Machek <[email protected]>
- */
-
-/*
+ *
* entry.S contains the system-call and fault low-level handling routines.
*
* Some of this is documented in Documentation/x86/entry_64.txt
*
- * NOTE: This code handles signal-recognition, which happens every time
- * after an interrupt and after each system call.
- *
* A note on terminology:
- * - iret frame: Architecture defined interrupt frame from SS to RIP
- * at the top of the kernel process stack.
+ * - iret frame: Architecture defined interrupt frame from SS to RIP
+ * at the top of the kernel process stack.
*
* Some macro usage:
- * - ENTRY/END Define functions in the symbol table.
- * - TRACE_IRQ_* - Trace hard interrupt state for lock debugging.
- * - idtentry - Define exception entry points.
+ * - ENTRY/END: Define functions in the symbol table.
+ * - TRACE_IRQ_*: Trace hardirq state for lock debugging.
+ * - idtentry: Define exception entry points.
*/
-
#include <linux/linkage.h>
#include <asm/segment.h>
#include <asm/cache.h>
@@ -46,13 +40,12 @@

/* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this. */
#include <linux/elf-em.h>
-#define AUDIT_ARCH_X86_64 (EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
-#define __AUDIT_ARCH_64BIT 0x80000000
-#define __AUDIT_ARCH_LE 0x40000000
-
- .code64
- .section .entry.text, "ax"
+#define AUDIT_ARCH_X86_64 (EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
+#define __AUDIT_ARCH_64BIT 0x80000000
+#define __AUDIT_ARCH_LE 0x40000000

+.code64
+.section .entry.text, "ax"

#ifdef CONFIG_PARAVIRT
ENTRY(native_usergs_sysret64)
@@ -61,11 +54,10 @@ ENTRY(native_usergs_sysret64)
ENDPROC(native_usergs_sysret64)
#endif /* CONFIG_PARAVIRT */

-
.macro TRACE_IRQS_IRETQ
#ifdef CONFIG_TRACE_IRQFLAGS
- bt $9,EFLAGS(%rsp) /* interrupts off? */
- jnc 1f
+ bt $9, EFLAGS(%rsp) /* interrupts off? */
+ jnc 1f
TRACE_IRQS_ON
1:
#endif
@@ -85,34 +77,34 @@ ENDPROC(native_usergs_sysret64)
#if defined(CONFIG_DYNAMIC_FTRACE) && defined(CONFIG_TRACE_IRQFLAGS)

.macro TRACE_IRQS_OFF_DEBUG
- call debug_stack_set_zero
+ call debug_stack_set_zero
TRACE_IRQS_OFF
- call debug_stack_reset
+ call debug_stack_reset
.endm

.macro TRACE_IRQS_ON_DEBUG
- call debug_stack_set_zero
+ call debug_stack_set_zero
TRACE_IRQS_ON
- call debug_stack_reset
+ call debug_stack_reset
.endm

.macro TRACE_IRQS_IRETQ_DEBUG
- bt $9,EFLAGS(%rsp) /* interrupts off? */
- jnc 1f
+ bt $9, EFLAGS(%rsp) /* interrupts off? */
+ jnc 1f
TRACE_IRQS_ON_DEBUG
1:
.endm

#else
-# define TRACE_IRQS_OFF_DEBUG TRACE_IRQS_OFF
-# define TRACE_IRQS_ON_DEBUG TRACE_IRQS_ON
-# define TRACE_IRQS_IRETQ_DEBUG TRACE_IRQS_IRETQ
+# define TRACE_IRQS_OFF_DEBUG TRACE_IRQS_OFF
+# define TRACE_IRQS_ON_DEBUG TRACE_IRQS_ON
+# define TRACE_IRQS_IRETQ_DEBUG TRACE_IRQS_IRETQ
#endif

/*
- * 64bit SYSCALL instruction entry. Up to 6 arguments in registers.
+ * 64-bit SYSCALL instruction entry. Up to 6 arguments in registers.
*
- * 64bit SYSCALL saves rip to rcx, clears rflags.RF, then saves rflags to r11,
+ * 64-bit SYSCALL saves rip to rcx, clears rflags.RF, then saves rflags to r11,
* then loads new ss, cs, and rip from previously programmed MSRs.
* rflags gets masked by a value from another MSR (so CLD and CLAC
* are not needed). SYSCALL does not save anything on the stack
@@ -128,7 +120,7 @@ ENDPROC(native_usergs_sysret64)
* r10 arg3 (needs to be moved to rcx to conform to C ABI)
* r8 arg4
* r9 arg5
- * (note: r12-r15,rbp,rbx are callee-preserved in C ABI)
+ * (note: r12-r15, rbp, rbx are callee-preserved in C ABI)
*
* Only called from user space.
*
@@ -151,12 +143,12 @@ ENTRY(entry_SYSCALL_64)
*/
GLOBAL(entry_SYSCALL_64_after_swapgs)

- movq %rsp,PER_CPU_VAR(rsp_scratch)
- movq PER_CPU_VAR(cpu_current_top_of_stack),%rsp
+ movq %rsp, PER_CPU_VAR(rsp_scratch)
+ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp

/* Construct struct pt_regs on stack */
- pushq $__USER_DS /* pt_regs->ss */
- pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */
+ pushq $__USER_DS /* pt_regs->ss */
+ pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */
/*
* Re-enable interrupts.
* We use 'rsp_scratch' as a scratch space, hence irq-off block above
@@ -165,34 +157,34 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
* with using rsp_scratch:
*/
ENABLE_INTERRUPTS(CLBR_NONE)
- pushq %r11 /* pt_regs->flags */
- pushq $__USER_CS /* pt_regs->cs */
- pushq %rcx /* pt_regs->ip */
- pushq %rax /* pt_regs->orig_ax */
- pushq %rdi /* pt_regs->di */
- pushq %rsi /* pt_regs->si */
- pushq %rdx /* pt_regs->dx */
- pushq %rcx /* pt_regs->cx */
- pushq $-ENOSYS /* pt_regs->ax */
- pushq %r8 /* pt_regs->r8 */
- pushq %r9 /* pt_regs->r9 */
- pushq %r10 /* pt_regs->r10 */
- pushq %r11 /* pt_regs->r11 */
- sub $(6*8),%rsp /* pt_regs->bp,bx,r12-15 not saved */
-
- testl $_TIF_WORK_SYSCALL_ENTRY, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
- jnz tracesys
+ pushq %r11 /* pt_regs->flags */
+ pushq $__USER_CS /* pt_regs->cs */
+ pushq %rcx /* pt_regs->ip */
+ pushq %rax /* pt_regs->orig_ax */
+ pushq %rdi /* pt_regs->di */
+ pushq %rsi /* pt_regs->si */
+ pushq %rdx /* pt_regs->dx */
+ pushq %rcx /* pt_regs->cx */
+ pushq $-ENOSYS /* pt_regs->ax */
+ pushq %r8 /* pt_regs->r8 */
+ pushq %r9 /* pt_regs->r9 */
+ pushq %r10 /* pt_regs->r10 */
+ pushq %r11 /* pt_regs->r11 */
+ sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
+
+ testl $_TIF_WORK_SYSCALL_ENTRY, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
+ jnz tracesys
entry_SYSCALL_64_fastpath:
#if __SYSCALL_MASK == ~0
- cmpq $__NR_syscall_max,%rax
+ cmpq $__NR_syscall_max, %rax
#else
- andl $__SYSCALL_MASK,%eax
- cmpl $__NR_syscall_max,%eax
+ andl $__SYSCALL_MASK, %eax
+ cmpl $__NR_syscall_max, %eax
#endif
- ja 1f /* return -ENOSYS (already in pt_regs->ax) */
- movq %r10,%rcx
- call *sys_call_table(,%rax,8)
- movq %rax,RAX(%rsp)
+ ja 1f /* return -ENOSYS (already in pt_regs->ax) */
+ movq %r10, %rcx
+ call *sys_call_table(, %rax, 8)
+ movq %rax, RAX(%rsp)
1:
/*
* Syscall return path ending with SYSRET (fast path).
@@ -213,15 +205,15 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
* flags (TIF_NOTIFY_RESUME, TIF_USER_RETURN_NOTIFY, etc) set is
* very bad.
*/
- testl $_TIF_ALLWORK_MASK, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
- jnz int_ret_from_sys_call_irqs_off /* Go to the slow path */
+ testl $_TIF_ALLWORK_MASK, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
+ jnz int_ret_from_sys_call_irqs_off /* Go to the slow path */

RESTORE_C_REGS_EXCEPT_RCX_R11
- movq RIP(%rsp),%rcx
- movq EFLAGS(%rsp),%r11
- movq RSP(%rsp),%rsp
+ movq RIP(%rsp), %rcx
+ movq EFLAGS(%rsp), %r11
+ movq RSP(%rsp), %rsp
/*
- * 64bit SYSRET restores rip from rcx,
+ * 64-bit SYSRET restores rip from rcx,
* rflags from r11 (but RF and VM bits are forced to 0),
* cs and ss are loaded from MSRs.
* Restoration of rflags re-enables interrupts.
@@ -239,21 +231,21 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)

/* Do syscall entry tracing */
tracesys:
- movq %rsp, %rdi
- movl $AUDIT_ARCH_X86_64, %esi
- call syscall_trace_enter_phase1
- test %rax, %rax
- jnz tracesys_phase2 /* if needed, run the slow path */
- RESTORE_C_REGS_EXCEPT_RAX /* else restore clobbered regs */
- movq ORIG_RAX(%rsp), %rax
- jmp entry_SYSCALL_64_fastpath /* and return to the fast path */
+ movq %rsp, %rdi
+ movl $AUDIT_ARCH_X86_64, %esi
+ call syscall_trace_enter_phase1
+ test %rax, %rax
+ jnz tracesys_phase2 /* if needed, run the slow path */
+ RESTORE_C_REGS_EXCEPT_RAX /* else restore clobbered regs */
+ movq ORIG_RAX(%rsp), %rax
+ jmp entry_SYSCALL_64_fastpath /* and return to the fast path */

tracesys_phase2:
SAVE_EXTRA_REGS
- movq %rsp, %rdi
- movl $AUDIT_ARCH_X86_64, %esi
- movq %rax,%rdx
- call syscall_trace_enter_phase2
+ movq %rsp, %rdi
+ movl $AUDIT_ARCH_X86_64, %esi
+ movq %rax, %rdx
+ call syscall_trace_enter_phase2

/*
* Reload registers from stack in case ptrace changed them.
@@ -263,15 +255,15 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
RESTORE_C_REGS_EXCEPT_RAX
RESTORE_EXTRA_REGS
#if __SYSCALL_MASK == ~0
- cmpq $__NR_syscall_max,%rax
+ cmpq $__NR_syscall_max, %rax
#else
- andl $__SYSCALL_MASK,%eax
- cmpl $__NR_syscall_max,%eax
+ andl $__SYSCALL_MASK, %eax
+ cmpl $__NR_syscall_max, %eax
#endif
- ja 1f /* return -ENOSYS (already in pt_regs->ax) */
- movq %r10,%rcx /* fixup for C */
- call *sys_call_table(,%rax,8)
- movq %rax,RAX(%rsp)
+ ja 1f /* return -ENOSYS (already in pt_regs->ax) */
+ movq %r10, %rcx /* fixup for C */
+ call *sys_call_table(, %rax, 8)
+ movq %rax, RAX(%rsp)
1:
/* Use IRET because user could have changed pt_regs->foo */

@@ -283,31 +275,33 @@ GLOBAL(int_ret_from_sys_call)
DISABLE_INTERRUPTS(CLBR_NONE)
int_ret_from_sys_call_irqs_off: /* jumps come here from the irqs-off SYSRET path */
TRACE_IRQS_OFF
- movl $_TIF_ALLWORK_MASK,%edi
+ movl $_TIF_ALLWORK_MASK, %edi
/* edi: mask to check */
GLOBAL(int_with_check)
LOCKDEP_SYS_EXIT_IRQ
GET_THREAD_INFO(%rcx)
- movl TI_flags(%rcx),%edx
- andl %edi,%edx
- jnz int_careful
- andl $~TS_COMPAT,TI_status(%rcx)
+ movl TI_flags(%rcx), %edx
+ andl %edi, %edx
+ jnz int_careful
+ andl $~TS_COMPAT, TI_status(%rcx)
jmp syscall_return

- /* Either reschedule or signal or syscall exit tracking needed. */
- /* First do a reschedule test. */
- /* edx: work, edi: workmask */
+ /*
+ * Either reschedule or signal or syscall exit tracking needed.
+ * First do a reschedule test.
+ * edx: work, edi: workmask
+ */
int_careful:
- bt $TIF_NEED_RESCHED,%edx
- jnc int_very_careful
+ bt $TIF_NEED_RESCHED, %edx
+ jnc int_very_careful
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
- pushq %rdi
+ pushq %rdi
SCHEDULE_USER
- popq %rdi
+ popq %rdi
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- jmp int_with_check
+ jmp int_with_check

/* handle signals and tracing -- both require a full pt_regs */
int_very_careful:
@@ -315,27 +309,27 @@ GLOBAL(int_with_check)
ENABLE_INTERRUPTS(CLBR_NONE)
SAVE_EXTRA_REGS
/* Check for syscall exit trace */
- testl $_TIF_WORK_SYSCALL_EXIT,%edx
- jz int_signal
- pushq %rdi
- leaq 8(%rsp),%rdi # &ptregs -> arg1
- call syscall_trace_leave
- popq %rdi
- andl $~(_TIF_WORK_SYSCALL_EXIT|_TIF_SYSCALL_EMU),%edi
- jmp int_restore_rest
+ testl $_TIF_WORK_SYSCALL_EXIT, %edx
+ jz int_signal
+ pushq %rdi
+ leaq 8(%rsp), %rdi /* &ptregs -> arg1 */
+ call syscall_trace_leave
+ popq %rdi
+ andl $~(_TIF_WORK_SYSCALL_EXIT|_TIF_SYSCALL_EMU), %edi
+ jmp int_restore_rest

int_signal:
- testl $_TIF_DO_NOTIFY_MASK,%edx
- jz 1f
- movq %rsp,%rdi # &ptregs -> arg1
- xorl %esi,%esi # oldset -> arg2
- call do_notify_resume
-1: movl $_TIF_WORK_MASK,%edi
+ testl $_TIF_DO_NOTIFY_MASK, %edx
+ jz 1f
+ movq %rsp, %rdi /* &ptregs -> arg1 */
+ xorl %esi, %esi /* oldset -> arg2 */
+ call do_notify_resume
+1: movl $_TIF_WORK_MASK, %edi
int_restore_rest:
RESTORE_EXTRA_REGS
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- jmp int_with_check
+ jmp int_with_check

syscall_return:
/* The IRETQ could re-enable interrupts: */
@@ -346,10 +340,10 @@ GLOBAL(int_with_check)
* Try to use SYSRET instead of IRET if we're returning to
* a completely clean 64-bit userspace context.
*/
- movq RCX(%rsp),%rcx
- movq RIP(%rsp),%r11
- cmpq %rcx,%r11 /* RCX == RIP */
- jne opportunistic_sysret_failed
+ movq RCX(%rsp), %rcx
+ movq RIP(%rsp), %r11
+ cmpq %rcx, %r11 /* RCX == RIP */
+ jne opportunistic_sysret_failed

/*
* On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
@@ -362,19 +356,21 @@ GLOBAL(int_with_check)
.ifne __VIRTUAL_MASK_SHIFT - 47
.error "virtual address width changed -- SYSRET checks need update"
.endif
+
/* Change top 16 bits to be the sign-extension of 47th bit */
shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
+
/* If this changed %rcx, it was not canonical */
cmpq %rcx, %r11
jne opportunistic_sysret_failed

- cmpq $__USER_CS,CS(%rsp) /* CS must match SYSRET */
- jne opportunistic_sysret_failed
+ cmpq $__USER_CS, CS(%rsp) /* CS must match SYSRET */
+ jne opportunistic_sysret_failed

- movq R11(%rsp),%r11
- cmpq %r11,EFLAGS(%rsp) /* R11 == RFLAGS */
- jne opportunistic_sysret_failed
+ movq R11(%rsp), %r11
+ cmpq %r11, EFLAGS(%rsp) /* R11 == RFLAGS */
+ jne opportunistic_sysret_failed

/*
* SYSRET can't restore RF. SYSRET can restore TF, but unlike IRET,
@@ -383,29 +379,29 @@ GLOBAL(int_with_check)
* with register state that satisfies the opportunistic SYSRET
* conditions. For example, single-stepping this user code:
*
- * movq $stuck_here,%rcx
+ * movq $stuck_here, %rcx
* pushfq
* popq %r11
* stuck_here:
*
* would never get past 'stuck_here'.
*/
- testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
- jnz opportunistic_sysret_failed
+ testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
+ jnz opportunistic_sysret_failed

/* nothing to check for RSP */

- cmpq $__USER_DS,SS(%rsp) /* SS must match SYSRET */
- jne opportunistic_sysret_failed
+ cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */
+ jne opportunistic_sysret_failed

/*
- * We win! This label is here just for ease of understanding
- * perf profiles. Nothing jumps here.
+ * We win! This label is here just for ease of understanding
+ * perf profiles. Nothing jumps here.
*/
syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */
RESTORE_C_REGS_EXCEPT_RCX_R11
- movq RSP(%rsp),%rsp
+ movq RSP(%rsp), %rsp
USERGS_SYSRET64

opportunistic_sysret_failed:
@@ -417,7 +413,7 @@ END(entry_SYSCALL_64)
.macro FORK_LIKE func
ENTRY(stub_\func)
SAVE_EXTRA_REGS 8
- jmp sys_\func
+ jmp sys_\func
END(stub_\func)
.endm

@@ -436,7 +432,7 @@ ENTRY(stub_execve)
/* must use IRET code path (pt_regs->cs may have changed) */
addq $8, %rsp
ZERO_EXTRA_REGS
- movq %rax,RAX(%rsp)
+ movq %rax, RAX(%rsp)
jmp int_ret_from_sys_call
END(stub_execve)
/*
@@ -479,19 +475,19 @@ ENTRY(stub_rt_sigreturn)
* we SAVE_EXTRA_REGS here.
*/
SAVE_EXTRA_REGS 8
- call sys_rt_sigreturn
+ call sys_rt_sigreturn
return_from_stub:
addq $8, %rsp
RESTORE_EXTRA_REGS
- movq %rax,RAX(%rsp)
- jmp int_ret_from_sys_call
+ movq %rax, RAX(%rsp)
+ jmp int_ret_from_sys_call
END(stub_rt_sigreturn)

#ifdef CONFIG_X86_X32_ABI
ENTRY(stub_x32_rt_sigreturn)
SAVE_EXTRA_REGS 8
- call sys32_x32_rt_sigreturn
- jmp return_from_stub
+ call sys32_x32_rt_sigreturn
+ jmp return_from_stub
END(stub_x32_rt_sigreturn)
#endif

@@ -502,16 +498,16 @@ END(stub_x32_rt_sigreturn)
*/
ENTRY(ret_from_fork)

- LOCK ; btr $TIF_FORK,TI_flags(%r8)
+ LOCK ; btr $TIF_FORK, TI_flags(%r8)

- pushq $0x0002
- popfq # reset kernel eflags
+ pushq $0x0002
+ popfq /* reset kernel eflags */

- call schedule_tail # rdi: 'prev' task parameter
+ call schedule_tail /* rdi: 'prev' task parameter */

RESTORE_EXTRA_REGS

- testb $3, CS(%rsp) # from kernel_thread?
+ testb $3, CS(%rsp) /* from kernel_thread? */

/*
* By the time we get here, we have no idea whether our pt_regs,
@@ -522,13 +518,15 @@ ENTRY(ret_from_fork)
*/
jnz int_ret_from_sys_call

- /* We came from kernel_thread */
- /* nb: we depend on RESTORE_EXTRA_REGS above */
- movq %rbp, %rdi
- call *%rbx
- movl $0, RAX(%rsp)
+ /*
+ * We came from kernel_thread
+ * nb: we depend on RESTORE_EXTRA_REGS above
+ */
+ movq %rbp, %rdi
+ call *%rbx
+ movl $0, RAX(%rsp)
RESTORE_EXTRA_REGS
- jmp int_ret_from_sys_call
+ jmp int_ret_from_sys_call
END(ret_from_fork)

/*
@@ -539,7 +537,7 @@ END(ret_from_fork)
ENTRY(irq_entries_start)
vector=FIRST_EXTERNAL_VECTOR
.rept (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR)
- pushq $(~vector+0x80) /* Note: always in signed byte range */
+ pushq $(~vector+0x80) /* Note: always in signed byte range */
vector=vector+1
jmp common_interrupt
.align 8
@@ -569,7 +567,7 @@ END(irq_entries_start)
/* this goes to 0(%rsp) for unwinder, not for saving the value: */
SAVE_EXTRA_REGS_RBP -RBP

- leaq -RBP(%rsp),%rdi /* arg1 for \func (pointer to pt_regs) */
+ leaq -RBP(%rsp), %rdi /* arg1 for \func (pointer to pt_regs) */

testb $3, CS-RBP(%rsp)
jz 1f
@@ -582,14 +580,14 @@ END(irq_entries_start)
* a little cheaper to use a separate counter in the PDA (short of
* moving irq_enter into assembly, which would be too much work)
*/
- movq %rsp, %rsi
- incl PER_CPU_VAR(irq_count)
- cmovzq PER_CPU_VAR(irq_stack_ptr),%rsp
- pushq %rsi
+ movq %rsp, %rsi
+ incl PER_CPU_VAR(irq_count)
+ cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
+ pushq %rsi
/* We entered an interrupt context - irqs are off: */
TRACE_IRQS_OFF

- call \func
+ call \func
.endm

/*
@@ -599,36 +597,35 @@ END(irq_entries_start)
.p2align CONFIG_X86_L1_CACHE_SHIFT
common_interrupt:
ASM_CLAC
- addq $-0x80,(%rsp) /* Adjust vector to [-256,-1] range */
+ addq $-0x80, (%rsp) /* Adjust vector to [-256, -1] range */
interrupt do_IRQ
/* 0(%rsp): old RSP */
ret_from_intr:
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- decl PER_CPU_VAR(irq_count)
+ decl PER_CPU_VAR(irq_count)

/* Restore saved previous stack */
- popq %rsi
+ popq %rsi
/* return code expects complete pt_regs - adjust rsp accordingly: */
- leaq -RBP(%rsi),%rsp
+ leaq -RBP(%rsi), %rsp

testb $3, CS(%rsp)
jz retint_kernel
/* Interrupt came from user space */
retint_user:
GET_THREAD_INFO(%rcx)
- /*
- * %rcx: thread info. Interrupts off.
- */
+
+ /* %rcx: thread info. Interrupts are off. */
retint_with_reschedule:
- movl $_TIF_WORK_MASK,%edi
+ movl $_TIF_WORK_MASK, %edi
retint_check:
LOCKDEP_SYS_EXIT_IRQ
- movl TI_flags(%rcx),%edx
- andl %edi,%edx
- jnz retint_careful
+ movl TI_flags(%rcx), %edx
+ andl %edi, %edx
+ jnz retint_careful

-retint_swapgs: /* return to user-space */
+retint_swapgs: /* return to user-space */
/*
* The iretq could re-enable interrupts:
*/
@@ -643,9 +640,9 @@ retint_swapgs: /* return to user-space */
#ifdef CONFIG_PREEMPT
/* Interrupts are off */
/* Check if we need preemption */
- bt $9,EFLAGS(%rsp) /* interrupts were off? */
+ bt $9, EFLAGS(%rsp) /* were interrupts off? */
jnc 1f
-0: cmpl $0,PER_CPU_VAR(__preempt_count)
+0: cmpl $0, PER_CPU_VAR(__preempt_count)
jnz 1f
call preempt_schedule_irq
jmp 0b
@@ -671,8 +668,8 @@ ENTRY(native_iret)
* 64-bit mode SS:RSP on the exception stack is always valid.
*/
#ifdef CONFIG_X86_ESPFIX64
- testb $4,(SS-RIP)(%rsp)
- jnz native_irq_return_ldt
+ testb $4, (SS-RIP)(%rsp)
+ jnz native_irq_return_ldt
#endif

.global native_irq_return_iret
@@ -687,59 +684,59 @@ ENTRY(native_iret)

#ifdef CONFIG_X86_ESPFIX64
native_irq_return_ldt:
- pushq %rax
- pushq %rdi
+ pushq %rax
+ pushq %rdi
SWAPGS
- movq PER_CPU_VAR(espfix_waddr),%rdi
- movq %rax,(0*8)(%rdi) /* RAX */
- movq (2*8)(%rsp),%rax /* RIP */
- movq %rax,(1*8)(%rdi)
- movq (3*8)(%rsp),%rax /* CS */
- movq %rax,(2*8)(%rdi)
- movq (4*8)(%rsp),%rax /* RFLAGS */
- movq %rax,(3*8)(%rdi)
- movq (6*8)(%rsp),%rax /* SS */
- movq %rax,(5*8)(%rdi)
- movq (5*8)(%rsp),%rax /* RSP */
- movq %rax,(4*8)(%rdi)
- andl $0xffff0000,%eax
- popq %rdi
- orq PER_CPU_VAR(espfix_stack),%rax
+ movq PER_CPU_VAR(espfix_waddr), %rdi
+ movq %rax, (0*8)(%rdi) /* RAX */
+ movq (2*8)(%rsp), %rax /* RIP */
+ movq %rax, (1*8)(%rdi)
+ movq (3*8)(%rsp), %rax /* CS */
+ movq %rax, (2*8)(%rdi)
+ movq (4*8)(%rsp), %rax /* RFLAGS */
+ movq %rax, (3*8)(%rdi)
+ movq (6*8)(%rsp), %rax /* SS */
+ movq %rax, (5*8)(%rdi)
+ movq (5*8)(%rsp), %rax /* RSP */
+ movq %rax, (4*8)(%rdi)
+ andl $0xffff0000, %eax
+ popq %rdi
+ orq PER_CPU_VAR(espfix_stack), %rax
SWAPGS
- movq %rax,%rsp
- popq %rax
- jmp native_irq_return_iret
+ movq %rax, %rsp
+ popq %rax
+ jmp native_irq_return_iret
#endif

/* edi: workmask, edx: work */
retint_careful:
- bt $TIF_NEED_RESCHED,%edx
- jnc retint_signal
+ bt $TIF_NEED_RESCHED, %edx
+ jnc retint_signal
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
- pushq %rdi
+ pushq %rdi
SCHEDULE_USER
- popq %rdi
+ popq %rdi
GET_THREAD_INFO(%rcx)
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- jmp retint_check
+ jmp retint_check

retint_signal:
- testl $_TIF_DO_NOTIFY_MASK,%edx
- jz retint_swapgs
+ testl $_TIF_DO_NOTIFY_MASK, %edx
+ jz retint_swapgs
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_NONE)
SAVE_EXTRA_REGS
- movq $-1,ORIG_RAX(%rsp)
- xorl %esi,%esi # oldset
- movq %rsp,%rdi # &pt_regs
- call do_notify_resume
+ movq $-1, ORIG_RAX(%rsp)
+ xorl %esi, %esi /* oldset */
+ movq %rsp, %rdi /* &pt_regs */
+ call do_notify_resume
RESTORE_EXTRA_REGS
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
GET_THREAD_INFO(%rcx)
- jmp retint_with_reschedule
+ jmp retint_with_reschedule

END(common_interrupt)

@@ -749,10 +746,10 @@ END(common_interrupt)
.macro apicinterrupt3 num sym do_sym
ENTRY(\sym)
ASM_CLAC
- pushq $~(\num)
+ pushq $~(\num)
.Lcommon_\sym:
interrupt \do_sym
- jmp ret_from_intr
+ jmp ret_from_intr
END(\sym)
.endm

@@ -774,60 +771,45 @@ trace_apicinterrupt \num \sym
.endm

#ifdef CONFIG_SMP
-apicinterrupt3 IRQ_MOVE_CLEANUP_VECTOR \
- irq_move_cleanup_interrupt smp_irq_move_cleanup_interrupt
-apicinterrupt3 REBOOT_VECTOR \
- reboot_interrupt smp_reboot_interrupt
+apicinterrupt3 IRQ_MOVE_CLEANUP_VECTOR irq_move_cleanup_interrupt smp_irq_move_cleanup_interrupt
+apicinterrupt3 REBOOT_VECTOR reboot_interrupt smp_reboot_interrupt
#endif

#ifdef CONFIG_X86_UV
-apicinterrupt3 UV_BAU_MESSAGE \
- uv_bau_message_intr1 uv_bau_message_interrupt
+apicinterrupt3 UV_BAU_MESSAGE uv_bau_message_intr1 uv_bau_message_interrupt
#endif
-apicinterrupt LOCAL_TIMER_VECTOR \
- apic_timer_interrupt smp_apic_timer_interrupt
-apicinterrupt X86_PLATFORM_IPI_VECTOR \
- x86_platform_ipi smp_x86_platform_ipi
+
+apicinterrupt LOCAL_TIMER_VECTOR apic_timer_interrupt smp_apic_timer_interrupt
+apicinterrupt X86_PLATFORM_IPI_VECTOR x86_platform_ipi smp_x86_platform_ipi

#ifdef CONFIG_HAVE_KVM
-apicinterrupt3 POSTED_INTR_VECTOR \
- kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
-apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
- kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
+apicinterrupt3 POSTED_INTR_VECTOR kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
#endif

#ifdef CONFIG_X86_MCE_THRESHOLD
-apicinterrupt THRESHOLD_APIC_VECTOR \
- threshold_interrupt smp_threshold_interrupt
+apicinterrupt THRESHOLD_APIC_VECTOR threshold_interrupt smp_threshold_interrupt
#endif

#ifdef CONFIG_X86_MCE_AMD
-apicinterrupt DEFERRED_ERROR_VECTOR \
- deferred_error_interrupt smp_deferred_error_interrupt
+apicinterrupt DEFERRED_ERROR_VECTOR deferred_error_interrupt smp_deferred_error_interrupt
#endif

#ifdef CONFIG_X86_THERMAL_VECTOR
-apicinterrupt THERMAL_APIC_VECTOR \
- thermal_interrupt smp_thermal_interrupt
+apicinterrupt THERMAL_APIC_VECTOR thermal_interrupt smp_thermal_interrupt
#endif

#ifdef CONFIG_SMP
-apicinterrupt CALL_FUNCTION_SINGLE_VECTOR \
- call_function_single_interrupt smp_call_function_single_interrupt
-apicinterrupt CALL_FUNCTION_VECTOR \
- call_function_interrupt smp_call_function_interrupt
-apicinterrupt RESCHEDULE_VECTOR \
- reschedule_interrupt smp_reschedule_interrupt
+apicinterrupt CALL_FUNCTION_SINGLE_VECTOR call_function_single_interrupt smp_call_function_single_interrupt
+apicinterrupt CALL_FUNCTION_VECTOR call_function_interrupt smp_call_function_interrupt
+apicinterrupt RESCHEDULE_VECTOR reschedule_interrupt smp_reschedule_interrupt
#endif

-apicinterrupt ERROR_APIC_VECTOR \
- error_interrupt smp_error_interrupt
-apicinterrupt SPURIOUS_APIC_VECTOR \
- spurious_interrupt smp_spurious_interrupt
+apicinterrupt ERROR_APIC_VECTOR error_interrupt smp_error_interrupt
+apicinterrupt SPURIOUS_APIC_VECTOR spurious_interrupt smp_spurious_interrupt

#ifdef CONFIG_IRQ_WORK
-apicinterrupt IRQ_WORK_VECTOR \
- irq_work_interrupt smp_irq_work_interrupt
+apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt
#endif

/*
@@ -846,54 +828,54 @@ ENTRY(\sym)
PARAVIRT_ADJUST_EXCEPTION_FRAME

.ifeq \has_error_code
- pushq $-1 /* ORIG_RAX: no syscall to restart */
+ pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif

ALLOC_PT_GPREGS_ON_STACK

.if \paranoid
.if \paranoid == 1
- testb $3, CS(%rsp) /* If coming from userspace, switch */
- jnz 1f /* stacks. */
+ testb $3, CS(%rsp) /* If coming from userspace, switch stacks */
+ jnz 1f
.endif
- call paranoid_entry
+ call paranoid_entry
.else
- call error_entry
+ call error_entry
.endif
/* returned flag: ebx=0: need swapgs on exit, ebx=1: don't need it */

.if \paranoid
.if \shift_ist != -1
- TRACE_IRQS_OFF_DEBUG /* reload IDT in case of recursion */
+ TRACE_IRQS_OFF_DEBUG /* reload IDT in case of recursion */
.else
TRACE_IRQS_OFF
.endif
.endif

- movq %rsp,%rdi /* pt_regs pointer */
+ movq %rsp, %rdi /* pt_regs pointer */

.if \has_error_code
- movq ORIG_RAX(%rsp),%rsi /* get error code */
- movq $-1,ORIG_RAX(%rsp) /* no syscall to restart */
+ movq ORIG_RAX(%rsp), %rsi /* get error code */
+ movq $-1, ORIG_RAX(%rsp) /* no syscall to restart */
.else
- xorl %esi,%esi /* no error code */
+ xorl %esi, %esi /* no error code */
.endif

.if \shift_ist != -1
- subq $EXCEPTION_STKSZ, CPU_TSS_IST(\shift_ist)
+ subq $EXCEPTION_STKSZ, CPU_TSS_IST(\shift_ist)
.endif

- call \do_sym
+ call \do_sym

.if \shift_ist != -1
- addq $EXCEPTION_STKSZ, CPU_TSS_IST(\shift_ist)
+ addq $EXCEPTION_STKSZ, CPU_TSS_IST(\shift_ist)
.endif

/* these procedures expect "no swapgs" flag in ebx */
.if \paranoid
- jmp paranoid_exit
+ jmp paranoid_exit
.else
- jmp error_exit
+ jmp error_exit
.endif

.if \paranoid == 1
@@ -903,25 +885,25 @@ ENTRY(\sym)
* run in real process context if user_mode(regs).
*/
1:
- call error_entry
+ call error_entry


- movq %rsp,%rdi /* pt_regs pointer */
- call sync_regs
- movq %rax,%rsp /* switch stack */
+ movq %rsp, %rdi /* pt_regs pointer */
+ call sync_regs
+ movq %rax, %rsp /* switch stack */

- movq %rsp,%rdi /* pt_regs pointer */
+ movq %rsp, %rdi /* pt_regs pointer */

.if \has_error_code
- movq ORIG_RAX(%rsp),%rsi /* get error code */
- movq $-1,ORIG_RAX(%rsp) /* no syscall to restart */
+ movq ORIG_RAX(%rsp), %rsi /* get error code */
+ movq $-1, ORIG_RAX(%rsp) /* no syscall to restart */
.else
- xorl %esi,%esi /* no error code */
+ xorl %esi, %esi /* no error code */
.endif

- call \do_sym
+ call \do_sym

- jmp error_exit /* %ebx: no swapgs flag */
+ jmp error_exit /* %ebx: no swapgs flag */
.endif
END(\sym)
.endm
@@ -937,55 +919,57 @@ idtentry \sym \do_sym has_error_code=\has_error_code
.endm
#endif

-idtentry divide_error do_divide_error has_error_code=0
-idtentry overflow do_overflow has_error_code=0
-idtentry bounds do_bounds has_error_code=0
-idtentry invalid_op do_invalid_op has_error_code=0
-idtentry device_not_available do_device_not_available has_error_code=0
-idtentry double_fault do_double_fault has_error_code=1 paranoid=2
-idtentry coprocessor_segment_overrun do_coprocessor_segment_overrun has_error_code=0
-idtentry invalid_TSS do_invalid_TSS has_error_code=1
-idtentry segment_not_present do_segment_not_present has_error_code=1
-idtentry spurious_interrupt_bug do_spurious_interrupt_bug has_error_code=0
-idtentry coprocessor_error do_coprocessor_error has_error_code=0
-idtentry alignment_check do_alignment_check has_error_code=1
-idtentry simd_coprocessor_error do_simd_coprocessor_error has_error_code=0
-
-
- /* Reload gs selector with exception handling */
- /* edi: new selector */
+idtentry divide_error do_divide_error has_error_code=0
+idtentry overflow do_overflow has_error_code=0
+idtentry bounds do_bounds has_error_code=0
+idtentry invalid_op do_invalid_op has_error_code=0
+idtentry device_not_available do_device_not_available has_error_code=0
+idtentry double_fault do_double_fault has_error_code=1 paranoid=2
+idtentry coprocessor_segment_overrun do_coprocessor_segment_overrun has_error_code=0
+idtentry invalid_TSS do_invalid_TSS has_error_code=1
+idtentry segment_not_present do_segment_not_present has_error_code=1
+idtentry spurious_interrupt_bug do_spurious_interrupt_bug has_error_code=0
+idtentry coprocessor_error do_coprocessor_error has_error_code=0
+idtentry alignment_check do_alignment_check has_error_code=1
+idtentry simd_coprocessor_error do_simd_coprocessor_error has_error_code=0
+
+
+ /*
+ * Reload gs selector with exception handling
+ * edi: new selector
+ */
ENTRY(native_load_gs_index)
pushfq
DISABLE_INTERRUPTS(CLBR_ANY & ~CLBR_RDI)
SWAPGS
gs_change:
- movl %edi,%gs
-2: mfence /* workaround */
+ movl %edi, %gs
+2: mfence /* workaround */
SWAPGS
popfq
ret
END(native_load_gs_index)

- _ASM_EXTABLE(gs_change,bad_gs)
- .section .fixup,"ax"
+ _ASM_EXTABLE(gs_change, bad_gs)
+ .section .fixup, "ax"
/* running with kernelgs */
bad_gs:
- SWAPGS /* switch back to user gs */
- xorl %eax,%eax
- movl %eax,%gs
- jmp 2b
+ SWAPGS /* switch back to user gs */
+ xorl %eax, %eax
+ movl %eax, %gs
+ jmp 2b
.previous

/* Call softirq on interrupt stack. Interrupts are off. */
ENTRY(do_softirq_own_stack)
- pushq %rbp
- mov %rsp,%rbp
- incl PER_CPU_VAR(irq_count)
- cmove PER_CPU_VAR(irq_stack_ptr),%rsp
- push %rbp # backlink for old unwinder
- call __do_softirq
+ pushq %rbp
+ mov %rsp, %rbp
+ incl PER_CPU_VAR(irq_count)
+ cmove PER_CPU_VAR(irq_stack_ptr), %rsp
+ push %rbp /* frame pointer backlink */
+ call __do_softirq
leaveq
- decl PER_CPU_VAR(irq_count)
+ decl PER_CPU_VAR(irq_count)
ret
END(do_softirq_own_stack)

@@ -1005,23 +989,24 @@ idtentry xen_hypervisor_callback xen_do_hypervisor_callback has_error_code=0
* existing activation in its critical region -- if so, we pop the current
* activation and restart the handler using the previous one.
*/
-ENTRY(xen_do_hypervisor_callback) # do_hypervisor_callback(struct *pt_regs)
+ENTRY(xen_do_hypervisor_callback) /* do_hypervisor_callback(struct *pt_regs) */
+
/*
* Since we don't modify %rdi, evtchn_do_upall(struct *pt_regs) will
* see the correct pointer to the pt_regs
*/
- movq %rdi, %rsp # we don't return, adjust the stack frame
-11: incl PER_CPU_VAR(irq_count)
- movq %rsp,%rbp
- cmovzq PER_CPU_VAR(irq_stack_ptr),%rsp
- pushq %rbp # backlink for old unwinder
- call xen_evtchn_do_upcall
- popq %rsp
- decl PER_CPU_VAR(irq_count)
+ movq %rdi, %rsp /* we don't return, adjust the stack frame */
+11: incl PER_CPU_VAR(irq_count)
+ movq %rsp, %rbp
+ cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
+ pushq %rbp /* frame pointer backlink */
+ call xen_evtchn_do_upcall
+ popq %rsp
+ decl PER_CPU_VAR(irq_count)
#ifndef CONFIG_PREEMPT
- call xen_maybe_preempt_hcall
+ call xen_maybe_preempt_hcall
#endif
- jmp error_exit
+ jmp error_exit
END(xen_do_hypervisor_callback)

/*
@@ -1038,35 +1023,35 @@ END(xen_do_hypervisor_callback)
* with its current contents: any discrepancy means we in category 1.
*/
ENTRY(xen_failsafe_callback)
- movl %ds,%ecx
- cmpw %cx,0x10(%rsp)
- jne 1f
- movl %es,%ecx
- cmpw %cx,0x18(%rsp)
- jne 1f
- movl %fs,%ecx
- cmpw %cx,0x20(%rsp)
- jne 1f
- movl %gs,%ecx
- cmpw %cx,0x28(%rsp)
- jne 1f
+ movl %ds, %ecx
+ cmpw %cx, 0x10(%rsp)
+ jne 1f
+ movl %es, %ecx
+ cmpw %cx, 0x18(%rsp)
+ jne 1f
+ movl %fs, %ecx
+ cmpw %cx, 0x20(%rsp)
+ jne 1f
+ movl %gs, %ecx
+ cmpw %cx, 0x28(%rsp)
+ jne 1f
/* All segments match their saved values => Category 2 (Bad IRET). */
- movq (%rsp),%rcx
- movq 8(%rsp),%r11
- addq $0x30,%rsp
- pushq $0 /* RIP */
- pushq %r11
- pushq %rcx
- jmp general_protection
+ movq (%rsp), %rcx
+ movq 8(%rsp), %r11
+ addq $0x30, %rsp
+ pushq $0 /* RIP */
+ pushq %r11
+ pushq %rcx
+ jmp general_protection
1: /* Segment mismatch => Category 1 (Bad segment). Retry the IRET. */
- movq (%rsp),%rcx
- movq 8(%rsp),%r11
- addq $0x30,%rsp
- pushq $-1 /* orig_ax = -1 => not a system call */
+ movq (%rsp), %rcx
+ movq 8(%rsp), %r11
+ addq $0x30, %rsp
+ pushq $-1 /* orig_ax = -1 => not a system call */
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
SAVE_EXTRA_REGS
- jmp error_exit
+ jmp error_exit
END(xen_failsafe_callback)

apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
@@ -1079,21 +1064,25 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
hyperv_callback_vector hyperv_vector_handler
#endif /* CONFIG_HYPERV */

-idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
-idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
-idtentry stack_segment do_stack_segment has_error_code=1
+idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
+idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
+idtentry stack_segment do_stack_segment has_error_code=1
+
#ifdef CONFIG_XEN
-idtentry xen_debug do_debug has_error_code=0
-idtentry xen_int3 do_int3 has_error_code=0
-idtentry xen_stack_segment do_stack_segment has_error_code=1
+idtentry xen_debug do_debug has_error_code=0
+idtentry xen_int3 do_int3 has_error_code=0
+idtentry xen_stack_segment do_stack_segment has_error_code=1
#endif
-idtentry general_protection do_general_protection has_error_code=1
-trace_idtentry page_fault do_page_fault has_error_code=1
+
+idtentry general_protection do_general_protection has_error_code=1
+trace_idtentry page_fault do_page_fault has_error_code=1
+
#ifdef CONFIG_KVM_GUEST
-idtentry async_page_fault do_async_page_fault has_error_code=1
+idtentry async_page_fault do_async_page_fault has_error_code=1
#endif
+
#ifdef CONFIG_X86_MCE
-idtentry machine_check has_error_code=0 paranoid=1 do_sym=*machine_check_vector(%rip)
+idtentry machine_check has_error_code=0 paranoid=1 do_sym=*machine_check_vector(%rip)
#endif

/*
@@ -1105,13 +1094,13 @@ ENTRY(paranoid_entry)
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
- movl $1,%ebx
- movl $MSR_GS_BASE,%ecx
+ movl $1, %ebx
+ movl $MSR_GS_BASE, %ecx
rdmsr
- testl %edx,%edx
- js 1f /* negative -> in kernel */
+ testl %edx, %edx
+ js 1f /* negative -> in kernel */
SWAPGS
- xorl %ebx,%ebx
+ xorl %ebx, %ebx
1: ret
END(paranoid_entry)

@@ -1124,16 +1113,17 @@ END(paranoid_entry)
* in syscall entry), so checking for preemption here would
* be complicated. Fortunately, we there's no good reason
* to try to handle preemption here.
+ *
+ * On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)
*/
-/* On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it) */
ENTRY(paranoid_exit)
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF_DEBUG
- testl %ebx,%ebx /* swapgs needed? */
- jnz paranoid_exit_no_swapgs
+ testl %ebx, %ebx /* swapgs needed? */
+ jnz paranoid_exit_no_swapgs
TRACE_IRQS_IRETQ
SWAPGS_UNSAFE_STACK
- jmp paranoid_exit_restore
+ jmp paranoid_exit_restore
paranoid_exit_no_swapgs:
TRACE_IRQS_IRETQ_DEBUG
paranoid_exit_restore:
@@ -1151,7 +1141,7 @@ ENTRY(error_entry)
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
- xorl %ebx,%ebx
+ xorl %ebx, %ebx
testb $3, CS+8(%rsp)
jz error_kernelspace
error_swapgs:
@@ -1167,41 +1157,41 @@ ENTRY(error_entry)
* for these here too.
*/
error_kernelspace:
- incl %ebx
- leaq native_irq_return_iret(%rip),%rcx
- cmpq %rcx,RIP+8(%rsp)
- je error_bad_iret
- movl %ecx,%eax /* zero extend */
- cmpq %rax,RIP+8(%rsp)
- je bstep_iret
- cmpq $gs_change,RIP+8(%rsp)
- je error_swapgs
- jmp error_sti
+ incl %ebx
+ leaq native_irq_return_iret(%rip), %rcx
+ cmpq %rcx, RIP+8(%rsp)
+ je error_bad_iret
+ movl %ecx, %eax /* zero extend */
+ cmpq %rax, RIP+8(%rsp)
+ je bstep_iret
+ cmpq $gs_change, RIP+8(%rsp)
+ je error_swapgs
+ jmp error_sti

bstep_iret:
/* Fix truncated RIP */
- movq %rcx,RIP+8(%rsp)
+ movq %rcx, RIP+8(%rsp)
/* fall through */

error_bad_iret:
SWAPGS
- mov %rsp,%rdi
- call fixup_bad_iret
- mov %rax,%rsp
- decl %ebx /* Return to usergs */
- jmp error_sti
+ mov %rsp, %rdi
+ call fixup_bad_iret
+ mov %rax, %rsp
+ decl %ebx /* Return to usergs */
+ jmp error_sti
END(error_entry)


/* On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it) */
ENTRY(error_exit)
- movl %ebx,%eax
+ movl %ebx, %eax
RESTORE_EXTRA_REGS
DISABLE_INTERRUPTS(CLBR_NONE)
TRACE_IRQS_OFF
- testl %eax,%eax
- jnz retint_kernel
- jmp retint_user
+ testl %eax, %eax
+ jnz retint_kernel
+ jmp retint_user
END(error_exit)

/* Runs on exception stack */
@@ -1240,21 +1230,21 @@ ENTRY(nmi)
*/

/* Use %rdx as our temp variable throughout */
- pushq %rdx
+ pushq %rdx

/*
* If %cs was not the kernel segment, then the NMI triggered in user
* space, which means it is definitely not nested.
*/
- cmpl $__KERNEL_CS, 16(%rsp)
- jne first_nmi
+ cmpl $__KERNEL_CS, 16(%rsp)
+ jne first_nmi

/*
* Check the special variable on the stack to see if NMIs are
* executing.
*/
- cmpl $1, -8(%rsp)
- je nested_nmi
+ cmpl $1, -8(%rsp)
+ je nested_nmi

/*
* Now test if the previous stack was an NMI stack.
@@ -1268,6 +1258,7 @@ ENTRY(nmi)
cmpq %rdx, 4*8(%rsp)
/* If the stack pointer is above the NMI stack, this is a normal NMI */
ja first_nmi
+
subq $EXCEPTION_STKSZ, %rdx
cmpq %rdx, 4*8(%rsp)
/* If it is below the NMI stack, it is a normal NMI */
@@ -1280,29 +1271,29 @@ ENTRY(nmi)
* It's about to repeat the NMI handler, so we are fine
* with ignoring this one.
*/
- movq $repeat_nmi, %rdx
- cmpq 8(%rsp), %rdx
- ja 1f
- movq $end_repeat_nmi, %rdx
- cmpq 8(%rsp), %rdx
- ja nested_nmi_out
+ movq $repeat_nmi, %rdx
+ cmpq 8(%rsp), %rdx
+ ja 1f
+ movq $end_repeat_nmi, %rdx
+ cmpq 8(%rsp), %rdx
+ ja nested_nmi_out

1:
/* Set up the interrupted NMIs stack to jump to repeat_nmi */
- leaq -1*8(%rsp), %rdx
- movq %rdx, %rsp
- leaq -10*8(%rsp), %rdx
- pushq $__KERNEL_DS
- pushq %rdx
+ leaq -1*8(%rsp), %rdx
+ movq %rdx, %rsp
+ leaq -10*8(%rsp), %rdx
+ pushq $__KERNEL_DS
+ pushq %rdx
pushfq
- pushq $__KERNEL_CS
- pushq $repeat_nmi
+ pushq $__KERNEL_CS
+ pushq $repeat_nmi

/* Put stack back */
- addq $(6*8), %rsp
+ addq $(6*8), %rsp

nested_nmi_out:
- popq %rdx
+ popq %rdx

/* No need to check faults here */
INTERRUPT_RETURN
@@ -1344,19 +1335,17 @@ ENTRY(nmi)
* is also used by nested NMIs and can not be trusted on exit.
*/
/* Do not pop rdx, nested NMIs will corrupt that part of the stack */
- movq (%rsp), %rdx
+ movq (%rsp), %rdx

/* Set the NMI executing variable on the stack. */
- pushq $1
+ pushq $1

- /*
- * Leave room for the "copied" frame
- */
- subq $(5*8), %rsp
+ /* Leave room for the "copied" frame */
+ subq $(5*8), %rsp

/* Copy the stack frame to the Saved frame */
.rept 5
- pushq 11*8(%rsp)
+ pushq 11*8(%rsp)
.endr

/* Everything up to here is safe from nested NMIs */
@@ -1376,14 +1365,14 @@ ENTRY(nmi)
* is benign for the non-repeat case, where 1 was pushed just above
* to this very stack slot).
*/
- movq $1, 10*8(%rsp)
+ movq $1, 10*8(%rsp)

/* Make another copy, this one may be modified by nested NMIs */
- addq $(10*8), %rsp
+ addq $(10*8), %rsp
.rept 5
- pushq -6*8(%rsp)
+ pushq -6*8(%rsp)
.endr
- subq $(5*8), %rsp
+ subq $(5*8), %rsp
end_repeat_nmi:

/*
@@ -1391,7 +1380,7 @@ ENTRY(nmi)
* NMI if the first NMI took an exception and reset our iret stack
* so that we repeat another NMI.
*/
- pushq $-1 /* ORIG_RAX: no syscall to restart */
+ pushq $-1 /* ORIG_RAX: no syscall to restart */
ALLOC_PT_GPREGS_ON_STACK

/*
@@ -1401,7 +1390,7 @@ ENTRY(nmi)
* setting NEED_RESCHED or anything that normal interrupts and
* exceptions might do.
*/
- call paranoid_entry
+ call paranoid_entry

/*
* Save off the CR2 register. If we take a page fault in the NMI then
@@ -1412,21 +1401,21 @@ ENTRY(nmi)
* origin fault. Save it off and restore it if it changes.
* Use the r12 callee-saved register.
*/
- movq %cr2, %r12
+ movq %cr2, %r12

/* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */
- movq %rsp,%rdi
- movq $-1,%rsi
- call do_nmi
+ movq %rsp, %rdi
+ movq $-1, %rsi
+ call do_nmi

/* Did the NMI take a page fault? Restore cr2 if it did */
- movq %cr2, %rcx
- cmpq %rcx, %r12
- je 1f
- movq %r12, %cr2
+ movq %cr2, %rcx
+ cmpq %rcx, %r12
+ je 1f
+ movq %r12, %cr2
1:
- testl %ebx,%ebx /* swapgs needed? */
- jnz nmi_restore
+ testl %ebx, %ebx /* swapgs needed? */
+ jnz nmi_restore
nmi_swapgs:
SWAPGS_UNSAFE_STACK
nmi_restore:
@@ -1436,12 +1425,11 @@ ENTRY(nmi)
REMOVE_PT_GPREGS_FROM_STACK 6*8

/* Clear the NMI executing stack variable */
- movq $0, 5*8(%rsp)
+ movq $0, 5*8(%rsp)
INTERRUPT_RETURN
END(nmi)

ENTRY(ignore_sysret)
- mov $-ENOSYS,%eax
+ mov $-ENOSYS, %eax
sysret
END(ignore_sysret)
-

2015-06-09 00:13:50

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat

On Mon, Jun 8, 2015 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
> So the SYSENTER instruction is pretty quirky and it has different behavior
> depending on bitness and CPU maker.
>
> Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
> in both of the cases.
>
> Split the name into its two uses:
>
> ia32_sysenter_target (32) -> entry_SYSENTER_32
> ia32_sysenter_target (64) -> entry_SYSENTER_compat
>

Now that I'm rebasing my pile on top of this, I have a minor gripe
about this one. There are (in my mind, anyway), two SYSENTER
instructions: the 32-bit one and the 64-bit one. (That is, there's
SYSENTER32, which happens when you do SYSENTER in 32-bit or compat
mode, and SYSENTER64, which happens when you do SYSENTER in long
mode.) SYSENTER32, from user code's perspective, does the same thing
in either case [1]. That means that it really does make sense that
we'd have two implementations of the same entry point, one written in
32-bit asm and one written in 64-bit asm.

The patch I'm rebasing merges the two wrmsrs to MSR_IA32_SYSENTER, and
this change makes it uglier.

[1] Sort of. We probably have differently nonsensical calling
conventions, but that's our fault and has nothing to do with the
hardware.

--Andy

2015-06-09 09:33:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat


* Andy Lutomirski <[email protected]> wrote:

> On Mon, Jun 8, 2015 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
> > So the SYSENTER instruction is pretty quirky and it has different behavior
> > depending on bitness and CPU maker.
> >
> > Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
> > in both of the cases.
> >
> > Split the name into its two uses:
> >
> > ia32_sysenter_target (32) -> entry_SYSENTER_32
> > ia32_sysenter_target (64) -> entry_SYSENTER_compat
> >
>
> Now that I'm rebasing my pile on top of this, I have a minor gripe
> about this one. There are (in my mind, anyway), two SYSENTER
> instructions: the 32-bit one and the 64-bit one. (That is, there's
> SYSENTER32, which happens when you do SYSENTER in 32-bit or compat
> mode, and SYSENTER64, which happens when you do SYSENTER in long
> mode.) SYSENTER32, from user code's perspective, does the same thing
> in either case [1]. That means that it really does make sense that
> we'd have two implementations of the same entry point, one written in
> 32-bit asm and one written in 64-bit asm.
>
> The patch I'm rebasing merges the two wrmsrs to MSR_IA32_SYSENTER, and
> this change makes it uglier.
>
> [1] Sort of. We probably have differently nonsensical calling
> conventions, but that's our fault and has nothing to do with the
> hardware.

Did you intend to merge these two wrmsr()s:

#ifdef CONFIG_X86_64
void syscall_init(void)
{
...
wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
...
}
#endif

#ifdef CONFIG_X86_32
void enable_sep_cpu(void)
{
...
wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);
...
}

... and the new bifurcated names preserve the #ifdef, right?

So I mostly agree with you, but still I'm a bit torn about this, for the following
reason:

- SYSENTER on a 32-bit kernel behaves a bit differently from SYSENTER on a 64-bit
kernel: for example on 32-bit kernels we'll return with SYSEXIT, while on
64-bit kernels we return with SYSRET. The difference is small but user-space
observable: for example EDX is 0 on SYSRET while it points to ->sysenter_return
in the SYSEXIT case.

This kind of user-observable assymmetry does not exist for other unified syscall
ABIs, such as the INT80 method.

So I think that despite having to preserve a small non-unified #ifdef for this
initialization, we are still better off naming the two entry points differently,
along the pattern we use, because the behavior is slightly different depending on
the bitness of the kernel.

Thanks,

Ingo

2015-06-09 16:34:09

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 2/4] x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat

On Tue, Jun 9, 2015 at 2:33 AM, Ingo Molnar <[email protected]> wrote:
>
> * Andy Lutomirski <[email protected]> wrote:
>
>> On Mon, Jun 8, 2015 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
>> > So the SYSENTER instruction is pretty quirky and it has different behavior
>> > depending on bitness and CPU maker.
>> >
>> > Yet we create a false sense of coherency by naming it 'ia32_sysenter_target'
>> > in both of the cases.
>> >
>> > Split the name into its two uses:
>> >
>> > ia32_sysenter_target (32) -> entry_SYSENTER_32
>> > ia32_sysenter_target (64) -> entry_SYSENTER_compat
>> >
>>
>> Now that I'm rebasing my pile on top of this, I have a minor gripe
>> about this one. There are (in my mind, anyway), two SYSENTER
>> instructions: the 32-bit one and the 64-bit one. (That is, there's
>> SYSENTER32, which happens when you do SYSENTER in 32-bit or compat
>> mode, and SYSENTER64, which happens when you do SYSENTER in long
>> mode.) SYSENTER32, from user code's perspective, does the same thing
>> in either case [1]. That means that it really does make sense that
>> we'd have two implementations of the same entry point, one written in
>> 32-bit asm and one written in 64-bit asm.
>>
>> The patch I'm rebasing merges the two wrmsrs to MSR_IA32_SYSENTER, and
>> this change makes it uglier.
>>
>> [1] Sort of. We probably have differently nonsensical calling
>> conventions, but that's our fault and has nothing to do with the
>> hardware.
>
> Did you intend to merge these two wrmsr()s:
>
> #ifdef CONFIG_X86_64
> void syscall_init(void)
> {
> ...
> wrmsrl_safe(MSR_IA32_SYSENTER_EIP, (u64)entry_SYSENTER_compat);
> ...
> }
> #endif
>
> #ifdef CONFIG_X86_32
> void enable_sep_cpu(void)
> {
> ...
> wrmsr(MSR_IA32_SYSENTER_EIP, (unsigned long)entry_SYSENTER_32, 0);
> ...
> }
>
> ... and the new bifurcated names preserve the #ifdef, right?

Exactly.

>
> So I mostly agree with you, but still I'm a bit torn about this, for the following
> reason:
>
> - SYSENTER on a 32-bit kernel behaves a bit differently from SYSENTER on a 64-bit
> kernel: for example on 32-bit kernels we'll return with SYSEXIT, while on
> 64-bit kernels we return with SYSRET. The difference is small but user-space
> observable: for example EDX is 0 on SYSRET while it points to ->sysenter_return
> in the SYSEXIT case.
>
> This kind of user-observable assymmetry does not exist for other unified syscall
> ABIs, such as the INT80 method.
>
> So I think that despite having to preserve a small non-unified #ifdef for this
> initialization, we are still better off naming the two entry points differently,
> along the pattern we use, because the behavior is slightly different depending on
> the bitness of the kernel.
>

Fair enough. This is certainly not a big deal either way. Maybe when
this really gets cleaned up, we can merge the entry points again.

--Andy

> Thanks,
>
> Ingo



--
Andy Lutomirski
AMA Capital Management, LLC

2015-07-06 15:02:28

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 06/08/2015 02:51 PM, Ingo Molnar wrote:
> From 4d7321381e5c7102a3d3faf0a0a0035a09619612 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <[email protected]>
> Date: Mon, 8 Jun 2015 20:43:07 +0200
> Subject: [PATCH] x86/asm/entry/64: Clean up entry_64.S
>
> Make the 64-bit syscall entry code a bit more readable:
>
> - use consistent assembly coding style similar to the other entry_*.S files
>
> - remove old comments that are not true anymore
>
> - eliminate whitespace noise
>
> - use consistent vertical spacing
>
> - fix various comments
>
> - reorganize entry point generation tables to be more readable
>
> No code changed:
>
> # arch/x86/entry/entry_64.o:
>
> text data bss dec hex filename
> 12282 0 0 12282 2ffa entry_64.o.before
> 12282 0 0 12282 2ffa entry_64.o.after
>
> md5:
> cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.before.asm
> cbab1f2d727a2a8a87618eeb79f391b7 entry_64.o.after.asm

Hey Ingo,

I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
blame pointed to this patch. I know that you didn't see any changes in the compiled
file in your testcase, but I do see changes in mine.

Below is what the fuzzer was hitting, and lower are the differences in the compiled
output of entry_64.o.

[3157054.661763] ------------[ cut here ]------------
[3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
[3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[3157054.664164] Dumping ftrace buffer:
[3157054.664740] (ftrace buffer empty)
[3157054.665274] Modules linked in:
[3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
[3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
[3157054.670141] ==================================================================
[3157054.671268] BUG: KASan: out of bounds on stack in __show_regs+0x7f6/0x940 at addr ffff88077800be50
[3157054.674604] Read of size 8 by task trinity-main/11446
[3157054.676521] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157054.679451] flags: 0x42fffff80000400(reserved)
[3157054.681237] page dumped because: kasan: bad access detected
[3157054.683326] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.687097] ffff88077800be50 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157054.690303] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157054.693365] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
[3157054.696209] Call Trace:
[3157054.697180] <NMI> dump_stack (lib/dump_stack.c:52)
[3157054.699390] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157054.701663] ? printk (kernel/printk/printk.c:1896)
[3157054.703531] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.705553] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157054.708014] ? __show_regs (arch/x86/kernel/process_64.c:68)
[3157054.710046] __show_regs (arch/x86/kernel/process_64.c:68)
[3157054.712066] ? printk (kernel/printk/printk.c:1896)
[3157054.713878] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.715875] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157054.718336] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157054.720619] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157054.722530] __die (arch/x86/kernel/dumpstack.c:294)
[3157054.724290] die (arch/x86/kernel/dumpstack.c:316)
[3157054.725962] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157054.727805] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157054.729843] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157054.732211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.734101] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157054.736165] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157054.738126] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157054.740133] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157054.742997] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157054.744991] invalid_op (arch/x86/entry/entry_64.S:925)
[3157054.746873] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.748769] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157054.750658] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157054.752712] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.754514] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.756313] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.758106] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157054.760665] <<EOE>> <UNK>
[3157054.761826] Memory state around the buggy address:
[3157054.763672] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.766266] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157054.768848] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157054.771469] ^
[3157054.774302] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.776910] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.779636] ==================================================================
[3157054.784428] RAX: 0000000080120001 RBX: 0000000000000001 RCX: 00000000c0000101
[3157054.801838] RDX: 1ffffffff4691cd0 RSI: ffffffffa0c10620 RDI: ffffffffa344dc00
[3157054.804414] ==================================================================
[3157054.807050] BUG: KASan: out of bounds on stack in __show_regs+0x897/0x940 at addr ffff88077800be48
[3157054.810374] Read of size 8 by task trinity-main/11446
[3157054.813129] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157054.816012] flags: 0x42fffff80000400(reserved)
[3157054.817718] page dumped because: kasan: bad access detected
[3157054.819766] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.823531] ffff88077800be48 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157054.826320] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157054.829107] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
[3157054.831922] Call Trace:
[3157054.832864] <NMI> dump_stack (lib/dump_stack.c:52)
[3157054.835025] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157054.837224] ? printk (kernel/printk/printk.c:1896)
[3157054.839040] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.841011] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157054.843454] ? __show_regs (arch/x86/kernel/process_64.c:72)
[3157054.845477] __show_regs (arch/x86/kernel/process_64.c:72)
[3157054.847442] ? printk (kernel/printk/printk.c:1896)
[3157054.849276] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.851272] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157054.853949] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157054.856236] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157054.858114] __die (arch/x86/kernel/dumpstack.c:294)
[3157054.859871] die (arch/x86/kernel/dumpstack.c:316)
[3157054.861624] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157054.863479] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157054.865508] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157054.867842] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.869736] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157054.871910] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157054.872787] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157054.873674] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157054.874725] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157054.875605] invalid_op (arch/x86/entry/entry_64.S:925)
[3157054.876439] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.877275] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157054.878112] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157054.879012] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.879810] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.880605] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.881678] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157054.882830] <<EOE>> <UNK>
[3157054.883319] Memory state around the buggy address:
[3157054.884153] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.885300] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157054.886443] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157054.887580] ^
[3157054.888469] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.889605] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.890743] ==================================================================
[3157054.891910] RBP: ffff88077800bee8 R08: 0000000000000001 R09: 000000000000002e
[3157054.893039] ==================================================================
[3157054.894188] BUG: KASan: out of bounds on stack in __show_regs+0x87f/0x940 at addr ffff88077800be40
[3157054.895585] Read of size 8 by task trinity-main/11446
[3157054.896401] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157054.897657] flags: 0x42fffff80000400(reserved)
[3157054.898431] page dumped because: kasan: bad access detected
[3157054.899325] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.900940] ffff88077800be40 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157054.902128] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157054.903273] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
[3157054.904415] Call Trace:
[3157054.904793] <NMI> dump_stack (lib/dump_stack.c:52)
[3157054.905668] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157054.906527] ? printk (kernel/printk/printk.c:1896)
[3157054.907254] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.908034] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157054.908973] ? __show_regs (arch/x86/kernel/process_64.c:74)
[3157054.909774] __show_regs (arch/x86/kernel/process_64.c:74)
[3157054.910558] ? printk (kernel/printk/printk.c:1896)
[3157054.911555] ? bitmap_weight (include/linux/bitmap.h:303)
[3157054.913530] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157054.916152] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157054.918430] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157054.920318] __die (arch/x86/kernel/dumpstack.c:294)
[3157054.922112] die (arch/x86/kernel/dumpstack.c:316)
[3157054.923801] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157054.925643] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157054.927671] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157054.930005] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.931948] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157054.934024] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157054.935990] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157054.937983] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157054.940346] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157054.942337] invalid_op (arch/x86/entry/entry_64.S:925)
[3157054.944211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.946085] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157054.947953] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157054.949950] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.951993] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.953778] ? debug (arch/x86/entry/entry_64.S:1067)
[3157054.955568] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157054.958110] <<EOE>> <UNK>
[3157054.959168] Memory state around the buggy address:
[3157054.960999] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.963654] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157054.966249] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157054.968833] ^
[3157054.970757] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.973408] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157054.975995] ==================================================================
[3157054.978574] ==================================================================
[3157054.981228] BUG: KASan: out of bounds on stack in __show_regs+0x7ae/0x940 at addr ffff88077800be58
[3157054.984458] Read of size 8 by task trinity-main/11446
[3157054.986295] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157054.989141] flags: 0x42fffff80000400(reserved)
[3157054.990824] page dumped because: kasan: bad access detected
[3157054.992895] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.996590] ffff88077800be58 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157054.999365] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157055.002164] 0000000000000010 ffffffff00000000 0000000000000082 ffffed00ef0017c8
[3157055.004929] Call Trace:
[3157055.005866] <NMI> dump_stack (lib/dump_stack.c:52)
[3157055.007983] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157055.010155] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157055.012580] ? __show_regs (arch/x86/kernel/process_64.c:74)
[3157055.014603] __show_regs (arch/x86/kernel/process_64.c:74)
[3157055.016574] ? printk (kernel/printk/printk.c:1896)
[3157055.018396] ? bitmap_weight (include/linux/bitmap.h:303)
[3157055.020358] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157055.023068] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157055.025348] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157055.027228] __die (arch/x86/kernel/dumpstack.c:294)
[3157055.028983] die (arch/x86/kernel/dumpstack.c:316)
[3157055.030664] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157055.032552] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157055.034572] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157055.036891] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.038782] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157055.040840] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157055.042835] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157055.044842] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157055.047189] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157055.049155] invalid_op (arch/x86/entry/entry_64.S:925)
[3157055.051022] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.052945] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157055.054819] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.056824] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.058595] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.060379] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.062197] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.064731] <<EOE>> <UNK>
[3157055.065800] Memory state around the buggy address:
[3157055.067623] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.070203] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157055.072827] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157055.075416] ^
[3157055.077617] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.080205] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.083104] ==================================================================
[3157055.085692] ==================================================================
[3157055.088288] BUG: KASan: out of bounds on stack in __show_regs+0x8e2/0x940 at addr ffff88077800be60
[3157055.091538] Read of size 8 by task trinity-main/11446
[3157055.093378] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157055.096225] flags: 0x42fffff80000400(reserved)
[3157055.097905] page dumped because: kasan: bad access detected
[3157055.099925] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157055.103650] ffff88077800be60 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157055.106430] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157055.109192] 0000000000000010 ffffffff00000000 0000000000000082 ffffed00ef0017cb
[3157055.111989] Call Trace:
[3157055.112927] <NMI> dump_stack (lib/dump_stack.c:52)
[3157055.115044] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157055.117220] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157055.119608] ? __show_regs (arch/x86/kernel/process_64.c:74)
[3157055.121667] __show_regs (arch/x86/kernel/process_64.c:74)
[3157055.123627] ? printk (kernel/printk/printk.c:1896)
[3157055.125449] ? bitmap_weight (include/linux/bitmap.h:303)
[3157055.127429] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157055.130055] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157055.132355] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157055.134243] __die (arch/x86/kernel/dumpstack.c:294)
[3157055.135988] die (arch/x86/kernel/dumpstack.c:316)
[3157055.137648] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157055.139500] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157055.141530] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157055.143859] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.145741] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157055.147811] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157055.149771] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157055.151856] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157055.154231] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157055.156219] invalid_op (arch/x86/entry/entry_64.S:925)
[3157055.158111] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.159993] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157055.161923] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.163912] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.165701] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.167472] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.169247] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.171839] <<EOE>> <UNK>
[3157055.172903] Memory state around the buggy address:
[3157055.174732] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.177327] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157055.179921] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157055.182583] ^
[3157055.184885] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.187483] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.190072] ==================================================================
[3157055.191450] pps pps0: PPS event at 4682.682479766
[3157055.191456] pps pps0: capture assert seq #4932
[3157055.196385] R10: ffffed014e1e4883 R11: ffffed014e1e4881 R12: ffff88077800bef8
[3157055.198934] ==================================================================
[3157055.201581] BUG: KASan: out of bounds on stack in __show_regs+0x901/0x940 at addr ffff88077800be30
[3157055.204771] Read of size 8 by task trinity-main/11446
[3157055.206617] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157055.209469] flags: 0x42fffff80000400(reserved)
[3157055.211321] page dumped because: kasan: bad access detected
[3157055.213356] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157055.217047] ffff88077800be30 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157055.219821] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157055.222620] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
[3157055.225392] Call Trace:
[3157055.226326] <NMI> dump_stack (lib/dump_stack.c:52)
[3157055.228460] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157055.230633] ? printk (kernel/printk/printk.c:1896)
[3157055.232508] ? bitmap_weight (include/linux/bitmap.h:303)
[3157055.234471] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157055.236860] ? __show_regs (arch/x86/kernel/process_64.c:76)
[3157055.238885] __show_regs (arch/x86/kernel/process_64.c:76)
[3157055.240849] ? printk (kernel/printk/printk.c:1896)
[3157055.242726] ? bitmap_weight (include/linux/bitmap.h:303)
[3157055.244694] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157055.247332] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157055.249599] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157055.251525] __die (arch/x86/kernel/dumpstack.c:294)
[3157055.253277] die (arch/x86/kernel/dumpstack.c:316)
[3157055.254948] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157055.256791] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157055.258825] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157055.261184] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.263075] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157055.265127] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157055.267091] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157055.269083] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157055.271489] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157055.273463] invalid_op (arch/x86/entry/entry_64.S:925)
[3157055.275344] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.277229] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157055.279103] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.281096] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.283115] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.284903] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.286702] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.289236] <<EOE>> <UNK>
[3157055.290296] Memory state around the buggy address:
[3157055.292224] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.294827] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157055.297424] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157055.300003] ^
[3157055.301810] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.304413] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.307011] ==================================================================
[3157055.309596] ==================================================================
[3157055.312309] BUG: KASan: out of bounds on stack in __show_regs+0x73e/0x940 at addr ffff88077800be38
[3157055.315505] Read of size 8 by task trinity-main/11446
[3157055.317354] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
[3157055.320204] flags: 0x42fffff80000400(reserved)
[3157055.321928] page dumped because: kasan: bad access detected
[3157055.323953] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157055.327653] ffff88077800be38 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
[3157055.330417] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
[3157055.333251] 0000000000000010 ffffffff00000000 0000000000000082 ffffed00ef0017c6
[3157055.336017] Call Trace:
[3157055.336958] <NMI> dump_stack (lib/dump_stack.c:52)
[3157055.339087] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
[3157055.341276] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
[3157055.343674] ? __show_regs (arch/x86/kernel/process_64.c:76)
[3157055.345699] __show_regs (arch/x86/kernel/process_64.c:76)
[3157055.347659] ? printk (kernel/printk/printk.c:1896)
[3157055.349473] ? bitmap_weight (include/linux/bitmap.h:303)
[3157055.351520] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
[3157055.354146] ? dump_stack_print_info (kernel/printk/printk.c:3121)
[3157055.356412] show_regs (arch/x86/kernel/dumpstack_64.c:313)
[3157055.358289] __die (arch/x86/kernel/dumpstack.c:294)
[3157055.360045] die (arch/x86/kernel/dumpstack.c:316)
[3157055.361735] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
[3157055.363595] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
[3157055.365653] ? do_device_not_available (arch/x86/kernel/traps.c:291)
[3157055.367973] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.369858] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
[3157055.371934] ? sched_clock (arch/x86/kernel/tsc.c:305)
[3157055.373889] ? nmi_handle (arch/x86/kernel/nmi.c:134)
[3157055.375882] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
[3157055.378249] do_invalid_op (arch/x86/kernel/traps.c:313)
[3157055.380216] invalid_op (arch/x86/entry/entry_64.S:925)
[3157055.382139] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.384024] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
[3157055.385907] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.387896] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.389669] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.391502] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.393282] <<EOE>> <#DB> ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.395816] <<EOE>> <UNK>
[3157055.396865] Memory state around the buggy address:
[3157055.398693] ffff88077800bd00: f1 f1 f1 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.401307] ffff88077800bd80: 00 00 00 f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00
[3157055.403897] >ffff88077800be00: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00
[3157055.406472] ^
[3157055.408309] ffff88077800be80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.410885] ffff88077800bf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[3157055.413508] ==================================================================
[3157055.416083] R13: 000b375311a5d4ab R14: ffffffffa3485190 R15: ffffffffa3485180
[3157055.418637] FS: 00007f6d93c6f700(0000) GS:ffff880778000000(0000) knlGS:0000000000000000
[3157055.421726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3157055.423798] CR2: 0000000004378000 CR3: 00000003d2987000 CR4: 00000000000007e0
[3157055.426363] DR0: ffffffff81000000 DR1: 0000000000000000 DR2: 0000000000000000
[3157055.428933] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[3157055.431526] Stack:
[3157055.432310] 0000000000000001 0000000004378000 ffff88077800be98 ffffffffa0b2ff6f
[3157055.435066] ffffffffa3485180 ffffffffa3485190 000b375311a5d4ab 0000000000000000
[3157055.437846] ffff88077800be98 dffffc0000000000 ffffed014e1e4881 ffffed014e1e4883
[3157055.440612] Call Trace:
[3157055.441576] <NMI>
[3157055.442347] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.444426] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.446211] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.447992] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.449762] <<EOE>>
[3157055.450579] <#DB> [3157055.451465] ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.453456] <<EOE>>
[3157055.454274] <UNK> Code: c9 ff 68 85 c0 75 28 5b 41 5c 5d c3 4c 89 e7 e8 4a fc ff ff eb 8c e8 73 8a 02 00 65 c7 05 78 c9 ff 68 01 00 00 00 e9 04 ff ff ff <0f> 0b 0f 0b e8 8a 8b 02 00 65 c7 05 5f c9 ff 68 00 00 00 00 eb
All code
========
0: c9 leaveq
1: ff 68 85 ljmpq *-0x7b(%rax)
4: c0 (bad)
5: 75 28 jne 0x2f
7: 5b pop %rbx
8: 41 5c pop %r12
a: 5d pop %rbp
b: c3 retq
c: 4c 89 e7 mov %r12,%rdi
f: e8 4a fc ff ff callq 0xfffffffffffffc5e
14: eb 8c jmp 0xffffffffffffffa2
16: e8 73 8a 02 00 callq 0x28a8e
1b: 65 c7 05 78 c9 ff 68 movl $0x1,%gs:0x68ffc978(%rip) # 0x68ffc99e
22: 01 00 00 00
26: e9 04 ff ff ff jmpq 0xffffffffffffff2f
2b:* 0f 0b ud2 <-- trapping instruction
2d: 0f 0b ud2
2f: e8 8a 8b 02 00 callq 0x28bbe
34: 65 c7 05 5f c9 ff 68 movl $0x0,%gs:0x68ffc95f(%rip) # 0x68ffc99e
3b: 00 00 00 00
3f: eb 00 jmp 0x41

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 0f 0b ud2
4: e8 8a 8b 02 00 callq 0x28b93
9: 65 c7 05 5f c9 ff 68 movl $0x0,%gs:0x68ffc95f(%rip) # 0x68ffc973
10: 00 00 00 00
14: eb 00 jmp 0x16
[3157055.463226] RIP do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157055.465196] RSP <ffff88077800bed8>
[3157055.466582] ---[ end trace 778a5a25355bda0f ]---
[3157055.468290] Kernel panic - not syncing: Fatal exception in interrupt
[3157055.470836] Dumping ftrace buffer:
[3157055.471807] (ftrace buffer empty)
[3157055.472408] Kernel Offset: 0x16000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[3157055.474066] Rebooting in 1 seconds..




--- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
+++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
@@ -1,5 +1,5 @@

-entry.before.o: file format elf64-x86-64
+entry.after.o: file format elf64-x86-64


Disassembly of section .entry.text:
@@ -3961,8 +3961,8 @@
3b: 09 02 or %eax,(%rdx)
...
3d: R_X86_64_64 .entry.text
- 45: 03 3a add (%rdx),%edi
- 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
+ 45: 03 33 add (%rbx),%esi
+ 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
4d: 44 91 rex.R xchg %eax,%ecx
4f: 93 xchg %eax,%ebx
50: 2f (bad)
@@ -3998,7 +3998,7 @@
94: 09 58 84 or %ebx,-0x7c(%rax)
97: 59 pop %rcx
98: 5c pop %rsp
- 99: f3 3d 2f 2f 4b 5e repz cmp $0x5e4b2f2f,%eax
+ 99: f3 3d 2f 2f 4b 60 repz cmp $0x604b2f2f,%eax
9f: 4b 2f rex.WXB (bad)
a1: 59 pop %rcx
a2: 83 21 59 andl $0x59,(%rcx)
@@ -4010,7 +4010,7 @@
bc: bb 83 59 5c 67 mov $0x675c5983,%ebx
c1: f8 clc
c2: 59 pop %rcx
- c3: 83 3d 03 0e 2e 4b 4c cmpl $0x4c,0x4b2e0e03(%rip) # 4b2e0ecd <ignore_sysret+0x4b2dea6d>
+ c3: 83 3d 03 0f 2e 4b 4d cmpl $0x4d,0x4b2e0f03(%rip) # 4b2e0fcd <ignore_sysret+0x4b2deb6d>
ca: 3d 30 91 30 59 cmp $0x59309130,%eax
cf: 83 03 10 addl $0x10,(%rbx)
d2: 2e 75 32 jne,pn 107 <.debug_line+0x107>
@@ -4038,14 +4038,14 @@
104: 03 0b add (%rbx),%ecx
106: c8 76 2f 22 enterq $0x2f76,$0x22
10a: 5a pop %rdx
- 10b: 08 bc 03 09 82 6a 3d or %bh,0x3d6a8209(%rbx,%rax,1)
+ 10b: 08 bc 03 09 82 6c 3d or %bh,0x3d6c8209(%rbx,%rax,1)
112: 2f (bad)
113: 83 08 bb orl $0xffffffbb,(%rax)
116: 03 0f add (%rdi),%ecx
118: ba 03 38 02 93 mov $0x93023803,%edx
11d: 0d 01 59 02 68 or $0x68025901,%eax
122: 15 83 59 85 22 adc $0x22855983,%eax
- 127: 4c 83 31 f7 rex.WR xorq $0xfffffffffffffff7,(%rcx)
+ 127: 4c 83 31 f6 rex.WR xorq $0xfffffffffffffff6,(%rcx)
12b: 5b pop %rbx
12c: 3d 2f 6c 67 f4 cmp $0xf4676c2f,%eax
131: 67 35 91 2f 91 2f addr32 xor $0x2f912f91,%eax
@@ -4073,21 +4073,20 @@
16d: 3d 59 08 bb 83 cmp $0x83bb0859,%eax
172: 59 pop %rcx
173: f3 03 23 repz add (%rbx),%esp
- 176: ba 02 80 01 14 mov $0x14018002,%edx
- 17b: 02 80 01 17 02 80 add -0x7ffde8ff(%rax),%al
- 181: 01 15 02 80 02 14 add %edx,0x14028002(%rip) # 14028189 <ignore_sysret+0x14025d29>
- 187: 02 80 02 16 02 80 add -0x7ffde9fe(%rax),%al
- 18d: 01 14 02 add %edx,(%rdx,%rax,1)
- 190: 80 01 17 addb $0x17,(%rcx)
- 193: 02 80 02 17 02 80 add -0x7ffde8fe(%rax),%al
- 199: 02 17 add (%rdi),%dl
- 19b: 02 80 02 17 02 80 add -0x7ffde8fe(%rax),%al
- 1a1: 02 14 02 add (%rdx,%rax,1),%dl
- 1a4: 80 02 14 addb $0x14,(%rdx)
- 1a7: 02 80 02 16 02 80 add -0x7ffde9fe(%rax),%al
- 1ad: 02 14 02 add (%rdx,%rax,1),%dl
- 1b0: 80 02 16 addb $0x16,(%rdx)
- 1b3: 03 ef add %edi,%ebp
+ 176: ba 02 80 01 13 mov $0x13018002,%edx
+ 17b: 02 80 01 16 02 80 add -0x7ffde9ff(%rax),%al
+ 181: 01 15 02 80 02 13 add %edx,0x13028002(%rip) # 13028189 <ignore_sysret+0x13025d29>
+ 187: 02 80 02 15 02 80 add -0x7ffdeafe(%rax),%al
+ 18d: 01 13 add %edx,(%rbx)
+ 18f: 02 80 01 16 02 80 add -0x7ffde9ff(%rax),%al
+ 195: 02 16 add (%rsi),%dl
+ 197: 02 80 02 16 02 80 add -0x7ffde9fe(%rax),%al
+ 19d: 02 16 add (%rsi),%dl
+ 19f: 02 80 02 13 02 80 add -0x7ffdecfe(%rax),%al
+ 1a5: 02 13 add (%rbx),%dl
+ 1a7: 02 80 02 15 02 80 add -0x7ffdeafe(%rax),%al
+ 1ad: 02 13 add (%rbx),%dl
+ 1af: 02 80 02 15 03 ee add -0x11fceafe(%rax),%al
1b5: 00 02 add %al,(%rdx)
1b7: 80 02 01 addb $0x1,(%rdx)
1ba: 02 30 add (%rax),%dh
@@ -4108,28 +4107,30 @@
1d9: 30 13 xor %dl,(%rbx)
1db: 02 30 add (%rax),%dh
1dd: 13 02 adc (%rdx),%eax
- 1df: 2d 18 21 67 68 sub $0x68672118,%eax
+ 1df: 2d 1a 21 67 68 sub $0x6867211a,%eax
1e4: 2f (bad)
1e5: 3d 67 21 03 0f cmp $0xf032167,%eax
1ea: 74 21 je 20d <.debug_line+0x20d>
1ec: 3d 83 9f 21 59 cmp $0x59219f83,%eax
- 1f1: 21 83 03 d9 00 d6 and %eax,-0x29ff26fd(%rbx)
+ 1f1: 21 83 03 da 00 d6 and %eax,-0x29ff25fd(%rbx)
1f7: 02 80 01 16 02 80 add -0x7ffde9ff(%rax),%al
1fd: 01 13 add %edx,(%rbx)
1ff: 02 80 01 13 02 30 add 0x30021301(%rax),%al
- 205: 18 02 sbb %al,(%rdx)
+ 205: 1a 02 sbb (%rdx),%al
207: 30 13 xor %dl,(%rbx)
- 209: 02 60 14 add 0x14(%rax),%ah
+ 209: 02 60 15 add 0x15(%rax),%ah
20c: 02 30 add (%rax),%dh
- 20e: 15 03 09 02 5d adc $0x5d020903,%eax
- 213: 01 21 add %esp,(%rcx)
- 215: 02 2d 13 08 c9 59 add 0x59c90813(%rip),%ch # 59c90a2e <ignore_sysret+0x59c8e5ce>
+ 20e: 16 (bad)
+ 20f: 03 09 add (%rcx),%ecx
+ 211: 02 5d 01 add 0x1(%rbp),%bl
+ 214: 21 02 and %eax,(%rdx)
+ 216: 2d 13 08 c9 59 sub $0x59c90813,%eax
21b: 59 pop %rcx
21c: 2f (bad)
21d: 2f (bad)
21e: 2f (bad)
21f: 67 2f addr32 (bad)
- 221: 03 0f add (%rdi),%ecx
+ 221: 03 10 add (%rax),%edx
223: ba 83 e5 2f 2f mov $0x2f2fe583,%edx
228: f3 3d 30 08 92 08 repz cmp $0x8920830,%eax
22e: bb 02 2d 13 4b mov $0x4b132d02,%ebx
@@ -4155,8 +4156,7 @@
264: 2e cs
265: 5a pop %rdx
266: 5a pop %rdx
- 267: 2f (bad)
- 268: 75 5a jne 2c4 <syscall_return+0x38>
+ 267: 30 75 5a xor %dh,0x5a(%rbp)
26a: 03 09 add (%rcx),%ecx
26c: 2e 75 59 jne,pn 2c8 <syscall_return+0x3c>
26f: 2f (bad)
@@ -4169,7 +4169,7 @@
27a: 5b pop %rbx
27b: 4d 23 03 and (%r11),%r8
27e: 27 (bad)
- 27f: 66 4d 33 4f 03 data32 xor 0x3(%r15),%r9
+ 27f: 66 4d 31 4f 03 data32 xor %r9,0x3(%r15)
284: 13 08 adc (%rax),%ecx
286: 3c 93 cmp $0x93,%al
288: 4d 08 3d 52 2f 03 09 rex.WRB or %r15b,0x9032f52(%rip) # 90331e1 <ignore_sysret+0x9030d81>
@@ -4188,10 +4188,7 @@
2ae: 09 02 or %eax,(%rdx)
...
2b0: R_X86_64_64 .fixup
- 2b8: 03 cc add %esp,%ecx
- 2ba: 07 (bad)
- 2bb: 01 67 2f add %esp,0x2f(%rdi)
- 2be: 2f (bad)
+ 2b8: 03 bc 07 01 67 2f 2f add 0x2f2f6701(%rdi,%rax,1),%edi
2bf: 02 .byte 0x2
2c0: 05 .byte 0x5
2c1: 00 01 add %al,(%rcx)

2015-07-06 16:07:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Sasha Levin <[email protected]> wrote:

> Hey Ingo,
>
> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
> blame pointed to this patch. I know that you didn't see any changes in the compiled
> file in your testcase, but I do see changes in mine.

Hm, weird - could you send me your .config please?

Thanks,

Ingo

2015-07-06 16:21:09

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/06/2015 12:07 PM, Ingo Molnar wrote:
>
> * Sasha Levin <[email protected]> wrote:
>
>> Hey Ingo,
>>
>> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
>> blame pointed to this patch. I know that you didn't see any changes in the compiled
>> file in your testcase, but I do see changes in mine.
>
> Hm, weird - could you send me your .config please?

Attached.


Thanks,
Sasha


Attachments:
config-sasha (163.44 kB)

2015-07-06 16:24:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Sasha Levin <[email protected]> wrote:

> On 07/06/2015 12:07 PM, Ingo Molnar wrote:
> >
> > * Sasha Levin <[email protected]> wrote:
> >
> >> Hey Ingo,
> >>
> >> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
> >> blame pointed to this patch. I know that you didn't see any changes in the compiled
> >> file in your testcase, but I do see changes in mine.
> >
> > Hm, weird - could you send me your .config please?
>
> Attached.

It's still weird: I copied your config to 4d7321381e5c and made 'make oldconfig' -
and it asked a couple of questions, which suggests that your config does not come
from 4d7321381e5c?

Furthermore, building 4d7321381e5c and 4d7321381e5c^1 gives me:

# arch/x86/entry/entry_64.o:

text data bss dec hex filename
11530 0 0 11530 2d0a entry_64.o.before
11530 0 0 11530 2d0a entry_64.o.after

md5:
1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm

i.e. I have trouble reproducing the discrepancy you are seeing. (I also tried
allmodconfig/allyesconfig again for good measure - no luck.)

Thanks,

Ingo

2015-07-06 16:38:21

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/06/2015 12:23 PM, Ingo Molnar wrote:
>
> * Sasha Levin <[email protected]> wrote:
>
>> On 07/06/2015 12:07 PM, Ingo Molnar wrote:
>>>
>>> * Sasha Levin <[email protected]> wrote:
>>>
>>>> Hey Ingo,
>>>>
>>>> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
>>>> blame pointed to this patch. I know that you didn't see any changes in the compiled
>>>> file in your testcase, but I do see changes in mine.
>>>
>>> Hm, weird - could you send me your .config please?
>>
>> Attached.
>
> It's still weird: I copied your config to 4d7321381e5c and made 'make oldconfig' -
> and it asked a couple of questions, which suggests that your config does not come
> from 4d7321381e5c?

Right, sorry, this is my base config for -next. I just did 'yes "" | make oldconfig'
for the build test.

I've attached it.

> Furthermore, building 4d7321381e5c and 4d7321381e5c^1 gives me:
>
> # arch/x86/entry/entry_64.o:
>
> text data bss dec hex filename
> 11530 0 0 11530 2d0a entry_64.o.before
> 11530 0 0 11530 2d0a entry_64.o.after
>
> md5:
> 1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
> 1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm
>
> i.e. I have trouble reproducing the discrepancy you are seeing. (I also tried
> allmodconfig/allyesconfig again for good measure - no luck.)

With allyesconfig I don't actually see a change. Only with the config I've attached.


Thanks,
Sasha


Attachments:
config-sasha (157.47 kB)

2015-07-06 16:43:37

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Sasha Levin <[email protected]> wrote:

> On 07/06/2015 12:23 PM, Ingo Molnar wrote:
> >
> > * Sasha Levin <[email protected]> wrote:
> >
> >> On 07/06/2015 12:07 PM, Ingo Molnar wrote:
> >>>
> >>> * Sasha Levin <[email protected]> wrote:
> >>>
> >>>> Hey Ingo,
> >>>>
> >>>> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
> >>>> blame pointed to this patch. I know that you didn't see any changes in the compiled
> >>>> file in your testcase, but I do see changes in mine.
> >>>
> >>> Hm, weird - could you send me your .config please?
> >>
> >> Attached.
> >
> > It's still weird: I copied your config to 4d7321381e5c and made 'make oldconfig' -
> > and it asked a couple of questions, which suggests that your config does not come
> > from 4d7321381e5c?
>
> Right, sorry, this is my base config for -next. I just did 'yes "" | make oldconfig'
> for the build test.
>
> I've attached it.
>
> > Furthermore, building 4d7321381e5c and 4d7321381e5c^1 gives me:
> >
> > # arch/x86/entry/entry_64.o:
> >
> > text data bss dec hex filename
> > 11530 0 0 11530 2d0a entry_64.o.before
> > 11530 0 0 11530 2d0a entry_64.o.after
> >
> > md5:
> > 1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
> > 1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm
> >
> > i.e. I have trouble reproducing the discrepancy you are seeing. (I also tried
> > allmodconfig/allyesconfig again for good measure - no luck.)
>
> With allyesconfig I don't actually see a change. Only with the config I've attached.

So I still cannot reproduce it even with your latest config:

# arch/x86/entry/entry_64.o:

text data bss dec hex filename
11530 0 0 11530 2d0a entry_64.o.before
11530 0 0 11530 2d0a entry_64.o.after

md5:
1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm

but I saw this during the build:

scripts/Makefile.kasan:23: CONFIG_KASAN: compiler does not support all options. Trying minimal configuration

is your KASAN build more fully enabled, and does the problem reproduce with
CONFIG_KASAN turned off?

Thanks,

Ingo

2015-07-06 17:03:28

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/06/2015 12:43 PM, Ingo Molnar wrote:
>
> * Sasha Levin <[email protected]> wrote:
>
>> On 07/06/2015 12:23 PM, Ingo Molnar wrote:
>>>
>>> * Sasha Levin <[email protected]> wrote:
>>>
>>>> On 07/06/2015 12:07 PM, Ingo Molnar wrote:
>>>>>
>>>>> * Sasha Levin <[email protected]> wrote:
>>>>>
>>>>>> Hey Ingo,
>>>>>>
>>>>>> I've started seeing the fuzzer hitting the BUG() at arch/x86/kernel/nmi.c:533. git
>>>>>> blame pointed to this patch. I know that you didn't see any changes in the compiled
>>>>>> file in your testcase, but I do see changes in mine.
>>>>>
>>>>> Hm, weird - could you send me your .config please?
>>>>
>>>> Attached.
>>>
>>> It's still weird: I copied your config to 4d7321381e5c and made 'make oldconfig' -
>>> and it asked a couple of questions, which suggests that your config does not come
>>> from 4d7321381e5c?
>>
>> Right, sorry, this is my base config for -next. I just did 'yes "" | make oldconfig'
>> for the build test.
>>
>> I've attached it.
>>
>>> Furthermore, building 4d7321381e5c and 4d7321381e5c^1 gives me:
>>>
>>> # arch/x86/entry/entry_64.o:
>>>
>>> text data bss dec hex filename
>>> 11530 0 0 11530 2d0a entry_64.o.before
>>> 11530 0 0 11530 2d0a entry_64.o.after
>>>
>>> md5:
>>> 1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
>>> 1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm
>>>
>>> i.e. I have trouble reproducing the discrepancy you are seeing. (I also tried
>>> allmodconfig/allyesconfig again for good measure - no luck.)
>>
>> With allyesconfig I don't actually see a change. Only with the config I've attached.
>
> So I still cannot reproduce it even with your latest config:
>
> # arch/x86/entry/entry_64.o:
>
> text data bss dec hex filename
> 11530 0 0 11530 2d0a entry_64.o.before
> 11530 0 0 11530 2d0a entry_64.o.after
>
> md5:
> 1430e793250d5d8572bc2df2997d3929 entry_64.o.before.asm
> 1430e793250d5d8572bc2df2997d3929 entry_64.o.after.asm
>
> but I saw this during the build:
>
> scripts/Makefile.kasan:23: CONFIG_KASAN: compiler does not support all options. Trying minimal configuration
>
> is your KASAN build more fully enabled, and does the problem reproduce with
> CONFIG_KASAN turned off?

KASAN was fully enabled, but this reproduces with CONFIG_KASAN turned off as well.

I also thought that it might be because I'm using gcc6, but going back to 4.7.2
produced the same difference.


Thanks,
Sasha

2015-07-06 17:20:45

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>
> --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
> +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
> @@ -1,5 +1,5 @@
>
> -entry.before.o: file format elf64-x86-64
> +entry.after.o: file format elf64-x86-64
>
>
> Disassembly of section .entry.text:
> @@ -3961,8 +3961,8 @@
> 3b: 09 02 or %eax,(%rdx)
> ...
> 3d: R_X86_64_64 .entry.text
> - 45: 03 3a add (%rdx),%edi
> - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
> + 45: 03 33 add (%rbx),%esi
> + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>

What exactly are you doing to generate this diff? This all looks really weird.

> 4d: 44 91 rex.R xchg %eax,%ecx
> 4f: 93 xchg %eax,%ebx
> 50: 2f (bad)

For example: what on earth is the asm above?

--Andy

2015-07-06 17:35:29

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/06/2015 01:20 PM, Andy Lutomirski wrote:
> On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>> >
>> > --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
>> > +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
>> > @@ -1,5 +1,5 @@
>> >
>> > -entry.before.o: file format elf64-x86-64
>> > +entry.after.o: file format elf64-x86-64
>> >
>> >
>> > Disassembly of section .entry.text:
>> > @@ -3961,8 +3961,8 @@
>> > 3b: 09 02 or %eax,(%rdx)
>> > ...
>> > 3d: R_X86_64_64 .entry.text
>> > - 45: 03 3a add (%rdx),%edi
>> > - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
>> > + 45: 03 33 add (%rbx),%esi
>> > + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
> What exactly are you doing to generate this diff? This all looks really weird.
>
>> > 4d: 44 91 rex.R xchg %eax,%ecx
>> > 4f: 93 xchg %eax,%ebx
>> > 50: 2f (bad)
> For example: what on earth is the asm above?

objdump...


Thanks,
Sasha

2015-07-06 17:41:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Sasha Levin <[email protected]> wrote:

> On 07/06/2015 01:20 PM, Andy Lutomirski wrote:
> > On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
> >> >
> >> > --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
> >> > +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
> >> > @@ -1,5 +1,5 @@
> >> >
> >> > -entry.before.o: file format elf64-x86-64
> >> > +entry.after.o: file format elf64-x86-64
> >> >
> >> >
> >> > Disassembly of section .entry.text:
> >> > @@ -3961,8 +3961,8 @@
> >> > 3b: 09 02 or %eax,(%rdx)
> >> > ...
> >> > 3d: R_X86_64_64 .entry.text
> >> > - 45: 03 3a add (%rdx),%edi
> >> > - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
> >> > + 45: 03 33 add (%rbx),%esi
> >> > + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
> > What exactly are you doing to generate this diff? This all looks really weird.
> >
> >> > 4d: 44 91 rex.R xchg %eax,%ecx
> >> > 4f: 93 xchg %eax,%ebx
> >> > 50: 2f (bad)
> > For example: what on earth is the asm above?
>
> objdump...

Oh, so I'm using 'objdump -d' to compare - but you probably used 'objdump
--disassemble-all', to disassemble .data sections as well?

Thanks,

Ingo

2015-07-06 18:35:47

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On Mon, Jul 6, 2015 at 10:41 AM, Ingo Molnar <[email protected]> wrote:
>
> * Sasha Levin <[email protected]> wrote:
>
>> On 07/06/2015 01:20 PM, Andy Lutomirski wrote:
>> > On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>> >> >
>> >> > --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
>> >> > +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
>> >> > @@ -1,5 +1,5 @@
>> >> >
>> >> > -entry.before.o: file format elf64-x86-64
>> >> > +entry.after.o: file format elf64-x86-64
>> >> >
>> >> >
>> >> > Disassembly of section .entry.text:
>> >> > @@ -3961,8 +3961,8 @@
>> >> > 3b: 09 02 or %eax,(%rdx)
>> >> > ...
>> >> > 3d: R_X86_64_64 .entry.text
>> >> > - 45: 03 3a add (%rdx),%edi
>> >> > - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
>> >> > + 45: 03 33 add (%rbx),%esi
>> >> > + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
>> > What exactly are you doing to generate this diff? This all looks really weird.
>> >
>> >> > 4d: 44 91 rex.R xchg %eax,%ecx
>> >> > 4f: 93 xchg %eax,%ebx
>> >> > 50: 2f (bad)
>> > For example: what on earth is the asm above?
>>
>> objdump...
>
> Oh, so I'm using 'objdump -d' to compare - but you probably used 'objdump
> --disassemble-all', to disassemble .data sections as well?
>

I can reproduce the difference now. Give me a few minutes to see if I
can figure out what's causing it.

--Andy

> Thanks,
>
> Ingo



--
Andy Lutomirski
AMA Capital Management, LLC

2015-07-06 18:39:51

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On Mon, Jul 6, 2015 at 11:35 AM, Andy Lutomirski <[email protected]> wrote:
> On Mon, Jul 6, 2015 at 10:41 AM, Ingo Molnar <[email protected]> wrote:
>>
>> * Sasha Levin <[email protected]> wrote:
>>
>>> On 07/06/2015 01:20 PM, Andy Lutomirski wrote:
>>> > On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>>> >> >
>>> >> > --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
>>> >> > +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
>>> >> > @@ -1,5 +1,5 @@
>>> >> >
>>> >> > -entry.before.o: file format elf64-x86-64
>>> >> > +entry.after.o: file format elf64-x86-64
>>> >> >
>>> >> >
>>> >> > Disassembly of section .entry.text:
>>> >> > @@ -3961,8 +3961,8 @@
>>> >> > 3b: 09 02 or %eax,(%rdx)
>>> >> > ...
>>> >> > 3d: R_X86_64_64 .entry.text
>>> >> > - 45: 03 3a add (%rdx),%edi
>>> >> > - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
>>> >> > + 45: 03 33 add (%rbx),%esi
>>> >> > + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
>>> > What exactly are you doing to generate this diff? This all looks really weird.
>>> >
>>> >> > 4d: 44 91 rex.R xchg %eax,%ecx
>>> >> > 4f: 93 xchg %eax,%ebx
>>> >> > 50: 2f (bad)
>>> > For example: what on earth is the asm above?
>>>
>>> objdump...
>>
>> Oh, so I'm using 'objdump -d' to compare - but you probably used 'objdump
>> --disassemble-all', to disassemble .data sections as well?
>>
>
> I can reproduce the difference now. Give me a few minutes to see if I
> can figure out what's causing it.

It's debug info. If I strip the two .o files, the results are
identical. I think the differences are just line numbers.

Why would this matter? Sasha, can you double-check that this patch
really introduced the problem?

--Andy

2015-07-07 07:02:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S


* Andy Lutomirski <[email protected]> wrote:

> On Mon, Jul 6, 2015 at 10:41 AM, Ingo Molnar <[email protected]> wrote:
> >
> > * Sasha Levin <[email protected]> wrote:
> >
> >> On 07/06/2015 01:20 PM, Andy Lutomirski wrote:
> >> > On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
> >> >> >
> >> >> > --- entry.before.o.cmd 2015-07-06 10:48:32.110189938 -0400
> >> >> > +++ entry.after.o.cmd 2015-07-06 10:48:23.509645442 -0400
> >> >> > @@ -1,5 +1,5 @@
> >> >> >
> >> >> > -entry.before.o: file format elf64-x86-64
> >> >> > +entry.after.o: file format elf64-x86-64
> >> >> >
> >> >> >
> >> >> > Disassembly of section .entry.text:
> >> >> > @@ -3961,8 +3961,8 @@
> >> >> > 3b: 09 02 or %eax,(%rdx)
> >> >> > ...
> >> >> > 3d: R_X86_64_64 .entry.text
> >> >> > - 45: 03 3a add (%rdx),%edi
> >> >> > - 47: 01 3d 03 d6 00 c8 add %edi,-0x37ff29fd(%rip) # ffffffffc800d650 <ignore_sysret+0xffffffffc800b1f0>
> >> >> > + 45: 03 33 add (%rbx),%esi
> >> >> > + 47: 01 3d 03 d5 00 c8 add %edi,-0x37ff2afd(%rip) # ffffffffc800d550 <ignore_sysret+0xffffffffc800b0f0>
> >> > What exactly are you doing to generate this diff? This all looks really weird.
> >> >
> >> >> > 4d: 44 91 rex.R xchg %eax,%ecx
> >> >> > 4f: 93 xchg %eax,%ebx
> >> >> > 50: 2f (bad)
> >> > For example: what on earth is the asm above?
> >>
> >> objdump...
> >
> > Oh, so I'm using 'objdump -d' to compare - but you probably used 'objdump
> > --disassemble-all', to disassemble .data sections as well?
> >
>
> I can reproduce the difference now. Give me a few minutes to see if I
> can figure out what's causing it.

Yeah, so I'm using -d instead of --disassemble-all to not have false positives on
.data details such as line number sensitive debug info.

Thanks,

Ingo

2015-07-08 15:40:53

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/06/2015 02:39 PM, Andy Lutomirski wrote:
>> I can reproduce the difference now. Give me a few minutes to see if I
>> > can figure out what's causing it.
> It's debug info. If I strip the two .o files, the results are
> identical. I think the differences are just line numbers.
>
> Why would this matter? Sasha, can you double-check that this patch
> really introduced the problem?

I wasn't sure that that patch is causing the problem to begin with, I just
singled it out because I saw it generated changes and git blame pointed
to it wrt to modified lines.

I wasn't able to reproduce that issue again, I'll update when/if I do.


Thanks,
Sasha

2015-07-09 00:59:54

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

Having failed to bisect, let's look at the trace:

On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
> [3157054.661763] ------------[ cut here ]------------
> [3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
> [3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> [3157054.664164] Dumping ftrace buffer:
> [3157054.664740] (ftrace buffer empty)
> [3157054.665274] Modules linked in:
> [3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
> [3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
> [3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
> [3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
> [3157054.670141] ==================================================================
> [3157054.671268] BUG: KASan: out of bounds on stack in __show_regs+0x7f6/0x940 at addr ffff88077800be50

I bet that__show_regs interacts poorly with KASan for some reason.
But that's not the underlying bug. In fact, the bad read is quite
close the RSP, so this is almost certainly a bug in KASan or
__show_regs.

> [3157054.674604] Read of size 8 by task trinity-main/11446
> [3157054.676521] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
> [3157054.679451] flags: 0x42fffff80000400(reserved)
> [3157054.681237] page dumped because: kasan: bad access detected
> [3157054.683326] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
> [3157054.687097] ffff88077800be50 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
> [3157054.690303] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
> [3157054.693365] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
> [3157054.696209] Call Trace:
> [3157054.697180] <NMI> dump_stack (lib/dump_stack.c:52)
> [3157054.699390] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
> [3157054.701663] ? printk (kernel/printk/printk.c:1896)
> [3157054.703531] ? bitmap_weight (include/linux/bitmap.h:303)
> [3157054.705553] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
> [3157054.708014] ? __show_regs (arch/x86/kernel/process_64.c:68)
> [3157054.710046] __show_regs (arch/x86/kernel/process_64.c:68)
> [3157054.712066] ? printk (kernel/printk/printk.c:1896)
> [3157054.713878] ? bitmap_weight (include/linux/bitmap.h:303)
> [3157054.715875] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
> [3157054.718336] ? dump_stack_print_info (kernel/printk/printk.c:3121)
> [3157054.720619] show_regs (arch/x86/kernel/dumpstack_64.c:313)
> [3157054.722530] __die (arch/x86/kernel/dumpstack.c:294)
> [3157054.724290] die (arch/x86/kernel/dumpstack.c:316)
> [3157054.725962] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
> [3157054.727805] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
> [3157054.729843] ? do_device_not_available (arch/x86/kernel/traps.c:291)
> [3157054.732211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
> [3157054.734101] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
> [3157054.736165] ? sched_clock (arch/x86/kernel/tsc.c:305)
> [3157054.738126] ? nmi_handle (arch/x86/kernel/nmi.c:134)
> [3157054.740133] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
> [3157054.742997] do_invalid_op (arch/x86/kernel/traps.c:313)
> [3157054.744991] invalid_op (arch/x86/entry/entry_64.S:925)

So we got #UD somewhere...

> [3157054.746873] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
> [3157054.748769] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
> [3157054.750658] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)

...here, perhaps?

Do you know what line 1435 was in the version you tested? There
shouldn't be funny instructions in end_repeat_nmi, though. Did we end
up off an instruction boundary?

Here's my wild guess. The repeat_nmi thing is really rare. What if
there's a CPU or emulator that can't do mov %cr2, %r12 or vice versa?
mov from cr has a somewhat unusual encoding. What platform is this?
Does KASan play games that would cause KVM to emulate a mov to or from
cr2?


--Andy

2015-07-10 13:30:35

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/08/2015 08:59 PM, Andy Lutomirski wrote:
> Having failed to bisect, let's look at the trace:
>
> On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>> [3157054.661763] ------------[ cut here ]------------
>> [3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
>> [3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
>> [3157054.664164] Dumping ftrace buffer:
>> [3157054.664740] (ftrace buffer empty)
>> [3157054.665274] Modules linked in:
>> [3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>> [3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
>> [3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
>> [3157054.670141] ==================================================================
>> [3157054.671268] BUG: KASan: out of bounds on stack in __show_regs+0x7f6/0x940 at addr ffff88077800be50
>
> I bet that__show_regs interacts poorly with KASan for some reason.
> But that's not the underlying bug. In fact, the bad read is quite
> close the RSP, so this is almost certainly a bug in KASan or
> __show_regs.
>
>> [3157054.674604] Read of size 8 by task trinity-main/11446
>> [3157054.676521] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
>> [3157054.679451] flags: 0x42fffff80000400(reserved)
>> [3157054.681237] page dumped because: kasan: bad access detected
>> [3157054.683326] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>> [3157054.687097] ffff88077800be50 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
>> [3157054.690303] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
>> [3157054.693365] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
>> [3157054.696209] Call Trace:
>> [3157054.697180] <NMI> dump_stack (lib/dump_stack.c:52)
>> [3157054.699390] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
>> [3157054.701663] ? printk (kernel/printk/printk.c:1896)
>> [3157054.703531] ? bitmap_weight (include/linux/bitmap.h:303)
>> [3157054.705553] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
>> [3157054.708014] ? __show_regs (arch/x86/kernel/process_64.c:68)
>> [3157054.710046] __show_regs (arch/x86/kernel/process_64.c:68)
>> [3157054.712066] ? printk (kernel/printk/printk.c:1896)
>> [3157054.713878] ? bitmap_weight (include/linux/bitmap.h:303)
>> [3157054.715875] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
>> [3157054.718336] ? dump_stack_print_info (kernel/printk/printk.c:3121)
>> [3157054.720619] show_regs (arch/x86/kernel/dumpstack_64.c:313)
>> [3157054.722530] __die (arch/x86/kernel/dumpstack.c:294)
>> [3157054.724290] die (arch/x86/kernel/dumpstack.c:316)
>> [3157054.725962] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
>> [3157054.727805] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
>> [3157054.729843] ? do_device_not_available (arch/x86/kernel/traps.c:291)
>> [3157054.732211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.734101] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
>> [3157054.736165] ? sched_clock (arch/x86/kernel/tsc.c:305)
>> [3157054.738126] ? nmi_handle (arch/x86/kernel/nmi.c:134)
>> [3157054.740133] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
>> [3157054.742997] do_invalid_op (arch/x86/kernel/traps.c:313)
>> [3157054.744991] invalid_op (arch/x86/entry/entry_64.S:925)
>
> So we got #UD somewhere...
>
>> [3157054.746873] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.748769] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
>> [3157054.750658] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
>
> ...here, perhaps?
>
> Do you know what line 1435 was in the version you tested? There
> shouldn't be funny instructions in end_repeat_nmi, though. Did we end
> up off an instruction boundary?

Yes, arch/x86/entry/entry_64.S:1435 is 'call do_nmi', do_nmi does nmi_enter(),
which is actually:

? do_nmi (arch/x86/kernel/nmi.c:533

And that has a BUG_ON(in_nmi()); in it.

>
> Here's my wild guess. The repeat_nmi thing is really rare. What if
> there's a CPU or emulator that can't do mov %cr2, %r12 or vice versa?
> mov from cr has a somewhat unusual encoding. What platform is this?
> Does KASan play games that would cause KVM to emulate a mov to or from
> cr2?

It's just a regular KVM guest, I'm not sure about KASan (Andrey Cc'ed).


Thanks,
Sasha

2015-07-10 15:26:48

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On 07/09/2015 03:59 AM, Andy Lutomirski wrote:
> Having failed to bisect, let's look at the trace:
>
> On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>> [3157054.661763] ------------[ cut here ]------------
>> [3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
>> [3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
>> [3157054.664164] Dumping ftrace buffer:
>> [3157054.664740] (ftrace buffer empty)
>> [3157054.665274] Modules linked in:
>> [3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>> [3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
>> [3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
>> [3157054.670141] ==================================================================
>> [3157054.671268] BUG: KASan: out of bounds on stack in __show_regs+0x7f6/0x940 at addr ffff88077800be50
>
> I bet that__show_regs interacts poorly with KASan for some reason.
> But that's not the underlying bug. In fact, the bad read is quite
> close the RSP, so this is almost certainly a bug in KASan or
> __show_regs.
>
>> [3157054.674604] Read of size 8 by task trinity-main/11446
>> [3157054.676521] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
>> [3157054.679451] flags: 0x42fffff80000400(reserved)
>> [3157054.681237] page dumped because: kasan: bad access detected
>> [3157054.683326] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>> [3157054.687097] ffff88077800be50 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
>> [3157054.690303] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
>> [3157054.693365] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
>> [3157054.696209] Call Trace:
>> [3157054.697180] <NMI> dump_stack (lib/dump_stack.c:52)
>> [3157054.699390] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
>> [3157054.701663] ? printk (kernel/printk/printk.c:1896)
>> [3157054.703531] ? bitmap_weight (include/linux/bitmap.h:303)
>> [3157054.705553] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
>> [3157054.708014] ? __show_regs (arch/x86/kernel/process_64.c:68)
>> [3157054.710046] __show_regs (arch/x86/kernel/process_64.c:68)
>> [3157054.712066] ? printk (kernel/printk/printk.c:1896)
>> [3157054.713878] ? bitmap_weight (include/linux/bitmap.h:303)
>> [3157054.715875] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
>> [3157054.718336] ? dump_stack_print_info (kernel/printk/printk.c:3121)
>> [3157054.720619] show_regs (arch/x86/kernel/dumpstack_64.c:313)
>> [3157054.722530] __die (arch/x86/kernel/dumpstack.c:294)
>> [3157054.724290] die (arch/x86/kernel/dumpstack.c:316)
>> [3157054.725962] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
>> [3157054.727805] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
>> [3157054.729843] ? do_device_not_available (arch/x86/kernel/traps.c:291)
>> [3157054.732211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.734101] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
>> [3157054.736165] ? sched_clock (arch/x86/kernel/tsc.c:305)
>> [3157054.738126] ? nmi_handle (arch/x86/kernel/nmi.c:134)
>> [3157054.740133] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
>> [3157054.742997] do_invalid_op (arch/x86/kernel/traps.c:313)
>> [3157054.744991] invalid_op (arch/x86/entry/entry_64.S:925)
>
> So we got #UD somewhere...
>
>> [3157054.746873] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>> [3157054.748769] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
>> [3157054.750658] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
>
> ...here, perhaps?
>

Huh? Perhaps here:
kernel BUG at arch/x86/kernel/nmi.c:533

We are in the middle of processing BUG_ON()
IOW, we hit BUG_ON(). BUG_ON() handler calls __show_regs(), for some reason kasan to complains here
and prints backtrace of bad access.

So, after cutting off all kasan reports:

[3157054.661763] ------------[ cut here ]------------
[3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
[3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[3157054.664164] Dumping ftrace buffer:
[3157054.664740] (ftrace buffer empty)
[3157054.665274] Modules linked in:
[3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
[3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
[3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
[3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
[3157054.784428] RAX: 0000000080120001 RBX: 0000000000000001 RCX: 00000000c0000101
[3157054.801838] RDX: 1ffffffff4691cd0 RSI: ffffffffa0c10620 RDI: ffffffffa344dc00
[3157054.891910] RBP: ffff88077800bee8 R08: 0000000000000001 R09: 000000000000002e
[3157055.191450] pps pps0: PPS event at 4682.682479766
[3157055.191456] pps pps0: capture assert seq #4932
[3157055.196385] R10: ffffed014e1e4883 R11: ffffed014e1e4881 R12: ffff88077800bef8
[3157055.416083] R13: 000b375311a5d4ab R14: ffffffffa3485190 R15: ffffffffa3485180
[3157055.418637] FS: 00007f6d93c6f700(0000) GS:ffff880778000000(0000) knlGS:0000000000000000
[3157055.421726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3157055.423798] CR2: 0000000004378000 CR3: 00000003d2987000 CR4: 00000000000007e0
[3157055.426363] DR0: ffffffff81000000 DR1: 0000000000000000 DR2: 0000000000000000
[3157055.428933] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[3157055.431526] Stack:
[3157055.432310] 0000000000000001 0000000004378000 ffff88077800be98 ffffffffa0b2ff6f
[3157055.435066] ffffffffa3485180 ffffffffa3485190 000b375311a5d4ab 0000000000000000
[3157055.437846] ffff88077800be98 dffffc0000000000 ffffed014e1e4881 ffffed014e1e4883
[3157055.440612] Call Trace:
[3157055.441576] <NMI>
[3157055.442347] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
[3157055.444426] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.446211] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.447992] ? debug (arch/x86/entry/entry_64.S:1067)
[3157055.449762] <<EOE>>
[3157055.450579] <#DB> [3157055.451465] ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
[3157055.453456] <<EOE>>

> Do you know what line 1435 was in the version you tested? There
> shouldn't be funny instructions in end_repeat_nmi, though. Did we end
> up off an instruction boundary?
>
> Here's my wild guess. The repeat_nmi thing is really rare. What if
> there's a CPU or emulator that can't do mov %cr2, %r12 or vice versa?
> mov from cr has a somewhat unusual encoding. What platform is this?
> Does KASan play games that would cause KVM to emulate a mov to or from
> cr2?
>

I can't tell you what and how KVM emulates anything, but kasan just maps some memory at
certain location and read/writes it. That's it.


2015-07-10 15:37:04

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] x86/asm/entry/64: Clean up entry_64.S

On Fri, Jul 10, 2015 at 8:26 AM, Andrey Ryabinin <[email protected]> wrote:
> On 07/09/2015 03:59 AM, Andy Lutomirski wrote:
>> Having failed to bisect, let's look at the trace:
>>
>> On Mon, Jul 6, 2015 at 8:00 AM, Sasha Levin <[email protected]> wrote:
>>> [3157054.661763] ------------[ cut here ]------------
>>> [3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
>>> [3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
>>> [3157054.664164] Dumping ftrace buffer:
>>> [3157054.664740] (ftrace buffer empty)
>>> [3157054.665274] Modules linked in:
>>> [3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>>> [3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
>>> [3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>>> [3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
>>> [3157054.670141] ==================================================================
>>> [3157054.671268] BUG: KASan: out of bounds on stack in __show_regs+0x7f6/0x940 at addr ffff88077800be50
>>
>> I bet that__show_regs interacts poorly with KASan for some reason.
>> But that's not the underlying bug. In fact, the bad read is quite
>> close the RSP, so this is almost certainly a bug in KASan or
>> __show_regs.
>>
>>> [3157054.674604] Read of size 8 by task trinity-main/11446
>>> [3157054.676521] page:ffffea001de002c0 count:1 mapcount:0 mapping: (null) index:0x0
>>> [3157054.679451] flags: 0x42fffff80000400(reserved)
>>> [3157054.681237] page dumped because: kasan: bad access detected
>>> [3157054.683326] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
>>> [3157054.687097] ffff88077800be50 000000009c65e33f ffff88077800b9f8 ffffffffa0ac8938
>>> [3157054.690303] 1ffffd4003bc0058 ffff88077800ba88 ffff88077800ba78 ffffffff9759796e
>>> [3157054.693365] ffff88077800bab8 ffffffffa0abe0b3 0000000000000082 ffffffffa2fe39e4
>>> [3157054.696209] Call Trace:
>>> [3157054.697180] <NMI> dump_stack (lib/dump_stack.c:52)
>>> [3157054.699390] kasan_report_error (mm/kasan/report.c:132 mm/kasan/report.c:193)
>>> [3157054.701663] ? printk (kernel/printk/printk.c:1896)
>>> [3157054.703531] ? bitmap_weight (include/linux/bitmap.h:303)
>>> [3157054.705553] __asan_report_load8_noabort (mm/kasan/report.c:230 mm/kasan/report.c:251)
>>> [3157054.708014] ? __show_regs (arch/x86/kernel/process_64.c:68)
>>> [3157054.710046] __show_regs (arch/x86/kernel/process_64.c:68)
>>> [3157054.712066] ? printk (kernel/printk/printk.c:1896)
>>> [3157054.713878] ? bitmap_weight (include/linux/bitmap.h:303)
>>> [3157054.715875] ? start_thread_common.constprop.0 (arch/x86/kernel/process_64.c:58)
>>> [3157054.718336] ? dump_stack_print_info (kernel/printk/printk.c:3121)
>>> [3157054.720619] show_regs (arch/x86/kernel/dumpstack_64.c:313)
>>> [3157054.722530] __die (arch/x86/kernel/dumpstack.c:294)
>>> [3157054.724290] die (arch/x86/kernel/dumpstack.c:316)
>>> [3157054.725962] do_trap (arch/x86/kernel/traps.c:214 arch/x86/kernel/traps.c:260)
>>> [3157054.727805] do_error_trap (arch/x86/kernel/traps.c:298 include/linux/jump_label.h:125 include/linux/context_tracking_state.h:29 include/linux/context_tracking.h:46 arch/x86/kernel/traps.c:302)
>>> [3157054.729843] ? do_device_not_available (arch/x86/kernel/traps.c:291)
>>> [3157054.732211] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>>> [3157054.734101] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:86)
>>> [3157054.736165] ? sched_clock (arch/x86/kernel/tsc.c:305)
>>> [3157054.738126] ? nmi_handle (arch/x86/kernel/nmi.c:134)
>>> [3157054.740133] ? trace_hardirqs_off_thunk (arch/x86/entry/thunk_64.S:40)
>>> [3157054.742997] do_invalid_op (arch/x86/kernel/traps.c:313)
>>> [3157054.744991] invalid_op (arch/x86/entry/entry_64.S:925)
>>
>> So we got #UD somewhere...
>>
>>> [3157054.746873] ? do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
>>> [3157054.748769] ? do_nmi (arch/x86/kernel/nmi.c:515 arch/x86/kernel/nmi.c:531)
>>> [3157054.750658] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
>>
>> ...here, perhaps?
>>
>
> Huh? Perhaps here:
> kernel BUG at arch/x86/kernel/nmi.c:533
>
> We are in the middle of processing BUG_ON()
> IOW, we hit BUG_ON(). BUG_ON() handler calls __show_regs(), for some reason kasan to complains here
> and prints backtrace of bad access.

Yeah, so something's wrong there. Cc: Steven.

BTW, Steven, I think I thought up a very clean fix to the old RSP
issue, and I'll email something out later today. I doubt it's the
same thing here, though.

>
> So, after cutting off all kasan reports:
>
> [3157054.661763] ------------[ cut here ]------------
> [3157054.662552] kernel BUG at arch/x86/kernel/nmi.c:533!
> [3157054.663277] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> [3157054.664164] Dumping ftrace buffer:
> [3157054.664740] (ftrace buffer empty)
> [3157054.665274] Modules linked in:
> [3157054.665768] CPU: 16 PID: 11446 Comm: trinity-main Not tainted 4.1.0-next-20150703-sasha-00040-gd868f14-dirty #2292
> [3157054.667203] task: ffff880408813000 ti: ffff8803d29c8000 task.ti: ffff8803d29c8000
> [3157054.668256] RIP: do_nmi (arch/x86/kernel/nmi.c:533 (discriminator 1))
> [3157054.669378] RSP: 0018:ffff88077800bed8 EFLAGS: 00010006
> [3157054.784428] RAX: 0000000080120001 RBX: 0000000000000001 RCX: 00000000c0000101
> [3157054.801838] RDX: 1ffffffff4691cd0 RSI: ffffffffa0c10620 RDI: ffffffffa344dc00
> [3157054.891910] RBP: ffff88077800bee8 R08: 0000000000000001 R09: 000000000000002e
> [3157055.191450] pps pps0: PPS event at 4682.682479766
> [3157055.191456] pps pps0: capture assert seq #4932
> [3157055.196385] R10: ffffed014e1e4883 R11: ffffed014e1e4881 R12: ffff88077800bef8
> [3157055.416083] R13: 000b375311a5d4ab R14: ffffffffa3485190 R15: ffffffffa3485180
> [3157055.418637] FS: 00007f6d93c6f700(0000) GS:ffff880778000000(0000) knlGS:0000000000000000
> [3157055.421726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [3157055.423798] CR2: 0000000004378000 CR3: 00000003d2987000 CR4: 00000000000007e0
> [3157055.426363] DR0: ffffffff81000000 DR1: 0000000000000000 DR2: 0000000000000000
> [3157055.428933] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> [3157055.431526] Stack:
> [3157055.432310] 0000000000000001 0000000004378000 ffff88077800be98 ffffffffa0b2ff6f
> [3157055.435066] ffffffffa3485180 ffffffffa3485190 000b375311a5d4ab 0000000000000000
> [3157055.437846] ffff88077800be98 dffffc0000000000 ffffed014e1e4881 ffffed014e1e4883
> [3157055.440612] Call Trace:
> [3157055.441576] <NMI>
> [3157055.442347] end_repeat_nmi (arch/x86/entry/entry_64.S:1435)
> [3157055.444426] ? debug (arch/x86/entry/entry_64.S:1067)
> [3157055.446211] ? debug (arch/x86/entry/entry_64.S:1067)
> [3157055.447992] ? debug (arch/x86/entry/entry_64.S:1067)
> [3157055.449762] <<EOE>>
> [3157055.450579] <#DB> [3157055.451465] ? nmi_handle (arch/x86/kernel/nmi.c:134 include/linux/jump_label.h:125 include/trace/events/nmi.h:10 arch/x86/kernel/nmi.c:135)
> [3157055.453456] <<EOE>>
>
>> Do you know what line 1435 was in the version you tested? There
>> shouldn't be funny instructions in end_repeat_nmi, though. Did we end
>> up off an instruction boundary?
>>
>> Here's my wild guess. The repeat_nmi thing is really rare. What if
>> there's a CPU or emulator that can't do mov %cr2, %r12 or vice versa?
>> mov from cr has a somewhat unusual encoding. What platform is this?
>> Does KASan play games that would cause KVM to emulate a mov to or from
>> cr2?
>>
>
> I can't tell you what and how KVM emulates anything, but kasan just maps some memory at
> certain location and read/writes it. That's it.

My wild idea was clearly wrong, since that r12 thing happens on all
NMIs. Also, I directly tested the KVM emulator, and it's fine.

--Andy