2017-10-11 20:30:00

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

Changes:
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce dynamic relocation space on
mapped memory. It also simplifies the relocation process.
- Move the start the module section next to the kernel. Remove the need for
-mcmodel=large on modules. Extends module space from 1 to 2G maximum.
- Support for XEN PVH as 32-bit relocations can be ignored with
--emit-relocs.
- Support for GOT relocations previously done automatically with -pie.
- Remove need for dynamic PLT in modules.
- Support dymamic GOT for modules.
- rfc v2:
- Add support for global stack cookie while compiler default to fs without
mcmodel=kernel
- Change patch 7 to correctly jump out of the identity mapping on kexec load
preserve.

These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.

Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.

The patches:
- 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
- 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
- 14: Adapt percpu design to work correctly when PIE is enabled.
- 15: Provide an option to default visibility to hidden except for key symbols.
It removes errors between compilation units.
- 16: Adapt relocation tool to handle PIE binary correctly.
- 19: Add support for global cookie.
- 20: Support ftrace with PIE (used on Ubuntu config).
- 21: Fix incorrect address marker on dump_pagetables.
- 22: Add option to move the module section just after the kernel.
- 23: Adapt module loading to support PIE with dynamic GOT.
- 24: Make the GOT read-only.
- 25: Add the CONFIG_X86_PIE option (off by default).
- 26: Adapt relocation tool to generate a 64-bit relocation table.
- 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
from 1G to 3G (off by default).

Performance/Size impact:

Size of vmlinux (Default configuration):
File size:
- PIE disabled: +0.000031%
- PIE enabled: -3.210% (less relocations)
.text section:
- PIE disabled: +0.000644%
- PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
File size:
- PIE disabled: -0.201%
- PIE enabled: -0.082%
.text section:
- PIE disabled: same
- PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
File size:
- PIE enabled: -3.167%
.text section:
- PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
File size:
- PIE enabled: -3.167%
.text section:
- PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
- PIE disabled: no significant change (avg +0.1% on latest test).
- PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
- PIE disabled: no significant change (-2% on latest run, likely noise).
- PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
Elapsed Time:
- PIE disabled: no significant change (avg -0.239%)
- PIE enabled: average +0.07%
System Time:
- PIE disabled: no significant change (avg -0.277%)
- PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

diffstat:
Documentation/x86/x86_64/mm.txt | 3
arch/x86/Kconfig | 43 ++++++
arch/x86/Makefile | 40 +++++
arch/x86/boot/boot.h | 2
arch/x86/boot/compressed/Makefile | 5
arch/x86/boot/compressed/misc.c | 10 +
arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
arch/x86/crypto/aesni-intel_asm.S | 14 +-
arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
arch/x86/crypto/glue_helper-asm-avx.S | 4
arch/x86/crypto/glue_helper-asm-avx2.S | 6
arch/x86/entry/entry_32.S | 3
arch/x86/entry/entry_64.S | 29 ++--
arch/x86/include/asm/asm.h | 13 +
arch/x86/include/asm/bug.h | 2
arch/x86/include/asm/ftrace.h | 6
arch/x86/include/asm/jump_label.h | 8 -
arch/x86/include/asm/kvm_host.h | 6
arch/x86/include/asm/module.h | 11 +
arch/x86/include/asm/page_64_types.h | 9 +
arch/x86/include/asm/paravirt_types.h | 12 +
arch/x86/include/asm/percpu.h | 25 ++-
arch/x86/include/asm/pgtable_64_types.h | 6
arch/x86/include/asm/pm-trace.h | 2
arch/x86/include/asm/processor.h | 12 +
arch/x86/include/asm/sections.h | 8 +
arch/x86/include/asm/setup.h | 2
arch/x86/include/asm/stackprotector.h | 19 ++
arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
arch/x86/kernel/asm-offsets.c | 3
arch/x86/kernel/asm-offsets_32.c | 3
arch/x86/kernel/asm-offsets_64.c | 3
arch/x86/kernel/cpu/common.c | 7 -
arch/x86/kernel/cpu/microcode/core.c | 4
arch/x86/kernel/ftrace.c | 42 +++++-
arch/x86/kernel/head64.c | 32 +++-
arch/x86/kernel/head_32.S | 3
arch/x86/kernel/head_64.S | 41 +++++-
arch/x86/kernel/kvm.c | 6
arch/x86/kernel/module.c | 182 ++++++++++++++++++++++++++-
arch/x86/kernel/module.lds | 3
arch/x86/kernel/process.c | 5
arch/x86/kernel/relocate_kernel_64.S | 8 -
arch/x86/kernel/setup_percpu.c | 2
arch/x86/kernel/vmlinux.lds.S | 13 +
arch/x86/kvm/svm.c | 4
arch/x86/lib/cmpxchg16b_emu.S | 8 -
arch/x86/mm/dump_pagetables.c | 11 +
arch/x86/power/hibernate_asm_64.S | 4
arch/x86/tools/relocs.c | 170 +++++++++++++++++++++++--
arch/x86/tools/relocs.h | 4
arch/x86/tools/relocs_common.c | 15 +-
arch/x86/xen/xen-asm.S | 12 -
arch/x86/xen/xen-head.S | 9 -
arch/x86/xen/xen-pvh.S | 13 +
drivers/base/firmware_class.c | 4
include/asm-generic/sections.h | 6
include/asm-generic/vmlinux.lds.h | 12 +
include/linux/compiler.h | 8 +
init/Kconfig | 9 +
kernel/kallsyms.c | 16 +-
kernel/trace/trace.h | 4
lib/dynamic_debug.c | 4
70 files changed, 1032 insertions(+), 308 deletions(-)


2017-10-11 20:30:03

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 03/27] x86: Use symbol name in jump table for PIE support

Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/jump_label.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index adc54c12cbd1..6e558e4524dc 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -36,9 +36,9 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
- _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+ _ASM_PTR "1b, %l[l_yes], %P0 \n\t"
".popsection \n\t"
- : : "i" (key), "i" (branch) : : l_yes);
+ : : "X" (&((char *)key)[branch]) : : l_yes);

return false;
l_yes:
@@ -52,9 +52,9 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
"2:\n\t"
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
- _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+ _ASM_PTR "1b, %l[l_yes], %P0 \n\t"
".popsection \n\t"
- : : "i" (key), "i" (branch) : : l_yes);
+ : : "X" (&((char *)key)[branch]) : : l_yes);

return false;
l_yes:
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:08

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 08/27] x86/CPU: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible. Use the new _ASM_GET_PTR macro instead of
the 'mov $symbol, %dst' construct to not have an absolute reference.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/processor.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b446c5a082ad..b09bd50b06c7 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -49,7 +49,7 @@ static inline void *current_text_addr(void)
{
void *pc;

- asm volatile("mov $1f, %0; 1:":"=r" (pc));
+ asm volatile(_ASM_GET_PTR(1f, %0) "; 1:":"=r" (pc));

return pc;
}
@@ -695,6 +695,7 @@ static inline void sync_core(void)
: ASM_CALL_CONSTRAINT : : "memory");
#else
unsigned int tmp;
+ unsigned long tmp2;

asm volatile (
UNWIND_HINT_SAVE
@@ -705,11 +706,13 @@ static inline void sync_core(void)
"pushfq\n\t"
"mov %%cs, %0\n\t"
"pushq %q0\n\t"
- "pushq $1f\n\t"
+ "leaq 1f(%%rip), %1\n\t"
+ "pushq %1\n\t"
"iretq\n\t"
UNWIND_HINT_RESTORE
"1:"
- : "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
+ : "=&r" (tmp), "=&r" (tmp2), ASM_CALL_CONSTRAINT
+ : : "cc", "memory");
#endif
}

--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:01

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++++++-----
arch/x86/crypto/aesni-intel_asm.S | 14 ++--
arch/x86/crypto/aesni-intel_avx-x86_64.S | 6 +-
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 ++++++------
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 ++++++-------
arch/x86/crypto/camellia-x86_64-asm_64.S | 8 ++-
arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++++++-------
arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++++++------
arch/x86/crypto/des3_ede-asm_64.S | 96 ++++++++++++++++++----------
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4 +-
arch/x86/crypto/glue_helper-asm-avx.S | 4 +-
arch/x86/crypto/glue_helper-asm-avx2.S | 6 +-
12 files changed, 211 insertions(+), 152 deletions(-)

diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
index 8739cf7795de..86fa068e5e81 100644
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ b/arch/x86/crypto/aes-x86_64-asm_64.S
@@ -48,8 +48,12 @@
#define R10 %r10
#define R11 %r11

+/* Hold global for PIE suport */
+#define RBASE %r12
+
#define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
ENTRY(FUNC); \
+ pushq RBASE; \
movq r1,r2; \
leaq KEY+48(r8),r9; \
movq r10,r11; \
@@ -74,54 +78,63 @@
movl r6 ## E,4(r9); \
movl r7 ## E,8(r9); \
movl r8 ## E,12(r9); \
+ popq RBASE; \
ret; \
ENDPROC(FUNC);

+#define round_mov(tab_off, reg_i, reg_o) \
+ leaq tab_off(%rip), RBASE; \
+ movl (RBASE,reg_i,4), reg_o;
+
+#define round_xor(tab_off, reg_i, reg_o) \
+ leaq tab_off(%rip), RBASE; \
+ xorl (RBASE,reg_i,4), reg_o;
+
#define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
movzbl r2 ## H,r5 ## E; \
movzbl r2 ## L,r6 ## E; \
- movl TAB+1024(,r5,4),r5 ## E;\
+ round_mov(TAB+1024, r5, r5 ## E)\
movw r4 ## X,r2 ## X; \
- movl TAB(,r6,4),r6 ## E; \
+ round_mov(TAB, r6, r6 ## E) \
roll $16,r2 ## E; \
shrl $16,r4 ## E; \
movzbl r4 ## L,r7 ## E; \
movzbl r4 ## H,r4 ## E; \
xorl OFFSET(r8),ra ## E; \
xorl OFFSET+4(r8),rb ## E; \
- xorl TAB+3072(,r4,4),r5 ## E;\
- xorl TAB+2048(,r7,4),r6 ## E;\
+ round_xor(TAB+3072, r4, r5 ## E)\
+ round_xor(TAB+2048, r7, r6 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r4 ## E; \
- movl TAB+1024(,r4,4),r4 ## E;\
+ round_mov(TAB+1024, r4, r4 ## E)\
movw r3 ## X,r1 ## X; \
roll $16,r1 ## E; \
shrl $16,r3 ## E; \
- xorl TAB(,r7,4),r5 ## E; \
+ round_xor(TAB, r7, r5 ## E) \
movzbl r3 ## L,r7 ## E; \
movzbl r3 ## H,r3 ## E; \
- xorl TAB+3072(,r3,4),r4 ## E;\
- xorl TAB+2048(,r7,4),r5 ## E;\
+ round_xor(TAB+3072, r3, r4 ## E)\
+ round_xor(TAB+2048, r7, r5 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r3 ## E; \
shrl $16,r1 ## E; \
- xorl TAB+3072(,r3,4),r6 ## E;\
- movl TAB+2048(,r7,4),r3 ## E;\
+ round_xor(TAB+3072, r3, r6 ## E)\
+ round_mov(TAB+2048, r7, r3 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r1 ## E; \
- xorl TAB+1024(,r1,4),r6 ## E;\
- xorl TAB(,r7,4),r3 ## E; \
+ round_xor(TAB+1024, r1, r6 ## E)\
+ round_xor(TAB, r7, r3 ## E) \
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r7 ## E; \
shrl $16,r2 ## E; \
- xorl TAB+3072(,r1,4),r3 ## E;\
- xorl TAB+2048(,r7,4),r4 ## E;\
+ round_xor(TAB+3072, r1, r3 ## E)\
+ round_xor(TAB+2048, r7, r4 ## E)\
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r2 ## E; \
xorl OFFSET+8(r8),rc ## E; \
xorl OFFSET+12(r8),rd ## E; \
- xorl TAB+1024(,r1,4),r3 ## E;\
- xorl TAB(,r2,4),r4 ## E;
+ round_xor(TAB+1024, r1, r3 ## E)\
+ round_xor(TAB, r2, r4 ## E)

#define move_regs(r1,r2,r3,r4) \
movl r3 ## E,r1 ## E; \
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 16627fec80b2..5f73201dff32 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -325,7 +325,8 @@ _get_AAD_rest0\num_initial_blocks\operation:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \TMP1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \TMP1
PSHUFB_XMM \TMP1, %xmm\i
_get_AAD_rest_final\num_initial_blocks\operation:
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
@@ -584,7 +585,8 @@ _get_AAD_rest0\num_initial_blocks\operation:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \TMP1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \TMP1
PSHUFB_XMM \TMP1, %xmm\i
_get_AAD_rest_final\num_initial_blocks\operation:
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
@@ -2722,7 +2724,7 @@ ENDPROC(aesni_cbc_dec)
*/
.align 4
_aesni_inc_init:
- movaps .Lbswap_mask, BSWAP_MASK
+ movaps .Lbswap_mask(%rip), BSWAP_MASK
movaps IV, CTR
PSHUFB_XMM BSWAP_MASK CTR
mov $1, TCTR_LOW
@@ -2850,12 +2852,12 @@ ENTRY(aesni_xts_crypt8)
cmpb $0, %cl
movl $0, %ecx
movl $240, %r10d
- leaq _aesni_enc4, %r11
- leaq _aesni_dec4, %rax
+ leaq _aesni_enc4(%rip), %r11
+ leaq _aesni_dec4(%rip), %rax
cmovel %r10d, %ecx
cmoveq %rax, %r11

- movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
+ movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
movups (IVP), IV

mov 480(KEYP), KLEN
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index faecb1518bf8..488605b19fe8 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -454,7 +454,8 @@ _get_AAD_rest0\@:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \T1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \T1
vpshufb \T1, reg_i, reg_i
_get_AAD_rest_final\@:
vpshufb SHUF_MASK(%rip), reg_i, reg_i
@@ -1761,7 +1762,8 @@ _get_AAD_rest0\@:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \T1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \T1
vpshufb \T1, reg_i, reg_i
_get_AAD_rest_final\@:
vpshufb SHUF_MASK(%rip), reg_i, reg_i
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index f7c495e2863c..46feaea52632 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -52,10 +52,10 @@
/* \
* S-function with AES subbytes \
*/ \
- vmovdqa .Linv_shift_row, t4; \
- vbroadcastss .L0f0f0f0f, t7; \
- vmovdqa .Lpre_tf_lo_s1, t0; \
- vmovdqa .Lpre_tf_hi_s1, t1; \
+ vmovdqa .Linv_shift_row(%rip), t4; \
+ vbroadcastss .L0f0f0f0f(%rip), t7; \
+ vmovdqa .Lpre_tf_lo_s1(%rip), t0; \
+ vmovdqa .Lpre_tf_hi_s1(%rip), t1; \
\
/* AES inverse shift rows */ \
vpshufb t4, x0, x0; \
@@ -68,8 +68,8 @@
vpshufb t4, x6, x6; \
\
/* prefilter sboxes 1, 2 and 3 */ \
- vmovdqa .Lpre_tf_lo_s4, t2; \
- vmovdqa .Lpre_tf_hi_s4, t3; \
+ vmovdqa .Lpre_tf_lo_s4(%rip), t2; \
+ vmovdqa .Lpre_tf_hi_s4(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x1, t0, t1, t7, t6); \
@@ -83,8 +83,8 @@
filter_8bit(x6, t2, t3, t7, t6); \
\
/* AES subbytes + AES shift rows */ \
- vmovdqa .Lpost_tf_lo_s1, t0; \
- vmovdqa .Lpost_tf_hi_s1, t1; \
+ vmovdqa .Lpost_tf_lo_s1(%rip), t0; \
+ vmovdqa .Lpost_tf_hi_s1(%rip), t1; \
vaesenclast t4, x0, x0; \
vaesenclast t4, x7, x7; \
vaesenclast t4, x1, x1; \
@@ -95,16 +95,16 @@
vaesenclast t4, x6, x6; \
\
/* postfilter sboxes 1 and 4 */ \
- vmovdqa .Lpost_tf_lo_s3, t2; \
- vmovdqa .Lpost_tf_hi_s3, t3; \
+ vmovdqa .Lpost_tf_lo_s3(%rip), t2; \
+ vmovdqa .Lpost_tf_hi_s3(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x3, t0, t1, t7, t6); \
filter_8bit(x6, t0, t1, t7, t6); \
\
/* postfilter sbox 3 */ \
- vmovdqa .Lpost_tf_lo_s2, t4; \
- vmovdqa .Lpost_tf_hi_s2, t5; \
+ vmovdqa .Lpost_tf_lo_s2(%rip), t4; \
+ vmovdqa .Lpost_tf_hi_s2(%rip), t5; \
filter_8bit(x2, t2, t3, t7, t6); \
filter_8bit(x5, t2, t3, t7, t6); \
\
@@ -443,7 +443,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
transpose_4x4(c0, c1, c2, c3, a0, a1); \
transpose_4x4(d0, d1, d2, d3, a0, a1); \
\
- vmovdqu .Lshufb_16x16b, a0; \
+ vmovdqu .Lshufb_16x16b(%rip), a0; \
vmovdqu st1, a1; \
vpshufb a0, a2, a2; \
vpshufb a0, a3, a3; \
@@ -482,7 +482,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
#define inpack16_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
y6, y7, rio, key) \
vmovq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor 0 * 16(rio), x0, y7; \
vpxor 1 * 16(rio), x0, y6; \
@@ -533,7 +533,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
vmovdqu x0, stack_tmp0; \
\
vmovq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor x0, y7, y7; \
vpxor x0, y6, y6; \
@@ -1016,7 +1016,7 @@ ENTRY(camellia_ctr_16way)
subq $(16 * 16), %rsp;
movq %rsp, %rax;

- vmovdqa .Lbswap128_mask, %xmm14;
+ vmovdqa .Lbswap128_mask(%rip), %xmm14;

/* load IV and byteswap */
vmovdqu (%rcx), %xmm0;
@@ -1065,7 +1065,7 @@ ENTRY(camellia_ctr_16way)

/* inpack16_pre: */
vmovq (key_table)(CTX), %xmm15;
- vpshufb .Lpack_bswap, %xmm15, %xmm15;
+ vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
vpxor %xmm0, %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
@@ -1133,7 +1133,7 @@ camellia_xts_crypt_16way:
subq $(16 * 16), %rsp;
movq %rsp, %rax;

- vmovdqa .Lxts_gf128mul_and_shl1_mask, %xmm14;
+ vmovdqa .Lxts_gf128mul_and_shl1_mask(%rip), %xmm14;

/* load IV */
vmovdqu (%rcx), %xmm0;
@@ -1209,7 +1209,7 @@ camellia_xts_crypt_16way:

/* inpack16_pre: */
vmovq (key_table)(CTX, %r8, 8), %xmm15;
- vpshufb .Lpack_bswap, %xmm15, %xmm15;
+ vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
vpxor 0 * 16(%rax), %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
@@ -1264,7 +1264,7 @@ ENTRY(camellia_xts_enc_16way)
*/
xorl %r8d, %r8d; /* input whitening key, 0 for enc */

- leaq __camellia_enc_blk16, %r9;
+ leaq __camellia_enc_blk16(%rip), %r9;

jmp camellia_xts_crypt_16way;
ENDPROC(camellia_xts_enc_16way)
@@ -1282,7 +1282,7 @@ ENTRY(camellia_xts_dec_16way)
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */

- leaq __camellia_dec_blk16, %r9;
+ leaq __camellia_dec_blk16(%rip), %r9;

jmp camellia_xts_crypt_16way;
ENDPROC(camellia_xts_dec_16way)
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index eee5b3982cfd..93da327fec83 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -69,12 +69,12 @@
/* \
* S-function with AES subbytes \
*/ \
- vbroadcasti128 .Linv_shift_row, t4; \
- vpbroadcastd .L0f0f0f0f, t7; \
- vbroadcasti128 .Lpre_tf_lo_s1, t5; \
- vbroadcasti128 .Lpre_tf_hi_s1, t6; \
- vbroadcasti128 .Lpre_tf_lo_s4, t2; \
- vbroadcasti128 .Lpre_tf_hi_s4, t3; \
+ vbroadcasti128 .Linv_shift_row(%rip), t4; \
+ vpbroadcastd .L0f0f0f0f(%rip), t7; \
+ vbroadcasti128 .Lpre_tf_lo_s1(%rip), t5; \
+ vbroadcasti128 .Lpre_tf_hi_s1(%rip), t6; \
+ vbroadcasti128 .Lpre_tf_lo_s4(%rip), t2; \
+ vbroadcasti128 .Lpre_tf_hi_s4(%rip), t3; \
\
/* AES inverse shift rows */ \
vpshufb t4, x0, x0; \
@@ -120,8 +120,8 @@
vinserti128 $1, t2##_x, x6, x6; \
vextracti128 $1, x1, t3##_x; \
vextracti128 $1, x4, t2##_x; \
- vbroadcasti128 .Lpost_tf_lo_s1, t0; \
- vbroadcasti128 .Lpost_tf_hi_s1, t1; \
+ vbroadcasti128 .Lpost_tf_lo_s1(%rip), t0; \
+ vbroadcasti128 .Lpost_tf_hi_s1(%rip), t1; \
vaesenclast t4##_x, x2##_x, x2##_x; \
vaesenclast t4##_x, t6##_x, t6##_x; \
vinserti128 $1, t6##_x, x2, x2; \
@@ -136,16 +136,16 @@
vinserti128 $1, t2##_x, x4, x4; \
\
/* postfilter sboxes 1 and 4 */ \
- vbroadcasti128 .Lpost_tf_lo_s3, t2; \
- vbroadcasti128 .Lpost_tf_hi_s3, t3; \
+ vbroadcasti128 .Lpost_tf_lo_s3(%rip), t2; \
+ vbroadcasti128 .Lpost_tf_hi_s3(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x3, t0, t1, t7, t6); \
filter_8bit(x6, t0, t1, t7, t6); \
\
/* postfilter sbox 3 */ \
- vbroadcasti128 .Lpost_tf_lo_s2, t4; \
- vbroadcasti128 .Lpost_tf_hi_s2, t5; \
+ vbroadcasti128 .Lpost_tf_lo_s2(%rip), t4; \
+ vbroadcasti128 .Lpost_tf_hi_s2(%rip), t5; \
filter_8bit(x2, t2, t3, t7, t6); \
filter_8bit(x5, t2, t3, t7, t6); \
\
@@ -482,7 +482,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
transpose_4x4(c0, c1, c2, c3, a0, a1); \
transpose_4x4(d0, d1, d2, d3, a0, a1); \
\
- vbroadcasti128 .Lshufb_16x16b, a0; \
+ vbroadcasti128 .Lshufb_16x16b(%rip), a0; \
vmovdqu st1, a1; \
vpshufb a0, a2, a2; \
vpshufb a0, a3, a3; \
@@ -521,7 +521,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
#define inpack32_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
y6, y7, rio, key) \
vpbroadcastq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor 0 * 32(rio), x0, y7; \
vpxor 1 * 32(rio), x0, y6; \
@@ -572,7 +572,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
vmovdqu x0, stack_tmp0; \
\
vpbroadcastq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor x0, y7, y7; \
vpxor x0, y6, y6; \
@@ -1112,7 +1112,7 @@ ENTRY(camellia_ctr_32way)
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm1;
inc_le128(%xmm0, %xmm15, %xmm14);
- vbroadcasti128 .Lbswap128_mask, %ymm14;
+ vbroadcasti128 .Lbswap128_mask(%rip), %ymm14;
vinserti128 $1, %xmm0, %ymm1, %ymm0;
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 15 * 32(%rax);
@@ -1158,7 +1158,7 @@ ENTRY(camellia_ctr_32way)

/* inpack32_pre: */
vpbroadcastq (key_table)(CTX), %ymm15;
- vpshufb .Lpack_bswap, %ymm15, %ymm15;
+ vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
vpxor %ymm0, %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
@@ -1242,13 +1242,13 @@ camellia_xts_crypt_32way:
subq $(16 * 32), %rsp;
movq %rsp, %rax;

- vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0, %ymm12;
+ vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0(%rip), %ymm12;

/* load IV and construct second IV */
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm15;
gf128mul_x_ble(%xmm0, %xmm12, %xmm13);
- vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1, %ymm13;
+ vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1(%rip), %ymm13;
vinserti128 $1, %xmm0, %ymm15, %ymm0;
vpxor 0 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 15 * 32(%rax);
@@ -1325,7 +1325,7 @@ camellia_xts_crypt_32way:

/* inpack32_pre: */
vpbroadcastq (key_table)(CTX, %r8, 8), %ymm15;
- vpshufb .Lpack_bswap, %ymm15, %ymm15;
+ vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
vpxor 0 * 32(%rax), %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
@@ -1383,7 +1383,7 @@ ENTRY(camellia_xts_enc_32way)

xorl %r8d, %r8d; /* input whitening key, 0 for enc */

- leaq __camellia_enc_blk32, %r9;
+ leaq __camellia_enc_blk32(%rip), %r9;

jmp camellia_xts_crypt_32way;
ENDPROC(camellia_xts_enc_32way)
@@ -1401,7 +1401,7 @@ ENTRY(camellia_xts_dec_32way)
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */

- leaq __camellia_dec_blk32, %r9;
+ leaq __camellia_dec_blk32(%rip), %r9;

jmp camellia_xts_crypt_32way;
ENDPROC(camellia_xts_dec_32way)
diff --git a/arch/x86/crypto/camellia-x86_64-asm_64.S b/arch/x86/crypto/camellia-x86_64-asm_64.S
index 95ba6956a7f6..ef1137406959 100644
--- a/arch/x86/crypto/camellia-x86_64-asm_64.S
+++ b/arch/x86/crypto/camellia-x86_64-asm_64.S
@@ -92,11 +92,13 @@
#define RXORbl %r9b

#define xor2ror16(T0, T1, tmp1, tmp2, ab, dst) \
+ leaq T0(%rip), tmp1; \
movzbl ab ## bl, tmp2 ## d; \
+ xorq (tmp1, tmp2, 8), dst; \
+ leaq T1(%rip), tmp2; \
movzbl ab ## bh, tmp1 ## d; \
- rorq $16, ab; \
- xorq T0(, tmp2, 8), dst; \
- xorq T1(, tmp1, 8), dst;
+ xorq (tmp2, tmp1, 8), dst; \
+ rorq $16, ab;

/**********************************************************************
1-way camellia
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index 86107c961bb4..64eb5c87d04a 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@


#define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- shrq $16, src; \
- movl s1(, RID1, 4), dst ## d; \
- op1 s2(, RID2, 4), dst ## d; \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- interleave_op(il_reg); \
- op2 s3(, RID1, 4), dst ## d; \
- op3 s4(, RID2, 4), dst ## d;
+ movzbl src ## bh, RID1d; \
+ leaq s1(%rip), RID2; \
+ movl (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s2(%rip), RID1; \
+ op1 (RID1, RID2, 4), dst ## d; \
+ shrq $16, src; \
+ movzbl src ## bh, RID1d; \
+ leaq s3(%rip), RID2; \
+ op2 (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s4(%rip), RID1; \
+ op3 (RID1, RID2, 4), dst ## d; \
+ interleave_op(il_reg);

#define dummy(d) /* do nothing */

@@ -166,15 +170,15 @@
subround(l ## 3, r ## 3, l ## 4, r ## 4, f);

#define enc_preload_rkr() \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor kr(CTX), RKR, RKR;

#define dec_preload_rkr() \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor kr(CTX), RKR, RKR; \
- vpshufb .Lbswap128_mask, RKR, RKR;
+ vpshufb .Lbswap128_mask(%rip), RKR, RKR;

#define transpose_2x4(x0, x1, t0, t1) \
vpunpckldq x1, x0, t0; \
@@ -251,9 +255,9 @@ __cast5_enc_blk16:

movq %rdi, CTX;

- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
enc_preload_rkr();

inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -287,7 +291,7 @@ __cast5_enc_blk16:
popq %rbx;
popq %r15;

- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;

outunpack_blocks(RR1, RL1, RTMP, RX, RKM);
outunpack_blocks(RR2, RL2, RTMP, RX, RKM);
@@ -325,9 +329,9 @@ __cast5_dec_blk16:

movq %rdi, CTX;

- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
dec_preload_rkr();

inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -358,7 +362,7 @@ __cast5_dec_blk16:
round(RL, RR, 1, 2);
round(RR, RL, 0, 1);

- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
popq %rbx;
popq %r15;

@@ -521,8 +525,8 @@ ENTRY(cast5_ctr_16way)

vpcmpeqd RKR, RKR, RKR;
vpaddq RKR, RKR, RKR; /* low: -2, high: -2 */
- vmovdqa .Lbswap_iv_mask, R1ST;
- vmovdqa .Lbswap128_mask, RKM;
+ vmovdqa .Lbswap_iv_mask(%rip), R1ST;
+ vmovdqa .Lbswap128_mask(%rip), RKM;

/* load IV and byteswap */
vmovq (%rcx), RX;
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index 7f30b6f0d72c..da1b7e4a23e4 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@


#define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- shrq $16, src; \
- movl s1(, RID1, 4), dst ## d; \
- op1 s2(, RID2, 4), dst ## d; \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- interleave_op(il_reg); \
- op2 s3(, RID1, 4), dst ## d; \
- op3 s4(, RID2, 4), dst ## d;
+ movzbl src ## bh, RID1d; \
+ leaq s1(%rip), RID2; \
+ movl (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s2(%rip), RID1; \
+ op1 (RID1, RID2, 4), dst ## d; \
+ shrq $16, src; \
+ movzbl src ## bh, RID1d; \
+ leaq s3(%rip), RID2; \
+ op2 (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s4(%rip), RID1; \
+ op3 (RID1, RID2, 4), dst ## d; \
+ interleave_op(il_reg);

#define dummy(d) /* do nothing */

@@ -190,10 +194,10 @@
qop(RD, RC, 1);

#define shuffle(mask) \
- vpshufb mask, RKR, RKR;
+ vpshufb mask(%rip), RKR, RKR;

#define preload_rkr(n, do_mask, mask) \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor (kr+n*16)(CTX), RKR, RKR; \
do_mask(mask);
@@ -275,9 +279,9 @@ __cast6_enc_blk8:

movq %rdi, CTX;

- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;

inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -301,7 +305,7 @@ __cast6_enc_blk8:
popq %rbx;
popq %r15;

- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;

outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -323,9 +327,9 @@ __cast6_dec_blk8:

movq %rdi, CTX;

- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;

inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -349,7 +353,7 @@ __cast6_dec_blk8:
popq %rbx;
popq %r15;

- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);

diff --git a/arch/x86/crypto/des3_ede-asm_64.S b/arch/x86/crypto/des3_ede-asm_64.S
index 8e49ce117494..4bbd3ec78df5 100644
--- a/arch/x86/crypto/des3_ede-asm_64.S
+++ b/arch/x86/crypto/des3_ede-asm_64.S
@@ -138,21 +138,29 @@
movzbl RW0bl, RT2d; \
movzbl RW0bh, RT3d; \
shrq $16, RW0; \
- movq s8(, RT0, 8), RT0; \
- xorq s6(, RT1, 8), to; \
+ leaq s8(%rip), RW1; \
+ movq (RW1, RT0, 8), RT0; \
+ leaq s6(%rip), RW1; \
+ xorq (RW1, RT1, 8), to; \
movzbl RW0bl, RL1d; \
movzbl RW0bh, RT1d; \
shrl $16, RW0d; \
- xorq s4(, RT2, 8), RT0; \
- xorq s2(, RT3, 8), to; \
+ leaq s4(%rip), RW1; \
+ xorq (RW1, RT2, 8), RT0; \
+ leaq s2(%rip), RW1; \
+ xorq (RW1, RT3, 8), to; \
movzbl RW0bl, RT2d; \
movzbl RW0bh, RT3d; \
- xorq s7(, RL1, 8), RT0; \
- xorq s5(, RT1, 8), to; \
- xorq s3(, RT2, 8), RT0; \
+ leaq s7(%rip), RW1; \
+ xorq (RW1, RL1, 8), RT0; \
+ leaq s5(%rip), RW1; \
+ xorq (RW1, RT1, 8), to; \
+ leaq s3(%rip), RW1; \
+ xorq (RW1, RT2, 8), RT0; \
load_next_key(n, RW0); \
xorq RT0, to; \
- xorq s1(, RT3, 8), to; \
+ leaq s1(%rip), RW1; \
+ xorq (RW1, RT3, 8), to; \

#define load_next_key(n, RWx) \
movq (((n) + 1) * 8)(CTX), RWx;
@@ -364,65 +372,89 @@ ENDPROC(des3_ede_x86_64_crypt_blk)
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrq $16, RW0; \
- xorq s8(, RT3, 8), to##0; \
- xorq s6(, RT1, 8), to##0; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrq $16, RW0; \
- xorq s4(, RT3, 8), to##0; \
- xorq s2(, RT1, 8), to##0; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrl $16, RW0d; \
- xorq s7(, RT3, 8), to##0; \
- xorq s5(, RT1, 8), to##0; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
load_next_key(n, RW0); \
- xorq s3(, RT3, 8), to##0; \
- xorq s1(, RT1, 8), to##0; \
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
xorq from##1, RW1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrq $16, RW1; \
- xorq s8(, RT3, 8), to##1; \
- xorq s6(, RT1, 8), to##1; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrq $16, RW1; \
- xorq s4(, RT3, 8), to##1; \
- xorq s2(, RT1, 8), to##1; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrl $16, RW1d; \
- xorq s7(, RT3, 8), to##1; \
- xorq s5(, RT1, 8), to##1; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
do_movq(RW0, RW1); \
- xorq s3(, RT3, 8), to##1; \
- xorq s1(, RT1, 8), to##1; \
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
xorq from##2, RW2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrq $16, RW2; \
- xorq s8(, RT3, 8), to##2; \
- xorq s6(, RT1, 8), to##2; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrq $16, RW2; \
- xorq s4(, RT3, 8), to##2; \
- xorq s2(, RT1, 8), to##2; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrl $16, RW2d; \
- xorq s7(, RT3, 8), to##2; \
- xorq s5(, RT1, 8), to##2; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
do_movq(RW0, RW2); \
- xorq s3(, RT3, 8), to##2; \
- xorq s1(, RT1, 8), to##2;
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2;

#define __movq(src, dst) \
movq src, dst;
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index f94375a8dcd1..d56a281221fb 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -97,7 +97,7 @@ ENTRY(clmul_ghash_mul)
FRAME_BEGIN
movups (%rdi), DATA
movups (%rsi), SHASH
- movaps .Lbswap_mask, BSWAP
+ movaps .Lbswap_mask(%rip), BSWAP
PSHUFB_XMM BSWAP DATA
call __clmul_gf128mul_ble
PSHUFB_XMM BSWAP DATA
@@ -114,7 +114,7 @@ ENTRY(clmul_ghash_update)
FRAME_BEGIN
cmp $16, %rdx
jb .Lupdate_just_ret # check length
- movaps .Lbswap_mask, BSWAP
+ movaps .Lbswap_mask(%rip), BSWAP
movups (%rdi), DATA
movups (%rcx), SHASH
PSHUFB_XMM BSWAP DATA
diff --git a/arch/x86/crypto/glue_helper-asm-avx.S b/arch/x86/crypto/glue_helper-asm-avx.S
index 02ee2308fb38..8a49ab1699ef 100644
--- a/arch/x86/crypto/glue_helper-asm-avx.S
+++ b/arch/x86/crypto/glue_helper-asm-avx.S
@@ -54,7 +54,7 @@
#define load_ctr_8way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t1, t2) \
vpcmpeqd t0, t0, t0; \
vpsrldq $8, t0, t0; /* low: -1, high: 0 */ \
- vmovdqa bswap, t1; \
+ vmovdqa bswap(%rip), t1; \
\
/* load IV and byteswap */ \
vmovdqu (iv), x7; \
@@ -99,7 +99,7 @@

#define load_xts_8way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, t0, \
t1, xts_gf128mul_and_shl1_mask) \
- vmovdqa xts_gf128mul_and_shl1_mask, t0; \
+ vmovdqa xts_gf128mul_and_shl1_mask(%rip), t0; \
\
/* load IV */ \
vmovdqu (iv), tiv; \
diff --git a/arch/x86/crypto/glue_helper-asm-avx2.S b/arch/x86/crypto/glue_helper-asm-avx2.S
index a53ac11dd385..e04c80467bd2 100644
--- a/arch/x86/crypto/glue_helper-asm-avx2.S
+++ b/arch/x86/crypto/glue_helper-asm-avx2.S
@@ -67,7 +67,7 @@
vmovdqu (iv), t2x; \
vmovdqa t2x, t3x; \
inc_le128(t2x, t0x, t1x); \
- vbroadcasti128 bswap, t1; \
+ vbroadcasti128 bswap(%rip), t1; \
vinserti128 $1, t2x, t3, t2; /* ab: le0 ; cd: le1 */ \
vpshufb t1, t2, x0; \
\
@@ -124,13 +124,13 @@
tivx, t0, t0x, t1, t1x, t2, t2x, t3, \
xts_gf128mul_and_shl1_mask_0, \
xts_gf128mul_and_shl1_mask_1) \
- vbroadcasti128 xts_gf128mul_and_shl1_mask_0, t1; \
+ vbroadcasti128 xts_gf128mul_and_shl1_mask_0(%rip), t1; \
\
/* load IV and construct second IV */ \
vmovdqu (iv), tivx; \
vmovdqa tivx, t0x; \
gf128mul_x_ble(tivx, t1x, t2x); \
- vbroadcasti128 xts_gf128mul_and_shl1_mask_1, t2; \
+ vbroadcasti128 xts_gf128mul_and_shl1_mask_1(%rip), t2; \
vinserti128 $1, tivx, t0, tiv; \
vpxor (0*32)(src), tiv, x0; \
vmovdqu tiv, (0*32)(dst); \
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:02

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 02/27] x86: Use symbol name on bug table for PIE support

Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/bug.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index aa6b2023d8f8..1210d22ad547 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -37,7 +37,7 @@ do { \
asm volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
- "\t" __BUG_REL(%c0) "\t# bug_entry::file\n" \
+ "\t" __BUG_REL(%P0) "\t# bug_entry::file\n" \
"\t.word %c1" "\t# bug_entry::line\n" \
"\t.word %c2" "\t# bug_entry::flags\n" \
"\t.org 2b+%c3\n" \
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:04

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 04/27] x86: Add macro to get symbol address for PIE support

Add a new _ASM_GET_PTR macro to fetch a symbol address. It will be used
to replace "_ASM_MOV $<symbol>, %dst" code construct that are not compatible
with PIE.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/asm.h | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index b0dc91f4bedc..6de365b8e3fd 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -57,6 +57,19 @@
# define CC_OUT(c) [_cc_ ## c] "=qm"
#endif

+/* Macros to get a global variable address with PIE support on 64-bit */
+#ifdef CONFIG_X86_32
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(movl $##_src)
+#else
+#ifdef __ASSEMBLY__
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(leaq (_src)(%rip))
+#else
+#define __ASM_GET_PTR_PRE(_src) __ASM_FORM_COMMA(leaq (_src)(%%rip))
+#endif
+#endif
+#define _ASM_GET_PTR(_src, _dst) \
+ __ASM_GET_PTR_PRE(_src) __ASM_FORM(_dst)
+
/* Exception table entry */
#ifdef __ASSEMBLY__
# define _ASM_EXTABLE_HANDLE(from, to, handler) \
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:05

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 05/27] x86: relocate_kernel - Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/kernel/relocate_kernel_64.S | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 307d3bac5f04..2ecbdcbe985b 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -200,9 +200,11 @@ identity_mapped:
movq %rax, %cr3
lea PAGE_SIZE(%r8), %rsp
call swap_pages
- movq $virtual_mapped, %rax
- pushq %rax
- ret
+ jmp *virtual_mapped_addr(%rip)
+
+ /* Absolute value for PIE support */
+virtual_mapped_addr:
+ .quad virtual_mapped

virtual_mapped:
movq RSP(%r8), %rsp
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:11

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 11/27] x86/power/64: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/power/hibernate_asm_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index ce8da3a0412c..6fdd7bbc3c33 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -24,7 +24,7 @@
#include <asm/frame.h>

ENTRY(swsusp_arch_suspend)
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq %rsp, pt_regs_sp(%rax)
movq %rbp, pt_regs_bp(%rax)
movq %rsi, pt_regs_si(%rax)
@@ -115,7 +115,7 @@ ENTRY(restore_registers)
movq %rax, %cr4; # turn PGE back on

/* We don't restore %rax, it must be 0 anyway */
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq pt_regs_sp(%rax), %rsp
movq pt_regs_bp(%rax), %rbp
movq pt_regs_si(%rax), %rsi
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:06

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/entry/entry_64.S | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 49167258d587..15bd5530d2ae 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -194,12 +194,15 @@ entry_SYSCALL_64_fastpath:
ja 1f /* return -ENOSYS (already in pt_regs->ax) */
movq %r10, %rcx

+ /* Ensures the call is position independent */
+ leaq sys_call_table(%rip), %r11
+
/*
* This call instruction is handled specially in stub_ptregs_64.
* It might end up jumping to the slow path. If it jumps, RAX
* and all argument registers are clobbered.
*/
- call *sys_call_table(, %rax, 8)
+ call *(%r11, %rax, 8)
.Lentry_SYSCALL_64_after_fastpath_call:

movq %rax, RAX(%rsp)
@@ -334,7 +337,8 @@ ENTRY(stub_ptregs_64)
* RAX stores a pointer to the C function implementing the syscall.
* IRQs are on.
*/
- cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
+ leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
+ cmpq %r11, (%rsp)
jne 1f

/*
@@ -1172,7 +1176,8 @@ ENTRY(error_entry)
movl %ecx, %eax /* zero extend */
cmpq %rax, RIP+8(%rsp)
je .Lbstep_iret
- cmpq $.Lgs_change, RIP+8(%rsp)
+ leaq .Lgs_change(%rip), %rcx
+ cmpq %rcx, RIP+8(%rsp)
jne .Lerror_entry_done

/*
@@ -1383,10 +1388,10 @@ ENTRY(nmi)
* resume the outer NMI.
*/

- movq $repeat_nmi, %rdx
+ leaq repeat_nmi(%rip), %rdx
cmpq 8(%rsp), %rdx
ja 1f
- movq $end_repeat_nmi, %rdx
+ leaq end_repeat_nmi(%rip), %rdx
cmpq 8(%rsp), %rdx
ja nested_nmi_out
1:
@@ -1440,7 +1445,8 @@ nested_nmi:
pushq %rdx
pushfq
pushq $__KERNEL_CS
- pushq $repeat_nmi
+ leaq repeat_nmi(%rip), %rdx
+ pushq %rdx

/* Put stack back */
addq $(6*8), %rsp
@@ -1479,7 +1485,9 @@ first_nmi:
addq $8, (%rsp) /* Fix up RSP */
pushfq /* RFLAGS */
pushq $__KERNEL_CS /* CS */
- pushq $1f /* RIP */
+ pushq %rax /* Support Position Independent Code */
+ leaq 1f(%rip), %rax /* RIP */
+ xchgq %rax, (%rsp) /* Restore RAX, put 1f */
INTERRUPT_RETURN /* continues at repeat_nmi below */
UNWIND_HINT_IRET_REGS
1:
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:24

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 24/27] x86/mm: Make the x86 GOT read-only

The GOT is changed during early boot when relocations are applied. Make
it read-only directly. This table exists only for PIE binary.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index e549bff87c5b..a2301c292e26 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -279,6 +279,17 @@
VMLINUX_SYMBOL(__end_ro_after_init) = .;
#endif

+#ifdef CONFIG_X86_PIE
+#define RO_GOT_X86 \
+ .got : AT(ADDR(.got) - LOAD_OFFSET) { \
+ VMLINUX_SYMBOL(__start_got) = .; \
+ *(.got); \
+ VMLINUX_SYMBOL(__end_got) = .; \
+ }
+#else
+#define RO_GOT_X86
+#endif
+
/*
* Read only Data
*/
@@ -335,6 +346,7 @@
VMLINUX_SYMBOL(__end_builtin_fw) = .; \
} \
\
+ RO_GOT_X86 \
TRACEDATA \
\
/* Kernel symbol table: Normal symbols */ \
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:17

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 17/27] xen: Adapt assembly for PIE support

Change the assembly code to use the new _ASM_GET_PTR macro which get a
symbol reference while being PIE compatible. Adapt the relocation tool
to ignore 32-bit Xen code.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/tools/relocs.c | 16 +++++++++++++++-
arch/x86/xen/xen-head.S | 9 +++++----
arch/x86/xen/xen-pvh.S | 13 +++++++++----
3 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 5d3eb2760198..bc032ad88b22 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -831,6 +831,16 @@ static int is_percpu_sym(ElfW(Sym) *sym, const char *symname)
strncmp(symname, "init_per_cpu_", 13);
}

+/*
+ * Check if the 32-bit relocation is within the xenpvh 32-bit code.
+ * If so, ignores it.
+ */
+static int is_in_xenpvh_assembly(ElfW(Addr) offset)
+{
+ ElfW(Sym) *sym = sym_lookup("pvh_start_xen");
+ return sym && (offset >= sym->st_value) &&
+ (offset < (sym->st_value + sym->st_size));
+}

static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
const char *symname)
@@ -892,8 +902,12 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
* the relocations are processed.
* Make sure that the offset will fit.
*/
- if (r_type != R_X86_64_64 && (int32_t)offset != (int64_t)offset)
+ if (r_type != R_X86_64_64 &&
+ (int32_t)offset != (int64_t)offset) {
+ if (is_in_xenpvh_assembly(offset))
+ break;
die("Relocation offset doesn't fit in 32 bits\n");
+ }

if (r_type == R_X86_64_64)
add_reloc(&relocs64, offset);
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 124941d09b2b..e5b7b9566191 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -25,14 +25,15 @@ ENTRY(startup_xen)

/* Clear .bss */
xor %eax,%eax
- mov $__bss_start, %_ASM_DI
- mov $__bss_stop, %_ASM_CX
+ _ASM_GET_PTR(__bss_start, %_ASM_DI)
+ _ASM_GET_PTR(__bss_stop, %_ASM_CX)
sub %_ASM_DI, %_ASM_CX
shr $__ASM_SEL(2, 3), %_ASM_CX
rep __ASM_SIZE(stos)

- mov %_ASM_SI, xen_start_info
- mov $init_thread_union+THREAD_SIZE, %_ASM_SP
+ _ASM_GET_PTR(xen_start_info, %_ASM_AX)
+ mov %_ASM_SI, (%_ASM_AX)
+ _ASM_GET_PTR(init_thread_union+THREAD_SIZE, %_ASM_SP)

jmp xen_start_kernel
END(startup_xen)
diff --git a/arch/x86/xen/xen-pvh.S b/arch/x86/xen/xen-pvh.S
index e1a5fbeae08d..43e234c7c2de 100644
--- a/arch/x86/xen/xen-pvh.S
+++ b/arch/x86/xen/xen-pvh.S
@@ -101,8 +101,8 @@ ENTRY(pvh_start_xen)
call xen_prepare_pvh

/* startup_64 expects boot_params in %rsi. */
- mov $_pa(pvh_bootparams), %rsi
- mov $_pa(startup_64), %rax
+ movabs $_pa(pvh_bootparams), %rsi
+ movabs $_pa(startup_64), %rax
jmp *%rax

#else /* CONFIG_X86_64 */
@@ -137,10 +137,15 @@ END(pvh_start_xen)

.section ".init.data","aw"
.balign 8
+ /*
+ * Use a quad for _pa(gdt_start) because PIE does not understand a
+ * long is enough. The resulting value will still be in the lower long
+ * part.
+ */
gdt:
.word gdt_end - gdt_start
- .long _pa(gdt_start)
- .word 0
+ .quad _pa(gdt_start)
+ .balign 8
gdt_start:
.quad 0x0000000000000000 /* NULL descriptor */
.quad 0x0000000000000000 /* reserved */
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:07

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 07/27] x86: pm-trace - Adapt assembly for PIE support

Change assembly to use the new _ASM_GET_PTR macro instead of _ASM_MOV for
the assembly to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/pm-trace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pm-trace.h b/arch/x86/include/asm/pm-trace.h
index 7b7ac42c3661..a3801261f0dd 100644
--- a/arch/x86/include/asm/pm-trace.h
+++ b/arch/x86/include/asm/pm-trace.h
@@ -7,7 +7,7 @@
do { \
if (pm_trace_enabled) { \
const void *tracedata; \
- asm volatile(_ASM_MOV " $1f,%0\n" \
+ asm volatile(_ASM_GET_PTR(1f, %0) "\n" \
".section .tracedata,\"a\"\n" \
"1:\t.word %c1\n\t" \
_ASM_PTR " %c2\n" \
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:25

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 25/27] x86/pie: Add option to build the kernel as PIE

Add the CONFIG_X86_PIE option which builds the kernel as a Position
Independent Executable (PIE). The kernel is currently build with the
mcmodel=kernel option which forces it to stay on the top 2G of the
virtual address space. With PIE, the kernel will be able to move below
the current limit.

The --emit-relocs linker option was kept instead of using -pie to limit
the impact on mapped sections. Any incompatible relocation will be
catch by the arch/x86/tools/relocs binary at compile time.

If segment based stack cookies are enabled, try to use the compiler
option to select the segment register. If not available, automatically
enabled global stack cookie in auto mode. Otherwise, recommend
compiler update or global stack cookie option.

Performance/Size impact:
Size of vmlinux (Default configuration):
File size:
- PIE disabled: +0.000031%
- PIE enabled: -3.210% (less relocations)
.text section:
- PIE disabled: +0.000644%
- PIE enabled: +0.837%

Size of vmlinux (Ubuntu configuration):
File size:
- PIE disabled: -0.201%
- PIE enabled: -0.082%
.text section:
- PIE disabled: same
- PIE enabled: +1.319%

Size of vmlinux (Default configuration + ORC):
File size:
- PIE enabled: -3.167%
.text section:
- PIE enabled: +0.814%

Size of vmlinux (Ubuntu configuration + ORC):
File size:
- PIE enabled: -3.167%
.text section:
- PIE enabled: +1.26%

The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.

Hackbench (50% and 1600% on thread/process for pipe/sockets):
- PIE disabled: no significant change (avg +0.1% on latest test).
- PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).

slab_test (average of 10 runs):
- PIE disabled: no significant change (-2% on latest run, likely noise).
- PIE enabled: between -1% and +0.8% on latest runs.

Kernbench (average of 10 Half and Optimal runs):
Elapsed Time:
- PIE disabled: no significant change (avg -0.239%)
- PIE enabled: average +0.07%
System Time:
- PIE disabled: no significant change (avg -0.277%)
- PIE enabled: average +0.7%

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303

Signed-off-by: Thomas Garnier <[email protected]>

merge PIE
---
arch/x86/Kconfig | 7 +++++++
arch/x86/Makefile | 27 +++++++++++++++++++++++++++
2 files changed, 34 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1f2b731c9ec3..bbd28a46ab55 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2148,6 +2148,13 @@ config X86_GLOBAL_STACKPROTECTOR

If unsure, say N

+config X86_PIE
+ bool
+ depends on X86_64
+ select DEFAULT_HIDDEN
+ select DYNAMIC_MODULE_BASE
+ select MODULE_REL_CRCS if MODVERSIONS
+
config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"
depends on SMP
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index b592d57c531b..beae9504c3f4 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -135,7 +135,34 @@ else

KBUILD_CFLAGS += -mno-red-zone
ifdef CONFIG_X86_PIE
+ KBUILD_CFLAGS += -fPIC
KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/x86/kernel/module.lds
+
+ifndef CONFIG_CC_STACKPROTECTOR_NONE
+ifndef CONFIG_X86_GLOBAL_STACKPROTECTOR
+ stackseg-flag := -mstack-protector-guard-reg=%gs
+ ifeq ($(call cc-option-yn,$(stackseg-flag)),n)
+ ifdef CONFIG_CC_STACKPROTECTOR_AUTO
+ $(warning Cannot use CONFIG_CC_STACKPROTECTOR_* while \
+ building a position independent kernel. \
+ Default to global stack protector \
+ (CONFIG_X86_GLOBAL_STACKPROTECTOR).)
+ CONFIG_X86_GLOBAL_STACKPROTECTOR := y
+ KBUILD_CFLAGS += -DCONFIG_X86_GLOBAL_STACKPROTECTOR
+ KBUILD_AFLAGS += -DCONFIG_X86_GLOBAL_STACKPROTECTOR
+ else
+ $(error echo Cannot use \
+ CONFIG_CC_STACKPROTECTOR_(REGULAR|STRONG) \
+ while building a position independent binary \
+ Update your compiler or use \
+ CONFIG_X86_GLOBAL_STACKPROTECTOR)
+ endif
+ else
+ KBUILD_CFLAGS += $(stackseg-flag)
+ endif
+endif
+endif
+
else
KBUILD_CFLAGS += -mcmodel=kernel
endif
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:23

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 23/27] x86/modules: Adapt module loading for PIE support

Adapt module loading to support PIE relocations. Generate dynamic GOT if
a symbol requires it but no entry exist in the kernel GOT.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/Makefile | 4 +
arch/x86/include/asm/module.h | 11 +++
arch/x86/include/asm/sections.h | 4 +
arch/x86/kernel/module.c | 182 ++++++++++++++++++++++++++++++++++++++--
arch/x86/kernel/module.lds | 3 +
5 files changed, 199 insertions(+), 5 deletions(-)
create mode 100644 arch/x86/kernel/module.lds

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index de228200ef2a..b592d57c531b 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -134,7 +134,11 @@ else
KBUILD_CFLAGS += $(cflags-y)

KBUILD_CFLAGS += -mno-red-zone
+ifdef CONFIG_X86_PIE
+ KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/x86/kernel/module.lds
+else
KBUILD_CFLAGS += -mcmodel=kernel
+endif

# -funit-at-a-time shrinks the kernel .text considerably
# unfortunately it makes reading oopses harder.
diff --git a/arch/x86/include/asm/module.h b/arch/x86/include/asm/module.h
index 9eb7c718aaf8..21e0e02c0343 100644
--- a/arch/x86/include/asm/module.h
+++ b/arch/x86/include/asm/module.h
@@ -4,12 +4,23 @@
#include <asm-generic/module.h>
#include <asm/orc_types.h>

+#ifdef CONFIG_X86_PIE
+struct mod_got_sec {
+ struct elf64_shdr *got;
+ int got_num_entries;
+ int got_max_entries;
+};
+#endif
+
struct mod_arch_specific {
#ifdef CONFIG_ORC_UNWINDER
unsigned int num_orcs;
int *orc_unwind_ip;
struct orc_entry *orc_unwind;
#endif
+#ifdef CONFIG_X86_PIE
+ struct mod_got_sec core;
+#endif
};

#ifdef CONFIG_X86_64
diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
index 6b2d496cf1aa..92d796109da1 100644
--- a/arch/x86/include/asm/sections.h
+++ b/arch/x86/include/asm/sections.h
@@ -15,4 +15,8 @@ extern char __end_rodata_hpage_align[];
extern char __start_got[], __end_got[];
#endif

+#if defined(CONFIG_X86_PIE)
+extern char __start_got[], __end_got[];
+#endif
+
#endif /* _ASM_X86_SECTIONS_H */
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index 62e7d70aadd5..aed24dfac1d3 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -30,6 +30,7 @@
#include <linux/gfp.h>
#include <linux/jump_label.h>
#include <linux/random.h>
+#include <linux/sort.h>

#include <asm/text-patching.h>
#include <asm/page.h>
@@ -77,6 +78,173 @@ static unsigned long int get_module_load_offset(void)
}
#endif

+#ifdef CONFIG_X86_PIE
+static u64 find_got_kernel_entry(Elf64_Sym *sym, const Elf64_Rela *rela)
+{
+ u64 *pos;
+
+ for (pos = (u64*)__start_got; pos < (u64*)__end_got; pos++) {
+ if (*pos == sym->st_value)
+ return (u64)pos + rela->r_addend;
+ }
+
+ return 0;
+}
+
+static u64 module_emit_got_entry(struct module *mod, void *loc,
+ const Elf64_Rela *rela, Elf64_Sym *sym)
+{
+ struct mod_got_sec *gotsec = &mod->arch.core;
+ u64 *got = (u64*)gotsec->got->sh_addr;
+ int i = gotsec->got_num_entries;
+ u64 ret;
+
+ /* Check if we can use the kernel GOT */
+ ret = find_got_kernel_entry(sym, rela);
+ if (ret)
+ return ret;
+
+ got[i] = sym->st_value;
+
+ /*
+ * Check if the entry we just created is a duplicate. Given that the
+ * relocations are sorted, this will be the last entry we allocated.
+ * (if one exists).
+ */
+ if (i > 0 && got[i] == got[i - 2]) {
+ ret = (u64)&got[i - 1];
+ } else {
+ gotsec->got_num_entries++;
+ BUG_ON(gotsec->got_num_entries > gotsec->got_max_entries);
+ ret = (u64)&got[i];
+ }
+
+ return ret + rela->r_addend;
+}
+
+#define cmp_3way(a,b) ((a) < (b) ? -1 : (a) > (b))
+
+static int cmp_rela(const void *a, const void *b)
+{
+ const Elf64_Rela *x = a, *y = b;
+ int i;
+
+ /* sort by type, symbol index and addend */
+ i = cmp_3way(ELF64_R_TYPE(x->r_info), ELF64_R_TYPE(y->r_info));
+ if (i == 0)
+ i = cmp_3way(ELF64_R_SYM(x->r_info), ELF64_R_SYM(y->r_info));
+ if (i == 0)
+ i = cmp_3way(x->r_addend, y->r_addend);
+ return i;
+}
+
+static bool duplicate_rel(const Elf64_Rela *rela, int num)
+{
+ /*
+ * Entries are sorted by type, symbol index and addend. That means
+ * that, if a duplicate entry exists, it must be in the preceding
+ * slot.
+ */
+ return num > 0 && cmp_rela(rela + num, rela + num - 1) == 0;
+}
+
+static unsigned int count_gots(Elf64_Sym *syms, Elf64_Rela *rela, int num)
+{
+ unsigned int ret = 0;
+ Elf64_Sym *s;
+ int i;
+
+ for (i = 0; i < num; i++) {
+ switch (ELF64_R_TYPE(rela[i].r_info)) {
+ case R_X86_64_GOTPCREL:
+ s = syms + ELF64_R_SYM(rela[i].r_info);
+
+ /*
+ * Use the kernel GOT when possible, else reserve a
+ * custom one for this module.
+ */
+ if (!duplicate_rel(rela, i) &&
+ !find_got_kernel_entry(s, rela + i))
+ ret++;
+ break;
+ }
+ }
+ return ret;
+}
+
+/*
+ * Generate GOT entries for GOTPCREL relocations that do not exists in the
+ * kernel GOT. Based on arm64 module-plts implementation.
+ */
+int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
+ char *secstrings, struct module *mod)
+{
+ unsigned long gots = 0;
+ Elf_Shdr *symtab = NULL;
+ Elf64_Sym *syms = NULL;
+ char *strings, *name;
+ int i;
+
+ /*
+ * Find the empty .got section so we can expand it to store the PLT
+ * entries. Record the symtab address as well.
+ */
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ if (!strcmp(secstrings + sechdrs[i].sh_name, ".got")) {
+ mod->arch.core.got = sechdrs + i;
+ } else if (sechdrs[i].sh_type == SHT_SYMTAB) {
+ symtab = sechdrs + i;
+ syms = (Elf64_Sym *)symtab->sh_addr;
+ }
+ }
+
+ if (!mod->arch.core.got) {
+ pr_err("%s: module GOT section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+ if (!syms) {
+ pr_err("%s: module symtab section missing\n", mod->name);
+ return -ENOEXEC;
+ }
+
+ for (i = 0; i < ehdr->e_shnum; i++) {
+ Elf64_Rela *rels = (void *)ehdr + sechdrs[i].sh_offset;
+ int numrels = sechdrs[i].sh_size / sizeof(Elf64_Rela);
+
+ if (sechdrs[i].sh_type != SHT_RELA)
+ continue;
+
+ /* sort by type, symbol index and addend */
+ sort(rels, numrels, sizeof(Elf64_Rela), cmp_rela, NULL);
+
+ gots += count_gots(syms, rels, numrels);
+ }
+
+ mod->arch.core.got->sh_type = SHT_NOBITS;
+ mod->arch.core.got->sh_flags = SHF_ALLOC;
+ mod->arch.core.got->sh_addralign = L1_CACHE_BYTES;
+ mod->arch.core.got->sh_size = (gots + 1) * sizeof(u64);
+ mod->arch.core.got_num_entries = 0;
+ mod->arch.core.got_max_entries = gots;
+
+ /*
+ * If a _GLOBAL_OFFSET_TABLE_ symbol exists, make it absolute for
+ * modules to correctly reference it. Similar to s390 implementation.
+ */
+ strings = (void *) ehdr + sechdrs[symtab->sh_link].sh_offset;
+ for (i = 0; i < symtab->sh_size/sizeof(Elf_Sym); i++) {
+ if (syms[i].st_shndx != SHN_UNDEF)
+ continue;
+ name = strings + syms[i].st_name;
+ if (!strcmp(name, "_GLOBAL_OFFSET_TABLE_")) {
+ syms[i].st_shndx = SHN_ABS;
+ break;
+ }
+ }
+ return 0;
+}
+#endif
+
void *module_alloc(unsigned long size)
{
void *p;
@@ -184,13 +352,18 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
if ((s64)val != *(s32 *)loc)
goto overflow;
break;
+#ifdef CONFIG_X86_PIE
+ case R_X86_64_GOTPCREL:
+ val = module_emit_got_entry(me, loc, rel + i, sym);
+ /* fallthrough */
+#endif
+ case R_X86_64_PLT32:
case R_X86_64_PC32:
val -= (u64)loc;
*(u32 *)loc = val;
-#if 0
- if ((s64)val != *(s32 *)loc)
+ if (IS_ENABLED(CONFIG_X86_PIE) &&
+ (s64)val != *(s32 *)loc)
goto overflow;
-#endif
break;
default:
pr_err("%s: Unknown rela relocation: %llu\n",
@@ -203,8 +376,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
overflow:
pr_err("overflow in relocation type %d val %Lx\n",
(int)ELF64_R_TYPE(rel[i].r_info), val);
- pr_err("`%s' likely not compiled with -mcmodel=kernel\n",
- me->name);
+ pr_err("`%s' likely too far from the kernel\n", me->name);
return -ENOEXEC;
}
#endif
diff --git a/arch/x86/kernel/module.lds b/arch/x86/kernel/module.lds
new file mode 100644
index 000000000000..fd6e95a4b454
--- /dev/null
+++ b/arch/x86/kernel/module.lds
@@ -0,0 +1,3 @@
+SECTIONS {
+ .got (NOLOAD) : { BYTE(0) }
+}
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:13

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 13/27] x86/boot/64: Use _text in a global for PIE support

By default PIE generated code create only relative references so _text
points to the temporary virtual address. Instead use a global variable
so the relocation is done as expected.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/kernel/head64.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index bab4fa579450..675f1dba3b21 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -45,8 +45,14 @@ static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
return ptr - (void *)_text + (void *)physaddr;
}

-unsigned long __head __startup_64(unsigned long physaddr,
- struct boot_params *bp)
+/*
+ * Use a global variable to properly calculate _text delta on PIE. By default
+ * a PIE binary do a RIP relative difference instead of the relocated address.
+ */
+unsigned long _text_offset = (unsigned long)(_text - __START_KERNEL_map);
+
+unsigned long __head notrace __startup_64(unsigned long physaddr,
+ struct boot_params *bp)
{
unsigned long load_delta, *p;
unsigned long pgtable_flags;
@@ -65,7 +71,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
* Compute the delta between the address I am compiled to run at
* and the address I am actually running at.
*/
- load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
+ load_delta = physaddr - _text_offset;

/* Is the address not 2M aligned? */
if (load_delta & ~PMD_PAGE_MASK)
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:16

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 16/27] x86/relocs: Handle PIE relocations

Change the relocation tool to correctly handle relocations generated by
-fPIE option:

- Add relocation for each entry of the .got section given the linker does not
generate R_X86_64_GLOB_DAT on a simple link.
- Ignore R_X86_64_GOTPCREL and R_X86_64_PLT32.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/tools/relocs.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 93 insertions(+), 1 deletion(-)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index 73eb7fd4aec4..5d3eb2760198 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -31,6 +31,7 @@ struct section {
Elf_Sym *symtab;
Elf_Rel *reltab;
char *strtab;
+ Elf_Addr *got;
};
static struct section *secs;

@@ -292,6 +293,35 @@ static Elf_Sym *sym_lookup(const char *symname)
return 0;
}

+static Elf_Sym *sym_lookup_addr(Elf_Addr addr, const char **name)
+{
+ int i;
+ for (i = 0; i < ehdr.e_shnum; i++) {
+ struct section *sec = &secs[i];
+ long nsyms;
+ Elf_Sym *symtab;
+ Elf_Sym *sym;
+
+ if (sec->shdr.sh_type != SHT_SYMTAB)
+ continue;
+
+ nsyms = sec->shdr.sh_size/sizeof(Elf_Sym);
+ symtab = sec->symtab;
+
+ for (sym = symtab; --nsyms >= 0; sym++) {
+ if (sym->st_value == addr) {
+ if (name) {
+ *name = sym_name(sec->link->strtab,
+ sym);
+ }
+ return sym;
+ }
+ }
+ }
+ return 0;
+}
+
+
#if BYTE_ORDER == LITTLE_ENDIAN
#define le16_to_cpu(val) (val)
#define le32_to_cpu(val) (val)
@@ -512,6 +542,33 @@ static void read_relocs(FILE *fp)
}
}

+static void read_got(FILE *fp)
+{
+ int i;
+ for (i = 0; i < ehdr.e_shnum; i++) {
+ struct section *sec = &secs[i];
+ sec->got = NULL;
+ if (sec->shdr.sh_type != SHT_PROGBITS ||
+ strcmp(sec_name(i), ".got")) {
+ continue;
+ }
+ sec->got = malloc(sec->shdr.sh_size);
+ if (!sec->got) {
+ die("malloc of %d bytes for got failed\n",
+ sec->shdr.sh_size);
+ }
+ if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) {
+ die("Seek to %d failed: %s\n",
+ sec->shdr.sh_offset, strerror(errno));
+ }
+ if (fread(sec->got, 1, sec->shdr.sh_size, fp)
+ != sec->shdr.sh_size) {
+ die("Cannot read got: %s\n",
+ strerror(errno));
+ }
+ }
+}
+

static void print_absolute_symbols(void)
{
@@ -642,6 +699,32 @@ static void add_reloc(struct relocs *r, uint32_t offset)
r->offset[r->count++] = offset;
}

+/*
+ * The linker does not generate relocations for the GOT for the kernel.
+ * If a GOT is found, simulate the relocations that should have been included.
+ */
+static void walk_got_table(int (*process)(struct section *sec, Elf_Rel *rel,
+ Elf_Sym *sym, const char *symname),
+ struct section *sec)
+{
+ int i;
+ Elf_Addr entry;
+ Elf_Sym *sym;
+ const char *symname;
+ Elf_Rel rel;
+
+ for (i = 0; i < sec->shdr.sh_size/sizeof(Elf_Addr); i++) {
+ entry = sec->got[i];
+ sym = sym_lookup_addr(entry, &symname);
+ if (!sym)
+ die("Could not found got symbol for entry %d\n", i);
+ rel.r_offset = sec->shdr.sh_addr + i * sizeof(Elf_Addr);
+ rel.r_info = ELF_BITS == 64 ? R_X86_64_GLOB_DAT
+ : R_386_GLOB_DAT;
+ process(sec, &rel, sym, symname);
+ }
+}
+
static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
Elf_Sym *sym, const char *symname))
{
@@ -655,6 +738,8 @@ static void walk_relocs(int (*process)(struct section *sec, Elf_Rel *rel,
struct section *sec = &secs[i];

if (sec->shdr.sh_type != SHT_REL_TYPE) {
+ if (sec->got)
+ walk_got_table(process, sec);
continue;
}
sec_symtab = sec->link;
@@ -764,6 +849,8 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
offset += per_cpu_load_addr;

switch (r_type) {
+ case R_X86_64_PLT32:
+ case R_X86_64_GOTPCREL:
case R_X86_64_NONE:
/* NONE can be ignored. */
break;
@@ -805,7 +892,7 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
* the relocations are processed.
* Make sure that the offset will fit.
*/
- if ((int32_t)offset != (int64_t)offset)
+ if (r_type != R_X86_64_64 && (int32_t)offset != (int64_t)offset)
die("Relocation offset doesn't fit in 32 bits\n");

if (r_type == R_X86_64_64)
@@ -814,6 +901,10 @@ static int do_reloc64(struct section *sec, Elf_Rel *rel, ElfW(Sym) *sym,
add_reloc(&relocs32, offset);
break;

+ case R_X86_64_GLOB_DAT:
+ add_reloc(&relocs64, offset);
+ break;
+
default:
die("Unsupported relocation type: %s (%d)\n",
rel_type(r_type), r_type);
@@ -1083,6 +1174,7 @@ void process(FILE *fp, int use_real_mode, int as_text,
read_strtabs(fp);
read_symtabs(fp);
read_relocs(fp);
+ read_got(fp);
if (ELF_BITS == 64)
percpu_init();
if (show_absolute_syms) {
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:20

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 20/27] x86/ftrace: Adapt function tracing for PIE support

When using -fPIE/PIC with function tracing, the compiler generates a
call through the GOT (call *__fentry__@GOTPCREL). This instruction
takes 6 bytes instead of 5 on the usual relative call.

If PIE is enabled, replace the 6th byte of the GOT call by a 1-byte nop
so ftrace can handle the previous 5-bytes as before.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/ftrace.h | 6 ++++--
arch/x86/include/asm/sections.h | 4 ++++
arch/x86/kernel/ftrace.c | 42 +++++++++++++++++++++++++++++++++++++++--
3 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index eccd0ac6bc38..183990157a5e 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -24,9 +24,11 @@ extern void __fentry__(void);
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/*
- * addr is the address of the mcount call instruction.
- * recordmcount does the necessary offset calculation.
+ * addr is the address of the mcount call instruction. PIE has always a
+ * byte added to the start of the function.
*/
+ if (IS_ENABLED(CONFIG_X86_PIE))
+ addr -= 1;
return addr;
}

diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h
index 2f75f30cb2f6..6b2d496cf1aa 100644
--- a/arch/x86/include/asm/sections.h
+++ b/arch/x86/include/asm/sections.h
@@ -11,4 +11,8 @@ extern struct exception_table_entry __stop___ex_table[];
extern char __end_rodata_hpage_align[];
#endif

+#if defined(CONFIG_X86_PIE)
+extern char __start_got[], __end_got[];
+#endif
+
#endif /* _ASM_X86_SECTIONS_H */
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 9bef1bbeba63..a253601e783b 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -101,7 +101,7 @@ static const unsigned char *ftrace_nop_replace(void)

static int
ftrace_modify_code_direct(unsigned long ip, unsigned const char *old_code,
- unsigned const char *new_code)
+ unsigned const char *new_code)
{
unsigned char replaced[MCOUNT_INSN_SIZE];

@@ -134,6 +134,44 @@ ftrace_modify_code_direct(unsigned long ip, unsigned const char *old_code,
return 0;
}

+/* Bytes before call GOT offset */
+const unsigned char got_call_preinsn[] = { 0xff, 0x15 };
+
+static int
+ftrace_modify_initial_code(unsigned long ip, unsigned const char *old_code,
+ unsigned const char *new_code)
+{
+ unsigned char replaced[MCOUNT_INSN_SIZE + 1];
+
+ ftrace_expected = old_code;
+
+ /*
+ * If PIE is not enabled or no GOT call was found, default to the
+ * original approach to code modification.
+ */
+ if (!IS_ENABLED(CONFIG_X86_PIE)
+ || probe_kernel_read(replaced, (void *)ip, sizeof(replaced))
+ || memcmp(replaced, got_call_preinsn, sizeof(got_call_preinsn)))
+ return ftrace_modify_code_direct(ip, old_code, new_code);
+
+ /*
+ * Build a nop slide with a 5-byte nop and 1-byte nop to keep the ftrace
+ * hooking algorithm working with the expected 5 bytes instruction.
+ */
+ memcpy(replaced, new_code, MCOUNT_INSN_SIZE);
+ replaced[MCOUNT_INSN_SIZE] = ideal_nops[1][0];
+
+ ip = text_ip_addr(ip);
+
+ if (probe_kernel_write((void *)ip, replaced, sizeof(replaced)))
+ return -EPERM;
+
+ sync_core();
+
+ return 0;
+
+}
+
int ftrace_make_nop(struct module *mod,
struct dyn_ftrace *rec, unsigned long addr)
{
@@ -152,7 +190,7 @@ int ftrace_make_nop(struct module *mod,
* just modify the code directly.
*/
if (addr == MCOUNT_ADDR)
- return ftrace_modify_code_direct(rec->ip, old, new);
+ return ftrace_modify_initial_code(rec->ip, old, new);

ftrace_expected = NULL;

--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:27

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 27/27] x86/kaslr: Add option to extend KASLR range from 1GB to 3GB

Add a new CONFIG_RANDOMIZE_BASE_LARGE option to benefit from PIE
support. It increases the KASLR range from 1GB to 3GB. The new range
stars at 0xffffffff00000000 just above the EFI memory region. This
option is off by default.

The boot code is adapted to create the appropriate page table spanning
three PUD pages.

The relocation table uses 64-bit integers generated with the updated
relocation tool with the large-reloc option.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/Kconfig | 21 +++++++++++++++++++++
arch/x86/boot/compressed/Makefile | 5 +++++
arch/x86/boot/compressed/misc.c | 10 +++++++++-
arch/x86/include/asm/page_64_types.h | 9 +++++++++
arch/x86/kernel/head64.c | 15 ++++++++++++---
arch/x86/kernel/head_64.S | 11 ++++++++++-
6 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bbd28a46ab55..54627dd06348 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2155,6 +2155,27 @@ config X86_PIE
select DYNAMIC_MODULE_BASE
select MODULE_REL_CRCS if MODVERSIONS

+config RANDOMIZE_BASE_LARGE
+ bool "Increase the randomization range of the kernel image"
+ depends on X86_64 && RANDOMIZE_BASE
+ select X86_PIE
+ select X86_MODULE_PLTS if MODULES
+ default n
+ ---help---
+ Build the kernel as a Position Independent Executable (PIE) and
+ increase the available randomization range from 1GB to 3GB.
+
+ This option impacts performance on kernel CPU intensive workloads up
+ to 10% due to PIE generated code. Impact on user-mode processes and
+ typical usage would be significantly less (0.50% when you build the
+ kernel).
+
+ The kernel and modules will generate slightly more assembly (1 to 2%
+ increase on the .text sections). The vmlinux binary will be
+ significantly smaller due to less relocations.
+
+ If unsure say N
+
config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"
depends on SMP
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 8a958274b54c..94dfee5a7cd2 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -112,7 +112,12 @@ $(obj)/vmlinux.bin: vmlinux FORCE

targets += $(patsubst $(obj)/%,%,$(vmlinux-objs-y)) vmlinux.bin.all vmlinux.relocs

+# Large randomization require bigger relocation table
+ifeq ($(CONFIG_RANDOMIZE_BASE_LARGE),y)
+CMD_RELOCS = arch/x86/tools/relocs --large-reloc
+else
CMD_RELOCS = arch/x86/tools/relocs
+endif
quiet_cmd_relocs = RELOCS $@
cmd_relocs = $(CMD_RELOCS) $< > $@;$(CMD_RELOCS) --abs-relocs $<
$(obj)/vmlinux.relocs: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index c14217cd0155..c1ac9f2e283d 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -169,10 +169,18 @@ void __puthex(unsigned long value)
}

#if CONFIG_X86_NEED_RELOCS
+
+/* Large randomization go lower than -2G and use large relocation table */
+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+typedef long rel_t;
+#else
+typedef int rel_t;
+#endif
+
static void handle_relocations(void *output, unsigned long output_len,
unsigned long virt_addr)
{
- int *reloc;
+ rel_t *reloc;
unsigned long delta, map, ptr;
unsigned long min_addr = (unsigned long)output;
unsigned long max_addr = min_addr + (VO___bss_start - VO__text);
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 3f5f08b010d0..6b65f846dd64 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -48,7 +48,11 @@
#define __PAGE_OFFSET __PAGE_OFFSET_BASE
#endif /* CONFIG_RANDOMIZE_MEMORY */

+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+#define __START_KERNEL_map _AC(0xffffffff00000000, UL)
+#else
#define __START_KERNEL_map _AC(0xffffffff80000000, UL)
+#endif /* CONFIG_RANDOMIZE_BASE_LARGE */

/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
#ifdef CONFIG_X86_5LEVEL
@@ -65,9 +69,14 @@
* 512MiB by default, leaving 1.5GiB for modules once the page tables
* are fully set up. If kernel ASLR is configured, it can extend the
* kernel page table mapping, reducing the size of the modules area.
+ * On PIE, we relocate the binary 2G lower so add this extra space.
*/
#if defined(CONFIG_RANDOMIZE_BASE)
+#ifdef CONFIG_RANDOMIZE_BASE_LARGE
+#define KERNEL_IMAGE_SIZE (_AC(3, UL) * 1024 * 1024 * 1024)
+#else
#define KERNEL_IMAGE_SIZE (1024 * 1024 * 1024)
+#endif
#else
#define KERNEL_IMAGE_SIZE (512 * 1024 * 1024)
#endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index b6363f0d11a7..d603d0f5a40a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -39,6 +39,7 @@ static unsigned int __initdata next_early_pgt;
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);

#define __head __section(.head.text)
+#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)

static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
{
@@ -56,6 +57,8 @@ unsigned long __head notrace __startup_64(unsigned long physaddr,
{
unsigned long load_delta, *p;
unsigned long pgtable_flags;
+ unsigned long level3_kernel_start, level3_kernel_count;
+ unsigned long level3_fixmap_start;
pgdval_t *pgd;
p4dval_t *p4d;
pudval_t *pud;
@@ -83,6 +86,11 @@ unsigned long __head notrace __startup_64(unsigned long physaddr,
/* Include the SME encryption mask in the fixup value */
load_delta += sme_get_me_mask();

+ /* Look at the randomization spread to adapt page table used */
+ level3_kernel_start = pud_index(__START_KERNEL_map);
+ level3_kernel_count = pud_count(KERNEL_IMAGE_SIZE);
+ level3_fixmap_start = level3_kernel_start + level3_kernel_count;
+
/* Fixup the physical addresses in the page table */

pgd = fixup_pointer(&early_top_pgt, physaddr);
@@ -94,8 +102,9 @@ unsigned long __head notrace __startup_64(unsigned long physaddr,
}

pud = fixup_pointer(&level3_kernel_pgt, physaddr);
- pud[510] += load_delta;
- pud[511] += load_delta;
+ for (i = 0; i < level3_kernel_count; i++)
+ pud[level3_kernel_start + i] += load_delta;
+ pud[level3_fixmap_start] += load_delta;

pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
pmd[506] += load_delta;
@@ -150,7 +159,7 @@ unsigned long __head notrace __startup_64(unsigned long physaddr,
*/

pmd = fixup_pointer(level2_kernel_pgt, physaddr);
- for (i = 0; i < PTRS_PER_PMD; i++) {
+ for (i = 0; i < PTRS_PER_PMD * level3_kernel_count; i++) {
if (pmd[i] & _PAGE_PRESENT)
pmd[i] += load_delta;
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index df5198e310fc..7918ffefc9c9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -39,11 +39,15 @@

#define p4d_index(x) (((x) >> P4D_SHIFT) & (PTRS_PER_P4D-1))
#define pud_index(x) (((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pud_count(x) (((x + (PUD_SIZE - 1)) & ~(PUD_SIZE - 1)) >> PUD_SHIFT)

PGD_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
PGD_START_KERNEL = pgd_index(__START_KERNEL_map)
L3_START_KERNEL = pud_index(__START_KERNEL_map)

+/* Adapt page table L3 space based on range of randomization */
+L3_KERNEL_ENTRY_COUNT = pud_count(KERNEL_IMAGE_SIZE)
+
.text
__HEAD
.code64
@@ -413,7 +417,12 @@ NEXT_PAGE(level4_kernel_pgt)
NEXT_PAGE(level3_kernel_pgt)
.fill L3_START_KERNEL,8,0
/* (2^48-(2*1024*1024*1024)-((2^39)*511))/(2^30) = 510 */
- .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC
+ i = 0
+ .rept L3_KERNEL_ENTRY_COUNT
+ .quad level2_kernel_pgt - __START_KERNEL_map + _KERNPG_TABLE_NOENC \
+ + PAGE_SIZE*i
+ i = i + 1
+ .endr
.quad level2_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC

NEXT_PAGE(level2_kernel_pgt)
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:15

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 15/27] compiler: Option to default to hidden symbols

Provide an option to default visibility to hidden except for key
symbols. This option is disabled by default and will be used by x86_64
PIE support to remove errors between compilation units.

The default visibility is also enabled for external symbols that are
compared as they maybe equals (start/end of sections). In this case,
older versions of GCC will remove the comparison if the symbols are
hidden. This issue exists at least on gcc 4.9 and before.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/boot/boot.h | 2 +-
arch/x86/include/asm/setup.h | 2 +-
arch/x86/kernel/cpu/microcode/core.c | 4 ++--
drivers/base/firmware_class.c | 4 ++--
include/asm-generic/sections.h | 6 ++++++
include/linux/compiler.h | 8 ++++++++
init/Kconfig | 7 +++++++
kernel/kallsyms.c | 16 ++++++++--------
kernel/trace/trace.h | 4 ++--
lib/dynamic_debug.c | 4 ++--
10 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index ef5a9cc66fb8..d726c35bdd96 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -193,7 +193,7 @@ static inline bool memcmp_gs(const void *s1, addr_t s2, size_t len)
}

/* Heap -- available for dynamic lists. */
-extern char _end[];
+extern char _end[] __default_visibility;
extern char *HEAP;
extern char *heap_end;
#define RESET_HEAP() ((void *)( HEAP = _end ))
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index a65cf544686a..7e0b54f605c6 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -67,7 +67,7 @@ static inline void x86_ce4100_early_setup(void) { }
* This is set up by the setup-routine at boot-time
*/
extern struct boot_params boot_params;
-extern char _text[];
+extern char _text[] __default_visibility;

static inline bool kaslr_enabled(void)
{
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 86e8f0b2537b..8f021783a929 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -144,8 +144,8 @@ static bool __init check_loader_disabled_bsp(void)
return *res;
}

-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+extern struct builtin_fw __start_builtin_fw[] __default_visibility;
+extern struct builtin_fw __end_builtin_fw[] __default_visibility;

bool get_builtin_firmware(struct cpio_data *cd, const char *name)
{
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 4b57cf5bc81d..77d4727f6594 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -45,8 +45,8 @@ MODULE_LICENSE("GPL");

#ifdef CONFIG_FW_LOADER

-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+extern struct builtin_fw __start_builtin_fw[] __default_visibility;
+extern struct builtin_fw __end_builtin_fw[] __default_visibility;

static bool fw_get_builtin_firmware(struct firmware *fw, const char *name,
void *buf, size_t size)
diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index e5da44eddd2f..1aa5d6dac9e1 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -30,6 +30,9 @@
* __irqentry_text_start, __irqentry_text_end
* __softirqentry_text_start, __softirqentry_text_end
*/
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(default)
+#endif
extern char _text[], _stext[], _etext[];
extern char _data[], _sdata[], _edata[];
extern char __bss_start[], __bss_stop[];
@@ -46,6 +49,9 @@ extern char __softirqentry_text_start[], __softirqentry_text_end[];

/* Start and end of .ctors section - used for constructor calls. */
extern char __ctors_start[], __ctors_end[];
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility pop
+#endif

extern __visible const void __nosave_begin, __nosave_end;

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e95a2631e545..6997716f73bf 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -78,6 +78,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);
#include <linux/compiler-clang.h>
#endif

+/* Useful for Position Independent Code to reduce global references */
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(hidden)
+#define __default_visibility __attribute__((visibility ("default")))
+#else
+#define __default_visibility
+#endif
+
/*
* Generic compiler-dependent macros required for kernel
* build go below this comment. Actual compiler/compiler version
diff --git a/init/Kconfig b/init/Kconfig
index ccb1d8daf241..b640201fcff7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1649,6 +1649,13 @@ config PROFILING
config TRACEPOINTS
bool

+#
+# Default to hidden visibility for all symbols.
+# Useful for Position Independent Code to reduce global references.
+#
+config DEFAULT_HIDDEN
+ bool
+
source "arch/Kconfig"

endmenu # General setup
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 127e7cfafa55..252019c8c3a9 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -32,24 +32,24 @@
* These will be re-linked against their real values
* during the second link stage.
*/
-extern const unsigned long kallsyms_addresses[] __weak;
-extern const int kallsyms_offsets[] __weak;
-extern const u8 kallsyms_names[] __weak;
+extern const unsigned long kallsyms_addresses[] __weak __default_visibility;
+extern const int kallsyms_offsets[] __weak __default_visibility;
+extern const u8 kallsyms_names[] __weak __default_visibility;

/*
* Tell the compiler that the count isn't in the small data section if the arch
* has one (eg: FRV).
*/
extern const unsigned long kallsyms_num_syms
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;

extern const unsigned long kallsyms_relative_base
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;

-extern const u8 kallsyms_token_table[] __weak;
-extern const u16 kallsyms_token_index[] __weak;
+extern const u8 kallsyms_token_table[] __weak __default_visibility;
+extern const u16 kallsyms_token_index[] __weak __default_visibility;

-extern const unsigned long kallsyms_markers[] __weak;
+extern const unsigned long kallsyms_markers[] __weak __default_visibility;

static inline int is_kernel_inittext(unsigned long addr)
{
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 652c682707cd..31cb920039a2 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1742,8 +1742,8 @@ extern int trace_event_enable_disable(struct trace_event_file *file,
int enable, int soft_disable);
extern int tracing_alloc_snapshot(void);

-extern const char *__start___trace_bprintk_fmt[];
-extern const char *__stop___trace_bprintk_fmt[];
+extern const char *__start___trace_bprintk_fmt[] __default_visibility;
+extern const char *__stop___trace_bprintk_fmt[] __default_visibility;

extern const char *__start___tracepoint_str[];
extern const char *__stop___tracepoint_str[];
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index da796e2dc4f5..10ed20177354 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -37,8 +37,8 @@
#include <linux/device.h>
#include <linux/netdevice.h>

-extern struct _ddebug __start___verbose[];
-extern struct _ddebug __stop___verbose[];
+extern struct _ddebug __start___verbose[] __default_visibility;
+extern struct _ddebug __stop___verbose[] __default_visibility;

struct ddebug_table {
struct list_head link;
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:22

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 22/27] x86/modules: Add option to start module section after kernel

Add an option so the module section is just after the mapped kernel. It
will ensure position independent modules are always at the right
distance from the kernel and do not require mcmodule=large. It also
optimize the available size for modules by getting rid of the empty
space on kernel randomization range.

Signed-off-by: Thomas Garnier <[email protected]>
---
Documentation/x86/x86_64/mm.txt | 3 +++
arch/x86/Kconfig | 4 ++++
arch/x86/include/asm/pgtable_64_types.h | 6 +++++-
arch/x86/kernel/head64.c | 5 ++++-
arch/x86/mm/dump_pagetables.c | 4 ++--
5 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index b0798e281aa6..b51d66386e32 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -73,4 +73,7 @@ Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
physical memory, vmalloc/ioremap space and virtual memory map are randomized.
Their order is preserved but their base will be offset early at boot time.

+If CONFIG_DYNAMIC_MODULE_BASE is enabled, the module section follows the end of
+the mapped kernel.
+
-Andi Kleen, Jul 2004
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 772ff3e0f623..1f2b731c9ec3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2133,6 +2133,10 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING

If unsure, leave at the default value.

+# Module section starts just after the end of the kernel module
+config DYNAMIC_MODULE_BASE
+ bool
+
config X86_GLOBAL_STACKPROTECTOR
bool "Stack cookie using a global variable"
select CC_STACKPROTECTOR
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 06470da156ba..e00fc429b898 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -6,6 +6,7 @@
#ifndef __ASSEMBLY__
#include <linux/types.h>
#include <asm/kaslr.h>
+#include <asm/sections.h>

/*
* These are used to make use of C type-checking..
@@ -18,7 +19,6 @@ typedef unsigned long pgdval_t;
typedef unsigned long pgprotval_t;

typedef struct { pteval_t pte; } pte_t;
-
#endif /* !__ASSEMBLY__ */

#define SHARED_KERNEL_PMD 0
@@ -93,7 +93,11 @@ typedef struct { pteval_t pte; } pte_t;
#define VMEMMAP_START __VMEMMAP_BASE
#endif /* CONFIG_RANDOMIZE_MEMORY */
#define VMALLOC_END (VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
+#ifdef CONFIG_DYNAMIC_MODULE_BASE
+#define MODULES_VADDR ALIGN(((unsigned long)_end + PAGE_SIZE), PMD_SIZE)
+#else
#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
+#endif
/* The module sections ends with the start of the fixmap */
#define MODULES_END __fix_to_virt(__end_of_fixed_addresses + 1)
#define MODULES_LEN (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 675f1dba3b21..b6363f0d11a7 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -321,12 +321,15 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
* Build-time sanity checks on the kernel image and module
* area mappings. (these are purely build-time and produce no code)
*/
+#ifndef CONFIG_DYNAMIC_MODULE_BASE
BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
- BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
+ BUILD_BUG_ON(!IS_ENABLED(CONFIG_RANDOMIZE_BASE_LARGE) &&
+ MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
+#endif
BUILD_BUG_ON(!(((MODULES_END - 1) & PGDIR_MASK) ==
(__START_KERNEL & PGDIR_MASK)));
BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses) <= MODULES_END);
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 8691a57da63e..8565b2b45848 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -95,7 +95,7 @@ static struct addr_marker address_markers[] = {
{ EFI_VA_END, "EFI Runtime Services" },
# endif
{ __START_KERNEL_map, "High Kernel Mapping" },
- { MODULES_VADDR, "Modules" },
+ { 0/* MODULES_VADDR */, "Modules" },
{ MODULES_END, "End Modules" },
#else
{ PAGE_OFFSET, "Kernel Mapping" },
@@ -529,7 +529,7 @@ static int __init pt_dump_init(void)
# endif
address_markers[FIXADDR_START_NR].start_address = FIXADDR_START;
#endif
-
+ address_markers[MODULES_VADDR_NR].start_address = MODULES_VADDR;
return 0;
}
__initcall(pt_dump_init);
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:10

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 10/27] x86/boot/64: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Early at boot, the kernel is mapped at a temporary address while preparing
the page table. To know the changes needed for the page table with KASLR,
the boot code calculate the difference between the expected address of the
kernel and the one chosen by KASLR. It does not work with PIE because all
symbols in code are relatives. Instead of getting the future relocated
virtual address, you will get the current temporary mapping. The solution
is using global variables that will be relocated as expected.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/kernel/head_64.S | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 42e32c2e51bb..32d1899f48df 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -86,8 +86,21 @@ startup_64:
popq %rsi

/* Form the CR3 value being sure to include the CR3 modifier */
- addq $(early_top_pgt - __START_KERNEL_map), %rax
+ addq _early_top_pgt_offset(%rip), %rax
jmp 1f
+
+ /*
+ * Position Independent Code takes only relative references in code
+ * meaning a global variable address is relative to RIP and not its
+ * future virtual address. Global variables can be used instead as they
+ * are still relocated on the expected kernel mapping address.
+ */
+ .align 8
+_early_top_pgt_offset:
+ .quad early_top_pgt - __START_KERNEL_map
+_init_top_offset:
+ .quad init_top_pgt - __START_KERNEL_map
+
ENTRY(secondary_startup_64)
UNWIND_HINT_EMPTY
/*
@@ -116,7 +129,7 @@ ENTRY(secondary_startup_64)
popq %rsi

/* Form the CR3 value being sure to include the CR3 modifier */
- addq $(init_top_pgt - __START_KERNEL_map), %rax
+ addq _init_top_offset(%rip), %rax
1:

/* Enable PAE mode, PGE and LA57 */
@@ -131,7 +144,7 @@ ENTRY(secondary_startup_64)
movq %rax, %cr3

/* Ensure I am executing from virtual addresses */
- movq $1f, %rax
+ movabs $1f, %rax
jmp *%rax
1:
UNWIND_HINT_EMPTY
@@ -230,11 +243,12 @@ ENTRY(secondary_startup_64)
* REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
* address given in m16:64.
*/
- pushq $.Lafter_lret # put return address on stack for unwinder
+ leaq .Lafter_lret(%rip), %rax
+ pushq %rax # put return address on stack for unwinder
xorq %rbp, %rbp # clear frame pointer
- movq initial_code(%rip), %rax
+ leaq initial_code(%rip), %rax
pushq $__KERNEL_CS # set correct cs
- pushq %rax # target address in negative space
+ pushq (%rax) # target address in negative space
lretq
.Lafter_lret:
END(secondary_startup_64)
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:21

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 21/27] x86/mm/dump_pagetables: Fix address markers index on x86_64

The address_markers_idx enum is not aligned with the table when EFI is
enabled. Add an EFI_VA_END_NR entry in this case.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/mm/dump_pagetables.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 5e3ac6fe6c9e..8691a57da63e 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -52,12 +52,15 @@ enum address_markers_idx {
LOW_KERNEL_NR,
VMALLOC_START_NR,
VMEMMAP_START_NR,
-#ifdef CONFIG_KASAN
+# ifdef CONFIG_KASAN
KASAN_SHADOW_START_NR,
KASAN_SHADOW_END_NR,
-#endif
+# endif
# ifdef CONFIG_X86_ESPFIX64
ESPFIX_START_NR,
+# endif
+# ifdef CONFIG_EFI
+ EFI_VA_END_NR,
# endif
HIGH_KERNEL_NR,
MODULES_VADDR_NR,
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:12

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 12/27] x86/paravirt: Adapt assembly for PIE support

if PIE is enabled, switch the paravirt assembly constraints to be
compatible. The %c/i constrains generate smaller code so is kept by
default.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/paravirt_types.h | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 280d94c36dad..e6961f8a74aa 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -335,9 +335,17 @@ extern struct pv_lock_ops pv_lock_ops;
#define PARAVIRT_PATCH(x) \
(offsetof(struct paravirt_patch_template, x) / sizeof(void *))

+#ifdef CONFIG_X86_PIE
+#define paravirt_opptr_call "a"
+#define paravirt_opptr_type "p"
+#else
+#define paravirt_opptr_call "c"
+#define paravirt_opptr_type "i"
+#endif
+
#define paravirt_type(op) \
[paravirt_typenum] "i" (PARAVIRT_PATCH(op)), \
- [paravirt_opptr] "i" (&(op))
+ [paravirt_opptr] paravirt_opptr_type (&(op))
#define paravirt_clobber(clobber) \
[paravirt_clobber] "i" (clobber)

@@ -391,7 +399,7 @@ int paravirt_disable_iospace(void);
* offset into the paravirt_patch_template structure, and can therefore be
* freely converted back into a structure offset.
*/
-#define PARAVIRT_CALL "call *%c[paravirt_opptr];"
+#define PARAVIRT_CALL "call *%" paravirt_opptr_call "[paravirt_opptr];"

/*
* These macros are intended to wrap calls through one of the paravirt
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:09

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 09/27] x86/acpi: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/kernel/acpi/wakeup_64.S | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 50b8ed0317a3..472659c0f811 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -14,7 +14,7 @@
* Hooray, we are in Long 64-bit mode (but still running in low memory)
*/
ENTRY(wakeup_long64)
- movq saved_magic, %rax
+ movq saved_magic(%rip), %rax
movq $0x123456789abcdef0, %rdx
cmpq %rdx, %rax
jne bogus_64_magic
@@ -25,14 +25,14 @@ ENTRY(wakeup_long64)
movw %ax, %es
movw %ax, %fs
movw %ax, %gs
- movq saved_rsp, %rsp
+ movq saved_rsp(%rip), %rsp

- movq saved_rbx, %rbx
- movq saved_rdi, %rdi
- movq saved_rsi, %rsi
- movq saved_rbp, %rbp
+ movq saved_rbx(%rip), %rbx
+ movq saved_rdi(%rip), %rdi
+ movq saved_rsi(%rip), %rsi
+ movq saved_rbp(%rip), %rbp

- movq saved_rip, %rax
+ movq saved_rip(%rip), %rax
jmp *%rax
ENDPROC(wakeup_long64)

@@ -45,7 +45,7 @@ ENTRY(do_suspend_lowlevel)
xorl %eax, %eax
call save_processor_state

- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq %rsp, pt_regs_sp(%rax)
movq %rbp, pt_regs_bp(%rax)
movq %rsi, pt_regs_si(%rax)
@@ -64,13 +64,14 @@ ENTRY(do_suspend_lowlevel)
pushfq
popq pt_regs_flags(%rax)

- movq $.Lresume_point, saved_rip(%rip)
+ leaq .Lresume_point(%rip), %rax
+ movq %rax, saved_rip(%rip)

- movq %rsp, saved_rsp
- movq %rbp, saved_rbp
- movq %rbx, saved_rbx
- movq %rdi, saved_rdi
- movq %rsi, saved_rsi
+ movq %rsp, saved_rsp(%rip)
+ movq %rbp, saved_rbp(%rip)
+ movq %rbx, saved_rbx(%rip)
+ movq %rdi, saved_rdi(%rip)
+ movq %rsi, saved_rsi(%rip)

addq $8, %rsp
movl $3, %edi
@@ -82,7 +83,7 @@ ENTRY(do_suspend_lowlevel)
.align 4
.Lresume_point:
/* We don't restore %rax, it must be 0 anyway */
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq saved_context_cr4(%rax), %rbx
movq %rbx, %cr4
movq saved_context_cr3(%rax), %rbx
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:19

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 19/27] x86: Support global stack cookie

Add an off-by-default configuration option to use a global stack cookie
instead of the default TLS. This configuration option will only be used
with PIE binaries.

For kernel stack cookie, the compiler uses the mcmodel=kernel to switch
between the fs segment to gs segment. A PIE binary does not use
mcmodel=kernel because it can be relocated anywhere, therefore the
compiler will default to the fs segment register. This is going to be
fixed with a compiler change allowing to pick the segment register as
done on PowerPC. In the meantime, this configuration can be used to
support older compilers.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/Kconfig | 11 +++++++++++
arch/x86/Makefile | 9 +++++++++
arch/x86/entry/entry_32.S | 3 ++-
arch/x86/entry/entry_64.S | 3 ++-
arch/x86/include/asm/processor.h | 3 ++-
arch/x86/include/asm/stackprotector.h | 19 ++++++++++++++-----
arch/x86/kernel/asm-offsets.c | 3 ++-
arch/x86/kernel/asm-offsets_32.c | 3 ++-
arch/x86/kernel/asm-offsets_64.c | 3 ++-
arch/x86/kernel/cpu/common.c | 3 ++-
arch/x86/kernel/head_32.S | 3 ++-
arch/x86/kernel/process.c | 5 +++++
12 files changed, 55 insertions(+), 13 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 063f1e0d51aa..772ff3e0f623 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2133,6 +2133,17 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING

If unsure, leave at the default value.

+config X86_GLOBAL_STACKPROTECTOR
+ bool "Stack cookie using a global variable"
+ select CC_STACKPROTECTOR
+ ---help---
+ This option turns on the "stack-protector" GCC feature using a global
+ variable instead of a segment register. It is useful when the
+ compiler does not support custom segment registers when building a
+ position independent (PIE) binary.
+
+ If unsure, say N
+
config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"
depends on SMP
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 6276572259c8..de228200ef2a 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -141,6 +141,15 @@ else
KBUILD_CFLAGS += $(call cc-option,-funit-at-a-time)
endif

+ifdef CONFIG_X86_GLOBAL_STACKPROTECTOR
+ ifeq ($(call cc-option, -mstack-protector-guard=global),)
+ $(error Cannot use CONFIG_X86_GLOBAL_STACKPROTECTOR: \
+ -mstack-protector-guard=global not supported \
+ by compiler)
+ endif
+ KBUILD_CFLAGS += -mstack-protector-guard=global
+endif
+
ifdef CONFIG_X86_X32
x32_ld_ok := $(call try-run,\
/bin/echo -e '1: .quad 1b' | \
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 8a13d468635a..ab3e5056722f 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -237,7 +237,8 @@ ENTRY(__switch_to_asm)
movl %esp, TASK_threadsp(%eax)
movl TASK_threadsp(%edx), %esp

-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
movl TASK_stack_canary(%edx), %ebx
movl %ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset
#endif
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d3a52d2342af..01be62c1b436 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -390,7 +390,8 @@ ENTRY(__switch_to_asm)
movq %rsp, TASK_threadsp(%rdi)
movq TASK_threadsp(%rsi), %rsp

-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
movq TASK_stack_canary(%rsi), %rbx
movq %rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset)
#endif
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b09bd50b06c7..e3a7ef8d5fb8 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -394,7 +394,8 @@ DECLARE_PER_CPU(char *, irq_stack_ptr);
DECLARE_PER_CPU(unsigned int, irq_count);
extern asmlinkage void ignore_sysret(void);
#else /* X86_64 */
-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
/*
* Make sure stack canary segment base is cached-aligned:
* "For Intel Atom processors, avoid non zero segment base address
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 8abedf1d650e..66462d778dc5 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -51,6 +51,10 @@
#define GDT_STACK_CANARY_INIT \
[GDT_ENTRY_STACK_CANARY] = GDT_ENTRY_INIT(0x4090, 0, 0x18),

+#ifdef CONFIG_X86_GLOBAL_STACKPROTECTOR
+extern unsigned long __stack_chk_guard;
+#endif
+
/*
* Initialize the stackprotector canary value.
*
@@ -62,7 +66,7 @@ static __always_inline void boot_init_stack_canary(void)
u64 canary;
u64 tsc;

-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
BUILD_BUG_ON(offsetof(union irq_stack_union, stack_canary) != 40);
#endif
/*
@@ -76,17 +80,22 @@ static __always_inline void boot_init_stack_canary(void)
canary += tsc + (tsc << 32UL);
canary &= CANARY_MASK;

+#ifdef CONFIG_X86_GLOBAL_STACKPROTECTOR
+ if (__stack_chk_guard == 0)
+ __stack_chk_guard = canary ?: 1;
+#else /* !CONFIG_X86_GLOBAL_STACKPROTECTOR */
current->stack_canary = canary;
#ifdef CONFIG_X86_64
this_cpu_write(irq_stack_union.stack_canary, canary);
-#else
+#else /* CONFIG_X86_32 */
this_cpu_write(stack_canary.canary, canary);
#endif
+#endif
}

static inline void setup_stack_canary_segment(int cpu)
{
-#ifdef CONFIG_X86_32
+#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
unsigned long canary = (unsigned long)&per_cpu(stack_canary, cpu);
struct desc_struct *gdt_table = get_cpu_gdt_rw(cpu);
struct desc_struct desc;
@@ -99,7 +108,7 @@ static inline void setup_stack_canary_segment(int cpu)

static inline void load_stack_canary_segment(void)
{
-#ifdef CONFIG_X86_32
+#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
asm("mov %0, %%gs" : : "r" (__KERNEL_STACK_CANARY) : "memory");
#endif
}
@@ -115,7 +124,7 @@ static inline void setup_stack_canary_segment(int cpu)

static inline void load_stack_canary_segment(void)
{
-#ifdef CONFIG_X86_32
+#if defined(CONFIG_X86_32) && !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
asm volatile ("mov %0, %%gs" : : "r" (0));
#endif
}
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index de827d6ac8c2..b30a12cd021e 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -30,7 +30,8 @@
void common(void) {
BLANK();
OFFSET(TASK_threadsp, task_struct, thread.sp);
-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
OFFSET(TASK_stack_canary, task_struct, stack_canary);
#endif

diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c
index 710edab9e644..33584e7e486b 100644
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -54,7 +54,8 @@ void foo(void)
/* Size of SYSENTER_stack */
DEFINE(SIZEOF_SYSENTER_stack, sizeof(((struct tss_struct *)0)->SYSENTER_stack));

-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
BLANK();
OFFSET(stack_canary_offset, stack_canary, canary);
#endif
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index cf42206926af..06feb31a09f5 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -64,7 +64,8 @@ int main(void)
OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
BLANK();

-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
DEFINE(stack_canary_offset, offsetof(union irq_stack_union, stack_canary));
BLANK();
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index fac71a3ee0b5..99c8af974874 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1431,7 +1431,8 @@ DEFINE_PER_CPU(unsigned long, cpu_current_top_of_stack) =
(unsigned long)&init_thread_union + THREAD_SIZE;
EXPORT_PER_CPU_SYMBOL(cpu_current_top_of_stack);

-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
DEFINE_PER_CPU_ALIGNED(struct stack_canary, stack_canary);
#endif

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 9ed3074d0d27..a55a67b33934 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -377,7 +377,8 @@ ENDPROC(startup_32_smp)
*/
__INIT
setup_once:
-#ifdef CONFIG_CC_STACKPROTECTOR
+#if defined(CONFIG_CC_STACKPROTECTOR) && \
+ !defined(CONFIG_X86_GLOBAL_STACKPROTECTOR)
/*
* Configure the stack canary. The linker can't handle this by
* relocation. Manually set base address in stack canary
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index bd6b85fac666..66ea1a35413e 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -73,6 +73,11 @@ EXPORT_PER_CPU_SYMBOL(cpu_tss);
DEFINE_PER_CPU(bool, __tss_limit_invalid);
EXPORT_PER_CPU_SYMBOL_GPL(__tss_limit_invalid);

+#ifdef CONFIG_X86_GLOBAL_STACKPROTECTOR
+unsigned long __stack_chk_guard __read_mostly;
+EXPORT_SYMBOL(__stack_chk_guard);
+#endif
+
/*
* this gets called so that we can store lazy state into memory and copy the
* current task into the new thread.
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:26

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 26/27] x86/relocs: Add option to generate 64-bit relocations

The x86 relocation tool generates a list of 32-bit signed integers. There
was no need to use 64-bit integers because all addresses where above the 2G
top of the memory.

This change add a large-reloc option to generate 64-bit unsigned integers.
It can be used when the kernel plan to go below the top 2G and 32-bit
integers are not enough.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/tools/relocs.c | 60 +++++++++++++++++++++++++++++++++---------
arch/x86/tools/relocs.h | 4 +--
arch/x86/tools/relocs_common.c | 15 +++++++----
3 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/arch/x86/tools/relocs.c b/arch/x86/tools/relocs.c
index bc032ad88b22..e7497ea1fe76 100644
--- a/arch/x86/tools/relocs.c
+++ b/arch/x86/tools/relocs.c
@@ -12,8 +12,14 @@

static Elf_Ehdr ehdr;

+#if ELF_BITS == 64
+typedef uint64_t rel_off_t;
+#else
+typedef uint32_t rel_off_t;
+#endif
+
struct relocs {
- uint32_t *offset;
+ rel_off_t *offset;
unsigned long count;
unsigned long size;
};
@@ -684,7 +690,7 @@ static void print_absolute_relocs(void)
printf("\n");
}

-static void add_reloc(struct relocs *r, uint32_t offset)
+static void add_reloc(struct relocs *r, rel_off_t offset)
{
if (r->count == r->size) {
unsigned long newsize = r->size + 50000;
@@ -1058,26 +1064,48 @@ static void sort_relocs(struct relocs *r)
qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
}

-static int write32(uint32_t v, FILE *f)
+static int write32(rel_off_t rel, FILE *f)
{
- unsigned char buf[4];
+ unsigned char buf[sizeof(uint32_t)];
+ uint32_t v = (uint32_t)rel;

put_unaligned_le32(v, buf);
- return fwrite(buf, 1, 4, f) == 4 ? 0 : -1;
+ return fwrite(buf, 1, sizeof(buf), f) == sizeof(buf) ? 0 : -1;
}

-static int write32_as_text(uint32_t v, FILE *f)
+static int write32_as_text(rel_off_t rel, FILE *f)
{
+ uint32_t v = (uint32_t)rel;
return fprintf(f, "\t.long 0x%08"PRIx32"\n", v) > 0 ? 0 : -1;
}

-static void emit_relocs(int as_text, int use_real_mode)
+static int write64(rel_off_t rel, FILE *f)
+{
+ unsigned char buf[sizeof(uint64_t)];
+ uint64_t v = (uint64_t)rel;
+
+ put_unaligned_le64(v, buf);
+ return fwrite(buf, 1, sizeof(buf), f) == sizeof(buf) ? 0 : -1;
+}
+
+static int write64_as_text(rel_off_t rel, FILE *f)
+{
+ uint64_t v = (uint64_t)rel;
+ return fprintf(f, "\t.quad 0x%016"PRIx64"\n", v) > 0 ? 0 : -1;
+}
+
+static void emit_relocs(int as_text, int use_real_mode, int use_large_reloc)
{
int i;
- int (*write_reloc)(uint32_t, FILE *) = write32;
+ int (*write_reloc)(rel_off_t, FILE *);
int (*do_reloc)(struct section *sec, Elf_Rel *rel, Elf_Sym *sym,
const char *symname);

+ if (use_large_reloc)
+ write_reloc = write64;
+ else
+ write_reloc = write32;
+
#if ELF_BITS == 64
if (!use_real_mode)
do_reloc = do_reloc64;
@@ -1088,6 +1116,9 @@ static void emit_relocs(int as_text, int use_real_mode)
do_reloc = do_reloc32;
else
do_reloc = do_reloc_real;
+
+ /* Large relocations only for 64-bit */
+ use_large_reloc = 0;
#endif

/* Collect up the relocations */
@@ -1111,8 +1142,13 @@ static void emit_relocs(int as_text, int use_real_mode)
* gas will like.
*/
printf(".section \".data.reloc\",\"a\"\n");
- printf(".balign 4\n");
- write_reloc = write32_as_text;
+ if (use_large_reloc) {
+ printf(".balign 8\n");
+ write_reloc = write64_as_text;
+ } else {
+ printf(".balign 4\n");
+ write_reloc = write32_as_text;
+ }
}

if (use_real_mode) {
@@ -1180,7 +1216,7 @@ static void print_reloc_info(void)

void process(FILE *fp, int use_real_mode, int as_text,
int show_absolute_syms, int show_absolute_relocs,
- int show_reloc_info)
+ int show_reloc_info, int use_large_reloc)
{
regex_init(use_real_mode);
read_ehdr(fp);
@@ -1203,5 +1239,5 @@ void process(FILE *fp, int use_real_mode, int as_text,
print_reloc_info();
return;
}
- emit_relocs(as_text, use_real_mode);
+ emit_relocs(as_text, use_real_mode, use_large_reloc);
}
diff --git a/arch/x86/tools/relocs.h b/arch/x86/tools/relocs.h
index 1d23bf953a4a..cb771cc4412d 100644
--- a/arch/x86/tools/relocs.h
+++ b/arch/x86/tools/relocs.h
@@ -30,8 +30,8 @@ enum symtype {

void process_32(FILE *fp, int use_real_mode, int as_text,
int show_absolute_syms, int show_absolute_relocs,
- int show_reloc_info);
+ int show_reloc_info, int use_large_reloc);
void process_64(FILE *fp, int use_real_mode, int as_text,
int show_absolute_syms, int show_absolute_relocs,
- int show_reloc_info);
+ int show_reloc_info, int use_large_reloc);
#endif /* RELOCS_H */
diff --git a/arch/x86/tools/relocs_common.c b/arch/x86/tools/relocs_common.c
index acab636bcb34..9cf1391af50a 100644
--- a/arch/x86/tools/relocs_common.c
+++ b/arch/x86/tools/relocs_common.c
@@ -11,14 +11,14 @@ void die(char *fmt, ...)

static void usage(void)
{
- die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode]" \
- " vmlinux\n");
+ die("relocs [--abs-syms|--abs-relocs|--reloc-info|--text|--realmode|" \
+ "--large-reloc] vmlinux\n");
}

int main(int argc, char **argv)
{
int show_absolute_syms, show_absolute_relocs, show_reloc_info;
- int as_text, use_real_mode;
+ int as_text, use_real_mode, use_large_reloc;
const char *fname;
FILE *fp;
int i;
@@ -29,6 +29,7 @@ int main(int argc, char **argv)
show_reloc_info = 0;
as_text = 0;
use_real_mode = 0;
+ use_large_reloc = 0;
fname = NULL;
for (i = 1; i < argc; i++) {
char *arg = argv[i];
@@ -53,6 +54,10 @@ int main(int argc, char **argv)
use_real_mode = 1;
continue;
}
+ if (strcmp(arg, "--large-reloc") == 0) {
+ use_large_reloc = 1;
+ continue;
+ }
}
else if (!fname) {
fname = arg;
@@ -74,11 +79,11 @@ int main(int argc, char **argv)
if (e_ident[EI_CLASS] == ELFCLASS64)
process_64(fp, use_real_mode, as_text,
show_absolute_syms, show_absolute_relocs,
- show_reloc_info);
+ show_reloc_info, use_large_reloc);
else
process_32(fp, use_real_mode, as_text,
show_absolute_syms, show_absolute_relocs,
- show_reloc_info);
+ show_reloc_info, use_large_reloc);
fclose(fp);
return 0;
}
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:14

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 14/27] x86/percpu: Adapt percpu for PIE support

Perpcu uses a clever design where the .percu ELF section has a virtual
address of zero and the relocation code avoid relocating specific
symbols. It makes the code simple and easily adaptable with or without
SMP support.

This design is incompatible with PIE because generated code always try to
access the zero virtual address relative to the default mapping address.
It becomes impossible when KASLR is configured to go below -2G. This
patch solves this problem by removing the zero mapping and adapting the GS
base to be relative to the expected address. These changes are done only
when PIE is enabled. The original implementation is kept as-is
by default.

The assembly and PER_CPU macros are changed to use relative references
when PIE is enabled.

The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given
percpu symbols are not absolute in this case.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/entry/entry_64.S | 4 ++--
arch/x86/include/asm/percpu.h | 25 +++++++++++++++++++------
arch/x86/kernel/cpu/common.c | 4 +++-
arch/x86/kernel/head_64.S | 4 ++++
arch/x86/kernel/setup_percpu.c | 2 +-
arch/x86/kernel/vmlinux.lds.S | 13 +++++++++++--
arch/x86/lib/cmpxchg16b_emu.S | 8 ++++----
arch/x86/xen/xen-asm.S | 12 ++++++------
init/Kconfig | 2 +-
9 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 15bd5530d2ae..d3a52d2342af 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -392,7 +392,7 @@ ENTRY(__switch_to_asm)

#ifdef CONFIG_CC_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
+ movq %rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset)
#endif

/* restore callee-saved registers */
@@ -808,7 +808,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt
/*
* Exception entry points.
*/
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
+#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss + (TSS_ist + ((x) - 1) * 8))

.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index b21a475fd7ed..07250f1099b5 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -4,9 +4,11 @@
#ifdef CONFIG_X86_64
#define __percpu_seg gs
#define __percpu_mov_op movq
+#define __percpu_rel (%rip)
#else
#define __percpu_seg fs
#define __percpu_mov_op movl
+#define __percpu_rel
#endif

#ifdef __ASSEMBLY__
@@ -27,10 +29,14 @@
#define PER_CPU(var, reg) \
__percpu_mov_op %__percpu_seg:this_cpu_off, reg; \
lea var(reg), reg
-#define PER_CPU_VAR(var) %__percpu_seg:var
+/* Compatible with Position Independent Code */
+#define PER_CPU_VAR(var) %__percpu_seg:(var)##__percpu_rel
+/* Rare absolute reference */
+#define PER_CPU_VAR_ABS(var) %__percpu_seg:var
#else /* ! SMP */
#define PER_CPU(var, reg) __percpu_mov_op $var, reg
-#define PER_CPU_VAR(var) var
+#define PER_CPU_VAR(var) (var)##__percpu_rel
+#define PER_CPU_VAR_ABS(var) var
#endif /* SMP */

#ifdef CONFIG_X86_64_SMP
@@ -208,27 +214,34 @@ do { \
pfo_ret__; \
})

+/* Position Independent code uses relative addresses only */
+#ifdef CONFIG_X86_PIE
+#define __percpu_stable_arg __percpu_arg(a1)
+#else
+#define __percpu_stable_arg __percpu_arg(P1)
+#endif
+
#define percpu_stable_op(op, var) \
({ \
typeof(var) pfo_ret__; \
switch (sizeof(var)) { \
case 1: \
- asm(op "b "__percpu_arg(P1)",%0" \
+ asm(op "b "__percpu_stable_arg ",%0" \
: "=q" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 2: \
- asm(op "w "__percpu_arg(P1)",%0" \
+ asm(op "w "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 4: \
- asm(op "l "__percpu_arg(P1)",%0" \
+ asm(op "l "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 8: \
- asm(op "q "__percpu_arg(P1)",%0" \
+ asm(op "q "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 03f9a1a8a314..fac71a3ee0b5 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -461,7 +461,9 @@ void load_percpu_segment(int cpu)
loadsegment(fs, __KERNEL_PERCPU);
#else
__loadsegment_simple(gs, 0);
- wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu));
+ wrmsrl(MSR_GS_BASE,
+ (unsigned long)per_cpu(irq_stack_union.gs_base, cpu) -
+ (unsigned long)__per_cpu_start);
#endif
load_stack_canary_segment();
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 32d1899f48df..df5198e310fc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -274,7 +274,11 @@ ENDPROC(start_cpu0)
GLOBAL(initial_code)
.quad x86_64_start_kernel
GLOBAL(initial_gs)
+#ifdef CONFIG_X86_PIE
+ .quad 0
+#else
.quad INIT_PER_CPU_VAR(irq_stack_union)
+#endif
GLOBAL(initial_stack)
/*
* The SIZEOF_PTREGS gap is a convention which helps the in-kernel
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index 28dafed6c682..271829a1cc38 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -25,7 +25,7 @@
DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
EXPORT_PER_CPU_SYMBOL(cpu_number);

-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && !defined(CONFIG_X86_PIE)
#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
#else
#define BOOT_PERCPU_OFFSET 0
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index f05f00acac89..48268d059ebe 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -186,9 +186,14 @@ SECTIONS
/*
* percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the
* output PHDR, so the next output section - .init.text - should
- * start another segment - init.
+ * start another segment - init. For Position Independent Code, the
+ * per-cpu section cannot be zero-based because everything is relative.
*/
+#ifdef CONFIG_X86_PIE
+ PERCPU_SECTION(INTERNODE_CACHE_BYTES)
+#else
PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
+#endif
ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
"per-CPU data too large - increase CONFIG_PHYSICAL_START")
#endif
@@ -364,7 +369,11 @@ SECTIONS
* Per-cpu symbols which need to be offset from __per_cpu_load
* for the boot processor.
*/
+#ifdef CONFIG_X86_PIE
+#define INIT_PER_CPU(x) init_per_cpu__##x = x
+#else
#define INIT_PER_CPU(x) init_per_cpu__##x = x + __per_cpu_load
+#endif
INIT_PER_CPU(gdt_page);
INIT_PER_CPU(irq_stack_union);

@@ -374,7 +383,7 @@ INIT_PER_CPU(irq_stack_union);
. = ASSERT((_end - _text <= KERNEL_IMAGE_SIZE),
"kernel image bigger than KERNEL_IMAGE_SIZE");

-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && !defined(CONFIG_X86_PIE)
. = ASSERT((irq_stack_union == 0),
"irq_stack_union is not at start of per-cpu area");
#endif
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 9b330242e740..254950604ae4 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -33,13 +33,13 @@ ENTRY(this_cpu_cmpxchg16b_emu)
pushfq
cli

- cmpq PER_CPU_VAR((%rsi)), %rax
+ cmpq PER_CPU_VAR_ABS((%rsi)), %rax
jne .Lnot_same
- cmpq PER_CPU_VAR(8(%rsi)), %rdx
+ cmpq PER_CPU_VAR_ABS(8(%rsi)), %rdx
jne .Lnot_same

- movq %rbx, PER_CPU_VAR((%rsi))
- movq %rcx, PER_CPU_VAR(8(%rsi))
+ movq %rbx, PER_CPU_VAR_ABS((%rsi))
+ movq %rcx, PER_CPU_VAR_ABS(8(%rsi))

popfq
mov $1, %al
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index dcd31fa39b5d..495d7f42f254 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -20,7 +20,7 @@
ENTRY(xen_irq_enable_direct)
FRAME_BEGIN
/* Unmask events */
- movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)

/*
* Preempt here doesn't matter because that will deal with any
@@ -29,7 +29,7 @@ ENTRY(xen_irq_enable_direct)
*/

/* Test for pending */
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
jz 1f

call check_events
@@ -44,7 +44,7 @@ ENTRY(xen_irq_enable_direct)
* non-zero.
*/
ENTRY(xen_irq_disable_direct)
- movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
ret
ENDPROC(xen_irq_disable_direct)

@@ -58,7 +58,7 @@ ENDPROC(xen_irq_disable_direct)
* x86 use opposite senses (mask vs enable).
*/
ENTRY(xen_save_fl_direct)
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
setz %ah
addb %ah, %ah
ret
@@ -79,7 +79,7 @@ ENTRY(xen_restore_fl_direct)
#else
testb $X86_EFLAGS_IF>>8, %ah
#endif
- setz PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ setz PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
/*
* Preempt here doesn't matter because that will deal with any
* pending interrupts. The pending check may end up being run
@@ -87,7 +87,7 @@ ENTRY(xen_restore_fl_direct)
*/

/* check for unmasked and pending */
- cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+ cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
jnz 1f
call check_events
1:
diff --git a/init/Kconfig b/init/Kconfig
index 78cb2461012e..ccb1d8daf241 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1201,7 +1201,7 @@ config KALLSYMS_ALL
config KALLSYMS_ABSOLUTE_PERCPU
bool
depends on KALLSYMS
- default X86_64 && SMP
+ default X86_64 && SMP && !X86_PIE

config KALLSYMS_BASE_RELATIVE
bool
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 20:30:18

by Thomas Garnier

[permalink] [raw]
Subject: [PATCH v1 18/27] kvm: Adapt assembly for PIE support

Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible. The new __ASM_GET_PTR_PRE macro is used to
get the address of a symbol on both 32 and 64-bit with PIE support.

Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.

Signed-off-by: Thomas Garnier <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 6 ++++--
arch/x86/kernel/kvm.c | 6 ++++--
arch/x86/kvm/svm.c | 4 ++--
3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9d7d856b2d89..14073fda75fb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1342,9 +1342,11 @@ asmlinkage void kvm_spurious_fault(void);
".pushsection .fixup, \"ax\" \n" \
"667: \n\t" \
cleanup_insn "\n\t" \
- "cmpb $0, kvm_rebooting \n\t" \
+ "cmpb $0, kvm_rebooting" __ASM_SEL(,(%%rip)) " \n\t" \
"jne 668b \n\t" \
- __ASM_SIZE(push) " $666b \n\t" \
+ __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \
+ __ASM_GET_PTR_PRE(666b) "%%" _ASM_AX "\n\t" \
+ "xchg %%" _ASM_AX ", (%%" _ASM_SP ") \n\t" \
"call kvm_spurious_fault \n\t" \
".popsection \n\t" \
_ASM_EXTABLE(666b, 667b)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594d0761..4464c3667831 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -627,8 +627,10 @@ asm(
".global __raw_callee_save___kvm_vcpu_is_preempted;"
".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
"__raw_callee_save___kvm_vcpu_is_preempted:"
-"movq __per_cpu_offset(,%rdi,8), %rax;"
-"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
+"leaq __per_cpu_offset(%rip), %rax;"
+"movq (%rax,%rdi,8), %rax;"
+"addq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;"
+"cmpb $0, (%rax);"
"setne %al;"
"ret;"
".popsection");
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 0e68f0b3cbf7..364536080438 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -568,12 +568,12 @@ static u32 svm_msrpm_offset(u32 msr)

static inline void clgi(void)
{
- asm volatile (__ex(SVM_CLGI));
+ asm volatile (__ex(SVM_CLGI) : :);
}

static inline void stgi(void)
{
- asm volatile (__ex(SVM_STGI));
+ asm volatile (__ex(SVM_STGI) : :);
}

static inline void invlpga(unsigned long addr, u32 asid)
--
2.15.0.rc0.271.g36b669edcc-goog


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-11 21:34:21

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

On 10/11/2017 3:30 PM, Thomas Garnier wrote:
> Changes:
> - patch v1:
> - Simplify ftrace implementation.
> - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
> - rfc v3:
> - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
> mapped memory. It also simplifies the relocation process.
> - Move the start the module section next to the kernel. Remove the need for
> -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
> - Support for XEN PVH as 32-bit relocations can be ignored with
> --emit-relocs.
> - Support for GOT relocations previously done automatically with -pie.
> - Remove need for dynamic PLT in modules.
> - Support dymamic GOT for modules.
> - rfc v2:
> - Add support for global stack cookie while compiler default to fs without
> mcmodel=kernel
> - Change patch 7 to correctly jump out of the identity mapping on kexec load
> preserve.
>
> These patches make the changes necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
> the top 2G of the virtual address space. It allows to optionally extend the
> KASLR randomization range from 1G to 3G.

Hi Thomas,

I've applied your patches so that I can verify that SME works with PIE.
Unfortunately, I'm running into build warnings and errors when I enable
PIE.

With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:

drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup

Disabling CONFIG_STACK_VALIDATION suppresses those.

But near the end of the build, I receive errors like this:

arch/x86/kernel/setup.o: In function `dump_kernel_offset':
.../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
.
. about 10 more of the above type messages
.
make: *** [vmlinux] Error 1
Error building kernel, exiting

Are there any config options that should or should not be enabled when
building with PIE enabled? Is there a compiler requirement for PIE (I'm
using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

Thanks,
Tom

>
> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
> feedback for using -pie versus --emit-relocs and details on compiler code
> generation.
>
> The patches:
> - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
> - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
> - 14: Adapt percpu design to work correctly when PIE is enabled.
> - 15: Provide an option to default visibility to hidden except for key symbols.
> It removes errors between compilation units.
> - 16: Adapt relocation tool to handle PIE binary correctly.
> - 19: Add support for global cookie.
> - 20: Support ftrace with PIE (used on Ubuntu config).
> - 21: Fix incorrect address marker on dump_pagetables.
> - 22: Add option to move the module section just after the kernel.
> - 23: Adapt module loading to support PIE with dynamic GOT.
> - 24: Make the GOT read-only.
> - 25: Add the CONFIG_X86_PIE option (off by default).
> - 26: Adapt relocation tool to generate a 64-bit relocation table.
> - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
> from 1G to 3G (off by default).
>
> Performance/Size impact:
>
> Size of vmlinux (Default configuration):
> File size:
> - PIE disabled: +0.000031%
> - PIE enabled: -3.210% (less relocations)
> .text section:
> - PIE disabled: +0.000644%
> - PIE enabled: +0.837%
>
> Size of vmlinux (Ubuntu configuration):
> File size:
> - PIE disabled: -0.201%
> - PIE enabled: -0.082%
> .text section:
> - PIE disabled: same
> - PIE enabled: +1.319%
>
> Size of vmlinux (Default configuration + ORC):
> File size:
> - PIE enabled: -3.167%
> .text section:
> - PIE enabled: +0.814%
>
> Size of vmlinux (Ubuntu configuration + ORC):
> File size:
> - PIE enabled: -3.167%
> .text section:
> - PIE enabled: +1.26%
>
> The size increase is mainly due to not having access to the 32-bit signed
> relocation that can be used with mcmodel=kernel. A small part is due to reduced
> optimization for PIE code. This bug [1] was opened with gcc to provide a better
> code generation for kernel PIE.
>
> Hackbench (50% and 1600% on thread/process for pipe/sockets):
> - PIE disabled: no significant change (avg +0.1% on latest test).
> - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>
> slab_test (average of 10 runs):
> - PIE disabled: no significant change (-2% on latest run, likely noise).
> - PIE enabled: between -1% and +0.8% on latest runs.
>
> Kernbench (average of 10 Half and Optimal runs):
> Elapsed Time:
> - PIE disabled: no significant change (avg -0.239%)
> - PIE enabled: average +0.07%
> System Time:
> - PIE disabled: no significant change (avg -0.277%)
> - PIE enabled: average +0.7%
>
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>
> diffstat:
> Documentation/x86/x86_64/mm.txt | 3
> arch/x86/Kconfig | 43 ++++++
> arch/x86/Makefile | 40 +++++
> arch/x86/boot/boot.h | 2
> arch/x86/boot/compressed/Makefile | 5
> arch/x86/boot/compressed/misc.c | 10 +
> arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
> arch/x86/crypto/aesni-intel_asm.S | 14 +-
> arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
> arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
> arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
> arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
> arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
> arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
> arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
> arch/x86/crypto/glue_helper-asm-avx.S | 4
> arch/x86/crypto/glue_helper-asm-avx2.S | 6
> arch/x86/entry/entry_32.S | 3
> arch/x86/entry/entry_64.S | 29 ++--
> arch/x86/include/asm/asm.h | 13 +
> arch/x86/include/asm/bug.h | 2
> arch/x86/include/asm/ftrace.h | 6
> arch/x86/include/asm/jump_label.h | 8 -
> arch/x86/include/asm/kvm_host.h | 6
> arch/x86/include/asm/module.h | 11 +
> arch/x86/include/asm/page_64_types.h | 9 +
> arch/x86/include/asm/paravirt_types.h | 12 +
> arch/x86/include/asm/percpu.h | 25 ++-
> arch/x86/include/asm/pgtable_64_types.h | 6
> arch/x86/include/asm/pm-trace.h | 2
> arch/x86/include/asm/processor.h | 12 +
> arch/x86/include/asm/sections.h | 8 +
> arch/x86/include/asm/setup.h | 2
> arch/x86/include/asm/stackprotector.h | 19 ++
> arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
> arch/x86/kernel/asm-offsets.c | 3
> arch/x86/kernel/asm-offsets_32.c | 3
> arch/x86/kernel/asm-offsets_64.c | 3
> arch/x86/kernel/cpu/common.c | 7 -
> arch/x86/kernel/cpu/microcode/core.c | 4
> arch/x86/kernel/ftrace.c | 42 +++++-
> arch/x86/kernel/head64.c | 32 +++-
> arch/x86/kernel/head_32.S | 3
> arch/x86/kernel/head_64.S | 41 +++++-
> arch/x86/kernel/kvm.c | 6
> arch/x86/kernel/module.c | 182 ++++++++++++++++++++++++++-
> arch/x86/kernel/module.lds | 3
> arch/x86/kernel/process.c | 5
> arch/x86/kernel/relocate_kernel_64.S | 8 -
> arch/x86/kernel/setup_percpu.c | 2
> arch/x86/kernel/vmlinux.lds.S | 13 +
> arch/x86/kvm/svm.c | 4
> arch/x86/lib/cmpxchg16b_emu.S | 8 -
> arch/x86/mm/dump_pagetables.c | 11 +
> arch/x86/power/hibernate_asm_64.S | 4
> arch/x86/tools/relocs.c | 170 +++++++++++++++++++++++--
> arch/x86/tools/relocs.h | 4
> arch/x86/tools/relocs_common.c | 15 +-
> arch/x86/xen/xen-asm.S | 12 -
> arch/x86/xen/xen-head.S | 9 -
> arch/x86/xen/xen-pvh.S | 13 +
> drivers/base/firmware_class.c | 4
> include/asm-generic/sections.h | 6
> include/asm-generic/vmlinux.lds.h | 12 +
> include/linux/compiler.h | 8 +
> init/Kconfig | 9 +
> kernel/kallsyms.c | 16 +-
> kernel/trace/trace.h | 4
> lib/dynamic_debug.c | 4
> 70 files changed, 1032 insertions(+), 308 deletions(-)
>

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-12 15:34:13

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <[email protected]> wrote:
> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>> Changes:
>> - patch v1:
>> - Simplify ftrace implementation.
>> - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>> - rfc v3:
>> - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>> mapped memory. It also simplifies the relocation process.
>> - Move the start the module section next to the kernel. Remove the need for
>> -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>> - Support for XEN PVH as 32-bit relocations can be ignored with
>> --emit-relocs.
>> - Support for GOT relocations previously done automatically with -pie.
>> - Remove need for dynamic PLT in modules.
>> - Support dymamic GOT for modules.
>> - rfc v2:
>> - Add support for global stack cookie while compiler default to fs without
>> mcmodel=kernel
>> - Change patch 7 to correctly jump out of the identity mapping on kexec load
>> preserve.
>>
>> These patches make the changes necessary to build the kernel as Position
>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>> the top 2G of the virtual address space. It allows to optionally extend the
>> KASLR randomization range from 1G to 3G.
>
> Hi Thomas,
>
> I've applied your patches so that I can verify that SME works with PIE.
> Unfortunately, I'm running into build warnings and errors when I enable
> PIE.
>
> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>
> drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>
> Disabling CONFIG_STACK_VALIDATION suppresses those.

I ran into that, I plan to fix it in the next iteration.

>
> But near the end of the build, I receive errors like this:
>
> arch/x86/kernel/setup.o: In function `dump_kernel_offset':
> .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
> .
> . about 10 more of the above type messages
> .
> make: *** [vmlinux] Error 1
> Error building kernel, exiting
>
> Are there any config options that should or should not be enabled when
> building with PIE enabled? Is there a compiler requirement for PIE (I'm
> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?

I never ran into these ones and I tested compilers older and newer.
What was your exact configuration?

>
> Thanks,
> Tom
>
>>
>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>> feedback for using -pie versus --emit-relocs and details on compiler code
>> generation.
>>
>> The patches:
>> - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>> - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>> - 14: Adapt percpu design to work correctly when PIE is enabled.
>> - 15: Provide an option to default visibility to hidden except for key symbols.
>> It removes errors between compilation units.
>> - 16: Adapt relocation tool to handle PIE binary correctly.
>> - 19: Add support for global cookie.
>> - 20: Support ftrace with PIE (used on Ubuntu config).
>> - 21: Fix incorrect address marker on dump_pagetables.
>> - 22: Add option to move the module section just after the kernel.
>> - 23: Adapt module loading to support PIE with dynamic GOT.
>> - 24: Make the GOT read-only.
>> - 25: Add the CONFIG_X86_PIE option (off by default).
>> - 26: Adapt relocation tool to generate a 64-bit relocation table.
>> - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>> from 1G to 3G (off by default).
>>
>> Performance/Size impact:
>>
>> Size of vmlinux (Default configuration):
>> File size:
>> - PIE disabled: +0.000031%
>> - PIE enabled: -3.210% (less relocations)
>> .text section:
>> - PIE disabled: +0.000644%
>> - PIE enabled: +0.837%
>>
>> Size of vmlinux (Ubuntu configuration):
>> File size:
>> - PIE disabled: -0.201%
>> - PIE enabled: -0.082%
>> .text section:
>> - PIE disabled: same
>> - PIE enabled: +1.319%
>>
>> Size of vmlinux (Default configuration + ORC):
>> File size:
>> - PIE enabled: -3.167%
>> .text section:
>> - PIE enabled: +0.814%
>>
>> Size of vmlinux (Ubuntu configuration + ORC):
>> File size:
>> - PIE enabled: -3.167%
>> .text section:
>> - PIE enabled: +1.26%
>>
>> The size increase is mainly due to not having access to the 32-bit signed
>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>> code generation for kernel PIE.
>>
>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>> - PIE disabled: no significant change (avg +0.1% on latest test).
>> - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>
>> slab_test (average of 10 runs):
>> - PIE disabled: no significant change (-2% on latest run, likely noise).
>> - PIE enabled: between -1% and +0.8% on latest runs.
>>
>> Kernbench (average of 10 Half and Optimal runs):
>> Elapsed Time:
>> - PIE disabled: no significant change (avg -0.239%)
>> - PIE enabled: average +0.07%
>> System Time:
>> - PIE disabled: no significant change (avg -0.277%)
>> - PIE enabled: average +0.7%
>>
>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>
>> diffstat:
>> Documentation/x86/x86_64/mm.txt | 3
>> arch/x86/Kconfig | 43 ++++++
>> arch/x86/Makefile | 40 +++++
>> arch/x86/boot/boot.h | 2
>> arch/x86/boot/compressed/Makefile | 5
>> arch/x86/boot/compressed/misc.c | 10 +
>> arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
>> arch/x86/crypto/aesni-intel_asm.S | 14 +-
>> arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
>> arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
>> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
>> arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
>> arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
>> arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
>> arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
>> arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
>> arch/x86/crypto/glue_helper-asm-avx.S | 4
>> arch/x86/crypto/glue_helper-asm-avx2.S | 6
>> arch/x86/entry/entry_32.S | 3
>> arch/x86/entry/entry_64.S | 29 ++--
>> arch/x86/include/asm/asm.h | 13 +
>> arch/x86/include/asm/bug.h | 2
>> arch/x86/include/asm/ftrace.h | 6
>> arch/x86/include/asm/jump_label.h | 8 -
>> arch/x86/include/asm/kvm_host.h | 6
>> arch/x86/include/asm/module.h | 11 +
>> arch/x86/include/asm/page_64_types.h | 9 +
>> arch/x86/include/asm/paravirt_types.h | 12 +
>> arch/x86/include/asm/percpu.h | 25 ++-
>> arch/x86/include/asm/pgtable_64_types.h | 6
>> arch/x86/include/asm/pm-trace.h | 2
>> arch/x86/include/asm/processor.h | 12 +
>> arch/x86/include/asm/sections.h | 8 +
>> arch/x86/include/asm/setup.h | 2
>> arch/x86/include/asm/stackprotector.h | 19 ++
>> arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
>> arch/x86/kernel/asm-offsets.c | 3
>> arch/x86/kernel/asm-offsets_32.c | 3
>> arch/x86/kernel/asm-offsets_64.c | 3
>> arch/x86/kernel/cpu/common.c | 7 -
>> arch/x86/kernel/cpu/microcode/core.c | 4
>> arch/x86/kernel/ftrace.c | 42 +++++-
>> arch/x86/kernel/head64.c | 32 +++-
>> arch/x86/kernel/head_32.S | 3
>> arch/x86/kernel/head_64.S | 41 +++++-
>> arch/x86/kernel/kvm.c | 6
>> arch/x86/kernel/module.c | 182 ++++++++++++++++++++++++++-
>> arch/x86/kernel/module.lds | 3
>> arch/x86/kernel/process.c | 5
>> arch/x86/kernel/relocate_kernel_64.S | 8 -
>> arch/x86/kernel/setup_percpu.c | 2
>> arch/x86/kernel/vmlinux.lds.S | 13 +
>> arch/x86/kvm/svm.c | 4
>> arch/x86/lib/cmpxchg16b_emu.S | 8 -
>> arch/x86/mm/dump_pagetables.c | 11 +
>> arch/x86/power/hibernate_asm_64.S | 4
>> arch/x86/tools/relocs.c | 170 +++++++++++++++++++++++--
>> arch/x86/tools/relocs.h | 4
>> arch/x86/tools/relocs_common.c | 15 +-
>> arch/x86/xen/xen-asm.S | 12 -
>> arch/x86/xen/xen-head.S | 9 -
>> arch/x86/xen/xen-pvh.S | 13 +
>> drivers/base/firmware_class.c | 4
>> include/asm-generic/sections.h | 6
>> include/asm-generic/vmlinux.lds.h | 12 +
>> include/linux/compiler.h | 8 +
>> init/Kconfig | 9 +
>> kernel/kallsyms.c | 16 +-
>> kernel/trace/trace.h | 4
>> lib/dynamic_debug.c | 4
>> 70 files changed, 1032 insertions(+), 308 deletions(-)
>>



--
Thomas

2017-10-12 16:28:15

by Tom Lendacky

[permalink] [raw]
Subject: Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

On 10/12/2017 10:34 AM, Thomas Garnier wrote:
> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <[email protected]> wrote:
>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>> Changes:
>>> - patch v1:
>>> - Simplify ftrace implementation.
>>> - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>> - rfc v3:
>>> - Use --emit-relocs instead of -pie to reduce dynamic relocation space on
>>> mapped memory. It also simplifies the relocation process.
>>> - Move the start the module section next to the kernel. Remove the need for
>>> -mcmodel=large on modules. Extends module space from 1 to 2G maximum.
>>> - Support for XEN PVH as 32-bit relocations can be ignored with
>>> --emit-relocs.
>>> - Support for GOT relocations previously done automatically with -pie.
>>> - Remove need for dynamic PLT in modules.
>>> - Support dymamic GOT for modules.
>>> - rfc v2:
>>> - Add support for global stack cookie while compiler default to fs without
>>> mcmodel=kernel
>>> - Change patch 7 to correctly jump out of the identity mapping on kexec load
>>> preserve.
>>>
>>> These patches make the changes necessary to build the kernel as Position
>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
>>> the top 2G of the virtual address space. It allows to optionally extend the
>>> KASLR randomization range from 1G to 3G.
>>
>> Hi Thomas,
>>
>> I've applied your patches so that I can verify that SME works with PIE.
>> Unfortunately, I'm running into build warnings and errors when I enable
>> PIE.
>>
>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>
>> drivers/scsi/libfc/fc_exch.o: warning: objtool: fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>
>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>
> I ran into that, I plan to fix it in the next iteration.
>
>>
>> But near the end of the build, I receive errors like this:
>>
>> arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>> .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `_text' defined in .text section in .tmp_vmlinux1
>> .
>> . about 10 more of the above type messages
>> .
>> make: *** [vmlinux] Error 1
>> Error building kernel, exiting
>>
>> Are there any config options that should or should not be enabled when
>> building with PIE enabled? Is there a compiler requirement for PIE (I'm
>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>
> I never ran into these ones and I tested compilers older and newer.
> What was your exact configuration?

I'll send you the config in a separate email.

Thanks,
Tom

>
>>
>> Thanks,
>> Tom
>>
>>>
>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
>>> feedback for using -pie versus --emit-relocs and details on compiler code
>>> generation.
>>>
>>> The patches:
>>> - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>> - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
>>> - 14: Adapt percpu design to work correctly when PIE is enabled.
>>> - 15: Provide an option to default visibility to hidden except for key symbols.
>>> It removes errors between compilation units.
>>> - 16: Adapt relocation tool to handle PIE binary correctly.
>>> - 19: Add support for global cookie.
>>> - 20: Support ftrace with PIE (used on Ubuntu config).
>>> - 21: Fix incorrect address marker on dump_pagetables.
>>> - 22: Add option to move the module section just after the kernel.
>>> - 23: Adapt module loading to support PIE with dynamic GOT.
>>> - 24: Make the GOT read-only.
>>> - 25: Add the CONFIG_X86_PIE option (off by default).
>>> - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>> - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
>>> from 1G to 3G (off by default).
>>>
>>> Performance/Size impact:
>>>
>>> Size of vmlinux (Default configuration):
>>> File size:
>>> - PIE disabled: +0.000031%
>>> - PIE enabled: -3.210% (less relocations)
>>> .text section:
>>> - PIE disabled: +0.000644%
>>> - PIE enabled: +0.837%
>>>
>>> Size of vmlinux (Ubuntu configuration):
>>> File size:
>>> - PIE disabled: -0.201%
>>> - PIE enabled: -0.082%
>>> .text section:
>>> - PIE disabled: same
>>> - PIE enabled: +1.319%
>>>
>>> Size of vmlinux (Default configuration + ORC):
>>> File size:
>>> - PIE enabled: -3.167%
>>> .text section:
>>> - PIE enabled: +0.814%
>>>
>>> Size of vmlinux (Ubuntu configuration + ORC):
>>> File size:
>>> - PIE enabled: -3.167%
>>> .text section:
>>> - PIE enabled: +1.26%
>>>
>>> The size increase is mainly due to not having access to the 32-bit signed
>>> relocation that can be used with mcmodel=kernel. A small part is due to reduced
>>> optimization for PIE code. This bug [1] was opened with gcc to provide a better
>>> code generation for kernel PIE.
>>>
>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>> - PIE disabled: no significant change (avg +0.1% on latest test).
>>> - PIE enabled: between -0.50% to +0.86% in average (default and Ubuntu config).
>>>
>>> slab_test (average of 10 runs):
>>> - PIE disabled: no significant change (-2% on latest run, likely noise).
>>> - PIE enabled: between -1% and +0.8% on latest runs.
>>>
>>> Kernbench (average of 10 Half and Optimal runs):
>>> Elapsed Time:
>>> - PIE disabled: no significant change (avg -0.239%)
>>> - PIE enabled: average +0.07%
>>> System Time:
>>> - PIE disabled: no significant change (avg -0.277%)
>>> - PIE enabled: average +0.7%
>>>
>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>
>>> diffstat:
>>> Documentation/x86/x86_64/mm.txt | 3
>>> arch/x86/Kconfig | 43 ++++++
>>> arch/x86/Makefile | 40 +++++
>>> arch/x86/boot/boot.h | 2
>>> arch/x86/boot/compressed/Makefile | 5
>>> arch/x86/boot/compressed/misc.c | 10 +
>>> arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
>>> arch/x86/crypto/aesni-intel_asm.S | 14 +-
>>> arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
>>> arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
>>> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
>>> arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
>>> arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
>>> arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
>>> arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
>>> arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
>>> arch/x86/crypto/glue_helper-asm-avx.S | 4
>>> arch/x86/crypto/glue_helper-asm-avx2.S | 6
>>> arch/x86/entry/entry_32.S | 3
>>> arch/x86/entry/entry_64.S | 29 ++--
>>> arch/x86/include/asm/asm.h | 13 +
>>> arch/x86/include/asm/bug.h | 2
>>> arch/x86/include/asm/ftrace.h | 6
>>> arch/x86/include/asm/jump_label.h | 8 -
>>> arch/x86/include/asm/kvm_host.h | 6
>>> arch/x86/include/asm/module.h | 11 +
>>> arch/x86/include/asm/page_64_types.h | 9 +
>>> arch/x86/include/asm/paravirt_types.h | 12 +
>>> arch/x86/include/asm/percpu.h | 25 ++-
>>> arch/x86/include/asm/pgtable_64_types.h | 6
>>> arch/x86/include/asm/pm-trace.h | 2
>>> arch/x86/include/asm/processor.h | 12 +
>>> arch/x86/include/asm/sections.h | 8 +
>>> arch/x86/include/asm/setup.h | 2
>>> arch/x86/include/asm/stackprotector.h | 19 ++
>>> arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
>>> arch/x86/kernel/asm-offsets.c | 3
>>> arch/x86/kernel/asm-offsets_32.c | 3
>>> arch/x86/kernel/asm-offsets_64.c | 3
>>> arch/x86/kernel/cpu/common.c | 7 -
>>> arch/x86/kernel/cpu/microcode/core.c | 4
>>> arch/x86/kernel/ftrace.c | 42 +++++-
>>> arch/x86/kernel/head64.c | 32 +++-
>>> arch/x86/kernel/head_32.S | 3
>>> arch/x86/kernel/head_64.S | 41 +++++-
>>> arch/x86/kernel/kvm.c | 6
>>> arch/x86/kernel/module.c | 182 ++++++++++++++++++++++++++-
>>> arch/x86/kernel/module.lds | 3
>>> arch/x86/kernel/process.c | 5
>>> arch/x86/kernel/relocate_kernel_64.S | 8 -
>>> arch/x86/kernel/setup_percpu.c | 2
>>> arch/x86/kernel/vmlinux.lds.S | 13 +
>>> arch/x86/kvm/svm.c | 4
>>> arch/x86/lib/cmpxchg16b_emu.S | 8 -
>>> arch/x86/mm/dump_pagetables.c | 11 +
>>> arch/x86/power/hibernate_asm_64.S | 4
>>> arch/x86/tools/relocs.c | 170 +++++++++++++++++++++++--
>>> arch/x86/tools/relocs.h | 4
>>> arch/x86/tools/relocs_common.c | 15 +-
>>> arch/x86/xen/xen-asm.S | 12 -
>>> arch/x86/xen/xen-head.S | 9 -
>>> arch/x86/xen/xen-pvh.S | 13 +
>>> drivers/base/firmware_class.c | 4
>>> include/asm-generic/sections.h | 6
>>> include/asm-generic/vmlinux.lds.h | 12 +
>>> include/linux/compiler.h | 8 +
>>> init/Kconfig | 9 +
>>> kernel/kallsyms.c | 16 +-
>>> kernel/trace/trace.h | 4
>>> lib/dynamic_debug.c | 4
>>> 70 files changed, 1032 insertions(+), 308 deletions(-)
>>>
>
>
>

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-12 20:02:01

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v1 15/27] compiler: Option to default to hidden symbols

On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote:
> Provide an option to default visibility to hidden except for key
> symbols. This option is disabled by default and will be used by x86_64
> PIE support to remove errors between compilation units.
>
> The default visibility is also enabled for external symbols that are
> compared as they maybe equals (start/end of sections). In this case,
> older versions of GCC will remove the comparison if the symbols are
> hidden. This issue exists at least on gcc 4.9 and before.
>
> Signed-off-by: Thomas Garnier <[email protected]>

<-- snip -->

> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 86e8f0b2537b..8f021783a929 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -144,8 +144,8 @@ static bool __init check_loader_disabled_bsp(void)
> return *res;
> }
>
> -extern struct builtin_fw __start_builtin_fw[];
> -extern struct builtin_fw __end_builtin_fw[];
> +extern struct builtin_fw __start_builtin_fw[] __default_visibility;
> +extern struct builtin_fw __end_builtin_fw[] __default_visibility;
>
> bool get_builtin_firmware(struct cpio_data *cd, const char *name)
> {

<-- snip -->

> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index e5da44eddd2f..1aa5d6dac9e1 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -30,6 +30,9 @@
> * __irqentry_text_start, __irqentry_text_end
> * __softirqentry_text_start, __softirqentry_text_end
> */
> +#ifdef CONFIG_DEFAULT_HIDDEN
> +#pragma GCC visibility push(default)
> +#endif
> extern char _text[], _stext[], _etext[];
> extern char _data[], _sdata[], _edata[];
> extern char __bss_start[], __bss_stop[];
> @@ -46,6 +49,9 @@ extern char __softirqentry_text_start[], __softirqentry_text_end[];
>
> /* Start and end of .ctors section - used for constructor calls. */
> extern char __ctors_start[], __ctors_end[];
> +#ifdef CONFIG_DEFAULT_HIDDEN
> +#pragma GCC visibility pop
> +#endif
>
> extern __visible const void __nosave_begin, __nosave_end;
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index e95a2631e545..6997716f73bf 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -78,6 +78,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);
> #include <linux/compiler-clang.h>
> #endif
>
> +/* Useful for Position Independent Code to reduce global references */
> +#ifdef CONFIG_DEFAULT_HIDDEN
> +#pragma GCC visibility push(hidden)
> +#define __default_visibility __attribute__((visibility ("default")))

Does this still work with CONFIG_LD_DEAD_CODE_DATA_ELIMINATION ?

> +#else
> +#define __default_visibility
> +#endif
> +
> /*
> * Generic compiler-dependent macros required for kernel
> * build go below this comment. Actual compiler/compiler version
> diff --git a/init/Kconfig b/init/Kconfig
> index ccb1d8daf241..b640201fcff7 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1649,6 +1649,13 @@ config PROFILING
> config TRACEPOINTS
> bool
>
> +#
> +# Default to hidden visibility for all symbols.
> +# Useful for Position Independent Code to reduce global references.
> +#
> +config DEFAULT_HIDDEN
> + bool

Note it is default.

Has 0-day ran through this git tree? It should be easy to get it added for
testing. Also, even though most changes are x86 based there are some generic
changes and I'd love a warm fuzzy this won't break odd / random builds.
Although 0-day does cover a lot of test cases, it only has limited run time
tests. There are some other test beds which also cover some more obscure
architectures. Having a test pass on Guenter's test bed would be nice to
see. For that please coordinate with Guenter if he's willing to run this
a test for you.

Luis

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-18 23:15:10

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v1 15/27] compiler: Option to default to hidden symbols

On Thu, Oct 12, 2017 at 1:02 PM, Luis R. Rodriguez <[email protected]> wrote:
> On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote:
>> Provide an option to default visibility to hidden except for key
>> symbols. This option is disabled by default and will be used by x86_64
>> PIE support to remove errors between compilation units.
>>
>> The default visibility is also enabled for external symbols that are
>> compared as they maybe equals (start/end of sections). In this case,
>> older versions of GCC will remove the comparison if the symbols are
>> hidden. This issue exists at least on gcc 4.9 and before.
>>
>> Signed-off-by: Thomas Garnier <[email protected]>
>
> <-- snip -->
>
>> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
>> index 86e8f0b2537b..8f021783a929 100644
>> --- a/arch/x86/kernel/cpu/microcode/core.c
>> +++ b/arch/x86/kernel/cpu/microcode/core.c
>> @@ -144,8 +144,8 @@ static bool __init check_loader_disabled_bsp(void)
>> return *res;
>> }
>>
>> -extern struct builtin_fw __start_builtin_fw[];
>> -extern struct builtin_fw __end_builtin_fw[];
>> +extern struct builtin_fw __start_builtin_fw[] __default_visibility;
>> +extern struct builtin_fw __end_builtin_fw[] __default_visibility;
>>
>> bool get_builtin_firmware(struct cpio_data *cd, const char *name)
>> {
>
> <-- snip -->
>
>> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
>> index e5da44eddd2f..1aa5d6dac9e1 100644
>> --- a/include/asm-generic/sections.h
>> +++ b/include/asm-generic/sections.h
>> @@ -30,6 +30,9 @@
>> * __irqentry_text_start, __irqentry_text_end
>> * __softirqentry_text_start, __softirqentry_text_end
>> */
>> +#ifdef CONFIG_DEFAULT_HIDDEN
>> +#pragma GCC visibility push(default)
>> +#endif
>> extern char _text[], _stext[], _etext[];
>> extern char _data[], _sdata[], _edata[];
>> extern char __bss_start[], __bss_stop[];
>> @@ -46,6 +49,9 @@ extern char __softirqentry_text_start[], __softirqentry_text_end[];
>>
>> /* Start and end of .ctors section - used for constructor calls. */
>> extern char __ctors_start[], __ctors_end[];
>> +#ifdef CONFIG_DEFAULT_HIDDEN
>> +#pragma GCC visibility pop
>> +#endif
>>
>> extern __visible const void __nosave_begin, __nosave_end;
>>
>> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
>> index e95a2631e545..6997716f73bf 100644
>> --- a/include/linux/compiler.h
>> +++ b/include/linux/compiler.h
>> @@ -78,6 +78,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);
>> #include <linux/compiler-clang.h>
>> #endif
>>
>> +/* Useful for Position Independent Code to reduce global references */
>> +#ifdef CONFIG_DEFAULT_HIDDEN
>> +#pragma GCC visibility push(hidden)
>> +#define __default_visibility __attribute__((visibility ("default")))
>
> Does this still work with CONFIG_LD_DEAD_CODE_DATA_ELIMINATION ?

I cannot make it work with or without this change. How is it supposed
to be used?

For me with, it crashes with a bad consdev at:
http://elixir.free-electrons.com/linux/latest/source/drivers/tty/tty_io.c#L3194

>
>> +#else
>> +#define __default_visibility
>> +#endif
>> +
>> /*
>> * Generic compiler-dependent macros required for kernel
>> * build go below this comment. Actual compiler/compiler version
>> diff --git a/init/Kconfig b/init/Kconfig
>> index ccb1d8daf241..b640201fcff7 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1649,6 +1649,13 @@ config PROFILING
>> config TRACEPOINTS
>> bool
>>
>> +#
>> +# Default to hidden visibility for all symbols.
>> +# Useful for Position Independent Code to reduce global references.
>> +#
>> +config DEFAULT_HIDDEN
>> + bool
>
> Note it is default.
>
> Has 0-day ran through this git tree? It should be easy to get it added for
> testing. Also, even though most changes are x86 based there are some generic
> changes and I'd love a warm fuzzy this won't break odd / random builds.
> Although 0-day does cover a lot of test cases, it only has limited run time
> tests. There are some other test beds which also cover some more obscure
> architectures. Having a test pass on Guenter's test bed would be nice to
> see. For that please coordinate with Guenter if he's willing to run this
> a test for you.

Not yet, plan to give a v1.5 to Kees Cook to keep in one of his tree
for couple weeks. I expect it will identify interesting issues.

>
> Luis



--
Thomas

2017-10-18 23:17:44

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

On Thu, Oct 12, 2017 at 9:28 AM, Tom Lendacky <[email protected]> wrote:
> On 10/12/2017 10:34 AM, Thomas Garnier wrote:
>>
>> On Wed, Oct 11, 2017 at 2:34 PM, Tom Lendacky <[email protected]>
>> wrote:
>>>
>>> On 10/11/2017 3:30 PM, Thomas Garnier wrote:
>>>>
>>>> Changes:
>>>> - patch v1:
>>>> - Simplify ftrace implementation.
>>>> - Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
>>>> - rfc v3:
>>>> - Use --emit-relocs instead of -pie to reduce dynamic relocation
>>>> space on
>>>> mapped memory. It also simplifies the relocation process.
>>>> - Move the start the module section next to the kernel. Remove the
>>>> need for
>>>> -mcmodel=large on modules. Extends module space from 1 to 2G
>>>> maximum.
>>>> - Support for XEN PVH as 32-bit relocations can be ignored with
>>>> --emit-relocs.
>>>> - Support for GOT relocations previously done automatically with
>>>> -pie.
>>>> - Remove need for dynamic PLT in modules.
>>>> - Support dymamic GOT for modules.
>>>> - rfc v2:
>>>> - Add support for global stack cookie while compiler default to fs
>>>> without
>>>> mcmodel=kernel
>>>> - Change patch 7 to correctly jump out of the identity mapping on
>>>> kexec load
>>>> preserve.
>>>>
>>>> These patches make the changes necessary to build the kernel as Position
>>>> Independent Executable (PIE) on x86_64. A PIE kernel can be relocated
>>>> below
>>>> the top 2G of the virtual address space. It allows to optionally extend
>>>> the
>>>> KASLR randomization range from 1G to 3G.
>>>
>>>
>>> Hi Thomas,
>>>
>>> I've applied your patches so that I can verify that SME works with PIE.
>>> Unfortunately, I'm running into build warnings and errors when I enable
>>> PIE.
>>>
>>> With CONFIG_STACK_VALIDATION=y I receive lots of messages like this:
>>>
>>> drivers/scsi/libfc/fc_exch.o: warning: objtool:
>>> fc_destroy_exch_mgr()+0x0: call without frame pointer save/setup
>>>
>>> Disabling CONFIG_STACK_VALIDATION suppresses those.
>>
>>
>> I ran into that, I plan to fix it in the next iteration.
>>
>>>
>>> But near the end of the build, I receive errors like this:
>>>
>>> arch/x86/kernel/setup.o: In function `dump_kernel_offset':
>>> .../arch/x86/kernel/setup.c:801:(.text+0x32): relocation truncated to
>>> fit: R_X86_64_32S against symbol `_text' defined in .text section in
>>> .tmp_vmlinux1
>>> .
>>> . about 10 more of the above type messages
>>> .
>>> make: *** [vmlinux] Error 1
>>> Error building kernel, exiting
>>>
>>> Are there any config options that should or should not be enabled when
>>> building with PIE enabled? Is there a compiler requirement for PIE (I'm
>>> using gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.5))?
>>
>>
>> I never ran into these ones and I tested compilers older and newer.
>> What was your exact configuration?
>
>
> I'll send you the config in a separate email.
>
> Thanks,
> Tom

Thanks for your feedback (Tom and Markus). The issue was linked to
using a modern gcc with a modern linker, I managed to repro and fix it
on my current version.

I will create a v1.5 for Kees Cook to keep on one of his branch for
few weeks so I can collect as much feedback from 0day. After that I
will send v2.

>
>
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
>>>> changes, PIE support and KASLR in general. Thanks to Roland McGrath on
>>>> his
>>>> feedback for using -pie versus --emit-relocs and details on compiler
>>>> code
>>>> generation.
>>>>
>>>> The patches:
>>>> - 1-3, 5-1#, 17-18: Change in assembly code to be PIE compliant.
>>>> - 4: Add a new _ASM_GET_PTR macro to fetch a symbol address
>>>> generically.
>>>> - 14: Adapt percpu design to work correctly when PIE is enabled.
>>>> - 15: Provide an option to default visibility to hidden except for
>>>> key symbols.
>>>> It removes errors between compilation units.
>>>> - 16: Adapt relocation tool to handle PIE binary correctly.
>>>> - 19: Add support for global cookie.
>>>> - 20: Support ftrace with PIE (used on Ubuntu config).
>>>> - 21: Fix incorrect address marker on dump_pagetables.
>>>> - 22: Add option to move the module section just after the kernel.
>>>> - 23: Adapt module loading to support PIE with dynamic GOT.
>>>> - 24: Make the GOT read-only.
>>>> - 25: Add the CONFIG_X86_PIE option (off by default).
>>>> - 26: Adapt relocation tool to generate a 64-bit relocation table.
>>>> - 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase
>>>> relocation range
>>>> from 1G to 3G (off by default).
>>>>
>>>> Performance/Size impact:
>>>>
>>>> Size of vmlinux (Default configuration):
>>>> File size:
>>>> - PIE disabled: +0.000031%
>>>> - PIE enabled: -3.210% (less relocations)
>>>> .text section:
>>>> - PIE disabled: +0.000644%
>>>> - PIE enabled: +0.837%
>>>>
>>>> Size of vmlinux (Ubuntu configuration):
>>>> File size:
>>>> - PIE disabled: -0.201%
>>>> - PIE enabled: -0.082%
>>>> .text section:
>>>> - PIE disabled: same
>>>> - PIE enabled: +1.319%
>>>>
>>>> Size of vmlinux (Default configuration + ORC):
>>>> File size:
>>>> - PIE enabled: -3.167%
>>>> .text section:
>>>> - PIE enabled: +0.814%
>>>>
>>>> Size of vmlinux (Ubuntu configuration + ORC):
>>>> File size:
>>>> - PIE enabled: -3.167%
>>>> .text section:
>>>> - PIE enabled: +1.26%
>>>>
>>>> The size increase is mainly due to not having access to the 32-bit
>>>> signed
>>>> relocation that can be used with mcmodel=kernel. A small part is due to
>>>> reduced
>>>> optimization for PIE code. This bug [1] was opened with gcc to provide a
>>>> better
>>>> code generation for kernel PIE.
>>>>
>>>> Hackbench (50% and 1600% on thread/process for pipe/sockets):
>>>> - PIE disabled: no significant change (avg +0.1% on latest test).
>>>> - PIE enabled: between -0.50% to +0.86% in average (default and
>>>> Ubuntu config).
>>>>
>>>> slab_test (average of 10 runs):
>>>> - PIE disabled: no significant change (-2% on latest run, likely
>>>> noise).
>>>> - PIE enabled: between -1% and +0.8% on latest runs.
>>>>
>>>> Kernbench (average of 10 Half and Optimal runs):
>>>> Elapsed Time:
>>>> - PIE disabled: no significant change (avg -0.239%)
>>>> - PIE enabled: average +0.07%
>>>> System Time:
>>>> - PIE disabled: no significant change (avg -0.277%)
>>>> - PIE enabled: average +0.7%
>>>>
>>>> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
>>>>
>>>> diffstat:
>>>> Documentation/x86/x86_64/mm.txt | 3
>>>> arch/x86/Kconfig | 43 ++++++
>>>> arch/x86/Makefile | 40 +++++
>>>> arch/x86/boot/boot.h | 2
>>>> arch/x86/boot/compressed/Makefile | 5
>>>> arch/x86/boot/compressed/misc.c | 10 +
>>>> arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
>>>> arch/x86/crypto/aesni-intel_asm.S | 14 +-
>>>> arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
>>>> arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
>>>> arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
>>>> arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
>>>> arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
>>>> arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
>>>> arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
>>>> arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
>>>> arch/x86/crypto/glue_helper-asm-avx.S | 4
>>>> arch/x86/crypto/glue_helper-asm-avx2.S | 6
>>>> arch/x86/entry/entry_32.S | 3
>>>> arch/x86/entry/entry_64.S | 29 ++--
>>>> arch/x86/include/asm/asm.h | 13 +
>>>> arch/x86/include/asm/bug.h | 2
>>>> arch/x86/include/asm/ftrace.h | 6
>>>> arch/x86/include/asm/jump_label.h | 8 -
>>>> arch/x86/include/asm/kvm_host.h | 6
>>>> arch/x86/include/asm/module.h | 11 +
>>>> arch/x86/include/asm/page_64_types.h | 9 +
>>>> arch/x86/include/asm/paravirt_types.h | 12 +
>>>> arch/x86/include/asm/percpu.h | 25 ++-
>>>> arch/x86/include/asm/pgtable_64_types.h | 6
>>>> arch/x86/include/asm/pm-trace.h | 2
>>>> arch/x86/include/asm/processor.h | 12 +
>>>> arch/x86/include/asm/sections.h | 8 +
>>>> arch/x86/include/asm/setup.h | 2
>>>> arch/x86/include/asm/stackprotector.h | 19 ++
>>>> arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
>>>> arch/x86/kernel/asm-offsets.c | 3
>>>> arch/x86/kernel/asm-offsets_32.c | 3
>>>> arch/x86/kernel/asm-offsets_64.c | 3
>>>> arch/x86/kernel/cpu/common.c | 7 -
>>>> arch/x86/kernel/cpu/microcode/core.c | 4
>>>> arch/x86/kernel/ftrace.c | 42 +++++-
>>>> arch/x86/kernel/head64.c | 32 +++-
>>>> arch/x86/kernel/head_32.S | 3
>>>> arch/x86/kernel/head_64.S | 41 +++++-
>>>> arch/x86/kernel/kvm.c | 6
>>>> arch/x86/kernel/module.c | 182
>>>> ++++++++++++++++++++++++++-
>>>> arch/x86/kernel/module.lds | 3
>>>> arch/x86/kernel/process.c | 5
>>>> arch/x86/kernel/relocate_kernel_64.S | 8 -
>>>> arch/x86/kernel/setup_percpu.c | 2
>>>> arch/x86/kernel/vmlinux.lds.S | 13 +
>>>> arch/x86/kvm/svm.c | 4
>>>> arch/x86/lib/cmpxchg16b_emu.S | 8 -
>>>> arch/x86/mm/dump_pagetables.c | 11 +
>>>> arch/x86/power/hibernate_asm_64.S | 4
>>>> arch/x86/tools/relocs.c | 170
>>>> +++++++++++++++++++++++--
>>>> arch/x86/tools/relocs.h | 4
>>>> arch/x86/tools/relocs_common.c | 15 +-
>>>> arch/x86/xen/xen-asm.S | 12 -
>>>> arch/x86/xen/xen-head.S | 9 -
>>>> arch/x86/xen/xen-pvh.S | 13 +
>>>> drivers/base/firmware_class.c | 4
>>>> include/asm-generic/sections.h | 6
>>>> include/asm-generic/vmlinux.lds.h | 12 +
>>>> include/linux/compiler.h | 8 +
>>>> init/Kconfig | 9 +
>>>> kernel/kallsyms.c | 16 +-
>>>> kernel/trace/trace.h | 4
>>>> lib/dynamic_debug.c | 4
>>>> 70 files changed, 1032 insertions(+), 308 deletions(-)
>>>>
>>
>>
>>
>



--
Thomas

2017-10-19 19:38:08

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH v1 15/27] compiler: Option to default to hidden symbols

On Wed, Oct 18, 2017 at 04:15:10PM -0700, Thomas Garnier wrote:
> On Thu, Oct 12, 2017 at 1:02 PM, Luis R. Rodriguez <[email protected]> wrote:
> > On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote:
> >> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> >> index e95a2631e545..6997716f73bf 100644
> >> --- a/include/linux/compiler.h
> >> +++ b/include/linux/compiler.h
> >> @@ -78,6 +78,14 @@ extern void __chk_io_ptr(const volatile void __iomem *);
> >> #include <linux/compiler-clang.h>
> >> #endif
> >>
> >> +/* Useful for Position Independent Code to reduce global references */
> >> +#ifdef CONFIG_DEFAULT_HIDDEN
> >> +#pragma GCC visibility push(hidden)
> >> +#define __default_visibility __attribute__((visibility ("default")))
> >
> > Does this still work with CONFIG_LD_DEAD_CODE_DATA_ELIMINATION ?
>
> I cannot make it work with or without this change. How is it supposed
> to be used?

Sadly I don't think much documentation was really added as part of the Nick's
commits about feature, even though commit b67067f1176 ("kbuild: allow archs to
select link dead code/data elimination") *does* say this was documented.

Side rant: the whole CONFIG_LTO removal was merged in the same commit without
this having gone in as a separate atomic patch.

Nick can you provide a bit more guidance about how to get this feature going or
tested on an architecture? Or are you just sticking to assuming folks using the
linker / compiler flags will know what to do? *Some* guidance could help.

> For me with, it crashes with a bad consdev at:
> http://elixir.free-electrons.com/linux/latest/source/drivers/tty/tty_io.c#L3194

From my reading of the commit log he only had tested it with with powerpc64le,
each other architecture would have to do work to get as far as even booting.

It would require someone then testing Nick's patches against a working
powerpc setup to ensure we don't regress there.

> >> diff --git a/init/Kconfig b/init/Kconfig
> >> index ccb1d8daf241..b640201fcff7 100644
> >> --- a/init/Kconfig
> >> +++ b/init/Kconfig
> >> @@ -1649,6 +1649,13 @@ config PROFILING
> >> config TRACEPOINTS
> >> bool
> >>
> >> +#
> >> +# Default to hidden visibility for all symbols.
> >> +# Useful for Position Independent Code to reduce global references.
> >> +#
> >> +config DEFAULT_HIDDEN
> >> + bool
> >
> > Note it is default.
> >
> > Has 0-day ran through this git tree? It should be easy to get it added for
> > testing. Also, even though most changes are x86 based there are some generic
> > changes and I'd love a warm fuzzy this won't break odd / random builds.
> > Although 0-day does cover a lot of test cases, it only has limited run time
> > tests. There are some other test beds which also cover some more obscure
> > architectures. Having a test pass on Guenter's test bed would be nice to
> > see. For that please coordinate with Guenter if he's willing to run this
> > a test for you.
>
> Not yet, plan to give a v1.5 to Kees Cook to keep in one of his tree
> for couple weeks. I expect it will identify interesting issues.

I bet :)

Luis

_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xen.org/xen-devel

2017-10-20 08:24:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support


* Thomas Garnier <[email protected]> wrote:

> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
>
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.

> diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
> index 8739cf7795de..86fa068e5e81 100644
> --- a/arch/x86/crypto/aes-x86_64-asm_64.S
> +++ b/arch/x86/crypto/aes-x86_64-asm_64.S
> @@ -48,8 +48,12 @@
> #define R10 %r10
> #define R11 %r11
>
> +/* Hold global for PIE suport */
> +#define RBASE %r12
> +
> #define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
> ENTRY(FUNC); \
> + pushq RBASE; \
> movq r1,r2; \
> leaq KEY+48(r8),r9; \
> movq r10,r11; \
> @@ -74,54 +78,63 @@
> movl r6 ## E,4(r9); \
> movl r7 ## E,8(r9); \
> movl r8 ## E,12(r9); \
> + popq RBASE; \
> ret; \
> ENDPROC(FUNC);
>
> +#define round_mov(tab_off, reg_i, reg_o) \
> + leaq tab_off(%rip), RBASE; \
> + movl (RBASE,reg_i,4), reg_o;
> +
> +#define round_xor(tab_off, reg_i, reg_o) \
> + leaq tab_off(%rip), RBASE; \
> + xorl (RBASE,reg_i,4), reg_o;
> +
> #define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
> movzbl r2 ## H,r5 ## E; \
> movzbl r2 ## L,r6 ## E; \
> - movl TAB+1024(,r5,4),r5 ## E;\
> + round_mov(TAB+1024, r5, r5 ## E)\
> movw r4 ## X,r2 ## X; \
> - movl TAB(,r6,4),r6 ## E; \
> + round_mov(TAB, r6, r6 ## E) \
> roll $16,r2 ## E; \
> shrl $16,r4 ## E; \
> movzbl r4 ## L,r7 ## E; \
> movzbl r4 ## H,r4 ## E; \
> xorl OFFSET(r8),ra ## E; \
> xorl OFFSET+4(r8),rb ## E; \
> - xorl TAB+3072(,r4,4),r5 ## E;\
> - xorl TAB+2048(,r7,4),r6 ## E;\
> + round_xor(TAB+3072, r4, r5 ## E)\
> + round_xor(TAB+2048, r7, r6 ## E)\
> movzbl r1 ## L,r7 ## E; \
> movzbl r1 ## H,r4 ## E; \
> - movl TAB+1024(,r4,4),r4 ## E;\
> + round_mov(TAB+1024, r4, r4 ## E)\
> movw r3 ## X,r1 ## X; \
> roll $16,r1 ## E; \
> shrl $16,r3 ## E; \
> - xorl TAB(,r7,4),r5 ## E; \
> + round_xor(TAB, r7, r5 ## E) \
> movzbl r3 ## L,r7 ## E; \
> movzbl r3 ## H,r3 ## E; \
> - xorl TAB+3072(,r3,4),r4 ## E;\
> - xorl TAB+2048(,r7,4),r5 ## E;\
> + round_xor(TAB+3072, r3, r4 ## E)\
> + round_xor(TAB+2048, r7, r5 ## E)\
> movzbl r1 ## L,r7 ## E; \
> movzbl r1 ## H,r3 ## E; \
> shrl $16,r1 ## E; \
> - xorl TAB+3072(,r3,4),r6 ## E;\
> - movl TAB+2048(,r7,4),r3 ## E;\
> + round_xor(TAB+3072, r3, r6 ## E)\
> + round_mov(TAB+2048, r7, r3 ## E)\
> movzbl r1 ## L,r7 ## E; \
> movzbl r1 ## H,r1 ## E; \
> - xorl TAB+1024(,r1,4),r6 ## E;\
> - xorl TAB(,r7,4),r3 ## E; \
> + round_xor(TAB+1024, r1, r6 ## E)\
> + round_xor(TAB, r7, r3 ## E) \
> movzbl r2 ## H,r1 ## E; \
> movzbl r2 ## L,r7 ## E; \
> shrl $16,r2 ## E; \
> - xorl TAB+3072(,r1,4),r3 ## E;\
> - xorl TAB+2048(,r7,4),r4 ## E;\
> + round_xor(TAB+3072, r1, r3 ## E)\
> + round_xor(TAB+2048, r7, r4 ## E)\
> movzbl r2 ## H,r1 ## E; \
> movzbl r2 ## L,r2 ## E; \
> xorl OFFSET+8(r8),rc ## E; \
> xorl OFFSET+12(r8),rd ## E; \
> - xorl TAB+1024(,r1,4),r3 ## E;\
> - xorl TAB(,r2,4),r4 ## E;
> + round_xor(TAB+1024, r1, r3 ## E)\
> + round_xor(TAB, r2, r4 ## E)

This appears to be adding unconditional overhead to a function that was moved to
assembly to improve its performance.

Thanks,

Ingo

2017-10-20 08:26:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support


* Thomas Garnier <[email protected]> wrote:

> Change the assembly code to use only relative references of symbols for the
> kernel to be PIE compatible.
>
> Position Independent Executable (PIE) support will allow to extended the
> KASLR randomization range below the -2G memory limit.
>
> Signed-off-by: Thomas Garnier <[email protected]>
> ---
> arch/x86/entry/entry_64.S | 22 +++++++++++++++-------
> 1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 49167258d587..15bd5530d2ae 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -194,12 +194,15 @@ entry_SYSCALL_64_fastpath:
> ja 1f /* return -ENOSYS (already in pt_regs->ax) */
> movq %r10, %rcx
>
> + /* Ensures the call is position independent */
> + leaq sys_call_table(%rip), %r11
> +
> /*
> * This call instruction is handled specially in stub_ptregs_64.
> * It might end up jumping to the slow path. If it jumps, RAX
> * and all argument registers are clobbered.
> */
> - call *sys_call_table(, %rax, 8)
> + call *(%r11, %rax, 8)
> .Lentry_SYSCALL_64_after_fastpath_call:
>
> movq %rax, RAX(%rsp)
> @@ -334,7 +337,8 @@ ENTRY(stub_ptregs_64)
> * RAX stores a pointer to the C function implementing the syscall.
> * IRQs are on.
> */
> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
> + cmpq %r11, (%rsp)
> jne 1f
>
> /*
> @@ -1172,7 +1176,8 @@ ENTRY(error_entry)
> movl %ecx, %eax /* zero extend */
> cmpq %rax, RIP+8(%rsp)
> je .Lbstep_iret
> - cmpq $.Lgs_change, RIP+8(%rsp)
> + leaq .Lgs_change(%rip), %rcx
> + cmpq %rcx, RIP+8(%rsp)
> jne .Lerror_entry_done
>
> /*
> @@ -1383,10 +1388,10 @@ ENTRY(nmi)
> * resume the outer NMI.
> */
>
> - movq $repeat_nmi, %rdx
> + leaq repeat_nmi(%rip), %rdx
> cmpq 8(%rsp), %rdx
> ja 1f
> - movq $end_repeat_nmi, %rdx
> + leaq end_repeat_nmi(%rip), %rdx
> cmpq 8(%rsp), %rdx
> ja nested_nmi_out
> 1:
> @@ -1440,7 +1445,8 @@ nested_nmi:
> pushq %rdx
> pushfq
> pushq $__KERNEL_CS
> - pushq $repeat_nmi
> + leaq repeat_nmi(%rip), %rdx
> + pushq %rdx
>
> /* Put stack back */
> addq $(6*8), %rsp
> @@ -1479,7 +1485,9 @@ first_nmi:
> addq $8, (%rsp) /* Fix up RSP */
> pushfq /* RFLAGS */
> pushq $__KERNEL_CS /* CS */
> - pushq $1f /* RIP */
> + pushq %rax /* Support Position Independent Code */
> + leaq 1f(%rip), %rax /* RIP */
> + xchgq %rax, (%rsp) /* Restore RAX, put 1f */
> INTERRUPT_RETURN /* continues at repeat_nmi below */
> UNWIND_HINT_IRET_REGS

This patch seems to add extra overhead to the syscall fast-path even when PIE is
disabled, right?

Thanks,

Ingo

2017-10-20 08:28:00

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

On 20 October 2017 at 09:24, Ingo Molnar <[email protected]> wrote:
>
> * Thomas Garnier <[email protected]> wrote:
>
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>
>> diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
>> index 8739cf7795de..86fa068e5e81 100644
>> --- a/arch/x86/crypto/aes-x86_64-asm_64.S
>> +++ b/arch/x86/crypto/aes-x86_64-asm_64.S
>> @@ -48,8 +48,12 @@
>> #define R10 %r10
>> #define R11 %r11
>>
>> +/* Hold global for PIE suport */
>> +#define RBASE %r12
>> +
>> #define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
>> ENTRY(FUNC); \
>> + pushq RBASE; \
>> movq r1,r2; \
>> leaq KEY+48(r8),r9; \
>> movq r10,r11; \
>> @@ -74,54 +78,63 @@
>> movl r6 ## E,4(r9); \
>> movl r7 ## E,8(r9); \
>> movl r8 ## E,12(r9); \
>> + popq RBASE; \
>> ret; \
>> ENDPROC(FUNC);
>>
>> +#define round_mov(tab_off, reg_i, reg_o) \
>> + leaq tab_off(%rip), RBASE; \
>> + movl (RBASE,reg_i,4), reg_o;
>> +
>> +#define round_xor(tab_off, reg_i, reg_o) \
>> + leaq tab_off(%rip), RBASE; \
>> + xorl (RBASE,reg_i,4), reg_o;
>> +
>> #define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
>> movzbl r2 ## H,r5 ## E; \
>> movzbl r2 ## L,r6 ## E; \
>> - movl TAB+1024(,r5,4),r5 ## E;\
>> + round_mov(TAB+1024, r5, r5 ## E)\
>> movw r4 ## X,r2 ## X; \
>> - movl TAB(,r6,4),r6 ## E; \
>> + round_mov(TAB, r6, r6 ## E) \
>> roll $16,r2 ## E; \
>> shrl $16,r4 ## E; \
>> movzbl r4 ## L,r7 ## E; \
>> movzbl r4 ## H,r4 ## E; \
>> xorl OFFSET(r8),ra ## E; \
>> xorl OFFSET+4(r8),rb ## E; \
>> - xorl TAB+3072(,r4,4),r5 ## E;\
>> - xorl TAB+2048(,r7,4),r6 ## E;\
>> + round_xor(TAB+3072, r4, r5 ## E)\
>> + round_xor(TAB+2048, r7, r6 ## E)\
>> movzbl r1 ## L,r7 ## E; \
>> movzbl r1 ## H,r4 ## E; \
>> - movl TAB+1024(,r4,4),r4 ## E;\
>> + round_mov(TAB+1024, r4, r4 ## E)\
>> movw r3 ## X,r1 ## X; \
>> roll $16,r1 ## E; \
>> shrl $16,r3 ## E; \
>> - xorl TAB(,r7,4),r5 ## E; \
>> + round_xor(TAB, r7, r5 ## E) \
>> movzbl r3 ## L,r7 ## E; \
>> movzbl r3 ## H,r3 ## E; \
>> - xorl TAB+3072(,r3,4),r4 ## E;\
>> - xorl TAB+2048(,r7,4),r5 ## E;\
>> + round_xor(TAB+3072, r3, r4 ## E)\
>> + round_xor(TAB+2048, r7, r5 ## E)\
>> movzbl r1 ## L,r7 ## E; \
>> movzbl r1 ## H,r3 ## E; \
>> shrl $16,r1 ## E; \
>> - xorl TAB+3072(,r3,4),r6 ## E;\
>> - movl TAB+2048(,r7,4),r3 ## E;\
>> + round_xor(TAB+3072, r3, r6 ## E)\
>> + round_mov(TAB+2048, r7, r3 ## E)\
>> movzbl r1 ## L,r7 ## E; \
>> movzbl r1 ## H,r1 ## E; \
>> - xorl TAB+1024(,r1,4),r6 ## E;\
>> - xorl TAB(,r7,4),r3 ## E; \
>> + round_xor(TAB+1024, r1, r6 ## E)\
>> + round_xor(TAB, r7, r3 ## E) \
>> movzbl r2 ## H,r1 ## E; \
>> movzbl r2 ## L,r7 ## E; \
>> shrl $16,r2 ## E; \
>> - xorl TAB+3072(,r1,4),r3 ## E;\
>> - xorl TAB+2048(,r7,4),r4 ## E;\
>> + round_xor(TAB+3072, r1, r3 ## E)\
>> + round_xor(TAB+2048, r7, r4 ## E)\
>> movzbl r2 ## H,r1 ## E; \
>> movzbl r2 ## L,r2 ## E; \
>> xorl OFFSET+8(r8),rc ## E; \
>> xorl OFFSET+12(r8),rd ## E; \
>> - xorl TAB+1024(,r1,4),r3 ## E;\
>> - xorl TAB(,r2,4),r4 ## E;
>> + round_xor(TAB+1024, r1, r3 ## E)\
>> + round_xor(TAB, r2, r4 ## E)
>
> This appears to be adding unconditional overhead to a function that was moved to
> assembly to improve its performance.
>

I did some benchmarking on this code a while ago and, interestingly,
it was slower than the generic C implementation (on a Pentium E2200),
so we may want to consider whether we still need this driver in the
first place.

2017-10-20 14:47:44

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support

On Fri, Oct 20, 2017 at 1:26 AM, Ingo Molnar <[email protected]> wrote:
>
> * Thomas Garnier <[email protected]> wrote:
>
>> Change the assembly code to use only relative references of symbols for the
>> kernel to be PIE compatible.
>>
>> Position Independent Executable (PIE) support will allow to extended the
>> KASLR randomization range below the -2G memory limit.
>>
>> Signed-off-by: Thomas Garnier <[email protected]>
>> ---
>> arch/x86/entry/entry_64.S | 22 +++++++++++++++-------
>> 1 file changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index 49167258d587..15bd5530d2ae 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -194,12 +194,15 @@ entry_SYSCALL_64_fastpath:
>> ja 1f /* return -ENOSYS (already in pt_regs->ax) */
>> movq %r10, %rcx
>>
>> + /* Ensures the call is position independent */
>> + leaq sys_call_table(%rip), %r11
>> +
>> /*
>> * This call instruction is handled specially in stub_ptregs_64.
>> * It might end up jumping to the slow path. If it jumps, RAX
>> * and all argument registers are clobbered.
>> */
>> - call *sys_call_table(, %rax, 8)
>> + call *(%r11, %rax, 8)
>> .Lentry_SYSCALL_64_after_fastpath_call:
>>
>> movq %rax, RAX(%rsp)
>> @@ -334,7 +337,8 @@ ENTRY(stub_ptregs_64)
>> * RAX stores a pointer to the C function implementing the syscall.
>> * IRQs are on.
>> */
>> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
>> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
>> + cmpq %r11, (%rsp)
>> jne 1f
>>
>> /*
>> @@ -1172,7 +1176,8 @@ ENTRY(error_entry)
>> movl %ecx, %eax /* zero extend */
>> cmpq %rax, RIP+8(%rsp)
>> je .Lbstep_iret
>> - cmpq $.Lgs_change, RIP+8(%rsp)
>> + leaq .Lgs_change(%rip), %rcx
>> + cmpq %rcx, RIP+8(%rsp)
>> jne .Lerror_entry_done
>>
>> /*
>> @@ -1383,10 +1388,10 @@ ENTRY(nmi)
>> * resume the outer NMI.
>> */
>>
>> - movq $repeat_nmi, %rdx
>> + leaq repeat_nmi(%rip), %rdx
>> cmpq 8(%rsp), %rdx
>> ja 1f
>> - movq $end_repeat_nmi, %rdx
>> + leaq end_repeat_nmi(%rip), %rdx
>> cmpq 8(%rsp), %rdx
>> ja nested_nmi_out
>> 1:
>> @@ -1440,7 +1445,8 @@ nested_nmi:
>> pushq %rdx
>> pushfq
>> pushq $__KERNEL_CS
>> - pushq $repeat_nmi
>> + leaq repeat_nmi(%rip), %rdx
>> + pushq %rdx
>>
>> /* Put stack back */
>> addq $(6*8), %rsp
>> @@ -1479,7 +1485,9 @@ first_nmi:
>> addq $8, (%rsp) /* Fix up RSP */
>> pushfq /* RFLAGS */
>> pushq $__KERNEL_CS /* CS */
>> - pushq $1f /* RIP */
>> + pushq %rax /* Support Position Independent Code */
>> + leaq 1f(%rip), %rax /* RIP */
>> + xchgq %rax, (%rsp) /* Restore RAX, put 1f */
>> INTERRUPT_RETURN /* continues at repeat_nmi below */
>> UNWIND_HINT_IRET_REGS
>
> This patch seems to add extra overhead to the syscall fast-path even when PIE is
> disabled, right?

It does add extra instructions when one is not possible, I preferred
that over ifdefing but I can change it.

>
> Thanks,
>
> Ingo



--
Thomas

2017-10-20 14:48:57

by Thomas Garnier

[permalink] [raw]
Subject: Re: [PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

On Fri, Oct 20, 2017 at 1:28 AM, Ard Biesheuvel
<[email protected]> wrote:
> On 20 October 2017 at 09:24, Ingo Molnar <[email protected]> wrote:
>>
>> * Thomas Garnier <[email protected]> wrote:
>>
>>> Change the assembly code to use only relative references of symbols for the
>>> kernel to be PIE compatible.
>>>
>>> Position Independent Executable (PIE) support will allow to extended the
>>> KASLR randomization range below the -2G memory limit.
>>
>>> diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
>>> index 8739cf7795de..86fa068e5e81 100644
>>> --- a/arch/x86/crypto/aes-x86_64-asm_64.S
>>> +++ b/arch/x86/crypto/aes-x86_64-asm_64.S
>>> @@ -48,8 +48,12 @@
>>> #define R10 %r10
>>> #define R11 %r11
>>>
>>> +/* Hold global for PIE suport */
>>> +#define RBASE %r12
>>> +
>>> #define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
>>> ENTRY(FUNC); \
>>> + pushq RBASE; \
>>> movq r1,r2; \
>>> leaq KEY+48(r8),r9; \
>>> movq r10,r11; \
>>> @@ -74,54 +78,63 @@
>>> movl r6 ## E,4(r9); \
>>> movl r7 ## E,8(r9); \
>>> movl r8 ## E,12(r9); \
>>> + popq RBASE; \
>>> ret; \
>>> ENDPROC(FUNC);
>>>
>>> +#define round_mov(tab_off, reg_i, reg_o) \
>>> + leaq tab_off(%rip), RBASE; \
>>> + movl (RBASE,reg_i,4), reg_o;
>>> +
>>> +#define round_xor(tab_off, reg_i, reg_o) \
>>> + leaq tab_off(%rip), RBASE; \
>>> + xorl (RBASE,reg_i,4), reg_o;
>>> +
>>> #define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
>>> movzbl r2 ## H,r5 ## E; \
>>> movzbl r2 ## L,r6 ## E; \
>>> - movl TAB+1024(,r5,4),r5 ## E;\
>>> + round_mov(TAB+1024, r5, r5 ## E)\
>>> movw r4 ## X,r2 ## X; \
>>> - movl TAB(,r6,4),r6 ## E; \
>>> + round_mov(TAB, r6, r6 ## E) \
>>> roll $16,r2 ## E; \
>>> shrl $16,r4 ## E; \
>>> movzbl r4 ## L,r7 ## E; \
>>> movzbl r4 ## H,r4 ## E; \
>>> xorl OFFSET(r8),ra ## E; \
>>> xorl OFFSET+4(r8),rb ## E; \
>>> - xorl TAB+3072(,r4,4),r5 ## E;\
>>> - xorl TAB+2048(,r7,4),r6 ## E;\
>>> + round_xor(TAB+3072, r4, r5 ## E)\
>>> + round_xor(TAB+2048, r7, r6 ## E)\
>>> movzbl r1 ## L,r7 ## E; \
>>> movzbl r1 ## H,r4 ## E; \
>>> - movl TAB+1024(,r4,4),r4 ## E;\
>>> + round_mov(TAB+1024, r4, r4 ## E)\
>>> movw r3 ## X,r1 ## X; \
>>> roll $16,r1 ## E; \
>>> shrl $16,r3 ## E; \
>>> - xorl TAB(,r7,4),r5 ## E; \
>>> + round_xor(TAB, r7, r5 ## E) \
>>> movzbl r3 ## L,r7 ## E; \
>>> movzbl r3 ## H,r3 ## E; \
>>> - xorl TAB+3072(,r3,4),r4 ## E;\
>>> - xorl TAB+2048(,r7,4),r5 ## E;\
>>> + round_xor(TAB+3072, r3, r4 ## E)\
>>> + round_xor(TAB+2048, r7, r5 ## E)\
>>> movzbl r1 ## L,r7 ## E; \
>>> movzbl r1 ## H,r3 ## E; \
>>> shrl $16,r1 ## E; \
>>> - xorl TAB+3072(,r3,4),r6 ## E;\
>>> - movl TAB+2048(,r7,4),r3 ## E;\
>>> + round_xor(TAB+3072, r3, r6 ## E)\
>>> + round_mov(TAB+2048, r7, r3 ## E)\
>>> movzbl r1 ## L,r7 ## E; \
>>> movzbl r1 ## H,r1 ## E; \
>>> - xorl TAB+1024(,r1,4),r6 ## E;\
>>> - xorl TAB(,r7,4),r3 ## E; \
>>> + round_xor(TAB+1024, r1, r6 ## E)\
>>> + round_xor(TAB, r7, r3 ## E) \
>>> movzbl r2 ## H,r1 ## E; \
>>> movzbl r2 ## L,r7 ## E; \
>>> shrl $16,r2 ## E; \
>>> - xorl TAB+3072(,r1,4),r3 ## E;\
>>> - xorl TAB+2048(,r7,4),r4 ## E;\
>>> + round_xor(TAB+3072, r1, r3 ## E)\
>>> + round_xor(TAB+2048, r7, r4 ## E)\
>>> movzbl r2 ## H,r1 ## E; \
>>> movzbl r2 ## L,r2 ## E; \
>>> xorl OFFSET+8(r8),rc ## E; \
>>> xorl OFFSET+12(r8),rd ## E; \
>>> - xorl TAB+1024(,r1,4),r3 ## E;\
>>> - xorl TAB(,r2,4),r4 ## E;
>>> + round_xor(TAB+1024, r1, r3 ## E)\
>>> + round_xor(TAB, r2, r4 ## E)
>>
>> This appears to be adding unconditional overhead to a function that was moved to
>> assembly to improve its performance.
>>

It adds couple extra instructions, how much overhead it creates is
hard for me to tell. It would increase the code complexity if
everything is ifdef.

>
> I did some benchmarking on this code a while ago and, interestingly,
> it was slower than the generic C implementation (on a Pentium E2200),
> so we may want to consider whether we still need this driver in the
> first place.

Interesting.

--
Thomas

2017-10-20 15:20:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support


* Thomas Garnier <[email protected]> wrote:

> >> */
> >> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
> >> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
> >> + cmpq %r11, (%rsp)
> >> jne 1f

> > This patch seems to add extra overhead to the syscall fast-path even when PIE is
> > disabled, right?
>
> It does add extra instructions when one is not possible, I preferred
> that over ifdefing but I can change it.

So my problem is, this pattern repeats in many other places as well, but sprinking
various pieces of assembly code with #ifdefs would be very bad as well.

I have no good idea how to solve this.

Thanks,

Ingo

2017-10-20 16:27:24

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support



> On Oct 20, 2017, at 5:20 PM, Ingo Molnar <[email protected]> wrote:
>
>
> * Thomas Garnier <[email protected]> wrote:
>
>>>> */
>>>> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
>>>> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
>>>> + cmpq %r11, (%rsp)
>>>> jne 1f
>
>>> This patch seems to add extra overhead to the syscall fast-path even when PIE is
>>> disabled, right?
>>
>> It does add extra instructions when one is not possible, I preferred
>> that over ifdefing but I can change it.
>
> So my problem is, this pattern repeats in many other places as well, but sprinking
> various pieces of assembly code with #ifdefs would be very bad as well.
>
> I have no good idea how to solve this.
>

How about:

.macro JMP_TO_LABEL ...


> Thanks,
>
> Ingo

2017-10-20 17:52:17

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support



> On Oct 20, 2017, at 5:20 PM, Ingo Molnar <[email protected]> wrote:
>
>
> * Thomas Garnier <[email protected]> wrote:
>
>>>> */
>>>> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
>>>> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
>>>> + cmpq %r11, (%rsp)
>>>> jne 1f
>
>>> This patch seems to add extra overhead to the syscall fast-path even when PIE is
>>> disabled, right?
>>
>> It does add extra instructions when one is not possible, I preferred
>> that over ifdefing but I can change it.
>
> So my problem is, this pattern repeats in many other places as well, but sprinking
> various pieces of assembly code with #ifdefs would be very bad as well.
>
> I have no good idea how to solve this.
>
> Thanks,

Ugh, brain was off. This is a bit messy. We could use a macro for this, too, I suppose.

>
> Ingo