LinuxLists.cc - [PATCH v2 1/9] x86/asm: Mark all top level asm statements as .text

2019-03-30 00:49:13

Subject: [PATCH v2 1/9] x86/asm: Mark all top level asm statements as .text

From: Andi Kleen <[email protected]>

With gcc toplevel assembler statements that do not mark themselves
as .text may end up in other sections. I had LTO boot crashes because
various assembler statements ended up in the middle of the initcall
section. It's also a latent problem without LTO, although it's
currently not known to cause any real problems.

According to the gcc team it's expected behavior.

Always mark all the top level assembler statements as text
so that they switch to the right section.

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/amd.c | 3 ++-
arch/x86/kernel/kprobes/core.c | 1 +
arch/x86/lib/error-inject.c | 1 +
3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 01004bfb1a1b..1bcb489e07e7 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -83,7 +83,8 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
*/

extern __visible void vide(void);
-__asm__(".globl vide\n"
+__asm__(".text\n"
+ ".globl vide\n"
".type vide, @function\n"
".align 4\n"
"vide: ret\n");
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index a034cb808e7e..31ab91c9c4e9 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -715,6 +715,7 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
* calls trampoline_handler() runs, which calls the kretprobe's handler.
*/
asm(
+ ".text\n"
".global kretprobe_trampoline\n"
".type kretprobe_trampoline, @function\n"
"kretprobe_trampoline:\n"
diff --git a/arch/x86/lib/error-inject.c b/arch/x86/lib/error-inject.c
index 3cdf06128d13..be5b5fb1598b 100644
--- a/arch/x86/lib/error-inject.c
+++ b/arch/x86/lib/error-inject.c
@@ -6,6 +6,7 @@
asmlinkage void just_return_func(void);

asm(
+ ".text\n"
".type just_return_func, @function\n"
".globl just_return_func\n"
"just_return_func:\n"
--
2.20.1

2019-03-30 00:48:53

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 6/9] x86/hyperv: Make hv_vcpu_is_preempted visible

From: Andi Kleen <[email protected]>

This function is referrenced from assembler, so need to be marked
visible for LTO.

Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Yi Sun <[email protected]>
Fixes: 3a025de64bf8 x86/hyperv: Enable PV qspinlock for Hyper-V
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/hyperv/hv_spinlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index a861b0456b1a..07f21a06392f 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -56,7 +56,7 @@ static void hv_qlock_wait(u8 *byte, u8 val)
/*
* Hyper-V does not support this so far.
*/
-bool hv_vcpu_is_preempted(int vcpu)
+__visible bool hv_vcpu_is_preempted(int vcpu)
{
return false;
}
--
2.20.1

2019-03-30 00:49:02

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 7/9] x86/kprobes: Make trampoline_handler global and visible

From: Andi Kleen <[email protected]>

This function is referenced from assembler, so in LTO
it needs to be global and visible to not be optimized away.

Cc: [email protected]
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 31ab91c9c4e9..1309a4eb3119 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -752,7 +752,7 @@ STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
/*
* Called from kretprobe_trampoline
*/
-static __used void *trampoline_handler(struct pt_regs *regs)
+__used __visible void *trampoline_handler(struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;
--
2.20.1

2019-03-30 00:49:07

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 8/9] x86/kvm: Make steal_time visible

From: Andi Kleen <[email protected]>

This per cpu variable is accessed from assembler code, so needs
to be visible.

Cc: [email protected]
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/kvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 5c93a65ee1e5..3f0cc828cc36 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -67,7 +67,7 @@ static int __init parse_no_stealacc(char *arg)
early_param("no-steal-acc", parse_no_stealacc);

static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64);
+DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64) __visible;
static int has_steal_clock = 0;

/*
--
2.20.1

2019-03-30 00:49:22

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 5/9] x86/xen: Mark xen_vcpu_stolen as __visible

From: Andi Kleen <[email protected]>

This function is referenced from assembler, so needs to be marked
__visible for LTO.

Cc: [email protected]
Cc: [email protected]
Signed-off-by: Andi Kleen <[email protected]>
---
drivers/xen/time.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859c29d0..04b59dab30f5 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -144,7 +144,7 @@ void xen_get_runstate_snapshot(struct vcpu_runstate_info *res)
}

/* return true when a vcpu could run but has no real cpu to run on */
-bool xen_vcpu_stolen(int vcpu)
+__visible bool xen_vcpu_stolen(int vcpu)
{
return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable;
}
--
2.20.1

2019-03-30 00:50:39

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 9/9] x86/cpu/bugs: Fix __initconst usage in bugs.c

From: Andi Kleen <[email protected]>

Fix some of the recently added const tables to use __initconst
for const data instead of __initdata which causes section attribute
conflicts.

Fixes: fa1202ef2243 ("x86/speculation: Add command line control")
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 2da82eff0eb4..b91b3bfa5cfb 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -275,7 +275,7 @@ static const struct {
const char *option;
enum spectre_v2_user_cmd cmd;
bool secure;
-} v2_user_options[] __initdata = {
+} v2_user_options[] __initconst = {
{ "auto", SPECTRE_V2_USER_CMD_AUTO, false },
{ "off", SPECTRE_V2_USER_CMD_NONE, false },
{ "on", SPECTRE_V2_USER_CMD_FORCE, true },
@@ -419,7 +419,7 @@ static const struct {
const char *option;
enum spectre_v2_mitigation_cmd cmd;
bool secure;
-} mitigation_options[] __initdata = {
+} mitigation_options[] __initconst = {
{ "off", SPECTRE_V2_CMD_NONE, false },
{ "on", SPECTRE_V2_CMD_FORCE, true },
{ "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
@@ -658,7 +658,7 @@ static const char * const ssb_strings[] = {
static const struct {
const char *option;
enum ssb_mitigation_cmd cmd;
-} ssb_mitigation_options[] __initdata = {
+} ssb_mitigation_options[] __initconst = {
{ "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
{ "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */
{ "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
--
2.20.1

2019-03-30 00:50:52

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 4/9] x86/timer: Don't inline __const_udelay

From: Andi Kleen <[email protected]>

LTO will happily inline __const_udelay everywhere it is used.
Forcing it noinline saves ~44k text in a LTO build.

13999560 1740864 1499136 17239560 1070e08 vmlinux-with-udelay-inline
13954764 1736768 1499136 17190668 1064f0c vmlinux-wo-udelay-inline

Even without LTO I believe marking it noinline documents it correctly.

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/lib/delay.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index f5b7f1b3b6d7..b7375dc6898f 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -162,7 +162,7 @@ void __delay(unsigned long loops)
}
EXPORT_SYMBOL(__delay);

-void __const_udelay(unsigned long xloops)
+noinline void __const_udelay(unsigned long xloops)
{
unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy;
int d0;
--
2.20.1

2019-03-30 00:50:57

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 2/9] x86/cpu/amd: Ifdef 32bit only assembler for 32bit

From: Andi Kleen <[email protected]>

The "vide" inline assembler is only needed on 32bit kernels for old
32bit only CPUs. Ifdef it to not include it into 64bit kernels.

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/amd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1bcb489e07e7..fb6a64bd765f 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -82,12 +82,14 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
* performance at the same time..
*/

+#ifdef CONFIG_X86_32
extern __visible void vide(void);
__asm__(".text\n"
".globl vide\n"
".type vide, @function\n"
".align 4\n"
"vide: ret\n");
+#endif

static void init_amd_k5(struct cpuinfo_x86 *c)
{
--
2.20.1

2019-03-30 00:51:10

by Andi Kleen

[permalink] [raw]

Subject: [PATCH v2 3/9] x86/paravirt: Replace paravirt patches with data

From: Andi Kleen <[email protected]>

For LTO all top level assembler statements need to be global because
LTO might put it into a different assembler file than the referencing
C code.

To avoid making all the paravirt patch snippets global replace them
with data containing the patch instructions. Since these are unlikely
to change this shouldn't be a significant maintenance burden.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/include/asm/paravirt_types.h | 6 +---
arch/x86/kernel/paravirt_patch_32.c | 33 +++++++++++----------
arch/x86/kernel/paravirt_patch_64.c | 42 +++++++++++++++------------
3 files changed, 42 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 2474e434a6f7..bb13e79d4344 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -367,12 +367,8 @@ extern struct paravirt_patch_template pv_ops;
#define paravirt_alt(insn_string) \
_paravirt_alt(insn_string, "%c[paravirt_typenum]", "%c[paravirt_clobber]")

-/* Simple instruction patching code. */
-#define NATIVE_LABEL(a,x,b) "\n\t.globl " a #x "_" #b "\n" a #x "_" #b ":\n\t"
-
#define DEF_NATIVE(ops, name, code) \
- __visible extern const char start_##ops##_##name[], end_##ops##_##name[]; \
- asm(NATIVE_LABEL("start_", ops, name) code NATIVE_LABEL("end_", ops, name))
+ const char start_##ops##_##name[] = code

unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len);
unsigned paravirt_patch_default(u8 type, void *insnbuf,
diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c
index de138d3912e4..9a649026d74c 100644
--- a/arch/x86/kernel/paravirt_patch_32.c
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -2,14 +2,14 @@
#include <asm/paravirt.h>

#ifdef CONFIG_PARAVIRT_XXL
-DEF_NATIVE(irq, irq_disable, "cli");
-DEF_NATIVE(irq, irq_enable, "sti");
-DEF_NATIVE(irq, restore_fl, "push %eax; popf");
-DEF_NATIVE(irq, save_fl, "pushf; pop %eax");
-DEF_NATIVE(cpu, iret, "iret");
-DEF_NATIVE(mmu, read_cr2, "mov %cr2, %eax");
-DEF_NATIVE(mmu, write_cr3, "mov %eax, %cr3");
-DEF_NATIVE(mmu, read_cr3, "mov %cr3, %eax");
+static const unsigned char patch_irq_irq_disable[] = { 0xfa }; /* cli */
+static const unsigned char patch_irq_irq_enable[] = { 0xfb }; /* sti */
+static const unsigned char patch_irq_restore_fl[] = { 0x50, 0x9d }; /* push %eax; popf */
+static const unsigned char patch_irq_save_fl[] = { 0x9c, 0x58 }; /* pushf; pop %eax */
+static const unsigned char patch_cpu_iret[] = { 0xcf }; /* iret */
+static const unsigned char patch_mmu_read_cr2[] = { 0x0f, 0x20, 0xd0 }; /* mov %cr2, %eax */
+static const unsigned char patch_mmu_write_cr3[] = { 0x0f, 0x22, 0xd8 };/* mov %eax, %cr3 */
+static const unsigned char patch_mmu_read_cr3[] = { 0x0f, 0x20, 0xd8 }; /* mov %cr3, %eax */

unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len)
{
@@ -19,8 +19,8 @@ unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len)
#endif

#if defined(CONFIG_PARAVIRT_SPINLOCKS)
-DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%eax)");
-DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
+static const unsigned char patch_lock_queued_spin_unlock[] = { 0xc6, 0x00, 0x00 }; /* movb $0, (%eax) */
+static const unsigned char patch_lock_vcpu_is_preempted[] = { 0x31, 0xc0 }; /* xor %eax, %eax */
#endif

extern bool pv_is_native_spin_unlock(void);
@@ -30,7 +30,8 @@ unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len)
{
#define PATCH_SITE(ops, x) \
case PARAVIRT_PATCH(ops.x): \
- return paravirt_patch_insns(ibuf, len, start_##ops##_##x, end_##ops##_##x)
+ return paravirt_patch_insns(ibuf, len, \
+ patch_##ops##_##x, patch_##ops##_##x+sizeof(patch_##ops##_x));

switch (type) {
#ifdef CONFIG_PARAVIRT_XXL
@@ -47,15 +48,17 @@ unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len)
case PARAVIRT_PATCH(lock.queued_spin_unlock):
if (pv_is_native_spin_unlock())
return paravirt_patch_insns(ibuf, len,
- start_lock_queued_spin_unlock,
- end_lock_queued_spin_unlock);
+ patch_lock_queued_spin_unlock,
+ patch_lock_queued_spin_unlock +
+ sizeof(patch_lock_queued_spin_unlock));
break;

case PARAVIRT_PATCH(lock.vcpu_is_preempted):
if (pv_is_native_vcpu_is_preempted())
return paravirt_patch_insns(ibuf, len,
- start_lock_vcpu_is_preempted,
- end_lock_vcpu_is_preempted);
+ patch_lock_vcpu_is_preempted,
+ patch_lock_vcpu_is_preempted +
+ sizeof(patch_lock_vcpu_is_preempted));
break;
#endif

diff --git a/arch/x86/kernel/paravirt_patch_64.c b/arch/x86/kernel/paravirt_patch_64.c
index 9d9e04b31077..fce6f54665d3 100644
--- a/arch/x86/kernel/paravirt_patch_64.c
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -4,29 +4,30 @@
#include <linux/stringify.h>

#ifdef CONFIG_PARAVIRT_XXL
-DEF_NATIVE(irq, irq_disable, "cli");
-DEF_NATIVE(irq, irq_enable, "sti");
-DEF_NATIVE(irq, restore_fl, "pushq %rdi; popfq");
-DEF_NATIVE(irq, save_fl, "pushfq; popq %rax");
-DEF_NATIVE(mmu, read_cr2, "movq %cr2, %rax");
-DEF_NATIVE(mmu, read_cr3, "movq %cr3, %rax");
-DEF_NATIVE(mmu, write_cr3, "movq %rdi, %cr3");
-DEF_NATIVE(cpu, wbinvd, "wbinvd");
+static const unsigned char patch_irq_irq_disable[] = { 0xfa }; /* cli */
+static const unsigned char patch_irq_irq_enable[] = { 0xfb }; /* sti */
+static const unsigned char patch_irq_restore_fl[] = { 0x50, 0x9d}; /* pushq %rdi; popfq */
+static const unsigned char patch_irq_save_fl[] = { 0x9c, 0x58 }; /* pushfq; popq %rax */
+static const unsigned char patch_mmu_read_cr2[] = { 0x0f, 0x20, 0xd0 }; /* movq %cr2, %rax */
+static const unsigned char patch_mmu_read_cr3[] = { 0x0f, 0x22, 0xd8 }; /* movq %cr3, %rax */
+static const unsigned char patch_mmu_write_cr3[] = { 0x0f, 0x22, 0xdf }; /* movq %rdi, %cr3 */
+static const unsigned char patch_cpu_wbinvd[] = { 0x0f, 0x09 }; /* wbinvd */

-DEF_NATIVE(cpu, usergs_sysret64, "swapgs; sysretq");
-DEF_NATIVE(cpu, swapgs, "swapgs");
-DEF_NATIVE(, mov64, "mov %rdi, %rax");
+static const unsigned char patch_cpu_usergs_sysret64[] = { 0x0f, 0x01, 0xf8, 0x48, 0x0f, 0x07 };
+ /* swapgs; sysretq */
+static const unsigned char patch_cpu_swapgs[] = { 0x0f, 0x01, 0xf8 }; /* swapgs */
+static const unsigned char patch_mov64[] = { 0x48, 0x89, 0xf8 }; /* mov %rdi, %rax */

unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len)
{
return paravirt_patch_insns(insnbuf, len,
- start__mov64, end__mov64);
+ start_mov64, start_mov64 + sizeof(start_mov64));
}
#endif

#if defined(CONFIG_PARAVIRT_SPINLOCKS)
-DEF_NATIVE(lock, queued_spin_unlock, "movb $0, (%rdi)");
-DEF_NATIVE(lock, vcpu_is_preempted, "xor %eax, %eax");
+static const unsigned char patch_lock_queued_spin_unlock[] = { 0xc6, 0x07, 0x00}; /* movb $0, (%rdi) */
+static const unsigned char patch_lock_vcpu_is_preempted[] = { 0x31, 0xc0 }; /* xor %eax, %eax */
#endif

extern bool pv_is_native_spin_unlock(void);
@@ -36,7 +37,8 @@ unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len)
{
#define PATCH_SITE(ops, x) \
case PARAVIRT_PATCH(ops.x): \
- return paravirt_patch_insns(ibuf, len, start_##ops##_##x, end_##ops##_##x)
+ return paravirt_patch_insns(ibuf, len, start_##ops##_##x, \
+ patch_##ops##_##x + sizeof(patch_##ops##_##x));

switch (type) {
#ifdef CONFIG_PARAVIRT_XXL
@@ -55,15 +57,17 @@ unsigned native_patch(u8 type, void *ibuf, unsigned long addr, unsigned len)
case PARAVIRT_PATCH(lock.queued_spin_unlock):
if (pv_is_native_spin_unlock())
return paravirt_patch_insns(ibuf, len,
- start_lock_queued_spin_unlock,
- end_lock_queued_spin_unlock);
+ patch_lock_queued_spin_unlock,
+ patch_lock_queued_spin_unlock +
+ sizeof(patch_lock_queued_spin_unlock));
break;

case PARAVIRT_PATCH(lock.vcpu_is_preempted):
if (pv_is_native_vcpu_is_preempted())
return paravirt_patch_insns(ibuf, len,
- start_lock_vcpu_is_preempted,
- end_lock_vcpu_is_preempted);
+ patch_lock_vcpu_is_preempted,
+ patch_lock_vcpu_is_preempted +
+ sizeof(patch_lock_vcpu_is_preempted));
break;
#endif

--
2.20.1

2019-03-30 10:40:54

by Jürgen Groß

[permalink] [raw]

Subject: Re: [PATCH v2 5/9] x86/xen: Mark xen_vcpu_stolen as __visible

On 30/03/2019 01:47, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> This function is referenced from assembler, so needs to be marked
> __visible for LTO.
>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Andi Kleen <[email protected]>

Acked-by: Juergen Gross <[email protected]>

Juergen

2019-04-19 19:06:47

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/core] x86/kprobes: Make trampoline_handler() global and visible

Commit-ID: 6ea26c21941cc313bc05f340019fe454900d21bd
Gitweb: https://git.kernel.org/tip/6ea26c21941cc313bc05f340019fe454900d21bd
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:41 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:55:30 +0200

x86/kprobes: Make trampoline_handler() global and visible

This function is referenced from assembler, so in LTO
it needs to be global and visible to not be optimized away.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/kprobes/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index a034cb808e7e..a7b2a5c4d969 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -751,7 +751,7 @@ STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
/*
* Called from kretprobe_trampoline
*/
-static __used void *trampoline_handler(struct pt_regs *regs)
+__used __visible void *trampoline_handler(struct pt_regs *regs)
{
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;

2019-04-19 19:06:56

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/asm] x86/asm: Mark all top level asm statements as .text

Commit-ID: c03e27506a564ec7db1b179e7464835901f49751
Gitweb: https://git.kernel.org/tip/c03e27506a564ec7db1b179e7464835901f49751
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:35 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:46:55 +0200

x86/asm: Mark all top level asm statements as .text

With gcc toplevel assembler statements that do not mark themselves as .text
may end up in other sections. This causes LTO boot crashes because various
assembler statements ended up in the middle of the initcall section. It's
also a latent problem without LTO, although it's currently not known to
cause any real problems.

According to the gcc team it's expected behavior.

Always mark all the top level assembler statements as text so that they
switch to the right section.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/cpu/amd.c | 3 ++-
arch/x86/kernel/kprobes/core.c | 1 +
arch/x86/lib/error-inject.c | 1 +
3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 01004bfb1a1b..1bcb489e07e7 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -83,7 +83,8 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
*/

extern __visible void vide(void);
-__asm__(".globl vide\n"
+__asm__(".text\n"
+ ".globl vide\n"
".type vide, @function\n"
".align 4\n"
"vide: ret\n");
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index a034cb808e7e..31ab91c9c4e9 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -715,6 +715,7 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
* calls trampoline_handler() runs, which calls the kretprobe's handler.
*/
asm(
+ ".text\n"
".global kretprobe_trampoline\n"
".type kretprobe_trampoline, @function\n"
"kretprobe_trampoline:\n"
diff --git a/arch/x86/lib/error-inject.c b/arch/x86/lib/error-inject.c
index 3cdf06128d13..be5b5fb1598b 100644
--- a/arch/x86/lib/error-inject.c
+++ b/arch/x86/lib/error-inject.c
@@ -6,6 +6,7 @@
asmlinkage void just_return_func(void);

asm(
+ ".text\n"
".type just_return_func, @function\n"
".globl just_return_func\n"
"just_return_func:\n"

2019-04-19 19:06:59

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/platform] x86/hyperv: Make hv_vcpu_is_preempted() visible

Commit-ID: 02143c2931c3c0faf088c5859a10de6c2b4f2d96
Gitweb: https://git.kernel.org/tip/02143c2931c3c0faf088c5859a10de6c2b4f2d96
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:40 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:58:57 +0200

x86/hyperv: Make hv_vcpu_is_preempted() visible

This function is referrenced from assembler, so it needs to be marked
visible for LTO.

Fixes: 3a025de64bf8 ("x86/hyperv: Enable PV qspinlock for Hyper-V")
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Yi Sun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/hyperv/hv_spinlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index a861b0456b1a..07f21a06392f 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -56,7 +56,7 @@ static void hv_qlock_wait(u8 *byte, u8 val)
/*
* Hyper-V does not support this so far.
*/
-bool hv_vcpu_is_preempted(int vcpu)
+__visible bool hv_vcpu_is_preempted(int vcpu)
{
return false;
}

2019-04-19 19:07:03

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/urgent] x86/cpu/bugs: Use __initconst for 'const' init data

Commit-ID: 1de7edbb59c8f1b46071f66c5c97b8a59569eb51
Gitweb: https://git.kernel.org/tip/1de7edbb59c8f1b46071f66c5c97b8a59569eb51
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:43 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:11:39 +0200

x86/cpu/bugs: Use __initconst for 'const' init data

Some of the recently added const tables use __initdata which causes section
attribute conflicts.

Use __initconst instead.

Fixes: fa1202ef2243 ("x86/speculation: Add command line control")
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/cpu/bugs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 2da82eff0eb4..b91b3bfa5cfb 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -275,7 +275,7 @@ static const struct {
const char *option;
enum spectre_v2_user_cmd cmd;
bool secure;
-} v2_user_options[] __initdata = {
+} v2_user_options[] __initconst = {
{ "auto", SPECTRE_V2_USER_CMD_AUTO, false },
{ "off", SPECTRE_V2_USER_CMD_NONE, false },
{ "on", SPECTRE_V2_USER_CMD_FORCE, true },
@@ -419,7 +419,7 @@ static const struct {
const char *option;
enum spectre_v2_mitigation_cmd cmd;
bool secure;
-} mitigation_options[] __initdata = {
+} mitigation_options[] __initconst = {
{ "off", SPECTRE_V2_CMD_NONE, false },
{ "on", SPECTRE_V2_CMD_FORCE, true },
{ "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
@@ -658,7 +658,7 @@ static const char * const ssb_strings[] = {
static const struct {
const char *option;
enum ssb_mitigation_cmd cmd;
-} ssb_mitigation_options[] __initdata = {
+} ssb_mitigation_options[] __initconst = {
{ "auto", SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
{ "on", SPEC_STORE_BYPASS_CMD_ON }, /* Disable Speculative Store Bypass */
{ "off", SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */

2019-04-19 19:07:27

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/asm] x86/cpu/amd: Exclude 32bit only assembler from 64bit build

Commit-ID: 26b31f46f036ad89de20cbbb732b76289411eddb
Gitweb: https://git.kernel.org/tip/26b31f46f036ad89de20cbbb732b76289411eddb
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:36 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:47:35 +0200

x86/cpu/amd: Exclude 32bit only assembler from 64bit build

The "vide" inline assembler is only needed on 32bit kernels for old
32bit only CPUs.

Guard it with an #ifdef so it's not included in 64bit builds.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/cpu/amd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 1bcb489e07e7..fb6a64bd765f 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -82,12 +82,14 @@ static inline int wrmsrl_amd_safe(unsigned msr, unsigned long long val)
* performance at the same time..
*/

+#ifdef CONFIG_X86_32
extern __visible void vide(void);
__asm__(".text\n"
".globl vide\n"
".type vide, @function\n"
".align 4\n"
"vide: ret\n");
+#endif

static void init_amd_k5(struct cpuinfo_x86 *c)
{

2019-04-19 19:07:38

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/timers] x86/timer: Don't inline __const_udelay()

Commit-ID: 81423c37415fe45057d64196ae0ce8e17a9c7148
Gitweb: https://git.kernel.org/tip/81423c37415fe45057d64196ae0ce8e17a9c7148
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:38 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:49:47 +0200

x86/timer: Don't inline __const_udelay()

LTO will happily inline __const_udelay() everywhere it is used. Forcing it
noinline saves ~44k text in a LTO build.

13999560 1740864 1499136 17239560 1070e08 vmlinux-with-udelay-inline
13954764 1736768 1499136 17190668 1064f0c vmlinux-wo-udelay-inline

Even without LTO this function should never be inlined.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/lib/delay.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index f5b7f1b3b6d7..b7375dc6898f 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -162,7 +162,7 @@ void __delay(unsigned long loops)
}
EXPORT_SYMBOL(__delay);

-void __const_udelay(unsigned long xloops)
+noinline void __const_udelay(unsigned long xloops)
{
unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy;
int d0;

2019-04-19 19:08:05

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/hyperv] x86/hyperv: Make hv_vcpu_is_preempted() visible

Commit-ID: d3748c8533f576969837f11e69c39d3f080c0e2b
Gitweb: https://git.kernel.org/tip/d3748c8533f576969837f11e69c39d3f080c0e2b
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:40 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:52:49 +0200

x86/hyperv: Make hv_vcpu_is_preempted() visible

This function is referrenced from assembler, so it needs to be marked
visible for LTO.

Fixes: 3a025de64bf8 ("x86/hyperv: Enable PV qspinlock for Hyper-V")
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Yi Sun <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/hyperv/hv_spinlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_spinlock.c b/arch/x86/hyperv/hv_spinlock.c
index a861b0456b1a..07f21a06392f 100644
--- a/arch/x86/hyperv/hv_spinlock.c
+++ b/arch/x86/hyperv/hv_spinlock.c
@@ -56,7 +56,7 @@ static void hv_qlock_wait(u8 *byte, u8 val)
/*
* Hyper-V does not support this so far.
*/
-bool hv_vcpu_is_preempted(int vcpu)
+__visible bool hv_vcpu_is_preempted(int vcpu)
{
return false;
}

2019-04-19 19:08:35

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/platform] x86/kvm: Make steal_time visible

Commit-ID: 14e581c381b942ce5463a7e61326d8ce1c843be7
Gitweb: https://git.kernel.org/tip/14e581c381b942ce5463a7e61326d8ce1c843be7
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:42 -0700
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 19 Apr 2019 17:58:57 +0200

x86/kvm: Make steal_time visible

This per cpu variable is accessed from assembler code, so it needs
to be visible for LTO.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]

---
arch/x86/kernel/kvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 5c93a65ee1e5..3f0cc828cc36 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -67,7 +67,7 @@ static int __init parse_no_stealacc(char *arg)
early_param("no-steal-acc", parse_no_stealacc);

static DEFINE_PER_CPU_DECRYPTED(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
-static DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64);
+DEFINE_PER_CPU_DECRYPTED(struct kvm_steal_time, steal_time) __aligned(64) __visible;
static int has_steal_clock = 0;

/*

2019-04-19 21:28:17

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [PATCH v2 3/9] x86/paravirt: Replace paravirt patches with data

On Fri, 29 Mar 2019, Andi Kleen wrote:
> For LTO all top level assembler statements need to be global because
> LTO might put it into a different assembler file than the referencing
> C code.
>
> To avoid making all the paravirt patch snippets global replace them
> with data containing the patch instructions. Since these are unlikely
> to change this shouldn't be a significant maintenance burden.

s/with data containing/with unparseable, inconsistent and broken mess/

Unparseable:
------------

> +static const unsigned char patch_irq_save_fl[] = { 0x9c, 0x58 }; /* pushf; pop %eax */
> +static const unsigned char patch_cpu_iret[] = { 0xcf }; /* iret */
> +static const unsigned char patch_mmu_read_cr2[] = { 0x0f, 0x20, 0xd0 }; /* mov %cr2, %eax */
> +static const unsigned char patch_mmu_write_cr3[] = { 0x0f, 0x22, 0xd8 };/* mov %eax, %cr3 */
> +static const unsigned char patch_mmu_read_cr3[] = { 0x0f, 0x20, 0xd8 }; /* mov %cr3, %eax */

Overlong lines, spaces and tabs mixed, no formatting which allows easy
reading and review.

Inconsistent and unparseable:
-----------------------------

> #define PATCH_SITE(ops, x) \
> case PARAVIRT_PATCH(ops.x): \
> - return paravirt_patch_insns(ibuf, len, start_##ops##_##x, end_##ops##_##x)
> + return paravirt_patch_insns(ibuf, len, \
> + patch_##ops##_##x, patch_##ops##_##x+sizeof(patch_##ops##_x));

vs.

> + return paravirt_patch_insns(ibuf, len, start_##ops##_##x, \
> + patch_##ops##_##x + sizeof(patch_##ops##_##x));

Broken:
-------

> +static const unsigned char patch_irq_restore_fl[] = { 0x50, 0x9d}; /* pushq %rdi; popfq */

Can you spot the fail?

That probably works because Intel CPUs are so good at executing crappy
code, right? That's at least what you told me recently. If that's the proof
then I should really stop reviewing patches.

Thanks,

tglx

2019-05-08 14:25:06

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:x86/urgent] x86/kprobes: Make trampoline_handler() global and visible

Commit-ID: 0e72499c3cc0cead32f88b94a02204d2b80768bf
Gitweb: https://git.kernel.org/tip/0e72499c3cc0cead32f88b94a02204d2b80768bf
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 29 Mar 2019 17:47:41 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 8 May 2019 13:13:58 +0200

x86/kprobes: Make trampoline_handler() global and visible

This function is referenced from assembler, so in LTO
it needs to be global and visible to not be optimized away.

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index cf52ee0d8711..9e4fa2484d10 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -768,7 +768,7 @@ static struct kprobe kretprobe_kprobe = {
/*
* Called from kretprobe_trampoline
*/
-static __used void *trampoline_handler(struct pt_regs *regs)
+__used __visible void *trampoline_handler(struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb;
struct kretprobe_instance *ri = NULL;