2017-08-29 17:51:48

by Josh Poimboeuf

[permalink] [raw]
Subject: [PATCH] objtool: Handle GCC stack pointer adjustment bug

Arnd Bergmann reported the following warning with GCC 7.1.1:

fs/fs_pin.o: warning: objtool: pin_kill()+0x139: stack state mismatch: cfa1=7+88 cfa2=7+96

And the kbuild robot reported the following warnings with GCC 5.4.1:

fs/fs_pin.o: warning: objtool: pin_kill()+0x182: return with modified stack frame
fs/quota/dquot.o: warning: objtool: dquot_alloc_inode()+0x140: stack state mismatch: cfa1=7+120 cfa2=7+128
fs/quota/dquot.o: warning: objtool: dquot_free_inode()+0x11a: stack state mismatch: cfa1=7+112 cfa2=7+120

Those warnings are caused by an unusual GCC non-optimization where it
uses an intermediate register to adjust the stack pointer. It does:

lea 0x8(%rsp), %rcx
...
mov %rcx, %rsp

Instead of the obvious:

add $0x8, %rsp

It makes no sense to use an intermediate register, so I opened a GCC bug
to track it:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813

But it's not exactly a high-priority bug and it looks like we'll be
stuck with this issue for a while. So for now we have to track register
values when they're loaded with stack pointer offsets.

This is kind of a big workaround for a tiny problem, but c'est la vie.
I hope to eventually create a GCC plugin to implement a big chunk of
objtool's functionality. Hopefully at that point we'll be able to
remove of a lot of these GCC-isms from the objtool code.

Reported-by: Arnd Bergmann <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
---
tools/objtool/arch/x86/decode.c | 94 ++++++++++++-----------------------------
tools/objtool/cfi.h | 2 +-
tools/objtool/check.c | 81 ++++++++++++++++++++++++++---------
tools/objtool/check.h | 1 +
4 files changed, 88 insertions(+), 90 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 7841e5d31973..0e8c8ec4fd4e 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -86,8 +86,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
struct insn insn;
int x86_64, sign;
unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
- modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
- sib = 0;
+ rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
+ modrm_reg = 0, sib = 0;

x86_64 = is_x86_64(elf);
if (x86_64 == -1)
@@ -114,6 +114,7 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
rex = insn.rex_prefix.bytes[0];
rex_w = X86_REX_W(rex) >> 3;
rex_r = X86_REX_R(rex) >> 2;
+ rex_x = X86_REX_X(rex) >> 1;
rex_b = X86_REX_B(rex);
}

@@ -217,6 +218,18 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->dest.reg = CFI_BP;
break;
}
+
+ if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
+
+ /* mov reg, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
/* fallthrough */
case 0x88:
if (!rex_b &&
@@ -269,80 +282,28 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
break;

case 0x8d:
- if (rex == 0x48 && modrm == 0x65) {
+ if (sib == 0x24 && rex_w && !rex_b && !rex_x) {

- /* lea disp(%rbp), %rsp */
+ /* lea disp(%rsp), reg */
*type = INSN_STACK;
op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_BP;
+ op->src.reg = CFI_SP;
op->src.offset = insn.displacement.value;
op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
- break;
- }
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];

- if (rex == 0x48 && (modrm == 0xa4 || modrm == 0x64) &&
- sib == 0x24) {
+ } else if (rex == 0x48 && modrm == 0x65) {

- /* lea disp(%rsp), %rsp */
+ /* lea disp(%rbp), %rsp */
*type = INSN_STACK;
op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
+ op->src.reg = CFI_BP;
op->src.offset = insn.displacement.value;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
- }

- if (rex == 0x48 && modrm == 0x2c && sib == 0x24) {
-
- /* lea (%rsp), %rbp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = CFI_SP;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_BP;
- break;
- }
-
- if (rex == 0x4c && modrm == 0x54 && sib == 0x24 &&
- insn.displacement.value == 8) {
-
- /*
- * lea 0x8(%rsp), %r10
- *
- * Here r10 is the "drap" pointer, used as a stack
- * pointer helper when the stack gets realigned.
- */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = 8;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_R10;
- break;
- }
-
- if (rex == 0x4c && modrm == 0x6c && sib == 0x24 &&
- insn.displacement.value == 16) {
-
- /*
- * lea 0x10(%rsp), %r13
- *
- * Here r13 is the "drap" pointer, used as a stack
- * pointer helper when the stack gets realigned.
- */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = 16;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_R13;
- break;
- }
-
- if (rex == 0x49 && modrm == 0x62 &&
- insn.displacement.value == -8) {
+ } else if (rex == 0x49 && modrm == 0x62 &&
+ insn.displacement.value == -8) {

/*
* lea -0x8(%r10), %rsp
@@ -356,11 +317,9 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->src.offset = -8;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
- }

- if (rex == 0x49 && modrm == 0x65 &&
- insn.displacement.value == -16) {
+ } else if (rex == 0x49 && modrm == 0x65 &&
+ insn.displacement.value == -16) {

/*
* lea -0x10(%r13), %rsp
@@ -374,7 +333,6 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->src.offset = -16;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
}

break;
diff --git a/tools/objtool/cfi.h b/tools/objtool/cfi.h
index 443ab2c69992..2fe883c665c7 100644
--- a/tools/objtool/cfi.h
+++ b/tools/objtool/cfi.h
@@ -40,7 +40,7 @@
#define CFI_R14 14
#define CFI_R15 15
#define CFI_RA 16
-#define CFI_NUM_REGS 17
+#define CFI_NUM_REGS 17

struct cfi_reg {
int base;
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 3dffeb944523..f744617c9946 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -218,8 +218,10 @@ static void clear_insn_state(struct insn_state *state)

memset(state, 0, sizeof(*state));
state->cfa.base = CFI_UNDEFINED;
- for (i = 0; i < CFI_NUM_REGS; i++)
+ for (i = 0; i < CFI_NUM_REGS; i++) {
state->regs[i].base = CFI_UNDEFINED;
+ state->vals[i].base = CFI_UNDEFINED;
+ }
state->drap_reg = CFI_UNDEFINED;
state->drap_offset = -1;
}
@@ -1201,24 +1203,47 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state)
switch (op->src.type) {

case OP_SRC_REG:
- if (cfa->base == op->src.reg && cfa->base == CFI_SP &&
- op->dest.reg == CFI_BP && regs[CFI_BP].base == CFI_CFA &&
- regs[CFI_BP].offset == -cfa->offset) {
-
- /* mov %rsp, %rbp */
- cfa->base = op->dest.reg;
- state->bp_scratch = false;
- } else if (state->drap) {
-
- /* drap: mov %rsp, %rbp */
- regs[CFI_BP].base = CFI_BP;
- regs[CFI_BP].offset = -state->stack_size;
- state->bp_scratch = false;
- } else if (!no_fp) {
-
- WARN_FUNC("unknown stack-related register move",
- insn->sec, insn->offset);
- return -1;
+ if (op->src.reg == CFI_SP && op->dest.reg == CFI_BP) {
+
+ if (cfa->base == CFI_SP &&
+ regs[CFI_BP].base == CFI_CFA &&
+ regs[CFI_BP].offset == -cfa->offset) {
+
+ /* mov %rsp, %rbp */
+ cfa->base = op->dest.reg;
+ state->bp_scratch = false;
+ }
+
+ else if (state->drap) {
+
+ /* drap: mov %rsp, %rbp */
+ regs[CFI_BP].base = CFI_BP;
+ regs[CFI_BP].offset = -state->stack_size;
+ state->bp_scratch = false;
+ }
+ }
+
+ else if (op->dest.reg == cfa->base) {
+
+ /* mov %reg, %rsp */
+ if (cfa->base == CFI_SP &&
+ state->vals[op->src.reg].base == CFI_CFA) {
+
+ /*
+ * This is needed for the rare case
+ * where GCC does something dumb like:
+ *
+ * lea 0x8(%rsp), %rcx
+ * ...
+ * mov %rcx, %rsp
+ */
+ cfa->offset = -state->vals[op->src.reg].offset;
+ state->stack_size = cfa->offset;
+
+ } else {
+ cfa->base = CFI_UNDEFINED;
+ cfa->offset = 0;
+ }
}

break;
@@ -1240,11 +1265,25 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state)
break;
}

- if (op->dest.reg != CFI_BP && op->src.reg == CFI_SP &&
- cfa->base == CFI_SP) {
+ if (op->src.reg == CFI_SP && cfa->base == CFI_SP) {

/* drap: lea disp(%rsp), %drap */
state->drap_reg = op->dest.reg;
+
+ /*
+ * lea disp(%rsp), %reg
+ *
+ * This is needed for the rare case where GCC
+ * does something dumb like:
+ *
+ * lea 0x8(%rsp), %rcx
+ * ...
+ * mov %rcx, %rsp
+ */
+ state->vals[op->dest.reg].base = CFI_CFA;
+ state->vals[op->dest.reg].offset = \
+ -state->stack_size + op->src.offset;
+
break;
}

diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 9f113016bf8c..47d9ea70a83d 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -33,6 +33,7 @@ struct insn_state {
bool bp_scratch;
bool drap;
int drap_reg, drap_offset;
+ struct cfi_reg vals[CFI_NUM_REGS];
};

struct instruction {
--
2.13.5


Subject: [tip:x86/asm] objtool: Handle GCC stack pointer adjustment bug

Commit-ID: dd88a0a0c8615417fe6b4285769b5b772de87279
Gitweb: http://git.kernel.org/tip/dd88a0a0c8615417fe6b4285769b5b772de87279
Author: Josh Poimboeuf <[email protected]>
AuthorDate: Tue, 29 Aug 2017 12:51:03 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 30 Aug 2017 10:48:41 +0200

objtool: Handle GCC stack pointer adjustment bug

Arnd Bergmann reported the following warning with GCC 7.1.1:

fs/fs_pin.o: warning: objtool: pin_kill()+0x139: stack state mismatch: cfa1=7+88 cfa2=7+96

And the kbuild robot reported the following warnings with GCC 5.4.1:

fs/fs_pin.o: warning: objtool: pin_kill()+0x182: return with modified stack frame
fs/quota/dquot.o: warning: objtool: dquot_alloc_inode()+0x140: stack state mismatch: cfa1=7+120 cfa2=7+128
fs/quota/dquot.o: warning: objtool: dquot_free_inode()+0x11a: stack state mismatch: cfa1=7+112 cfa2=7+120

Those warnings are caused by an unusual GCC non-optimization where it
uses an intermediate register to adjust the stack pointer. It does:

lea 0x8(%rsp), %rcx
...
mov %rcx, %rsp

Instead of the obvious:

add $0x8, %rsp

It makes no sense to use an intermediate register, so I opened a GCC bug
to track it:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813

But it's not exactly a high-priority bug and it looks like we'll be
stuck with this issue for a while. So for now we have to track register
values when they're loaded with stack pointer offsets.

This is kind of a big workaround for a tiny problem, but c'est la vie.
I hope to eventually create a GCC plugin to implement a big chunk of
objtool's functionality. Hopefully at that point we'll be able to
remove of a lot of these GCC-isms from the objtool code.

Reported-by: Arnd Bergmann <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/6a41a96884c725e7f05413bb7df40cfe824b2444.1504028945.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/objtool/arch/x86/decode.c | 94 ++++++++++++-----------------------------
tools/objtool/cfi.h | 2 +-
tools/objtool/check.c | 81 ++++++++++++++++++++++++++---------
tools/objtool/check.h | 1 +
4 files changed, 88 insertions(+), 90 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 7841e5d..0e8c8ec 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -86,8 +86,8 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
struct insn insn;
int x86_64, sign;
unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
- modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
- sib = 0;
+ rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
+ modrm_reg = 0, sib = 0;

x86_64 = is_x86_64(elf);
if (x86_64 == -1)
@@ -114,6 +114,7 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
rex = insn.rex_prefix.bytes[0];
rex_w = X86_REX_W(rex) >> 3;
rex_r = X86_REX_R(rex) >> 2;
+ rex_x = X86_REX_X(rex) >> 1;
rex_b = X86_REX_B(rex);
}

@@ -217,6 +218,18 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->dest.reg = CFI_BP;
break;
}
+
+ if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
+
+ /* mov reg, %rsp */
+ *type = INSN_STACK;
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ break;
+ }
+
/* fallthrough */
case 0x88:
if (!rex_b &&
@@ -269,80 +282,28 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
break;

case 0x8d:
- if (rex == 0x48 && modrm == 0x65) {
+ if (sib == 0x24 && rex_w && !rex_b && !rex_x) {

- /* lea disp(%rbp), %rsp */
+ /* lea disp(%rsp), reg */
*type = INSN_STACK;
op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_BP;
+ op->src.reg = CFI_SP;
op->src.offset = insn.displacement.value;
op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
- break;
- }
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];

- if (rex == 0x48 && (modrm == 0xa4 || modrm == 0x64) &&
- sib == 0x24) {
+ } else if (rex == 0x48 && modrm == 0x65) {

- /* lea disp(%rsp), %rsp */
+ /* lea disp(%rbp), %rsp */
*type = INSN_STACK;
op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
+ op->src.reg = CFI_BP;
op->src.offset = insn.displacement.value;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
- }

- if (rex == 0x48 && modrm == 0x2c && sib == 0x24) {
-
- /* lea (%rsp), %rbp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = CFI_SP;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_BP;
- break;
- }
-
- if (rex == 0x4c && modrm == 0x54 && sib == 0x24 &&
- insn.displacement.value == 8) {
-
- /*
- * lea 0x8(%rsp), %r10
- *
- * Here r10 is the "drap" pointer, used as a stack
- * pointer helper when the stack gets realigned.
- */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = 8;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_R10;
- break;
- }
-
- if (rex == 0x4c && modrm == 0x6c && sib == 0x24 &&
- insn.displacement.value == 16) {
-
- /*
- * lea 0x10(%rsp), %r13
- *
- * Here r13 is the "drap" pointer, used as a stack
- * pointer helper when the stack gets realigned.
- */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = 16;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_R13;
- break;
- }
-
- if (rex == 0x49 && modrm == 0x62 &&
- insn.displacement.value == -8) {
+ } else if (rex == 0x49 && modrm == 0x62 &&
+ insn.displacement.value == -8) {

/*
* lea -0x8(%r10), %rsp
@@ -356,11 +317,9 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->src.offset = -8;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
- }

- if (rex == 0x49 && modrm == 0x65 &&
- insn.displacement.value == -16) {
+ } else if (rex == 0x49 && modrm == 0x65 &&
+ insn.displacement.value == -16) {

/*
* lea -0x10(%r13), %rsp
@@ -374,7 +333,6 @@ int arch_decode_instruction(struct elf *elf, struct section *sec,
op->src.offset = -16;
op->dest.type = OP_DEST_REG;
op->dest.reg = CFI_SP;
- break;
}

break;
diff --git a/tools/objtool/cfi.h b/tools/objtool/cfi.h
index 443ab2c..2fe883c 100644
--- a/tools/objtool/cfi.h
+++ b/tools/objtool/cfi.h
@@ -40,7 +40,7 @@
#define CFI_R14 14
#define CFI_R15 15
#define CFI_RA 16
-#define CFI_NUM_REGS 17
+#define CFI_NUM_REGS 17

struct cfi_reg {
int base;
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 3dffeb9..f744617 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -218,8 +218,10 @@ static void clear_insn_state(struct insn_state *state)

memset(state, 0, sizeof(*state));
state->cfa.base = CFI_UNDEFINED;
- for (i = 0; i < CFI_NUM_REGS; i++)
+ for (i = 0; i < CFI_NUM_REGS; i++) {
state->regs[i].base = CFI_UNDEFINED;
+ state->vals[i].base = CFI_UNDEFINED;
+ }
state->drap_reg = CFI_UNDEFINED;
state->drap_offset = -1;
}
@@ -1201,24 +1203,47 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state)
switch (op->src.type) {

case OP_SRC_REG:
- if (cfa->base == op->src.reg && cfa->base == CFI_SP &&
- op->dest.reg == CFI_BP && regs[CFI_BP].base == CFI_CFA &&
- regs[CFI_BP].offset == -cfa->offset) {
-
- /* mov %rsp, %rbp */
- cfa->base = op->dest.reg;
- state->bp_scratch = false;
- } else if (state->drap) {
-
- /* drap: mov %rsp, %rbp */
- regs[CFI_BP].base = CFI_BP;
- regs[CFI_BP].offset = -state->stack_size;
- state->bp_scratch = false;
- } else if (!no_fp) {
-
- WARN_FUNC("unknown stack-related register move",
- insn->sec, insn->offset);
- return -1;
+ if (op->src.reg == CFI_SP && op->dest.reg == CFI_BP) {
+
+ if (cfa->base == CFI_SP &&
+ regs[CFI_BP].base == CFI_CFA &&
+ regs[CFI_BP].offset == -cfa->offset) {
+
+ /* mov %rsp, %rbp */
+ cfa->base = op->dest.reg;
+ state->bp_scratch = false;
+ }
+
+ else if (state->drap) {
+
+ /* drap: mov %rsp, %rbp */
+ regs[CFI_BP].base = CFI_BP;
+ regs[CFI_BP].offset = -state->stack_size;
+ state->bp_scratch = false;
+ }
+ }
+
+ else if (op->dest.reg == cfa->base) {
+
+ /* mov %reg, %rsp */
+ if (cfa->base == CFI_SP &&
+ state->vals[op->src.reg].base == CFI_CFA) {
+
+ /*
+ * This is needed for the rare case
+ * where GCC does something dumb like:
+ *
+ * lea 0x8(%rsp), %rcx
+ * ...
+ * mov %rcx, %rsp
+ */
+ cfa->offset = -state->vals[op->src.reg].offset;
+ state->stack_size = cfa->offset;
+
+ } else {
+ cfa->base = CFI_UNDEFINED;
+ cfa->offset = 0;
+ }
}

break;
@@ -1240,11 +1265,25 @@ static int update_insn_state(struct instruction *insn, struct insn_state *state)
break;
}

- if (op->dest.reg != CFI_BP && op->src.reg == CFI_SP &&
- cfa->base == CFI_SP) {
+ if (op->src.reg == CFI_SP && cfa->base == CFI_SP) {

/* drap: lea disp(%rsp), %drap */
state->drap_reg = op->dest.reg;
+
+ /*
+ * lea disp(%rsp), %reg
+ *
+ * This is needed for the rare case where GCC
+ * does something dumb like:
+ *
+ * lea 0x8(%rsp), %rcx
+ * ...
+ * mov %rcx, %rsp
+ */
+ state->vals[op->dest.reg].base = CFI_CFA;
+ state->vals[op->dest.reg].offset = \
+ -state->stack_size + op->src.offset;
+
break;
}

diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 9f11301..47d9ea7 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -33,6 +33,7 @@ struct insn_state {
bool bp_scratch;
bool drap;
int drap_reg, drap_offset;
+ struct cfi_reg vals[CFI_NUM_REGS];
};

struct instruction {

2017-08-30 20:14:16

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [tip:x86/asm] objtool: Handle GCC stack pointer adjustment bug

On Wed, Aug 30, 2017 at 12:23:24PM -0700, H. Peter Anvin wrote:
> On 08/30/17 02:43, tip-bot for Josh Poimboeuf wrote:
> >
> > Those warnings are caused by an unusual GCC non-optimization where it
> > uses an intermediate register to adjust the stack pointer. It does:
> >
> > lea 0x8(%rsp), %rcx
> > ...
> > mov %rcx, %rsp
> >
> > Instead of the obvious:
> >
> > add $0x8, %rsp
> >
> > It makes no sense to use an intermediate register, so I opened a GCC bug
> > to track it:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813
> >
> > But it's not exactly a high-priority bug and it looks like we'll be
> > stuck with this issue for a while. So for now we have to track register
> > values when they're loaded with stack pointer offsets.
> >
>
> This seems like a good reason to try to extract this information from
> the DWARF data *if available*?

Well, I haven't ruled that out for the future, but in this case,
integrating DWARF would be a lot more work than this relatively simple
patch.

If we did go that route, it could be tricky deciding when to trust
DWARF vs. when to trust objtool's reverse engineering.

Another (vague) idea I'm thinking about is to write a GCC plugin which
annotates the object files in a way that would help objtool become more
GCC-ignorant. If it worked, this approach would be more powerful and
less error-prone than relying on DWARF.

Depending on how much work we can offload to the plugin, it might also
help make it easier to port objtool to other arches and compilers (e.g.,
clang).

I'm not 100% sold on that idea either, because it still requires objtool
to trust the compiler to some extent. But I think it would be worth it
because it would make the objtool code simpler, more portable, more
robust, and easier to maintain (so I don't always have to stay on top of
all of GCC's latest optimizations).

In the meantime, objtool's current design is working fine (for now). I
haven't found any issues it can't handle (yet).

--
Josh

2017-08-30 23:47:08

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [tip:x86/asm] objtool: Handle GCC stack pointer adjustment bug

On 08/30/17 13:14, Josh Poimboeuf wrote:
> On Wed, Aug 30, 2017 at 12:23:24PM -0700, H. Peter Anvin wrote:
>> On 08/30/17 02:43, tip-bot for Josh Poimboeuf wrote:
>>>
>>> Those warnings are caused by an unusual GCC non-optimization where it
>>> uses an intermediate register to adjust the stack pointer. It does:
>>>
>>> lea 0x8(%rsp), %rcx
>>> ...
>>> mov %rcx, %rsp
>>>
>>> Instead of the obvious:
>>>
>>> add $0x8, %rsp
>>>
>>> It makes no sense to use an intermediate register, so I opened a GCC bug
>>> to track it:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813
>>>
>>> But it's not exactly a high-priority bug and it looks like we'll be
>>> stuck with this issue for a while. So for now we have to track register
>>> values when they're loaded with stack pointer offsets.
>>>
>>
>> This seems like a good reason to try to extract this information from
>> the DWARF data *if available*?
>
> Well, I haven't ruled that out for the future, but in this case,
> integrating DWARF would be a lot more work than this relatively simple
> patch.
>
> If we did go that route, it could be tricky deciding when to trust
> DWARF vs. when to trust objtool's reverse engineering.
>
> Another (vague) idea I'm thinking about is to write a GCC plugin which
> annotates the object files in a way that would help objtool become more
> GCC-ignorant. If it worked, this approach would be more powerful and
> less error-prone than relying on DWARF.
>
> Depending on how much work we can offload to the plugin, it might also
> help make it easier to port objtool to other arches and compilers (e.g.,
> clang).
>
> I'm not 100% sold on that idea either, because it still requires objtool
> to trust the compiler to some extent. But I think it would be worth it
> because it would make the objtool code simpler, more portable, more
> robust, and easier to maintain (so I don't always have to stay on top of
> all of GCC's latest optimizations).
>
> In the meantime, objtool's current design is working fine (for now). I
> haven't found any issues it can't handle (yet).
>

Reverse engineering this way is at least NP-complete, and quite possibly
undecidable. A gcc plugin would tie the kernel *way* harder to gcc than
it is now, and it seems incredibly unlikely that you would come up with
something simpler and more reliable than a DWARF parser. What you *can*
do, of course, is cross-correlate the two, and *way* more importantly,
you cover assembly.

-hpa

2017-08-31 04:42:12

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [tip:x86/asm] objtool: Handle GCC stack pointer adjustment bug

On Wed, Aug 30, 2017 at 04:39:42PM -0700, H. Peter Anvin wrote:
> On 08/30/17 13:14, Josh Poimboeuf wrote:
> > On Wed, Aug 30, 2017 at 12:23:24PM -0700, H. Peter Anvin wrote:
> >> On 08/30/17 02:43, tip-bot for Josh Poimboeuf wrote:
> >>>
> >>> Those warnings are caused by an unusual GCC non-optimization where it
> >>> uses an intermediate register to adjust the stack pointer. It does:
> >>>
> >>> lea 0x8(%rsp), %rcx
> >>> ...
> >>> mov %rcx, %rsp
> >>>
> >>> Instead of the obvious:
> >>>
> >>> add $0x8, %rsp
> >>>
> >>> It makes no sense to use an intermediate register, so I opened a GCC bug
> >>> to track it:
> >>>
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813
> >>>
> >>> But it's not exactly a high-priority bug and it looks like we'll be
> >>> stuck with this issue for a while. So for now we have to track register
> >>> values when they're loaded with stack pointer offsets.
> >>>
> >>
> >> This seems like a good reason to try to extract this information from
> >> the DWARF data *if available*?
> >
> > Well, I haven't ruled that out for the future, but in this case,
> > integrating DWARF would be a lot more work than this relatively simple
> > patch.
> >
> > If we did go that route, it could be tricky deciding when to trust
> > DWARF vs. when to trust objtool's reverse engineering.
> >
> > Another (vague) idea I'm thinking about is to write a GCC plugin which
> > annotates the object files in a way that would help objtool become more
> > GCC-ignorant. If it worked, this approach would be more powerful and
> > less error-prone than relying on DWARF.
> >
> > Depending on how much work we can offload to the plugin, it might also
> > help make it easier to port objtool to other arches and compilers (e.g.,
> > clang).
> >
> > I'm not 100% sold on that idea either, because it still requires objtool
> > to trust the compiler to some extent. But I think it would be worth it
> > because it would make the objtool code simpler, more portable, more
> > robust, and easier to maintain (so I don't always have to stay on top of
> > all of GCC's latest optimizations).
> >
> > In the meantime, objtool's current design is working fine (for now). I
> > haven't found any issues it can't handle (yet).
> >
>
> Reverse engineering this way is at least NP-complete, and quite possibly
> undecidable.

Well, in practice, the reverse engineering already works very well.
Much to my complete surprise, admittedly. And there are ways to "prove"
its accuracy over time with runtime sanity checks.

But I think we can agree that it would be wise to improve the current
approach.

> A gcc plugin would tie the kernel *way* harder to gcc than it is now

Actually, I would expect that moving GCC-specific pieces to a GCC plugin
would *decouple* the kernel from GCC, because there would then be a
compiler-independent interface to objtool. A similar plugin could be
written for clang. (all with the usual disclaimer: "if it works", of
course)

And I would hope/expect that the plugin would be less affected by new
and unusual GCC optimizations than objtool currently is, because it
would be much easier for the plugin to understand those changes by just
reading the RTL, instead of us trying to blindly decipher each new
pattern we find in the object code.

> and it seems incredibly unlikely that you would come up with something
> simpler and more reliable than a DWARF parser.

I'm not so sure about that. I think it's just a matter of reading RTL
from a GCC plugin and providing some hints in a special section. Of
course, the devil's in the details.

> What you *can* do, of course, is cross-correlate the two, and *way*
> more importantly, you cover assembly.

But how do you decide when to trust DWARF and when not? We can't just
say "trust DWARF in everything but inline asm" because there's no way to
delineate inline asm just by reading the object code.

And I've heard some talk of buggy DWARF output in various versions of
GCC, of which we would could do nothing about if we blindly trusted
DWARF. I don't know if those reports are accurate, but we would be at
the mercy of tooling bugs. And in my experience, GCC doesn't seem to
prioritize the fixing of such "minor" issues (with the above patch being
an example).

There are also some control flow quirks which objtool struggles with,
for which a GCC plugin would be *so* much nicer. Examples: noreturn
functions, switch statement jump tables, DRAP stack alignments, sibling
calls, KASAN/UBSAN/gcov issues, unreachable instructions, cold/unlikely
subfunctions, unusual stack pointer update patterns. Those are the
things that keep me up at night.

--
Josh