2023-11-29 22:18:14

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH v3 0/3] x86: Support llvm-objdump in instruction decoder selftest

I have taken up this series from Sam to try and drive this forward.

Currently, the instruction decoder selftest does not work with
llvm-objdump because chkobjdump.awk is GNU binutils specific.
chkobjdump.awk can be eliminated altogether because the minimum
supported version of GNU binutils has been bumped to 2.25.

However, with chkobjdump.awk removed, the selftest does not actually
work properly with llvm-objdump:

$ make -skj"$(nproc)" LLVM=1 defconfig
$ scripts/config -e X86_DECODER_SELFTEST
$ make -skj"$(nproc)" LLVM=1 olddefconfig bzImage
...
arch/x86/tools/insn_decoder_test: warning: objdump says 0 bytes, but insn_get_length() says 2
...

Upon inspection it turned out llvm-objdump was formatting its output
differently, which caused objdump_reformat.awk to incorrectly output its
values.

After fixing that bug, a second one was seen where the instruction
"wait" was incorrectly matched with "fwait", which again caused
insn_decoder_test to fail.

Changes in v3:
- Further commit message and cover letter formatting and improvements.
- Add patch 3, which is the ultimate catalyst for the first two changes
- Link to v2: https://lore.kernel.org/r/[email protected]/

Changes in v2:
- Coding style commit message amendments
- Link to v1: https://lore.kernel.org/r/[email protected]/

---
Nathan Chancellor (1):
x86/tools: Remove chkobjdump.awk

Samuel Zeter (2):
x86/tools: objdump_reformat.awk: Ensure regex matches fwait
x86/tools: objdump_reformat.awk: Allow for spaces

arch/x86/tools/Makefile | 2 +-
arch/x86/tools/chkobjdump.awk | 34 ----------------------------------
arch/x86/tools/objdump_reformat.awk | 4 ++--
3 files changed, 3 insertions(+), 37 deletions(-)
---
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
change-id: 20231129-objdump-reformat-llvm-2d71b9f40ac2

Best regards,
--
Nathan Chancellor <[email protected]>


2023-11-29 22:18:16

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH v3 2/3] x86/tools: objdump_reformat.awk: Allow for spaces

From: Samuel Zeter <[email protected]>

GNU objdump and LLVM objdump have differing output formats.
Specifically, GNU objump will format its output as: address:<tab>hex,
whereas LLVM objdump displays its output as address:<space>hex.

objdump_reformat.awk incorrectly handles this discrepancy due to
the unexpected space and as a result insn_decoder_test fails, as
its input is garbled.

The instruction line being tokenized now handles a space and colon,
or tab delimiter.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1364
Acked-by: Masami Hiramatsu <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Kees Cook <[email protected]>
Signed-off-by: Samuel Zeter <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
---
arch/x86/tools/objdump_reformat.awk | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/tools/objdump_reformat.awk b/arch/x86/tools/objdump_reformat.awk
index 276e572a6f60..a4120d907277 100644
--- a/arch/x86/tools/objdump_reformat.awk
+++ b/arch/x86/tools/objdump_reformat.awk
@@ -22,7 +22,7 @@ BEGIN {
}

/^ *[0-9a-f]+:/ {
- if (split($0, field, "\t") < 3) {
+ if (split($0, field, /: |\t/) < 3) {
# This is a continuation of the same insn.
prev_hex = prev_hex field[2]
} else {

--
2.43.0

2023-11-29 22:18:17

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH v3 3/3] x86/tools: Remove chkobjdump.awk

This check is superfluous now that the minimum version of binutils to
build the kernel is 2.25. This also fixes an error seen with
llvm-objdump because it does not support '-v' prior to LLVM 13:

llvm-objdump: error: unknown argument '-v'

Closes: https://github.com/ClangBuiltLinux/linux/issues/1362
Link: https://github.com/llvm/llvm-project/commit/dde24a87c55f82d8c7b3bf3eafb10f2b9b2b9a01
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Kees Cook <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
---
arch/x86/tools/Makefile | 2 +-
arch/x86/tools/chkobjdump.awk | 34 ----------------------------------
2 files changed, 1 insertion(+), 35 deletions(-)

diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 90e820ac9771..7278e2545c35 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -17,7 +17,7 @@ reformatter = $(srctree)/arch/x86/tools/objdump_reformat.awk
chkobjdump = $(srctree)/arch/x86/tools/chkobjdump.awk

quiet_cmd_posttest = TEST $@
- cmd_posttest = ($(OBJDUMP) -v | $(AWK) -f $(chkobjdump)) || $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(reformatter) | $(obj)/insn_decoder_test $(posttest_64bit) $(posttest_verbose)
+ cmd_posttest = $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(reformatter) | $(obj)/insn_decoder_test $(posttest_64bit) $(posttest_verbose)

quiet_cmd_sanitytest = TEST $@
cmd_sanitytest = $(obj)/insn_sanity $(posttest_64bit) -m 1000000
diff --git a/arch/x86/tools/chkobjdump.awk b/arch/x86/tools/chkobjdump.awk
deleted file mode 100644
index a4cf678cf5c8..000000000000
--- a/arch/x86/tools/chkobjdump.awk
+++ /dev/null
@@ -1,34 +0,0 @@
-# GNU objdump version checker
-#
-# Usage:
-# objdump -v | awk -f chkobjdump.awk
-BEGIN {
- # objdump version 2.19 or later is OK for the test.
- od_ver = 2;
- od_sver = 19;
-}
-
-/^GNU objdump/ {
- verstr = ""
- gsub(/\(.*\)/, "");
- for (i = 3; i <= NF; i++)
- if (match($(i), "^[0-9]")) {
- verstr = $(i);
- break;
- }
- if (verstr == "") {
- printf("Warning: Failed to find objdump version number.\n");
- exit 0;
- }
- split(verstr, ver, ".");
- if (ver[1] > od_ver ||
- (ver[1] == od_ver && ver[2] >= od_sver)) {
- exit 1;
- } else {
- printf("Warning: objdump version %s is older than %d.%d\n",
- verstr, od_ver, od_sver);
- print("Warning: Skipping posttest.");
- # Logic is inverted, because we just skip test without error.
- exit 0;
- }
-}

--
2.43.0

2023-11-29 22:18:18

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH v3 1/3] x86/tools: objdump_reformat.awk: Ensure regex matches fwait

From: Samuel Zeter <[email protected]>

If there is "wait" mnemonic in the line being parsed, it is incorrectly
handled by the script, and an extra line of "fwait" in
objdump_reformat's output is inserted. As insn_decoder_test relies upon
the formatted output, the test fails.

This is reproducible when disassembling with llvm-objdump:

Pre-processed lines:
ffffffff81033e72: 9b wait
ffffffff81033e73: 48 c7 c7 89 50 42 82 movq

After objdump_reformat.awk:
ffffffff81033e72: 9b fwait
ffffffff81033e72: wait
ffffffff81033e73: 48 c7 c7 89 50 42 82 movq

The regex match now accepts spaces or tabs, along with the "fwait"
instruction.

Closes: https://github.com/ClangBuiltLinux/linux/issues/1364
Acked-by: Masami Hiramatsu <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Tested-by: Kees Cook <[email protected]>
Signed-off-by: Samuel Zeter <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
---
arch/x86/tools/objdump_reformat.awk | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/tools/objdump_reformat.awk b/arch/x86/tools/objdump_reformat.awk
index f418c91b71f0..276e572a6f60 100644
--- a/arch/x86/tools/objdump_reformat.awk
+++ b/arch/x86/tools/objdump_reformat.awk
@@ -12,7 +12,7 @@ BEGIN {
prev_hex = ""
prev_mnemonic = ""
bad_expr = "(\\(bad\\)|^rex|^.byte|^rep(z|nz)$|^lock$|^es$|^cs$|^ss$|^ds$|^fs$|^gs$|^data(16|32)$|^addr(16|32|64))"
- fwait_expr = "^9b "
+ fwait_expr = "^9b[ \t]*fwait"
fwait_str="9b\tfwait"
}


--
2.43.0

Subject: [tip: x86/build] x86/tools: Remove chkobjdump.awk

The following commit has been merged into the x86/build branch of tip:

Commit-ID: 5225952d74d43e4c054731c74b8afd700b23a94a
Gitweb: https://git.kernel.org/tip/5225952d74d43e4c054731c74b8afd700b23a94a
Author: Nathan Chancellor <[email protected]>
AuthorDate: Wed, 29 Nov 2023 15:17:43 -07:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Thu, 30 Nov 2023 09:38:12 +01:00

x86/tools: Remove chkobjdump.awk

This check is superfluous now that the minimum version of binutils to
build the kernel is 2.25. This also fixes an error seen with
llvm-objdump because it does not support '-v' prior to LLVM 13:

llvm-objdump: error: unknown argument '-v'

Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Link: https://github.com/llvm/llvm-project/commit/dde24a87c55f82d8c7b3bf3eafb10f2b9b2b9a01
Link: https://lore.kernel.org/r/[email protected]
Closes: https://github.com/ClangBuiltLinux/linux/issues/1362
---
arch/x86/tools/Makefile | 2 +-
arch/x86/tools/chkobjdump.awk | 34 +----------------------------------
2 files changed, 1 insertion(+), 35 deletions(-)
delete mode 100644 arch/x86/tools/chkobjdump.awk

diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 90e820a..7278e25 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -17,7 +17,7 @@ reformatter = $(srctree)/arch/x86/tools/objdump_reformat.awk
chkobjdump = $(srctree)/arch/x86/tools/chkobjdump.awk

quiet_cmd_posttest = TEST $@
- cmd_posttest = ($(OBJDUMP) -v | $(AWK) -f $(chkobjdump)) || $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(reformatter) | $(obj)/insn_decoder_test $(posttest_64bit) $(posttest_verbose)
+ cmd_posttest = $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(reformatter) | $(obj)/insn_decoder_test $(posttest_64bit) $(posttest_verbose)

quiet_cmd_sanitytest = TEST $@
cmd_sanitytest = $(obj)/insn_sanity $(posttest_64bit) -m 1000000
diff --git a/arch/x86/tools/chkobjdump.awk b/arch/x86/tools/chkobjdump.awk
deleted file mode 100644
index a4cf678..0000000
--- a/arch/x86/tools/chkobjdump.awk
+++ /dev/null
@@ -1,34 +0,0 @@
-# GNU objdump version checker
-#
-# Usage:
-# objdump -v | awk -f chkobjdump.awk
-BEGIN {
- # objdump version 2.19 or later is OK for the test.
- od_ver = 2;
- od_sver = 19;
-}
-
-/^GNU objdump/ {
- verstr = ""
- gsub(/\(.*\)/, "");
- for (i = 3; i <= NF; i++)
- if (match($(i), "^[0-9]")) {
- verstr = $(i);
- break;
- }
- if (verstr == "") {
- printf("Warning: Failed to find objdump version number.\n");
- exit 0;
- }
- split(verstr, ver, ".");
- if (ver[1] > od_ver ||
- (ver[1] == od_ver && ver[2] >= od_sver)) {
- exit 1;
- } else {
- printf("Warning: objdump version %s is older than %d.%d\n",
- verstr, od_ver, od_sver);
- print("Warning: Skipping posttest.");
- # Logic is inverted, because we just skip test without error.
- exit 0;
- }
-}

Subject: [tip: x86/build] x86/tools: objdump_reformat.awk: Ensure regex matches fwait

The following commit has been merged into the x86/build branch of tip:

Commit-ID: 60c2ea7c89e375804171552d8ea53d9084ec3269
Gitweb: https://git.kernel.org/tip/60c2ea7c89e375804171552d8ea53d9084ec3269
Author: Samuel Zeter <[email protected]>
AuthorDate: Wed, 29 Nov 2023 15:17:41 -07:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Thu, 30 Nov 2023 09:38:09 +01:00

x86/tools: objdump_reformat.awk: Ensure regex matches fwait

If there is "wait" mnemonic in the line being parsed, it is incorrectly
handled by the script, and an extra line of "fwait" in
objdump_reformat's output is inserted. As insn_decoder_test relies upon
the formatted output, the test fails.

This is reproducible when disassembling with llvm-objdump:

Pre-processed lines:

ffffffff81033e72: 9b wait
ffffffff81033e73: 48 c7 c7 89 50 42 82 movq

After objdump_reformat.awk:

ffffffff81033e72: 9b fwait
ffffffff81033e72: wait
ffffffff81033e73: 48 c7 c7 89 50 42 82 movq

The regex match now accepts spaces or tabs, along with the "fwait"
instruction.

Signed-off-by: Samuel Zeter <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Tested-by: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Closes: https://github.com/ClangBuiltLinux/linux/issues/1364
---
arch/x86/tools/objdump_reformat.awk | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/tools/objdump_reformat.awk b/arch/x86/tools/objdump_reformat.awk
index f418c91..276e572 100644
--- a/arch/x86/tools/objdump_reformat.awk
+++ b/arch/x86/tools/objdump_reformat.awk
@@ -12,7 +12,7 @@ BEGIN {
prev_hex = ""
prev_mnemonic = ""
bad_expr = "(\\(bad\\)|^rex|^.byte|^rep(z|nz)$|^lock$|^es$|^cs$|^ss$|^ds$|^fs$|^gs$|^data(16|32)$|^addr(16|32|64))"
- fwait_expr = "^9b "
+ fwait_expr = "^9b[ \t]*fwait"
fwait_str="9b\tfwait"
}

Subject: [tip: x86/build] x86/tools: objdump_reformat.awk: Allow for spaces

The following commit has been merged into the x86/build branch of tip:

Commit-ID: f4570ebd836363dc7722b8eb8d099b311021af13
Gitweb: https://git.kernel.org/tip/f4570ebd836363dc7722b8eb8d099b311021af13
Author: Samuel Zeter <[email protected]>
AuthorDate: Wed, 29 Nov 2023 15:17:42 -07:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Thu, 30 Nov 2023 09:38:10 +01:00

x86/tools: objdump_reformat.awk: Allow for spaces

GNU objdump and LLVM objdump have differing output formats.
Specifically, GNU objump will format its output as: address:<tab>hex,
whereas LLVM objdump displays its output as address:<space>hex.

objdump_reformat.awk incorrectly handles this discrepancy due to
the unexpected space and as a result insn_decoder_test fails, as
its input is garbled.

The instruction line being tokenized now handles a space and colon,
or tab delimiter.

Signed-off-by: Samuel Zeter <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Tested-by: Kees Cook <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Closes: https://github.com/ClangBuiltLinux/linux/issues/1364
---
arch/x86/tools/objdump_reformat.awk | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/tools/objdump_reformat.awk b/arch/x86/tools/objdump_reformat.awk
index 276e572..a4120d9 100644
--- a/arch/x86/tools/objdump_reformat.awk
+++ b/arch/x86/tools/objdump_reformat.awk
@@ -22,7 +22,7 @@ BEGIN {
}

/^ *[0-9a-f]+:/ {
- if (split($0, field, "\t") < 3) {
+ if (split($0, field, /: |\t/) < 3) {
# This is a continuation of the same insn.
prev_hex = prev_hex field[2]
} else {