2023-03-16 18:17:10

by John Moon

[permalink] [raw]
Subject: [PATCH v3 0/2] Validating UAPI backwards compatibility

The kernel community has rigorously enforced a policy of backwards
compatibility in its UAPI headers for a long time. This has allowed user
applications to enjoy stability across kernel upgrades without
recompiling. Our goal is to add tooling and documentation to help kernel
developers maintain this stability.

We see in the kernel documentation:
"Kernel headers are backwards compatible, but not forwards compatible.
This means that a program built against a C library using older kernel
headers should run on a newer kernel (although it may not have access
to new features), but a program built against newer kernel headers may
not work on an older kernel."[1]

How does the kernel community enforce this guarantee? As we understand it,
it's enforced with thorough code review and testing. Is there any tooling
outside of this being used to help the process?

Also, could documentation on UAPI maintenance (from a developer's point
of view) be expanded? Internally, we have a set of guidelines for our
kernel developers regarding UAPI compatibility techniques. If there's
interest in supplying a document on this topic with the kernel, we'd be
happy to submit a draft detailing what we have so far as a jumping off
point.

In terms of tooling, I've attached a shell script we've been using
internally to validate backwards compatibility of our UAPI headers. The
script uses libabigail's[2] tool abidiff[3] to compare a modified
header's ABI before and after a patch is applied. If an existing UAPI is
modified, the script exits non-zero. We use this script in our
continuous integration system to block changes that fail the check.

It generates output like this when a backwards-incompatible change is made
to a UAPI header:

!!! ABI differences detected in include/uapi/linux/bpf.h from HEAD~1 -> HEAD !!!

[C] 'struct bpf_insn' changed:
type size hasn't changed
1 data member change:
type of '__s32 imm' changed:
typedef name changed from __s32 to __u32 at int-ll64.h:27:1
underlying type 'int' changed:
type name changed from 'int' to 'unsigned int'
type size hasn't changed

We wanted to share this script with the community and hopefully also
receive general feedback when it comes to tooling/policy surrounding this
issue. Our hope is that the script will help kernel UAPI authors maintain
good discipline and avoid breaking userspace.

In v3, we've changed the default behavior of the script to check _all_
UAPI headers in the tree. This is to avoid possible false negatives
brought up in the v2 review. We've also refactored how it works with
architecture-specific headers. The documentation has been updated to
reflect the changes and should serve well as a guide on how things are
working now.

Thanks for the help on v1/v2 and thanks in advance for the help on v3!

[1] Documentation/kbuild/headers_install.rst
[2] https://sourceware.org/libabigail/manual/libabigail-overview.html
[3] https://sourceware.org/libabigail/manual/abidiff.html

P.S. While at Qualcomm, Jordan Crouse <[email protected]> authored the
original version of the UAPI checker script. Thanks Jordan!


John Moon (2):
check-uapi: Introduce check-uapi.sh
docs: dev-tools: Add UAPI checker documentation

Documentation/dev-tools/checkuapi.rst | 492 ++++++++++++++++++++++
Documentation/dev-tools/index.rst | 1 +
scripts/check-uapi.sh | 564 ++++++++++++++++++++++++++
3 files changed, 1057 insertions(+)
create mode 100644 Documentation/dev-tools/checkuapi.rst
create mode 100755 scripts/check-uapi.sh


base-commit: 9c1bec9c0b08abeac72ed6214b723adc224013bf
--
2.17.1



2023-03-16 18:17:21

by John Moon

[permalink] [raw]
Subject: [PATCH v3 1/2] check-uapi: Introduce check-uapi.sh

While the kernel community has been good at maintaining backwards
compatibility with kernel UAPIs, it would be helpful to have a tool
to check if a commit introduces changes that break backwards
compatibility.

To that end, introduce check-uapi.sh: a simple shell script that
checks for changes to UAPI headers using libabigail.

libabigail is "a framework which aims at helping developers and
software distributors to spot some ABI-related issues like interface
incompatibility in ELF shared libraries by performing a static
analysis of the ELF binaries at hand."

The script uses one of libabigail's tools, "abidiff", to compile the
changed header before and after the commit to detect any changes.

abidiff "compares the ABI of two shared libraries in ELF format. It
emits a meaningful report describing the differences between the two
ABIs."

The script also includes the ability to check the compatibility of
all UAPI headers across commits. This allows developers to inspect
the stability of the UAPIs over time.

Signed-off-by: John Moon <[email protected]>
---
- Default to checking all UAPI headers to avoid possible false
negative brought up in v2
- Re-introduce ARCH environment variable to select which UAPIs under
arch/*/include/uapi to check
- Add "-m" flag to check only modified files
- Add "-q" and "-v" flags to modify verbosity
- Change "-r" flag to "-p" to represent a "past reference"
- Added meaningful exit codes
- Added check for minimum libdw version for use with clang
- Fixed bug where some non-header files were being checked
- Removed "scripts/unifdef" build as it was no longer needed

scripts/check-uapi.sh | 564 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 564 insertions(+)
create mode 100755 scripts/check-uapi.sh

diff --git a/scripts/check-uapi.sh b/scripts/check-uapi.sh
new file mode 100755
index 000000000000..5d166dc9617d
--- /dev/null
+++ b/scripts/check-uapi.sh
@@ -0,0 +1,564 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+# Script to check commits for UAPI backwards compatibility
+
+set -o errexit
+set -o pipefail
+
+print_usage() {
+ name=$(basename "$0")
+ cat << EOF
+$name - check for UAPI header stability across Git commits
+
+By default, the script will check to make sure the latest commit (or current
+dirty changes) did not introduce ABI changes when compared to HEAD^1. You can
+check against additional commit ranges with the -b and -p options.
+
+The script will not check UAPI headers for architectures other than the one
+defined in ARCH.
+
+Usage: $name [-b BASE_REF] [-p PAST_REF] [-m] [-j N] [-l ERROR_LOG] [-q] [-v]
+
+Options:
+ -b BASE_REF Base git reference to use for comparison. If unspecified or empty,
+ will use any dirty changes in tree to UAPI files. If there are no
+ dirty changes, HEAD will be used.
+ -p PAST_REF Compare BASE_REF to PAST_REF (e.g. -r v6.1). If unspecified or empty,
+ will use BASE_REF^1. Must be an ancestor of BASE_REF.
+ -m Check only UAPI headers modified between PAST_REF and BASE_REF for
+ backwards compatibility.
+ -j JOBS Number of checks to run in parallel (default: number of CPU cores).
+ -l ERROR_LOG Write error log to file (default: "\$KERNEL_SOURCE/abi_error_log.txt").
+ -q Quiet operation (suppress all stdout, still print stderr).
+ -v Verbose operation (print more information about each header being checked).
+
+Environmental args:
+ ABIDIFF Custom path to abidiff binary
+ CC C compiler (default is "gcc")
+ ARCH Target architecture of C compiler (default is host arch)
+
+Exit codes:
+ $SUCCESS) Success
+ $FAIL_ABI) ABI difference detected
+ $FAIL_PREREQ) Prerequisite not met
+ $FAIL_COMPILE) Compilation error
+ $FAIL_ARCH) All modified files were for an architecture other than ARCH
+ $FAIL) General error
+EOF
+}
+
+readonly SUCCESS=0
+readonly FAIL_ABI=1
+readonly FAIL_PREREQ=2
+readonly FAIL_COMPILE=3
+readonly FAIL_ARCH=4
+readonly FAIL=5
+
+# Print to stderr
+eprintf() {
+ # shellcheck disable=SC2059
+ printf "$@" >&2
+}
+
+# Print list of UAPI files to operate on
+get_uapi_files() {
+ local -r check_all="$1"
+ local -r base_ref="$2"
+ local -r ref="$3"
+ local -r file_list="${tmp_dir}/file_list.txt"
+
+ # Return already-processed list if it's available
+ if [ -s "$file_list" ]; then
+ cat "$file_list"
+ return 0
+ fi
+
+ if [ "$check_all" = "true" ]; then
+ # Print all of the UAPI header files at the commit in question
+ # Ignore the headers that we can't test.
+ # shellcheck disable=SC2086,SC2046
+ git ls-tree --full-tree --name-only -r "${base_ref:-HEAD}" -- "${UAPI_DIRS[@]}" \
+ | grep '.h$' \
+ | sed -e 's/^/M\t/g' \
+ | grep -v $({ get_no_header_list "$base_ref"; get_no_export_list; } | xargs -- printf '-e %s ') \
+ > "$file_list"
+ else
+ if [ -z "$base_ref" ] || [ -z "$ref" ]; then
+ # shellcheck disable=SC2086
+ git diff $GIT_ARGS > "$file_list"
+ else
+ # shellcheck disable=SC2086
+ git diff "$ref" "$base_ref" $GIT_ARGS > "$file_list"
+ fi
+
+ if mismatch_arch_files=$(grep -v "arch/${ARCH}" "$file_list" | grep -o "arch/.*\.h"); then
+ eprintf "warning - detected changes to following files, but can't check them with %s compiler:\n" "$ARCH"
+ for f in $mismatch_arch_files; do
+ eprintf " - %s\n" "$f"
+ sed -i -e "\|${f}|d" "$file_list"
+ printf "warning - could not perform ABI check on %s with %s compiler\n" "$f" "$ARCH" \
+ >> "${tmp_dir}/arch_mismatches.error"
+ done
+ eprintf "\nFiltering them out and performing partial check on remaining files...\n"
+ if [ ! -s "$file_list" ]; then
+ eprintf "error - all files were filtered out, there's nothing to check!\n"
+ exit "$FAIL_ARCH"
+ fi
+ fi
+ fi
+
+ if [ ! -s "$file_list" ]; then
+ return 1
+ fi
+
+ cat "$file_list"
+}
+
+# Compile the simple test app
+do_compile() {
+ local -r inc_dir="$1"
+ local -r header="$2"
+ local -r out="$3"
+ printf "int main(void) { return 0; }\n" | \
+ "$CC" -c \
+ -o "$out" \
+ -x c \
+ -O0 \
+ -std=c90 \
+ -fno-eliminate-unused-debug-types \
+ -g \
+ "-I${inc_dir}" \
+ -include "$header" \
+ -
+}
+
+# Print the list of incompatible headers
+get_no_header_list() {
+ local -r ref="${1:-HEAD}"
+ # Start with the usr/include/Makefile to get a list of the headers
+ # that don't compile using this method.
+ {
+ # shellcheck disable=SC2016
+ printf 'all: ; @echo $(no-header-test)\n'
+ cat "usr/include/Makefile"
+ } | SRCARCH="$ARCH" make -f - | tr " " "\n" | grep -v "asm-generic"
+
+ # The makefile also skips all asm-generic files, but prints "asm-generic/%"
+ # which won't match the files in this script, so just print all the asm-generic
+ # headers manually.
+ git ls-tree --full-tree --name-only -r "$ref" -- include/uapi/asm-generic \
+ | grep '.h$' \
+ | cut -d '/' -f 2-
+
+ # Finally, print all the architecture-specific headers that are _not_ for the
+ # ARCH we're targeting
+ git ls-tree --full-tree --name-only -r "$ref" -- arch/*/include/uapi \
+ | grep '.h$' \
+ | grep -v "^arch/${ARCH}"
+}
+
+# Print the list of headers that are not exported for this architecture
+get_no_export_list() {
+ {
+ # shellcheck disable=SC2016
+ printf 'all: ; @echo $(no-export-headers)\n'
+ cat "include/uapi/Kbuild"
+ } | SRCARCH="$ARCH" srctree="$KERNEL_SRC" make -f - | tr " " "\n" | sed '/^[[:space:]]*$/d'
+}
+
+# Save the current git tree state, stashing if needed
+save_tree_state() {
+ printf "Saving current tree state... "
+ current_ref="$(git rev-parse HEAD)"
+ readonly current_ref
+ if ! git diff-index --quiet HEAD; then
+ unstash="true"
+ git stash push --quiet
+ fi
+ printf "OK\n"
+}
+
+# Restore the git tree state, unstashing if needed
+restore_tree_state() {
+ if [ -z "$current_ref" ]; then
+ return 0
+ fi
+
+ printf "Restoring current tree state... "
+ git checkout --quiet "$current_ref"
+ if [ "$unstash" = "true" ]; then
+ git stash pop --quiet
+ unstash="false"
+ fi
+ printf "OK\n"
+}
+
+# Install headers for both git refs
+install_headers() {
+ local -r check_all="$1"
+ local -r base_ref="$2"
+ local -r ref="$3"
+
+ deviated_from_current_tree="false"
+ for inst_ref in "$base_ref" "$ref"; do
+ if [ -n "$inst_ref" ]; then
+ if [ "$deviated_from_current_tree" = "false" ]; then
+ save_tree_state
+ trap 'rm -rf "$tmp_dir"; restore_tree_state;' EXIT
+ deviated_from_current_tree="true"
+ fi
+ # This script ($0) is already loaded into memory at this point,
+ # so this operation is safe
+ git checkout --quiet "$(git rev-parse "$inst_ref")"
+ fi
+
+ printf "Installing sanitized UAPI headers from %s... " "${inst_ref:-dirty tree}"
+ make ARCH="$ARCH" INSTALL_HDR_PATH="${tmp_dir}/${inst_ref}/usr" headers_install > /dev/null 2>&1
+ printf "OK\n"
+ done
+
+ restore_tree_state
+ trap 'rm -rf "$tmp_dir"' EXIT
+}
+
+# Check file list for UAPI compatibility
+check_uapi_files() {
+ local -r check_all="$1"
+ local -r base_ref="$2"
+ local -r ref="$3"
+
+ install_headers "$check_all" "$base_ref" "$ref"
+
+ local passed=0;
+ local failed=0;
+ local -a threads=()
+
+ printf "Checking changes to UAPI headers between %s and %s\n" "$ref" "${base_ref:-dirty tree}"
+ while read -r status file; do
+ if [ "${#threads[@]}" -ge "$MAX_THREADS" ]; then
+ if wait "${threads[0]}"; then
+ passed=$((passed + 1))
+ else
+ failed=$((failed + 1))
+ fi
+ threads=("${threads[@]:1}")
+ fi
+
+ check_individual_file "$base_ref" "$ref" "$status" "$file" &
+ threads+=("$!")
+ done < <(get_uapi_files "$check_all" "$base_ref" "$ref")
+
+ for t in "${threads[@]}"; do
+ if wait "$t"; then
+ passed=$((passed + 1))
+ else
+ failed=$((failed + 1))
+ fi
+ done
+
+ if [ "$check_all" = "true" ]; then
+ scope=$(printf "UAPI headers compatible with %s" "$ARCH")
+ else
+ scope=$(printf "UAPI headers modified between %s and %s" "$ref" "${base_ref:-dirty tree}")
+ fi
+
+ total="$((passed + failed))"
+ if [ "$failed" -gt 0 ]; then
+ eprintf "error - %d/%d %s appear _not_ to be backwards compatible\n" "$failed" "$total" "$scope"
+ else
+ printf "All %d %s appear to be backwards compatible\n" "$total" "$scope"
+ fi
+
+ if [ "$check_all" = "true" ]; then
+ printf "Note: UAPI headers for architectures other than %s were not checked\n" "$ARCH"
+ fi
+
+ return "$failed"
+}
+
+# Print the path to a given header in the tmp_dir
+get_header() {
+ local -r ref="$1"
+ local -r arch="$2"
+ local -r base="$3"
+
+ if [ -z "$arch" ]; then
+ printf "%s" "${tmp_dir}/${ref}/usr/${base}"
+ else
+ printf "%s" "${tmp_dir}/${ref}/usr/$(printf "%s" "$base" | cut -d '/' -f 3-)"
+ fi
+}
+
+# Check an individual file for UAPI compatibility
+check_individual_file() {
+ local -r base_ref="$1"
+ local -r ref="$2"
+ local -r status="$3"
+ local -r file="$4"
+
+ local -r base=${file/uapi\//}
+ local -r uapi_arch="$(printf "%s" "$file" | grep -o 'arch/.*/uapi' | cut -d '/' -f 2)"
+ local -r base_header=$(get_header "$base_ref" "$uapi_arch" "$base")
+ local -r ref_header=$(get_header "$ref" "$uapi_arch" "$base")
+ local -r installed_base="$(printf "%s" "$base_header" | grep -o "usr/include/.*" | cut -d '/' -f 3-)"
+
+ if [ "$status" = "D" ]; then
+ eprintf "error - UAPI header %s was incorrectly removed\n" "$file" | tee "${base_header}.error"
+ return 1
+ fi
+
+ # shellcheck disable=SC2076
+ if [[ " $(get_no_header_list | xargs) " =~ " $installed_base " ]]; then
+ printf "warning - %s cannot be tested by this script (see usr/include/Makefile)\n" "$file" | tee "${base_header}.error"
+ return 1
+ fi
+
+ # shellcheck disable=SC2076
+ if [[ " $(get_no_export_list | xargs) " =~ " $installed_base " ]]; then
+ printf "warning - %s is not exported by the %s architecture, so cannot be tested\n" "$file" "$ARCH" | tee "${base_header}.error"
+ return 1
+ fi
+
+ for h in "$base_header" "$ref_header"; do
+ if [ ! -f "$h" ]; then
+ eprintf "error - %s does not exist - cannot compare ABI\n" "$h" | tee "${h}.error"
+ return 1
+ fi
+ done
+
+ compare_abi "$file" "$base_header" "$ref_header" "$base_ref" "$ref"
+}
+
+# Perform the A/B compilation and compare output ABI
+compare_abi() {
+ local -r file="$1"
+ local -r base_header="$2"
+ local -r ref_header="$3"
+ local -r base_ref="$4"
+ local -r ref="$5"
+ local -r log="${tmp_dir}/log/$(basename "$file").log"
+
+ mkdir -p "$(dirname "$log")"
+
+ if ! do_compile "${tmp_dir}/${base_ref}/usr/include" "$base_header" "${base_header}.bin" 2> "$log"; then
+ eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$base_ref"
+ cat "$log" >&2
+ exit "$FAIL_COMPILE"
+ fi
+
+ if ! do_compile "${tmp_dir}/${ref}/usr/include" "$ref_header" "${ref_header}.bin" 2> "$log"; then
+ eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$ref"
+ cat "$log" >&2
+ exit "$FAIL_COMPILE"
+ fi
+
+ "$ABIDIFF" --non-reachable-types "${ref_header}.bin" "${base_header}.bin" > "$log" && ret="$?" || ret="$?"
+ if [ "$ret" -eq 0 ]; then
+ if [ "$VERBOSE" = "true" ]; then
+ printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$ref" "${base_ref:-dirty tree}"
+ fi
+ else
+ # Bits in abidiff's return code can be used to determine the type of error
+ if [ $(("$ret" & 0x1)) -gt 0 ]; then
+ eprintf "error - abidiff did not run properly\n"
+ exit 1
+ fi
+
+ # If the only changes were additions (not modifications to existing APIs), then
+ # there's no problem. Ignore these diffs.
+ if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
+ grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
+ return 0
+ fi
+ {
+ printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$ref" "${base_ref:-dirty tree}"
+ sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
+
+ if ! cmp "$ref_header" "$base_header" > /dev/null 2>&1; then
+ printf "\nHeader file diff (after headers_install):\n"
+ diff -Naur "$ref_header" "$base_header" \
+ | sed -e "s|${ref_header}|${ref}/${file}|g" \
+ -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
+ printf "\n"
+ else
+ printf "\n%s did not change between %s and %s...\n" "$file" "$ref" "${base_ref:-dirty tree}"
+ printf "It's possible a change to one of the headers it includes caused this error:\n"
+ grep '^#include' "$base_header"
+ printf "\n"
+ fi
+ } | tee "${base_header}.error" >&2
+ return 1
+ fi
+}
+
+min_version_is_satisfied() {
+ local -r min_version="$1"
+ local -r version_installed="$2"
+
+ printf "%s\n%s\n" "$min_version" "$version_installed" | sort -Vc > /dev/null 2>&1
+}
+
+# Make sure we have the tools we need
+check_deps() {
+ ABIDIFF="${ABIDIFF:-abidiff}"
+ CC="${CC:-gcc}"
+ ARCH="${ARCH:-$(uname -m)}"
+ if [ "$ARCH" = "x86_64" ]; then
+ ARCH="x86"
+ fi
+
+ local -r abidiff_min_version="1.7"
+ local -r libdw_min_version_if_clang="0.171"
+
+ if ! command -v "$ABIDIFF" > /dev/null 2>&1; then
+ eprintf "error - abidiff not found!\n"
+ eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
+ eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
+ exit "$FAIL_PREREQ"
+ fi
+
+ local -r abidiff_version="$("$ABIDIFF" --version | cut -d ' ' -f 2)"
+ if ! min_version_is_satisfied "$abidiff_min_version" "$abidiff_version"; then
+ eprintf "error - abidiff version too old: %s\n" "$abidiff_version"
+ eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
+ eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
+ exit "$FAIL_PREREQ"
+ fi
+
+ if ! command -v "$CC" > /dev/null 2>&1; then
+ eprintf 'error - %s not found\n' "$CC"
+ exit "$FAIL_PREREQ"
+ fi
+
+ if "$CC" --version | grep -q clang; then
+ local -r libdw_version="$(ldconfig -v 2>/dev/null | grep -v SKIPPED | grep -m 1 -o 'libdw-[0-9]\+.[0-9]\+' | cut -c 7-)"
+ if ! min_version_is_satisfied "$libdw_min_version_if_clang" "$libdw_version"; then
+ eprintf "error - libdw version too old for use with clang: %s\n" "$libdw_version"
+ eprintf "Please install libdw from elfutils version %s or greater\n" "$libdw_min_version_if_clang"
+ eprintf "See: https://sourceware.org/elfutils/\n"
+ exit "$FAIL_PREREQ"
+ fi
+ fi
+
+ if [ ! -d "arch/${ARCH}" ]; then
+ eprintf 'error - ARCH "%s" is not a subdirectory under arch/\n' "$ARCH"
+ eprintf "Please set ARCH to one of:\n%s\n" "$(find arch -maxdepth 1 -mindepth 1 -type d -printf '%f ' | fmt)"
+ exit "$FAIL_PREREQ"
+ fi
+}
+
+run() {
+ local base_ref="$1"
+ local -r check_all="$2"
+ shift 2
+ local -r orig_args="$*"
+ if [ -z "$KERNEL_SRC" ]; then
+ KERNEL_SRC="$(realpath "$(dirname "$0")"/..)"
+ fi
+
+ cd "$KERNEL_SRC"
+
+ abi_error_log="${abi_error_log:-${KERNEL_SRC}/abi_error_log.txt}"
+
+ tmp_dir=$(mktemp -d)
+ trap 'rm -rf "$tmp_dir"' EXIT
+
+ check_deps
+
+ # Set of UAPI directories to check by default
+ UAPI_DIRS=(include/uapi arch/*/include/uapi)
+ GIT_ARGS="--name-status --no-renames --format= --diff-filter=a -- ${UAPI_DIRS[*]/%/\/*.h}"
+
+ if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
+ eprintf "error - this script requires the kernel tree to be initialized with Git\n"
+ exit "$FAIL_PREREQ"
+ fi
+
+ # If there are no dirty UAPI files, use HEAD as base_ref
+ # shellcheck disable=SC2086
+ if [ -z "$base_ref" ] && git diff --exit-code $GIT_ARGS > /dev/null 2>&1; then
+ base_ref="HEAD"
+ fi
+
+ if [ -z "$past_ref" ]; then
+ if [ -n "$base_ref" ]; then
+ past_ref="${base_ref}^1"
+ else
+ past_ref="HEAD"
+ fi
+ fi
+
+ if [ -n "$past_ref" ] && ! git rev-parse --verify "$past_ref" > /dev/null 2>&1; then
+ printf 'error - invalid git reference "%s"\n' "$past_ref"
+ exit "$FAIL_PREREQ"
+ fi
+
+ if [ -n "$base_ref" ]; then
+ if ! git merge-base --is-ancestor "$past_ref" "$base_ref" > /dev/null 2>&1; then
+ printf 'error - "%s" is not an ancestor of base ref "%s"\n' "$past_ref" "$base_ref"
+ exit "$FAIL_PREREQ"
+ fi
+ fi
+
+ if [ "$check_all" != "true" ] && ! get_uapi_files "$check_all" "$base_ref" "$past_ref" > /dev/null; then
+ printf "No changes to UAPI headers were applied between %s and %s\n" "$past_ref" "$base_ref"
+ exit "$SUCCESS"
+ fi
+
+ if ! check_uapi_files "$check_all" "$base_ref" "$past_ref"; then
+ eprintf "error - UAPI header ABI check failed\n"
+ {
+ printf 'Generated by "%s %s" from git ref %s\n\n' "$0" "$orig_args" "$(git rev-parse "HEAD")"
+ find "$tmp_dir" -type f -name '*.error' -exec cat {} +
+ } > "$abi_error_log"
+ eprintf "Failure summary saved to %s\n" "$abi_error_log"
+ exit "$FAIL_ABI"
+ fi
+}
+
+main() {
+ MAX_THREADS=$(nproc)
+ VERBOSE="false"
+ local base_ref=""
+ local check_all="true"
+ local quiet="false"
+ while getopts "hb:p:mj:l:qv" opt; do
+ case $opt in
+ h)
+ print_usage
+ exit "$SUCCESS"
+ ;;
+ b)
+ base_ref="$OPTARG"
+ ;;
+ p)
+ past_ref="$OPTARG"
+ ;;
+ m)
+ check_all="false"
+ ;;
+ j)
+ MAX_THREADS="$OPTARG"
+ ;;
+ l)
+ abi_error_log="$OPTARG"
+ ;;
+ q)
+ quiet="true"
+ ;;
+ v)
+ VERBOSE="true"
+ ;;
+ *)
+ exit "$FAIL_PREREQ"
+ esac
+ done
+
+ if [ "$quiet" = "true" ]; then
+ run "$base_ref" "$check_all" "$@" > /dev/null
+ else
+ run "$base_ref" "$check_all" "$@"
+ fi
+}
+
+main "$@"
--
2.17.1


2023-03-16 18:17:26

by John Moon

[permalink] [raw]
Subject: [PATCH v3 2/2] docs: dev-tools: Add UAPI checker documentation

Add detailed documentation for scripts/check-uapi.sh.

Signed-off-by: John Moon <[email protected]>
---
- Updated with new flags/usage
- Added example of possible false negatives
- Improved formatting

Documentation/dev-tools/checkuapi.rst | 492 ++++++++++++++++++++++++++
Documentation/dev-tools/index.rst | 1 +
2 files changed, 493 insertions(+)
create mode 100644 Documentation/dev-tools/checkuapi.rst

diff --git a/Documentation/dev-tools/checkuapi.rst b/Documentation/dev-tools/checkuapi.rst
new file mode 100644
index 000000000000..0e257eff1526
--- /dev/null
+++ b/Documentation/dev-tools/checkuapi.rst
@@ -0,0 +1,492 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+============
+UAPI Checker
+============
+
+The UAPI checker (``scripts/check-uapi.sh``) is a shell script which checks
+UAPI header files for userspace backwards-compatibility across the git tree.
+
+The script can produce false positives in some cases, so developers are
+encouraged to use their best judgement when interpreting the results. Please
+refer to kernel documentation on the topic of IOCTL stability for more
+information (Documentation/process/botching-up-ioctls.rst).
+
+Options
+=======
+
+This section will describe the options ``check-uapi.sh`` can be run with.
+
+Usage::
+
+ ./scripts/check-uapi.sh [-b BASE_REF] [-p PAST_REF] [-m] [-j N] [-l ERROR_LOG] [-q] [-v]
+
+Available options::
+
+ -b BASE_REF Base git reference to use for comparison. If unspecified or empty,
+ will use any dirty changes in tree to UAPI files. If there are no
+ dirty changes, HEAD will be used.
+ -p PAST_REF Compare BASE_REF to PAST_REF (e.g. -r v6.1). If unspecified or empty,
+ will use BASE_REF^1. Must be an ancestor of BASE_REF.
+ -m Check only UAPI headers modified between PAST_REF and BASE_REF for
+ backwards compatibility.
+ -j JOBS Number of checks to run in parallel (default: number of CPU cores).
+ -l ERROR_LOG Write error log to file (default: "$KERNEL_SOURCE/abi_error_log.txt").
+ -q Quiet operation (suppress all stdout, still print stderr).
+ -v Verbose operation (print more information about each header being checked).
+
+Environmental args::
+
+ ABIDIFF Custom path to abidiff binary
+ CC C compiler (default is "gcc")
+ ARCH Target architecture of C compiler (default is host arch)
+
+Exit codes::
+
+ 0) Success
+ 1) ABI difference detected
+ 2) Prerequisite not met
+ 3) Compilation error
+ 4) All modified files were for an architecture other than ARCH
+ 5) General error
+
+Examples
+========
+
+Basic Usage
+-----------
+
+First, let's try making a change to a UAPI header file that obviously won't
+break userspace::
+
+ cat << 'EOF' | patch -l -p1
+ --- a/include/uapi/linux/acct.h
+ +++ b/include/uapi/linux/acct.h
+ @@ -21,7 +21,9 @@
+ #include <asm/param.h>
+ #include <asm/byteorder.h>
+
+ -/*
+ +#define FOO
+ +
+ +/*
+ * comp_t is a 16-bit "floating" point number with a 3-bit base 8
+ * exponent and a 13-bit fraction.
+ * comp2_t is 24-bit with 5-bit base 2 exponent and 20 bit fraction
+ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
+ EOF
+
+Now, let's use the script to validate::
+
+ % ./scripts/check-uapi.sh
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ All 888 UAPI headers compatible with x86 appear to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+
+Let's add another change that *would* break userspace::
+
+ cat << 'EOF' | patch -l -p1
+ --- a/include/uapi/linux/bpf.h 2023-02-28 13:32:36.505591077 -0800
+ +++ b/include/uapi/linux/bpf.h 2023-02-28 13:32:57.033494020 -0800
+ @@ -73,7 +73,7 @@
+ __u8 dst_reg:4; /* dest register */
+ __u8 src_reg:4; /* source register */
+ __s16 off; /* signed offset */
+ - __s32 imm; /* signed immediate constant */
+ + __u32 imm; /* unsigned immediate constant */
+ };
+
+ /* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */
+ EOF
+
+The script should catch this incompatibility::
+
+ % ./scripts/check-uapi.sh
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ !!! ABI differences detected in include/uapi/linux/bpf.h from HEAD -> dirty tree !!!
+
+ [C] 'struct bpf_insn' changed:
+ type size hasn't changed
+ 1 data member change:
+ type of '__s32 imm' changed:
+ typedef name changed from __s32 to __u32 at int-ll64.h:27:1
+ underlying type 'int' changed:
+ type name changed from 'int' to 'unsigned int'
+ type size hasn't changed
+
+ Header file diff (after headers_install):
+ --- HEAD/include/uapi/linux/bpf.h 2023-03-15 16:05:39.601082143 -0700
+ +++ dirty/include/uapi/linux/bpf.h 2023-03-15 16:05:37.669092565 -0700
+ @@ -73,7 +73,7 @@
+ __u8 dst_reg:4; /* dest register */
+ __u8 src_reg:4; /* source register */
+ __s16 off; /* signed offset */
+ - __s32 imm; /* signed immediate constant */
+ + __u32 imm; /* unsigned immediate constant */
+ };
+
+ /* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */
+
+ error - 1/888 UAPI headers compatible with x86 appear _not_ to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+ error - UAPI header ABI check failed
+
+The script finds the ABI breakage and reports it (along with a diff of the
+offending file).
+
+Let's commit the breaking change, then commit the good change::
+
+ % git commit -m 'Breaking UAPI change' include/uapi/linux/bpf.h
+ [detached HEAD f758e574663a] Breaking UAPI change
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+ % git commit -m 'Innocuous UAPI change' include/uapi/linux/acct.h
+ [detached HEAD 2e87df769081] Innocuous UAPI change
+ 1 file changed, 3 insertions(+), 1 deletion(-)
+
+Now, let's run the script again with no arguments::
+
+ % ./scripts/check-uapi.sh
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Installing sanitized UAPI headers from HEAD^1... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD^1 and HEAD
+ All 888 UAPI headers compatible with x86 appear to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+
+It doesn't catch any breaking change because, by default, it only compares
+``HEAD`` to ``HEAD^1``. The breaking change was committed on ``HEAD~2``. If we
+wanted the search scope to go back further, we'd have to use the ``-p`` option
+to pass a different past reference to compare to. In this case, let's pass
+``-p HEAD~2`` to the script so it checks UAPI changes between ``HEAD~2`` and
+``HEAD``::
+
+ % ./scripts/check-uapi.sh -p HEAD~2
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Installing sanitized UAPI headers from HEAD~2... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD~2 and HEAD
+ !!! ABI differences detected in include/uapi/linux/bpf.h from HEAD~2 -> HEAD !!!
+
+ [C] 'struct bpf_insn' changed:
+ type size hasn't changed
+ 1 data member change:
+ type of '__s32 imm' changed:
+ typedef name changed from __s32 to __u32 at int-ll64.h:27:1
+ underlying type 'int' changed:
+ type name changed from 'int' to 'unsigned int'
+ type size hasn't changed
+
+ Header file diff (after headers_install):
+ --- HEAD~2/include/uapi/linux/bpf.h 2023-03-15 16:10:39.495462638 -0700
+ +++ HEAD/include/uapi/linux/bpf.h 2023-03-15 16:10:38.919465752 -0700
+ @@ -73,7 +73,7 @@
+ __u8 dst_reg:4; /* dest register */
+ __u8 src_reg:4; /* source register */
+ __s16 off; /* signed offset */
+ - __s32 imm; /* signed immediate constant */
+ + __u32 imm; /* unsigned immediate constant */
+ };
+
+ /* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */
+
+ error - 1/888 UAPI headers compatible with x86 appear _not_ to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+ error - UAPI header ABI check failed
+
+Alternatively, you could have also ran with ``-b HEAD~``. This would set the
+base reference to ``HEAD~`` so then the script would compare it to ``HEAD~^1``.
+
+
+Architecture-specific Headers
+-----------------------------
+
+Consider this change::
+
+ cat << 'EOF' | patch -l -p1
+ --- a/arch/arm64/include/uapi/asm/sigcontext.h
+ +++ b/arch/arm64/include/uapi/asm/sigcontext.h
+ @@ -70,6 +70,7 @@ struct sigcontext {
+ struct _aarch64_ctx {
+ __u32 magic;
+ __u32 size;
+ + __u32 new_var;
+ };
+
+ #define FPSIMD_MAGIC 0x46508001
+ EOF
+
+This is a change to an arm64-specific UAPI header file. In this example, I'm
+running the script from an x86 machine with an x86 compiler, so by default,
+the script only works with x86-compatible UAPI header files::
+
+ % ./scripts/check-uapi.sh
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ All 888 UAPI headers compatible with x86 appear to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+
+With an x86 compiler, we can't check header files in ``arch/arm64``, so the
+script doesn't even try. You can pass the ``-m`` to force the script to look at
+modified files, but in this case, you'll run into an issue::
+
+ % ./scripts/check-uapi.sh -m
+ warning - detected changes to following files, but can't check them with x86 compiler:
+ - arch/arm64/include/uapi/asm/sigcontext.h
+
+ Filtering them out and performing partial check on remaining files...
+ error - all files were filtered out, there's nothing to check!
+
+The script will attempt filtering out files that don't work with ``ARCH``, but
+if it filters all of the files out, it bails out.
+
+If we want to check the header file, we'll have to use an arm64 compiler and
+set ``ARCH`` accordingly::
+
+ % CC=aarch64-linux-gnu-gcc ARCH=arm64 ./scripts/check-uapi.sh
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ !!! ABI differences detected in arch/arm64/include/uapi/asm/sigcontext.h from HEAD -> dirty tree !!!
+
+ [C] 'struct _aarch64_ctx' changed:
+ type size changed from 64 to 96 (in bits)
+ 1 data member insertion:
+ '__u32 new_var', at offset 64 (in bits) at sigcontext.h:73:1
+ --- snip ---
+ [C] 'struct zt_context' changed:
+ type size changed from 128 to 160 (in bits)
+ 2 data member changes (1 filtered):
+ '__u16 nregs' offset changed from 64 to 96 (in bits) (by +32 bits)
+ '__u16 __reserved[3]' offset changed from 80 to 112 (in bits) (by +32 bits)
+
+ Header file diff (after headers_install):
+ --- HEAD/arch/arm64/include/uapi/asm/sigcontext.h 2023-03-15 16:15:46.613800573 -0700
+ +++ dirty/arch/arm64/include/uapi/asm/sigcontext.h 2023-03-15 16:15:44.765810584 -0700
+ @@ -70,6 +70,7 @@
+ struct _aarch64_ctx {
+ __u32 magic;
+ __u32 size;
+ + __u32 new_var;
+ };
+
+ #define FPSIMD_MAGIC 0x46508001
+
+ error - 1/858 UAPI headers compatible with arm64 appear _not_ to be backwards compatible
+ Note: UAPI headers for architectures other than arm64 were not checked
+ error - UAPI header ABI check failed
+
+We can see with ``ARCH`` and ``CC`` set properly for the file, the ABI change
+is reported properly.
+
+
+Cross-Dependency Breakages
+--------------------------
+
+Consider this change::
+
+ cat << 'EOF' | patch -l -p1
+ --- a/include/uapi/linux/types.h
+ +++ b/include/uapi/linux/types.h
+ @@ -52,7 +52,7 @@ typedef __u32 __bitwise __wsum;
+ #define __aligned_be64 __be64 __attribute__((aligned(8)))
+ #define __aligned_le64 __le64 __attribute__((aligned(8)))
+
+ -typedef unsigned __bitwise __poll_t;
+ +typedef unsigned short __bitwise __poll_t;
+
+ #endif /* __ASSEMBLY__ */
+ #endif /* _UAPI_LINUX_TYPES_H */
+ EOF
+
+Here, we're changing a ``typedef`` in ``types.h``. Now, let's see what we get
+when we run the script with ``-m`` to only check for modified files::
+
+ % ./scripts/check-uapi.sh -m
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ All 1 UAPI headers modified between HEAD and dirty tree appear to be backwards compatible
+
+With the ``-m`` flag, the script will only check the ABI compatibility of
+modified files. It reports back that there aren't any ABI issues with
+``types.h``. However, other UAPI headers in the tree may be broken by this
+change!
+
+When you run *without* the ``-m`` flag, *all* headers are checked::
+
+ % ./scripts/check-uapi.sh
+ Installing sanitized UAPI headers from dirty tree... OK
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from HEAD... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between HEAD and dirty tree
+ !!! ABI differences detected in include/uapi/linux/eventpoll.h from HEAD -> dirty tree !!!
+
+ [C] 'struct epoll_event' changed:
+ type size changed from 96 to 80 (in bits)
+ 2 data member changes:
+ type of '__poll_t events' changed:
+ underlying type 'unsigned int' changed:
+ type name changed from 'unsigned int' to 'unsigned short int'
+ type size changed from 32 to 16 (in bits)
+ '__u64 data' offset changed from 32 to 16 (in bits) (by -16 bits)
+
+ include/uapi/linux/eventpoll.h did not change between HEAD and dirty tree...
+ It's possible a change to one of the headers it includes caused this error:
+ #include <linux/fcntl.h>
+ #include <linux/types.h>
+
+ error - 1/888 UAPI headers compatible with x86 appear _not_ to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+ error - UAPI header ABI check failed
+
+Note that the script noticed the failing header file did not change, so it
+assumes one of its includes must have caused the breakage. Indeed, we can see
+``linux/types.h`` is used from ``eventpoll.h``.
+
+To make sure issues like this are caught, it's recommended **not** to run with
+the ``-m`` flag when doing formal validation checks (such as in a continuous
+integration system).
+
+
+Checking Historic UAPI Compatibility
+------------------------------------
+
+You can use the ``-b`` and ``-p`` options to examine different chunks of your
+git tree. For example, to check all changed UAPI header files between tags
+v6.0 and v6.1, you'd run::
+
+ % ./scripts/check-uapi.sh -b v6.1 -p v6.0
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from v6.1... OK
+ Installing sanitized UAPI headers from v6.0... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between v6.0 and v6.1
+ --- snip ---
+ error - 89/888 UAPI headers compatible with x86 appear _not_ to be backwards compatible
+ Note: UAPI headers for architectures other than x86 were not checked
+ error - UAPI header ABI check failed
+
+You'll notice that the script did detect many UAPI changes that are not
+backwards compatible. Knowing that kernel UAPIs are supposed to be stable
+forever, this is an alarming result. This brings us to the next section: false
+positives.
+
+False Positives
+===============
+
+The UAPI checker is very aggressive in detecting ABI changes, so some false
+positives may appear. For example, if you check all UAPI headers between v6.0
+and v6.1, many breakages will be flagged. Run the following::
+
+ ./scripts/check-uapi.sh -b v6.1 -p v6.0
+
+The errors will be logged to ``abi_error_log.txt``. Here, we'll find examples
+of several types of false positives.
+
+Enum Expansion
+--------------
+
+::
+
+ !!! ABI differences detected in include/uapi/linux/openvswitch.h from v6.0 -> v6.1 !!!
+
+ [C] 'enum ovs_datapath_attr' changed:
+ type size hasn't changed
+ 1 enumerator insertion:
+ 'ovs_datapath_attr::OVS_DP_ATTR_IFINDEX' value '9'
+ 1 enumerator change:
+ 'ovs_datapath_attr::__OVS_DP_ATTR_MAX' from value '9' to '10' at openvswitch.h.current:85:1
+
+In this case, an enum was expanded. Consequently, the "MAX" value was
+incremented. This is not considered a breaking change because it's assumed
+userspace programs are using the MAX value in a sane fashion.
+
+Expanding Into Reserved/Padding Fields
+--------------------------------------
+
+::
+
+ !!! ABI differences detected in include/uapi/linux/perf_event.h from v6.0 -> v6.1 !!!
+
+ [C] 'struct perf_branch_entry' changed:
+ type size hasn't changed
+ 3 data member insertions:
+ '__u64 spec', at offset 152 (in bits) at perf_event.h.current:1420:1
+ '__u64 new_type', at offset 154 (in bits) at perf_event.h.current:1421:1
+ '__u64 priv', at offset 158 (in bits) at perf_event.h.current:1422:1
+ 1 data member change:
+ '__u64 reserved' offset changed from 152 to 161 (in bits) (by +9 bits)
+
+In this case, a reserved field was expanded into. Previously, the reserved
+field occupied 40 bits in the struct. After the change, three new members
+were added that took up 9 bits, so the size of the reserved field was
+reduced to 31.
+
+As the size of the struct did not change and none of the fields a userspace
+program could have been using were removed/changed/relocated, this change is
+not considered breaking.
+
+Removals For Refactoring
+------------------------
+
+There is not an example of this in the v6.0 -> v6.1 span, but try::
+
+ % ./scripts/check-uapi.sh -b d759be8953febb6e5b5376c7d9bbf568864c6e2d -m
+ warning - detected changes to following files, but can't check them with x86 compiler:
+ - arch/alpha/include/uapi/asm/poll.h
+ - arch/ia64/include/uapi/asm/poll.h
+
+ Filtering them out and performing partial check on remaining files...
+ Saving current tree state... OK
+ Installing sanitized UAPI headers from d759be8953febb6e5b5376c7d9bbf568864c6e2d... OK
+ Installing sanitized UAPI headers from d759be8953febb6e5b5376c7d9bbf568864c6e2d^1... OK
+ Restoring current tree state... OK
+ Checking changes to UAPI headers between d759be8953febb6e5b5376c7d9bbf568864c6e2d^1 and d759be8953febb6e5b5376c7d9bbf568864c6e2d
+ error - UAPI header arch/x86/include/uapi/asm/poll.h was incorrectly removed
+ error - 1/1 UAPI headers modified between d759be8953febb6e5b5376c7d9bbf568864c6e2d^1 and d759be8953febb6e5b5376c7d9bbf568864c6e2d appear _not_ to be backwards compatible
+ error - UAPI header ABI check failed
+
+In this case, the commit was a cleanup/refactoring change that does not impact
+UAPIs. Unfortunately, the script is not smart enough to see this and raises the
+alarm because UAPI headers were removed.
+
+Summary
+-------
+
+There may be other examples of false positives that are not listed here.
+
+In the future, as tooling improves, we may be able to filter out more of these
+false positives. There may also be additional examples of false positives not
+listed here. Use your best judgement, and ideally a unit test in userspace, to
+test your UAPI changes!
+
+False Negatives
+===============
+
+The script can report false negatives when running with ``-m``. This occurs
+when a change is made to a header file which causes a break in a separate header
+file that includes it.
+
+For this reason, it's recommended not to run with the ``-m`` option unless you
+know the modified files are not being included elsewhere.
+
+For an example of this behavior, please see the Cross-Dependency Breakage
+example above.
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 6b0663075dc0..0876f5a2cf55 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -34,6 +34,7 @@ Documentation/dev-tools/testing-overview.rst
kselftest
kunit/index
ktap
+ checkuapi


.. only:: subproject and html
--
2.17.1


2023-03-21 21:20:27

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] check-uapi: Introduce check-uapi.sh

On Fri, Mar 17, 2023 at 3:17 AM John Moon <[email protected]> wrote:
>
> While the kernel community has been good at maintaining backwards
> compatibility with kernel UAPIs, it would be helpful to have a tool
> to check if a commit introduces changes that break backwards
> compatibility.
>
> To that end, introduce check-uapi.sh: a simple shell script that
> checks for changes to UAPI headers using libabigail.
>
> libabigail is "a framework which aims at helping developers and
> software distributors to spot some ABI-related issues like interface
> incompatibility in ELF shared libraries by performing a static
> analysis of the ELF binaries at hand."
>
> The script uses one of libabigail's tools, "abidiff", to compile the
> changed header before and after the commit to detect any changes.
>
> abidiff "compares the ABI of two shared libraries in ELF format. It
> emits a meaningful report describing the differences between the two
> ABIs."
>
> The script also includes the ability to check the compatibility of
> all UAPI headers across commits. This allows developers to inspect
> the stability of the UAPIs over time.
>
> Signed-off-by: John Moon <[email protected]>



I think I pointed out that not all UAPI headers are checked-in.


I applied the following.
This will remove <asm/termios.h>.
So, definitely it is a UAPI breakage, but check_all=true
mode succeeds.


diff --git a/include/uapi/asm-generic/Kbuild b/include/uapi/asm-generic/Kbuild
index ebb180aac74e..a9c88b0a8b3b 100644
--- a/include/uapi/asm-generic/Kbuild
+++ b/include/uapi/asm-generic/Kbuild
@@ -31,6 +31,6 @@ mandatory-y += stat.h
mandatory-y += statfs.h
mandatory-y += swab.h
mandatory-y += termbits.h
-mandatory-y += termios.h
+#mandatory-y += termios.h
mandatory-y += types.h
mandatory-y += unistd.h




Another false positive is a change outside the uapi directories.

For x86, syscall UAPI headers are generated in
arch/x86/entry/syscalls/.


You never know the compatibility of UAPI unless you check
all the headers that 'make headers_install' results in.


I am reluctant to read the whole code,
but I spotted possible bugs.





> +
> +readonly SUCCESS=0
> +readonly FAIL_ABI=1
> +readonly FAIL_PREREQ=2
> +readonly FAIL_COMPILE=3
> +readonly FAIL_ARCH=4
> +readonly FAIL=5
> +
> +# Print to stderr
> +eprintf() {
> + # shellcheck disable=SC2059
> + printf "$@" >&2
> +}
> +
> +# Print list of UAPI files to operate on
> +get_uapi_files() {
> + local -r check_all="$1"
> + local -r base_ref="$2"
> + local -r ref="$3"

local -r past_ref="$3"



> + local -r file_list="${tmp_dir}/file_list.txt"
> +
> + # Return already-processed list if it's available
> + if [ -s "$file_list" ]; then
> + cat "$file_list"
> + return 0
> + fi
> +
> + if [ "$check_all" = "true" ]; then
> + # Print all of the UAPI header files at the commit in question
> + # Ignore the headers that we can't test.
> + # shellcheck disable=SC2086,SC2046
> + git ls-tree --full-tree --name-only -r "${base_ref:-HEAD}" -- "${UAPI_DIRS[@]}" \
> + | grep '.h$' \
> + | sed -e 's/^/M\t/g' \
> + | grep -v $({ get_no_header_list "$base_ref"; get_no_export_list; } | xargs -- printf '-e %s ') \
> + > "$file_list"
> + else
> + if [ -z "$base_ref" ] || [ -z "$ref" ]; then


[ -z "$ref" ] is always false.



> + # shellcheck disable=SC2086
> + git diff $GIT_ARGS > "$file_list"


This is a bug.

Think about this situation:

$base_ref is empty, i.e. workspace
$ref is not HEAD but a commit in the past.


"git diff" will return the diff
between the cache and the workspace.






> + else
> + # shellcheck disable=SC2086
> + git diff "$ref" "$base_ref" $GIT_ARGS > "$file_list"
> + fi


Do you need this if...else...fi block in the first place?


git diff $ref $base_ref $GIT_ARGS > "$file_list"


will naturally work correctly for all cases.
(do not quote $base_ref, which might be empty)




> +
> + if mismatch_arch_files=$(grep -v "arch/${ARCH}" "$file_list" | grep -o "arch/.*\.h"); then
> + eprintf "warning - detected changes to following files, but can't check them with %s compiler:\n" "$ARCH"
> + for f in $mismatch_arch_files; do
> + eprintf " - %s\n" "$f"
> + sed -i -e "\|${f}|d" "$file_list"
> + printf "warning - could not perform ABI check on %s with %s compiler\n" "$f" "$ARCH" \
> + >> "${tmp_dir}/arch_mismatches.error"
> + done
> + eprintf "\nFiltering them out and performing partial check on remaining files...\n"
> + if [ ! -s "$file_list" ]; then
> + eprintf "error - all files were filtered out, there's nothing to check!\n"
> + exit "$FAIL_ARCH"
> + fi
> + fi
> + fi
> +
> + if [ ! -s "$file_list" ]; then
> + return 1
> + fi
> +
> + cat "$file_list"
> +}
> +
> +# Compile the simple test app
> +do_compile() {
> + local -r inc_dir="$1"
> + local -r header="$2"
> + local -r out="$3"
> + printf "int main(void) { return 0; }\n" | \
> + "$CC" -c \
> + -o "$out" \
> + -x c \
> + -O0 \
> + -std=c90 \
> + -fno-eliminate-unused-debug-types \
> + -g \
> + "-I${inc_dir}" \
> + -include "$header" \
> + -
> +}
> +
> +# Print the list of incompatible headers
> +get_no_header_list() {
> + local -r ref="${1:-HEAD}"
> + # Start with the usr/include/Makefile to get a list of the headers
> + # that don't compile using this method.
> + {
> + # shellcheck disable=SC2016
> + printf 'all: ; @echo $(no-header-test)\n'
> + cat "usr/include/Makefile"
> + } | SRCARCH="$ARCH" make -f - | tr " " "\n" | grep -v "asm-generic"
> +
> + # The makefile also skips all asm-generic files, but prints "asm-generic/%"
> + # which won't match the files in this script, so just print all the asm-generic
> + # headers manually.
> + git ls-tree --full-tree --name-only -r "$ref" -- include/uapi/asm-generic \
> + | grep '.h$' \
> + | cut -d '/' -f 2-
> +
> + # Finally, print all the architecture-specific headers that are _not_ for the
> + # ARCH we're targeting
> + git ls-tree --full-tree --name-only -r "$ref" -- arch/*/include/uapi \
> + | grep '.h$' \
> + | grep -v "^arch/${ARCH}"
> +}
> +
> +# Print the list of headers that are not exported for this architecture
> +get_no_export_list() {
> + {
> + # shellcheck disable=SC2016
> + printf 'all: ; @echo $(no-export-headers)\n'
> + cat "include/uapi/Kbuild"
> + } | SRCARCH="$ARCH" srctree="$KERNEL_SRC" make -f - | tr " " "\n" | sed '/^[[:space:]]*$/d'
> +}
> +
> +# Save the current git tree state, stashing if needed
> +save_tree_state() {
> + printf "Saving current tree state... "
> + current_ref="$(git rev-parse HEAD)"
> + readonly current_ref
> + if ! git diff-index --quiet HEAD; then
> + unstash="true"
> + git stash push --quiet
> + fi
> + printf "OK\n"
> +}
> +
> +# Restore the git tree state, unstashing if needed
> +restore_tree_state() {
> + if [ -z "$current_ref" ]; then
> + return 0
> + fi
> +
> + printf "Restoring current tree state... "
> + git checkout --quiet "$current_ref"
> + if [ "$unstash" = "true" ]; then
> + git stash pop --quiet
> + unstash="false"
> + fi
> + printf "OK\n"
> +}
> +
> +# Install headers for both git refs
> +install_headers() {
> + local -r check_all="$1"
> + local -r base_ref="$2"
> + local -r ref="$3"
> +
> + deviated_from_current_tree="false"
> + for inst_ref in "$base_ref" "$ref"; do


Bakeshed1: personally, I would use 'ref' as an iterator.

for ref in "$base_ref" "$past_ref"; do

I see overall inconsistency between ref vs past_ref.





> + if [ -n "$inst_ref" ]; then
> + if [ "$deviated_from_current_tree" = "false" ]; then
> + save_tree_state
> + trap 'rm -rf "$tmp_dir"; restore_tree_state;' EXIT


Bikeshed2:
Instead of updating the trap back-and-force,
I would create a function.


exit_handler () {
if [ "$deviated_from_current_tree" = true ]; then
restore_tree_state
fi

if [ -n "$tmp_dir" ]; then
rm -rf "$tmp_dir"
fi
}

trap exist_handler EXIT






> + deviated_from_current_tree="true"
> + fi
> + # This script ($0) is already loaded into memory at this point,
> + # so this operation is safe
> + git checkout --quiet "$(git rev-parse "$inst_ref")"
> + fi
> +
> + printf "Installing sanitized UAPI headers from %s... " "${inst_ref:-dirty tree}"
> + make ARCH="$ARCH" INSTALL_HDR_PATH="${tmp_dir}/${inst_ref}/usr" headers_install > /dev/null 2>&1


You can add -j${MAX_THREADS} for faster 'make headers_install'.








> + printf "OK\n"
> + done
> +
> + restore_tree_state
> + trap 'rm -rf "$tmp_dir"' EXIT
> +}
> +
> +# Check file list for UAPI compatibility
> +check_uapi_files() {
> + local -r check_all="$1"
> + local -r base_ref="$2"
> + local -r ref="$3"
> +
> + install_headers "$check_all" "$base_ref" "$ref"
> +
> + local passed=0;
> + local failed=0;
> + local -a threads=()
> +
> + printf "Checking changes to UAPI headers between %s and %s\n" "$ref" "${base_ref:-dirty tree}"
> + while read -r status file; do
> + if [ "${#threads[@]}" -ge "$MAX_THREADS" ]; then
> + if wait "${threads[0]}"; then
> + passed=$((passed + 1))
> + else
> + failed=$((failed + 1))
> + fi
> + threads=("${threads[@]:1}")
> + fi
> +
> + check_individual_file "$base_ref" "$ref" "$status" "$file" &
> + threads+=("$!")
> + done < <(get_uapi_files "$check_all" "$base_ref" "$ref")
> +
> + for t in "${threads[@]}"; do
> + if wait "$t"; then
> + passed=$((passed + 1))
> + else
> + failed=$((failed + 1))
> + fi
> + done
> +
> + if [ "$check_all" = "true" ]; then
> + scope=$(printf "UAPI headers compatible with %s" "$ARCH")
> + else
> + scope=$(printf "UAPI headers modified between %s and %s" "$ref" "${base_ref:-dirty tree}")
> + fi
> +
> + total="$((passed + failed))"
> + if [ "$failed" -gt 0 ]; then
> + eprintf "error - %d/%d %s appear _not_ to be backwards compatible\n" "$failed" "$total" "$scope"
> + else
> + printf "All %d %s appear to be backwards compatible\n" "$total" "$scope"
> + fi
> +
> + if [ "$check_all" = "true" ]; then
> + printf "Note: UAPI headers for architectures other than %s were not checked\n" "$ARCH"
> + fi
> +
> + return "$failed"
> +}
> +
> +# Print the path to a given header in the tmp_dir
> +get_header() {
> + local -r ref="$1"
> + local -r arch="$2"
> + local -r base="$3"
> +
> + if [ -z "$arch" ]; then
> + printf "%s" "${tmp_dir}/${ref}/usr/${base}"
> + else
> + printf "%s" "${tmp_dir}/${ref}/usr/$(printf "%s" "$base" | cut -d '/' -f 3-)"
> + fi
> +}
> +
> +# Check an individual file for UAPI compatibility
> +check_individual_file() {
> + local -r base_ref="$1"
> + local -r ref="$2"
> + local -r status="$3"
> + local -r file="$4"
> +
> + local -r base=${file/uapi\//}
> + local -r uapi_arch="$(printf "%s" "$file" | grep -o 'arch/.*/uapi' | cut -d '/' -f 2)"
> + local -r base_header=$(get_header "$base_ref" "$uapi_arch" "$base")
> + local -r ref_header=$(get_header "$ref" "$uapi_arch" "$base")
> + local -r installed_base="$(printf "%s" "$base_header" | grep -o "usr/include/.*" | cut -d '/' -f 3-)"
> +
> + if [ "$status" = "D" ]; then
> + eprintf "error - UAPI header %s was incorrectly removed\n" "$file" | tee "${base_header}.error"
> + return 1
> + fi
> +
> + # shellcheck disable=SC2076
> + if [[ " $(get_no_header_list | xargs) " =~ " $installed_base " ]]; then
> + printf "warning - %s cannot be tested by this script (see usr/include/Makefile)\n" "$file" | tee "${base_header}.error"
> + return 1
> + fi
> +
> + # shellcheck disable=SC2076
> + if [[ " $(get_no_export_list | xargs) " =~ " $installed_base " ]]; then
> + printf "warning - %s is not exported by the %s architecture, so cannot be tested\n" "$file" "$ARCH" | tee "${base_header}.error"
> + return 1
> + fi
> +
> + for h in "$base_header" "$ref_header"; do
> + if [ ! -f "$h" ]; then
> + eprintf "error - %s does not exist - cannot compare ABI\n" "$h" | tee "${h}.error"
> + return 1
> + fi
> + done
> +
> + compare_abi "$file" "$base_header" "$ref_header" "$base_ref" "$ref"
> +}
> +
> +# Perform the A/B compilation and compare output ABI
> +compare_abi() {
> + local -r file="$1"
> + local -r base_header="$2"
> + local -r ref_header="$3"
> + local -r base_ref="$4"
> + local -r ref="$5"
> + local -r log="${tmp_dir}/log/$(basename "$file").log"
> +
> + mkdir -p "$(dirname "$log")"
> +
> + if ! do_compile "${tmp_dir}/${base_ref}/usr/include" "$base_header" "${base_header}.bin" 2> "$log"; then
> + eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$base_ref"
> + cat "$log" >&2
> + exit "$FAIL_COMPILE"
> + fi
> +
> + if ! do_compile "${tmp_dir}/${ref}/usr/include" "$ref_header" "${ref_header}.bin" 2> "$log"; then
> + eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$ref"
> + cat "$log" >&2
> + exit "$FAIL_COMPILE"
> + fi
> +
> + "$ABIDIFF" --non-reachable-types "${ref_header}.bin" "${base_header}.bin" > "$log" && ret="$?" || ret="$?"
> + if [ "$ret" -eq 0 ]; then
> + if [ "$VERBOSE" = "true" ]; then
> + printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$ref" "${base_ref:-dirty tree}"
> + fi
> + else
> + # Bits in abidiff's return code can be used to determine the type of error
> + if [ $(("$ret" & 0x1)) -gt 0 ]; then
> + eprintf "error - abidiff did not run properly\n"
> + exit 1
> + fi
> +
> + # If the only changes were additions (not modifications to existing APIs), then
> + # there's no problem. Ignore these diffs.
> + if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
> + grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
> + return 0
> + fi
> + {
> + printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$ref" "${base_ref:-dirty tree}"
> + sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
> +
> + if ! cmp "$ref_header" "$base_header" > /dev/null 2>&1; then
> + printf "\nHeader file diff (after headers_install):\n"
> + diff -Naur "$ref_header" "$base_header" \
> + | sed -e "s|${ref_header}|${ref}/${file}|g" \
> + -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
> + printf "\n"
> + else
> + printf "\n%s did not change between %s and %s...\n" "$file" "$ref" "${base_ref:-dirty tree}"
> + printf "It's possible a change to one of the headers it includes caused this error:\n"
> + grep '^#include' "$base_header"
> + printf "\n"
> + fi
> + } | tee "${base_header}.error" >&2
> + return 1
> + fi
> +}
> +
> +min_version_is_satisfied() {
> + local -r min_version="$1"
> + local -r version_installed="$2"
> +
> + printf "%s\n%s\n" "$min_version" "$version_installed" | sort -Vc > /dev/null 2>&1
> +}
> +
> +# Make sure we have the tools we need
> +check_deps() {
> + ABIDIFF="${ABIDIFF:-abidiff}"
> + CC="${CC:-gcc}"
> + ARCH="${ARCH:-$(uname -m)}"
> + if [ "$ARCH" = "x86_64" ]; then
> + ARCH="x86"
> + fi
> +
> + local -r abidiff_min_version="1.7"
> + local -r libdw_min_version_if_clang="0.171"
> +
> + if ! command -v "$ABIDIFF" > /dev/null 2>&1; then
> + eprintf "error - abidiff not found!\n"
> + eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
> + eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
> + exit "$FAIL_PREREQ"
> + fi
> +
> + local -r abidiff_version="$("$ABIDIFF" --version | cut -d ' ' -f 2)"
> + if ! min_version_is_satisfied "$abidiff_min_version" "$abidiff_version"; then
> + eprintf "error - abidiff version too old: %s\n" "$abidiff_version"
> + eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
> + eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
> + exit "$FAIL_PREREQ"
> + fi
> +
> + if ! command -v "$CC" > /dev/null 2>&1; then
> + eprintf 'error - %s not found\n' "$CC"
> + exit "$FAIL_PREREQ"
> + fi
> +
> + if "$CC" --version | grep -q clang; then
> + local -r libdw_version="$(ldconfig -v 2>/dev/null | grep -v SKIPPED | grep -m 1 -o 'libdw-[0-9]\+.[0-9]\+' | cut -c 7-)"
> + if ! min_version_is_satisfied "$libdw_min_version_if_clang" "$libdw_version"; then
> + eprintf "error - libdw version too old for use with clang: %s\n" "$libdw_version"
> + eprintf "Please install libdw from elfutils version %s or greater\n" "$libdw_min_version_if_clang"
> + eprintf "See: https://sourceware.org/elfutils/\n"
> + exit "$FAIL_PREREQ"
> + fi
> + fi
> +
> + if [ ! -d "arch/${ARCH}" ]; then
> + eprintf 'error - ARCH "%s" is not a subdirectory under arch/\n' "$ARCH"
> + eprintf "Please set ARCH to one of:\n%s\n" "$(find arch -maxdepth 1 -mindepth 1 -type d -printf '%f ' | fmt)"
> + exit "$FAIL_PREREQ"
> + fi
> +}
> +
> +run() {
> + local base_ref="$1"
> + local -r check_all="$2"
> + shift 2
> + local -r orig_args="$*"
> + if [ -z "$KERNEL_SRC" ]; then
> + KERNEL_SRC="$(realpath "$(dirname "$0")"/..)"
> + fi
> +
> + cd "$KERNEL_SRC"
> +
> + abi_error_log="${abi_error_log:-${KERNEL_SRC}/abi_error_log.txt}"
> +
> + tmp_dir=$(mktemp -d)
> + trap 'rm -rf "$tmp_dir"' EXIT
> +
> + check_deps
> +
> + # Set of UAPI directories to check by default
> + UAPI_DIRS=(include/uapi arch/*/include/uapi)
> + GIT_ARGS="--name-status --no-renames --format= --diff-filter=a -- ${UAPI_DIRS[*]/%/\/*.h}"
> +
> + if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
> + eprintf "error - this script requires the kernel tree to be initialized with Git\n"
> + exit "$FAIL_PREREQ"
> + fi
> +
> + # If there are no dirty UAPI files, use HEAD as base_ref
> + # shellcheck disable=SC2086
> + if [ -z "$base_ref" ] && git diff --exit-code $GIT_ARGS > /dev/null 2>&1; then


I think this behavior is difficult to understand.

I think it is clearer to do 'git diff' for the entire tree
when you determine the base_ref.





> + base_ref="HEAD"
> + fi
> +
> + if [ -z "$past_ref" ]; then
> + if [ -n "$base_ref" ]; then
> + past_ref="${base_ref}^1"
> + else
> + past_ref="HEAD"
> + fi
> + fi



So, past_ref is set to either ${base_ref}^1 or HEAD here.


No quotes are needed for HEAD.

past_ref=HEAD

is enough.



> +
> + if [ -n "$past_ref" ] && ! git rev-parse --verify "$past_ref" > /dev/null 2>&1; then


[ -n "$past_ref" ] is always true since past_ref is set to something.








> + printf 'error - invalid git reference "%s"\n' "$past_ref"
> + exit "$FAIL_PREREQ"
> + fi
> +
> + if [ -n "$base_ref" ]; then
> + if ! git merge-base --is-ancestor "$past_ref" "$base_ref" > /dev/null 2>&1; then
> + printf 'error - "%s" is not an ancestor of base ref "%s"\n' "$past_ref" "$base_ref"
> + exit "$FAIL_PREREQ"
> + fi
> + fi
> +
> + if [ "$check_all" != "true" ] && ! get_uapi_files "$check_all" "$base_ref" "$past_ref" > /dev/null; then
> + printf "No changes to UAPI headers were applied between %s and %s\n" "$past_ref" "$base_ref"
> + exit "$SUCCESS"
> + fi
> +
> + if ! check_uapi_files "$check_all" "$base_ref" "$past_ref"; then
> + eprintf "error - UAPI header ABI check failed\n"
> + {
> + printf 'Generated by "%s %s" from git ref %s\n\n' "$0" "$orig_args" "$(git rev-parse "HEAD")"
> + find "$tmp_dir" -type f -name '*.error' -exec cat {} +
> + } > "$abi_error_log"
> + eprintf "Failure summary saved to %s\n" "$abi_error_log"
> + exit "$FAIL_ABI"
> + fi
> +}
> +
> +main() {
> + MAX_THREADS=$(nproc)
> + VERBOSE="false"
> + local base_ref=""
> + local check_all="true"
> + local quiet="false"
> + while getopts "hb:p:mj:l:qv" opt; do
> + case $opt in
> + h)
> + print_usage
> + exit "$SUCCESS"
> + ;;
> + b)
> + base_ref="$OPTARG"
> + ;;
> + p)
> + past_ref="$OPTARG"
> + ;;
> + m)
> + check_all="false"
> + ;;
> + j)
> + MAX_THREADS="$OPTARG"
> + ;;
> + l)
> + abi_error_log="$OPTARG"
> + ;;
> + q)
> + quiet="true"
> + ;;
> + v)
> + VERBOSE="true"
> + ;;
> + *)
> + exit "$FAIL_PREREQ"
> + esac
> + done
> +
> + if [ "$quiet" = "true" ]; then
> + run "$base_ref" "$check_all" "$@" > /dev/null
> + else
> + run "$base_ref" "$check_all" "$@"
> + fi
> +}
> +
> +main "$@"
> --
> 2.17.1
>


--
Best Regards
Masahiro Yamada

2023-03-22 05:01:34

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] check-uapi: Introduce check-uapi.sh

On 3/21/2023 2:19 PM, Masahiro Yamada wrote:

>
> I think I pointed out that not all UAPI headers are checked-in.
>
>
> I applied the following.
> This will remove <asm/termios.h>.
> So, definitely it is a UAPI breakage, but check_all=true
> mode succeeds.
>
>
> diff --git a/include/uapi/asm-generic/Kbuild b/include/uapi/asm-generic/Kbuild
> index ebb180aac74e..a9c88b0a8b3b 100644
> --- a/include/uapi/asm-generic/Kbuild
> +++ b/include/uapi/asm-generic/Kbuild
> @@ -31,6 +31,6 @@ mandatory-y += stat.h
> mandatory-y += statfs.h
> mandatory-y += swab.h
> mandatory-y += termbits.h
> -mandatory-y += termios.h
> +#mandatory-y += termios.h
> mandatory-y += types.h
> mandatory-y += unistd.h
>
>
>
>
> Another false positive is a change outside the uapi directories.
>
> For x86, syscall UAPI headers are generated in
> arch/x86/entry/syscalls/.
>
>
> You never know the compatibility of UAPI unless you check
> all the headers that 'make headers_install' results in.
>

Excellent points. We'll make another pass looking exclusively at the
"make headers_install" trees for the two refs. That should be more
reliable, require less code, and catch issues like these.

>
> I am reluctant to read the whole code,
> but I spotted possible bugs.
>
>
>
>
>
>> +
>> +readonly SUCCESS=0
>> +readonly FAIL_ABI=1
>> +readonly FAIL_PREREQ=2
>> +readonly FAIL_COMPILE=3
>> +readonly FAIL_ARCH=4
>> +readonly FAIL=5
>> +
>> +# Print to stderr
>> +eprintf() {
>> + # shellcheck disable=SC2059
>> + printf "$@" >&2
>> +}
>> +
>> +# Print list of UAPI files to operate on
>> +get_uapi_files() {
>> + local -r check_all="$1"
>> + local -r base_ref="$2"
>> + local -r ref="$3"
>
> local -r past_ref="$3"
>
>
>
>> + local -r file_list="${tmp_dir}/file_list.txt"
>> +
>> + # Return already-processed list if it's available
>> + if [ -s "$file_list" ]; then
>> + cat "$file_list"
>> + return 0
>> + fi
>> +
>> + if [ "$check_all" = "true" ]; then
>> + # Print all of the UAPI header files at the commit in question
>> + # Ignore the headers that we can't test.
>> + # shellcheck disable=SC2086,SC2046
>> + git ls-tree --full-tree --name-only -r "${base_ref:-HEAD}" -- "${UAPI_DIRS[@]}" \
>> + | grep '.h$' \
>> + | sed -e 's/^/M\t/g' \
>> + | grep -v $({ get_no_header_list "$base_ref"; get_no_export_list; } | xargs -- printf '-e %s ') \
>> + > "$file_list"
>> + else
>> + if [ -z "$base_ref" ] || [ -z "$ref" ]; then
>
>
> [ -z "$ref" ] is always false.
>
>
>
>> + # shellcheck disable=SC2086
>> + git diff $GIT_ARGS > "$file_list"
>
>
> This is a bug.
>
> Think about this situation:
>
> $base_ref is empty, i.e. workspace
> $ref is not HEAD but a commit in the past.
>
>
> "git diff" will return the diff
> between the cache and the workspace.
>
>
>
>
>
>
>> + else
>> + # shellcheck disable=SC2086
>> + git diff "$ref" "$base_ref" $GIT_ARGS > "$file_list"
>> + fi
>
>
> Do you need this if...else...fi block in the first place?
>
>
> git diff $ref $base_ref $GIT_ARGS > "$file_list"
>
>
> will naturally work correctly for all cases.
> (do not quote $base_ref, which might be empty)
>
>

Great suggestion, that's far more elegant. Thank you!

>
>
>> +
>> + if mismatch_arch_files=$(grep -v "arch/${ARCH}" "$file_list" | grep -o "arch/.*\.h"); then
>> + eprintf "warning - detected changes to following files, but can't check them with %s compiler:\n" "$ARCH"
>> + for f in $mismatch_arch_files; do
>> + eprintf " - %s\n" "$f"
>> + sed -i -e "\|${f}|d" "$file_list"
>> + printf "warning - could not perform ABI check on %s with %s compiler\n" "$f" "$ARCH" \
>> + >> "${tmp_dir}/arch_mismatches.error"
>> + done
>> + eprintf "\nFiltering them out and performing partial check on remaining files...\n"
>> + if [ ! -s "$file_list" ]; then
>> + eprintf "error - all files were filtered out, there's nothing to check!\n"
>> + exit "$FAIL_ARCH"
>> + fi
>> + fi
>> + fi
>> +
>> + if [ ! -s "$file_list" ]; then
>> + return 1
>> + fi
>> +
>> + cat "$file_list"
>> +}
>> +
>> +# Compile the simple test app
>> +do_compile() {
>> + local -r inc_dir="$1"
>> + local -r header="$2"
>> + local -r out="$3"
>> + printf "int main(void) { return 0; }\n" | \
>> + "$CC" -c \
>> + -o "$out" \
>> + -x c \
>> + -O0 \
>> + -std=c90 \
>> + -fno-eliminate-unused-debug-types \
>> + -g \
>> + "-I${inc_dir}" \
>> + -include "$header" \
>> + -
>> +}
>> +
>> +# Print the list of incompatible headers
>> +get_no_header_list() {
>> + local -r ref="${1:-HEAD}"
>> + # Start with the usr/include/Makefile to get a list of the headers
>> + # that don't compile using this method.
>> + {
>> + # shellcheck disable=SC2016
>> + printf 'all: ; @echo $(no-header-test)\n'
>> + cat "usr/include/Makefile"
>> + } | SRCARCH="$ARCH" make -f - | tr " " "\n" | grep -v "asm-generic"
>> +
>> + # The makefile also skips all asm-generic files, but prints "asm-generic/%"
>> + # which won't match the files in this script, so just print all the asm-generic
>> + # headers manually.
>> + git ls-tree --full-tree --name-only -r "$ref" -- include/uapi/asm-generic \
>> + | grep '.h$' \
>> + | cut -d '/' -f 2-
>> +
>> + # Finally, print all the architecture-specific headers that are _not_ for the
>> + # ARCH we're targeting
>> + git ls-tree --full-tree --name-only -r "$ref" -- arch/*/include/uapi \
>> + | grep '.h$' \
>> + | grep -v "^arch/${ARCH}"
>> +}
>> +
>> +# Print the list of headers that are not exported for this architecture
>> +get_no_export_list() {
>> + {
>> + # shellcheck disable=SC2016
>> + printf 'all: ; @echo $(no-export-headers)\n'
>> + cat "include/uapi/Kbuild"
>> + } | SRCARCH="$ARCH" srctree="$KERNEL_SRC" make -f - | tr " " "\n" | sed '/^[[:space:]]*$/d'
>> +}
>> +
>> +# Save the current git tree state, stashing if needed
>> +save_tree_state() {
>> + printf "Saving current tree state... "
>> + current_ref="$(git rev-parse HEAD)"
>> + readonly current_ref
>> + if ! git diff-index --quiet HEAD; then
>> + unstash="true"
>> + git stash push --quiet
>> + fi
>> + printf "OK\n"
>> +}
>> +
>> +# Restore the git tree state, unstashing if needed
>> +restore_tree_state() {
>> + if [ -z "$current_ref" ]; then
>> + return 0
>> + fi
>> +
>> + printf "Restoring current tree state... "
>> + git checkout --quiet "$current_ref"
>> + if [ "$unstash" = "true" ]; then
>> + git stash pop --quiet
>> + unstash="false"
>> + fi
>> + printf "OK\n"
>> +}
>> +
>> +# Install headers for both git refs
>> +install_headers() {
>> + local -r check_all="$1"
>> + local -r base_ref="$2"
>> + local -r ref="$3"
>> +
>> + deviated_from_current_tree="false"
>> + for inst_ref in "$base_ref" "$ref"; do
>
>
> Bakeshed1: personally, I would use 'ref' as an iterator.
>
> for ref in "$base_ref" "$past_ref"; do
>
> I see overall inconsistency between ref vs past_ref.
>
>

Right, I will clean that up in v4.

>
>
>
>> + if [ -n "$inst_ref" ]; then
>> + if [ "$deviated_from_current_tree" = "false" ]; then
>> + save_tree_state
>> + trap 'rm -rf "$tmp_dir"; restore_tree_state;' EXIT
>
>
> Bikeshed2:
> Instead of updating the trap back-and-force,
> I would create a function.
>
>
> exit_handler () {
> if [ "$deviated_from_current_tree" = true ]; then
> restore_tree_state
> fi
>
> if [ -n "$tmp_dir" ]; then
> rm -rf "$tmp_dir"
> fi
> }
>
> trap exist_handler EXIT
>
>

Good idea, will add in v4.

>
>
>
>
>> + deviated_from_current_tree="true"
>> + fi
>> + # This script ($0) is already loaded into memory at this point,
>> + # so this operation is safe
>> + git checkout --quiet "$(git rev-parse "$inst_ref")"
>> + fi
>> +
>> + printf "Installing sanitized UAPI headers from %s... " "${inst_ref:-dirty tree}"
>> + make ARCH="$ARCH" INSTALL_HDR_PATH="${tmp_dir}/${inst_ref}/usr" headers_install > /dev/null 2>&1
>
>
> You can add -j${MAX_THREADS} for faster 'make headers_install'.
>
>

Thanks for the tip. Will add in v4.

>
>
>
>
>
>
>> + printf "OK\n"
>> + done
>> +
>> + restore_tree_state
>> + trap 'rm -rf "$tmp_dir"' EXIT
>> +}
>> +
>> +# Check file list for UAPI compatibility
>> +check_uapi_files() {
>> + local -r check_all="$1"
>> + local -r base_ref="$2"
>> + local -r ref="$3"
>> +
>> + install_headers "$check_all" "$base_ref" "$ref"
>> +
>> + local passed=0;
>> + local failed=0;
>> + local -a threads=()
>> +
>> + printf "Checking changes to UAPI headers between %s and %s\n" "$ref" "${base_ref:-dirty tree}"
>> + while read -r status file; do
>> + if [ "${#threads[@]}" -ge "$MAX_THREADS" ]; then
>> + if wait "${threads[0]}"; then
>> + passed=$((passed + 1))
>> + else
>> + failed=$((failed + 1))
>> + fi
>> + threads=("${threads[@]:1}")
>> + fi
>> +
>> + check_individual_file "$base_ref" "$ref" "$status" "$file" &
>> + threads+=("$!")
>> + done < <(get_uapi_files "$check_all" "$base_ref" "$ref")
>> +
>> + for t in "${threads[@]}"; do
>> + if wait "$t"; then
>> + passed=$((passed + 1))
>> + else
>> + failed=$((failed + 1))
>> + fi
>> + done
>> +
>> + if [ "$check_all" = "true" ]; then
>> + scope=$(printf "UAPI headers compatible with %s" "$ARCH")
>> + else
>> + scope=$(printf "UAPI headers modified between %s and %s" "$ref" "${base_ref:-dirty tree}")
>> + fi
>> +
>> + total="$((passed + failed))"
>> + if [ "$failed" -gt 0 ]; then
>> + eprintf "error - %d/%d %s appear _not_ to be backwards compatible\n" "$failed" "$total" "$scope"
>> + else
>> + printf "All %d %s appear to be backwards compatible\n" "$total" "$scope"
>> + fi
>> +
>> + if [ "$check_all" = "true" ]; then
>> + printf "Note: UAPI headers for architectures other than %s were not checked\n" "$ARCH"
>> + fi
>> +
>> + return "$failed"
>> +}
>> +
>> +# Print the path to a given header in the tmp_dir
>> +get_header() {
>> + local -r ref="$1"
>> + local -r arch="$2"
>> + local -r base="$3"
>> +
>> + if [ -z "$arch" ]; then
>> + printf "%s" "${tmp_dir}/${ref}/usr/${base}"
>> + else
>> + printf "%s" "${tmp_dir}/${ref}/usr/$(printf "%s" "$base" | cut -d '/' -f 3-)"
>> + fi
>> +}
>> +
>> +# Check an individual file for UAPI compatibility
>> +check_individual_file() {
>> + local -r base_ref="$1"
>> + local -r ref="$2"
>> + local -r status="$3"
>> + local -r file="$4"
>> +
>> + local -r base=${file/uapi\//}
>> + local -r uapi_arch="$(printf "%s" "$file" | grep -o 'arch/.*/uapi' | cut -d '/' -f 2)"
>> + local -r base_header=$(get_header "$base_ref" "$uapi_arch" "$base")
>> + local -r ref_header=$(get_header "$ref" "$uapi_arch" "$base")
>> + local -r installed_base="$(printf "%s" "$base_header" | grep -o "usr/include/.*" | cut -d '/' -f 3-)"
>> +
>> + if [ "$status" = "D" ]; then
>> + eprintf "error - UAPI header %s was incorrectly removed\n" "$file" | tee "${base_header}.error"
>> + return 1
>> + fi
>> +
>> + # shellcheck disable=SC2076
>> + if [[ " $(get_no_header_list | xargs) " =~ " $installed_base " ]]; then
>> + printf "warning - %s cannot be tested by this script (see usr/include/Makefile)\n" "$file" | tee "${base_header}.error"
>> + return 1
>> + fi
>> +
>> + # shellcheck disable=SC2076
>> + if [[ " $(get_no_export_list | xargs) " =~ " $installed_base " ]]; then
>> + printf "warning - %s is not exported by the %s architecture, so cannot be tested\n" "$file" "$ARCH" | tee "${base_header}.error"
>> + return 1
>> + fi
>> +
>> + for h in "$base_header" "$ref_header"; do
>> + if [ ! -f "$h" ]; then
>> + eprintf "error - %s does not exist - cannot compare ABI\n" "$h" | tee "${h}.error"
>> + return 1
>> + fi
>> + done
>> +
>> + compare_abi "$file" "$base_header" "$ref_header" "$base_ref" "$ref"
>> +}
>> +
>> +# Perform the A/B compilation and compare output ABI
>> +compare_abi() {
>> + local -r file="$1"
>> + local -r base_header="$2"
>> + local -r ref_header="$3"
>> + local -r base_ref="$4"
>> + local -r ref="$5"
>> + local -r log="${tmp_dir}/log/$(basename "$file").log"
>> +
>> + mkdir -p "$(dirname "$log")"
>> +
>> + if ! do_compile "${tmp_dir}/${base_ref}/usr/include" "$base_header" "${base_header}.bin" 2> "$log"; then
>> + eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$base_ref"
>> + cat "$log" >&2
>> + exit "$FAIL_COMPILE"
>> + fi
>> +
>> + if ! do_compile "${tmp_dir}/${ref}/usr/include" "$ref_header" "${ref_header}.bin" 2> "$log"; then
>> + eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$ref"
>> + cat "$log" >&2
>> + exit "$FAIL_COMPILE"
>> + fi
>> +
>> + "$ABIDIFF" --non-reachable-types "${ref_header}.bin" "${base_header}.bin" > "$log" && ret="$?" || ret="$?"
>> + if [ "$ret" -eq 0 ]; then
>> + if [ "$VERBOSE" = "true" ]; then
>> + printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$ref" "${base_ref:-dirty tree}"
>> + fi
>> + else
>> + # Bits in abidiff's return code can be used to determine the type of error
>> + if [ $(("$ret" & 0x1)) -gt 0 ]; then
>> + eprintf "error - abidiff did not run properly\n"
>> + exit 1
>> + fi
>> +
>> + # If the only changes were additions (not modifications to existing APIs), then
>> + # there's no problem. Ignore these diffs.
>> + if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
>> + grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
>> + return 0
>> + fi
>> + {
>> + printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$ref" "${base_ref:-dirty tree}"
>> + sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
>> +
>> + if ! cmp "$ref_header" "$base_header" > /dev/null 2>&1; then
>> + printf "\nHeader file diff (after headers_install):\n"
>> + diff -Naur "$ref_header" "$base_header" \
>> + | sed -e "s|${ref_header}|${ref}/${file}|g" \
>> + -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
>> + printf "\n"
>> + else
>> + printf "\n%s did not change between %s and %s...\n" "$file" "$ref" "${base_ref:-dirty tree}"
>> + printf "It's possible a change to one of the headers it includes caused this error:\n"
>> + grep '^#include' "$base_header"
>> + printf "\n"
>> + fi
>> + } | tee "${base_header}.error" >&2
>> + return 1
>> + fi
>> +}
>> +
>> +min_version_is_satisfied() {
>> + local -r min_version="$1"
>> + local -r version_installed="$2"
>> +
>> + printf "%s\n%s\n" "$min_version" "$version_installed" | sort -Vc > /dev/null 2>&1
>> +}
>> +
>> +# Make sure we have the tools we need
>> +check_deps() {
>> + ABIDIFF="${ABIDIFF:-abidiff}"
>> + CC="${CC:-gcc}"
>> + ARCH="${ARCH:-$(uname -m)}"
>> + if [ "$ARCH" = "x86_64" ]; then
>> + ARCH="x86"
>> + fi
>> +
>> + local -r abidiff_min_version="1.7"
>> + local -r libdw_min_version_if_clang="0.171"
>> +
>> + if ! command -v "$ABIDIFF" > /dev/null 2>&1; then
>> + eprintf "error - abidiff not found!\n"
>> + eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
>> + eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +
>> + local -r abidiff_version="$("$ABIDIFF" --version | cut -d ' ' -f 2)"
>> + if ! min_version_is_satisfied "$abidiff_min_version" "$abidiff_version"; then
>> + eprintf "error - abidiff version too old: %s\n" "$abidiff_version"
>> + eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
>> + eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +
>> + if ! command -v "$CC" > /dev/null 2>&1; then
>> + eprintf 'error - %s not found\n' "$CC"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +
>> + if "$CC" --version | grep -q clang; then
>> + local -r libdw_version="$(ldconfig -v 2>/dev/null | grep -v SKIPPED | grep -m 1 -o 'libdw-[0-9]\+.[0-9]\+' | cut -c 7-)"
>> + if ! min_version_is_satisfied "$libdw_min_version_if_clang" "$libdw_version"; then
>> + eprintf "error - libdw version too old for use with clang: %s\n" "$libdw_version"
>> + eprintf "Please install libdw from elfutils version %s or greater\n" "$libdw_min_version_if_clang"
>> + eprintf "See: https://sourceware.org/elfutils/\n"
>> + exit "$FAIL_PREREQ"
>> + fi
>> + fi
>> +
>> + if [ ! -d "arch/${ARCH}" ]; then
>> + eprintf 'error - ARCH "%s" is not a subdirectory under arch/\n' "$ARCH"
>> + eprintf "Please set ARCH to one of:\n%s\n" "$(find arch -maxdepth 1 -mindepth 1 -type d -printf '%f ' | fmt)"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +}
>> +
>> +run() {
>> + local base_ref="$1"
>> + local -r check_all="$2"
>> + shift 2
>> + local -r orig_args="$*"
>> + if [ -z "$KERNEL_SRC" ]; then
>> + KERNEL_SRC="$(realpath "$(dirname "$0")"/..)"
>> + fi
>> +
>> + cd "$KERNEL_SRC"
>> +
>> + abi_error_log="${abi_error_log:-${KERNEL_SRC}/abi_error_log.txt}"
>> +
>> + tmp_dir=$(mktemp -d)
>> + trap 'rm -rf "$tmp_dir"' EXIT
>> +
>> + check_deps
>> +
>> + # Set of UAPI directories to check by default
>> + UAPI_DIRS=(include/uapi arch/*/include/uapi)
>> + GIT_ARGS="--name-status --no-renames --format= --diff-filter=a -- ${UAPI_DIRS[*]/%/\/*.h}"
>> +
>> + if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
>> + eprintf "error - this script requires the kernel tree to be initialized with Git\n"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +
>> + # If there are no dirty UAPI files, use HEAD as base_ref
>> + # shellcheck disable=SC2086
>> + if [ -z "$base_ref" ] && git diff --exit-code $GIT_ARGS > /dev/null 2>&1; then
>
>
> I think this behavior is difficult to understand.
>
> I think it is clearer to do 'git diff' for the entire tree
> when you determine the base_ref.
>
>

Okay. With the model of only looking at "make headers_install" output, I
imagine this bit will change anyway.

>
>
>
>> + base_ref="HEAD"
>> + fi
>> +
>> + if [ -z "$past_ref" ]; then
>> + if [ -n "$base_ref" ]; then
>> + past_ref="${base_ref}^1"
>> + else
>> + past_ref="HEAD"
>> + fi
>> + fi
>
>
>
> So, past_ref is set to either ${base_ref}^1 or HEAD here.
>
>
> No quotes are needed for HEAD.
>
> past_ref=HEAD
>
> is enough.
>
>
>
>> +
>> + if [ -n "$past_ref" ] && ! git rev-parse --verify "$past_ref" > /dev/null 2>&1; then
>
>
> [ -n "$past_ref" ] is always true since past_ref is set to something.
>
>

Good point, must have missed this in an earlier refactor. Will fix in v4.

>
>
>
>
>
>
>> + printf 'error - invalid git reference "%s"\n' "$past_ref"
>> + exit "$FAIL_PREREQ"
>> + fi
>> +
>> + if [ -n "$base_ref" ]; then
>> + if ! git merge-base --is-ancestor "$past_ref" "$base_ref" > /dev/null 2>&1; then
>> + printf 'error - "%s" is not an ancestor of base ref "%s"\n' "$past_ref" "$base_ref"
>> + exit "$FAIL_PREREQ"
>> + fi
>> + fi
>> +
>> + if [ "$check_all" != "true" ] && ! get_uapi_files "$check_all" "$base_ref" "$past_ref" > /dev/null; then
>> + printf "No changes to UAPI headers were applied between %s and %s\n" "$past_ref" "$base_ref"
>> + exit "$SUCCESS"
>> + fi
>> +
>> + if ! check_uapi_files "$check_all" "$base_ref" "$past_ref"; then
>> + eprintf "error - UAPI header ABI check failed\n"
>> + {
>> + printf 'Generated by "%s %s" from git ref %s\n\n' "$0" "$orig_args" "$(git rev-parse "HEAD")"
>> + find "$tmp_dir" -type f -name '*.error' -exec cat {} +
>> + } > "$abi_error_log"
>> + eprintf "Failure summary saved to %s\n" "$abi_error_log"
>> + exit "$FAIL_ABI"
>> + fi
>> +}
>> +
>> +main() {
>> + MAX_THREADS=$(nproc)
>> + VERBOSE="false"
>> + local base_ref=""
>> + local check_all="true"
>> + local quiet="false"
>> + while getopts "hb:p:mj:l:qv" opt; do
>> + case $opt in
>> + h)
>> + print_usage
>> + exit "$SUCCESS"
>> + ;;
>> + b)
>> + base_ref="$OPTARG"
>> + ;;
>> + p)
>> + past_ref="$OPTARG"
>> + ;;
>> + m)
>> + check_all="false"
>> + ;;
>> + j)
>> + MAX_THREADS="$OPTARG"
>> + ;;
>> + l)
>> + abi_error_log="$OPTARG"
>> + ;;
>> + q)
>> + quiet="true"
>> + ;;
>> + v)
>> + VERBOSE="true"
>> + ;;
>> + *)
>> + exit "$FAIL_PREREQ"
>> + esac
>> + done
>> +
>> + if [ "$quiet" = "true" ]; then
>> + run "$base_ref" "$check_all" "$@" > /dev/null
>> + else
>> + run "$base_ref" "$check_all" "$@"
>> + fi
>> +}
>> +
>> +main "$@"
>> --
>> 2.17.1
>>
>
>
> --
> Best Regards
> Masahiro Yamada

Thanks for the review!