2023-04-07 20:36:45

by John Moon

[permalink] [raw]
Subject: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

While the kernel community has been good at maintaining backwards
compatibility with kernel UAPIs, it would be helpful to have a tool
to check if a commit introduces changes that break backwards
compatibility.

To that end, introduce check-uapi.sh: a simple shell script that
checks for changes to UAPI headers using libabigail.

libabigail is "a framework which aims at helping developers and
software distributors to spot some ABI-related issues like interface
incompatibility in ELF shared libraries by performing a static
analysis of the ELF binaries at hand."

The script uses one of libabigail's tools, "abidiff", to compile the
changed header before and after the commit to detect any changes.

abidiff "compares the ABI of two shared libraries in ELF format. It
emits a meaningful report describing the differences between the two
ABIs."

The script also includes the ability to check the compatibility of
all UAPI headers across commits. This allows developers to inspect
the stability of the UAPIs over time.

Signed-off-by: John Moon <[email protected]>
---
- Improved git tree restore operation by going back to a user's
branch if they were on one. This is better than returning to a
detached commit.
- Fixed bitwise AND check syntax which failed with earlier versions
of Bash.
- Removed suppression of "make headers_install" stderr for better
user feedback.
- Made handling of -q flag more elegant.

scripts/check-uapi.sh | 489 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 489 insertions(+)
create mode 100755 scripts/check-uapi.sh

diff --git a/scripts/check-uapi.sh b/scripts/check-uapi.sh
new file mode 100755
index 000000000000..755187f27be5
--- /dev/null
+++ b/scripts/check-uapi.sh
@@ -0,0 +1,489 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+# Script to check commits for UAPI backwards compatibility
+
+set -o errexit
+set -o pipefail
+
+print_usage() {
+ name=$(basename "$0")
+ cat << EOF
+$name - check for UAPI header stability across Git commits
+
+By default, the script will check to make sure the latest commit (or current
+dirty changes) did not introduce ABI changes when compared to HEAD^1. You can
+check against additional commit ranges with the -b and -p options.
+
+The script will not check UAPI headers for architectures other than the one
+defined in ARCH.
+
+Usage: $name [-b BASE_REF] [-p PAST_REF] [-j N] [-l ERROR_LOG] [-q] [-v]
+
+Options:
+ -b BASE_REF Base git reference to use for comparison. If unspecified or empty,
+ will use any dirty changes in tree to UAPI files. If there are no
+ dirty changes, HEAD will be used.
+ -p PAST_REF Compare BASE_REF to PAST_REF (e.g. -p v6.1). If unspecified or empty,
+ will use BASE_REF^1. Must be an ancestor of BASE_REF. Only headers
+ that exist on PAST_REF will be checked for compatibility.
+ -j JOBS Number of checks to run in parallel (default: number of CPU cores).
+ -l ERROR_LOG Write error log to file (default: no error log is generated).
+ -q Quiet operation (suppress all stdout, still print stderr).
+ -v Verbose operation (print more information about each header being checked).
+
+Environmental args:
+ ABIDIFF Custom path to abidiff binary
+ CC C compiler (default is "gcc")
+ ARCH Target architecture of C compiler (default is host arch)
+
+Exit codes:
+ $SUCCESS) Success
+ $FAIL_ABI) ABI difference detected
+ $FAIL_PREREQ) Prerequisite not met
+ $FAIL_COMPILE) Compilation error
+EOF
+}
+
+readonly SUCCESS=0
+readonly FAIL_ABI=1
+readonly FAIL_PREREQ=2
+readonly FAIL_COMPILE=3
+
+# Print to stderr
+eprintf() {
+ # shellcheck disable=SC2059
+ printf "$@" >&2
+}
+
+# Check if git tree is dirty
+tree_is_dirty() {
+ if git diff --quiet; then
+ return 1
+ else
+ return 0
+ fi
+}
+
+# Get list of files installed in $ref
+get_file_list() {
+ local -r ref="$1"
+ local -r tree="$(get_header_tree "$ref")"
+
+ # Print all installed headers, filtering out ones that can't be compiled
+ find "$tree" -type f -name '*.h' -printf '%P\n' | grep -v -f "$INCOMPAT_LIST"
+}
+
+# Add to the list of incompatible headers
+add_to_incompat_list() {
+ local -r ref="$1"
+
+ # Start with the usr/include/Makefile to get a list of the headers
+ # that don't compile using this method.
+ if [ ! -f usr/include/Makefile ]; then
+ eprintf "error - no usr/include/Makefile present at %s\n" "$ref"
+ eprintf "Note: usr/include/Makefile was added in the v5.3 kernel release\n"
+ exit "$FAIL_PREREQ"
+ fi
+ {
+ # shellcheck disable=SC2016
+ printf 'all: ; @echo $(no-header-test)\n'
+ cat usr/include/Makefile
+ } | SRCARCH="$ARCH" make -f - | tr " " "\n" | grep -v "asm-generic" >> "$INCOMPAT_LIST"
+
+ # The makefile also skips all asm-generic files, but prints "asm-generic/%"
+ # which won't work for our grep match. Instead, print something grep will match.
+ printf "asm-generic/.*\.h\n" >> "$INCOMPAT_LIST"
+
+ sort -u -o "$INCOMPAT_LIST" "$INCOMPAT_LIST"
+}
+
+# Compile the simple test app
+do_compile() {
+ local -r inc_dir="$1"
+ local -r header="$2"
+ local -r out="$3"
+ printf "int main(void) { return 0; }\n" | \
+ "$CC" -c \
+ -o "$out" \
+ -x c \
+ -O0 \
+ -std=c90 \
+ -fno-eliminate-unused-debug-types \
+ -g \
+ "-I${inc_dir}" \
+ -include "$header" \
+ -
+}
+
+# Save the current git tree state, stashing if needed
+save_tree_state() {
+ printf "Saving current tree state... "
+ current_ref="$(git symbolic-ref --short HEAD 2> /dev/null || git rev-parse HEAD)"
+ readonly current_ref
+ if tree_is_dirty; then
+ unstash="true"
+ git stash push --quiet
+ fi
+ printf "OK\n"
+}
+
+# Restore the git tree state, unstashing if needed
+restore_tree_state() {
+ if [ -z "$current_ref" ]; then
+ return 0
+ fi
+
+ printf "Restoring current tree state... "
+ git checkout --quiet "$current_ref"
+ if [ "$unstash" = "true" ]; then
+ git stash pop --quiet
+ unstash="false"
+ fi
+ printf "OK\n"
+}
+
+# Handle exit cleanup
+exit_handler() {
+ if [ "$DEVIATED_FROM_CURRENT_TREE" = "true" ]; then
+ restore_tree_state
+ fi
+
+ rm -rf "$TMP_DIR"
+}
+
+# Install headers for both git refs
+install_headers() {
+ local -r base_ref="$1"
+ local -r past_ref="$2"
+
+ DEVIATED_FROM_CURRENT_TREE="false"
+ for ref in "$base_ref" "$past_ref"; do
+ if [ -n "$ref" ]; then
+ if [ "$DEVIATED_FROM_CURRENT_TREE" = "false" ]; then
+ save_tree_state
+ DEVIATED_FROM_CURRENT_TREE="true"
+ fi
+ # This script ($0) is already loaded into memory at this point,
+ # so this operation is safe
+ git checkout --quiet "$(git rev-parse "$ref")"
+ fi
+
+ printf "Installing sanitized UAPI headers from %s... " "${ref:-dirty tree}"
+ make -j "$MAX_THREADS" ARCH="$ARCH" INSTALL_HDR_PATH="${TMP_DIR}/${ref}/usr" headers_install > /dev/null
+ printf "OK\n"
+
+ # Add to list of incompatible headers while we have $ref checked out
+ add_to_incompat_list "$ref" "$INCOMPAT_LIST"
+ done
+
+ restore_tree_state
+ DEVIATED_FROM_CURRENT_TREE="false"
+}
+
+# Print the path to the headers_install tree for a given ref
+get_header_tree() {
+ local -r ref="$1"
+ printf "%s" "${TMP_DIR}/${ref}/usr"
+}
+
+# Check file list for UAPI compatibility
+check_uapi_files() {
+ local -r base_ref="$1"
+ local -r past_ref="$2"
+
+ local passed=0;
+ local failed=0;
+ local -a threads=()
+
+ printf "Checking changes to UAPI headers between %s and %s\n" "$past_ref" "${base_ref:-dirty tree}"
+ # Loop over all UAPI headers that were installed by $past_ref (if they only exist on $base_ref,
+ # there's no way they're broken and no way to compare anyway)
+ while read -r file; do
+ if [ "${#threads[@]}" -ge "$MAX_THREADS" ]; then
+ if wait "${threads[0]}"; then
+ passed=$((passed + 1))
+ else
+ failed=$((failed + 1))
+ fi
+ threads=("${threads[@]:1}")
+ fi
+
+ check_individual_file "$base_ref" "$past_ref" "$file" &
+ threads+=("$!")
+ done < <(get_file_list "$past_ref")
+
+ for t in "${threads[@]}"; do
+ if wait "$t"; then
+ passed=$((passed + 1))
+ else
+ failed=$((failed + 1))
+ fi
+ done
+
+ total="$((passed + failed))"
+ if [ "$failed" -gt 0 ]; then
+ eprintf "error - %d/%d UAPI headers compatible with %s appear _not_ to be backwards compatible\n" "$failed" "$total" "$ARCH"
+ else
+ printf "All %d UAPI headers compatible with %s appear to be backwards compatible\n" "$total" "$ARCH"
+ fi
+
+ return "$failed"
+}
+
+# Check an individual file for UAPI compatibility
+check_individual_file() {
+ local -r base_ref="$1"
+ local -r past_ref="$2"
+ local -r file="$3"
+
+ local -r base_header="$(get_header_tree "$base_ref")/${file}"
+ local -r past_header="$(get_header_tree "$past_ref")/${file}"
+
+ if [ ! -f "$base_header" ]; then
+ printf "error - UAPI header %s was incorrectly removed\n" "$file" | tee "${base_header}.error" >&2
+ return 1
+ fi
+
+ compare_abi "$file" "$base_header" "$past_header" "$base_ref" "$past_ref"
+}
+
+# Perform the A/B compilation and compare output ABI
+compare_abi() {
+ local -r file="$1"
+ local -r base_header="$2"
+ local -r past_header="$3"
+ local -r base_ref="$4"
+ local -r past_ref="$5"
+ local -r log="${TMP_DIR}/log/${file}.log"
+
+ mkdir -p "$(dirname "$log")"
+
+ if ! do_compile "$(get_header_tree "$base_ref")/include" "$base_header" "${base_header}.bin" 2> "$log"; then
+ eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$base_ref"
+ cat "$log" >&2
+ exit "$FAIL_COMPILE"
+ fi
+
+ if ! do_compile "$(get_header_tree "$past_ref")/include" "$past_header" "${past_header}.bin" 2> "$log"; then
+ eprintf "error - couldn't compile version of UAPI header %s at %s\n" "$file" "$past_ref"
+ cat "$log" >&2
+ exit "$FAIL_COMPILE"
+ fi
+
+ local ret=0
+ "$ABIDIFF" --non-reachable-types "${past_header}.bin" "${base_header}.bin" > "$log" || ret="$?"
+ if [ "$ret" -eq 0 ]; then
+ if [ "$VERBOSE" = "true" ]; then
+ printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ fi
+ else
+ # Bits in abidiff's return code can be used to determine the type of error
+ if [ $((ret & 0x1)) -gt 0 ]; then
+ eprintf "error - abidiff did not run properly\n"
+ exit 1
+ fi
+
+ # If the only changes were additions (not modifications to existing APIs), then
+ # there's no problem. Ignore these diffs.
+ if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
+ grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
+ return 0
+ fi
+ {
+ printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
+
+ if ! cmp "$past_header" "$base_header" > /dev/null 2>&1; then
+ printf "\nHeader file diff (after headers_install):\n"
+ diff -Naur "$past_header" "$base_header" \
+ | sed -e "s|${past_header}|${past_ref}/${file}|g" \
+ -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
+ printf "\n"
+ else
+ printf "\n%s did not change between %s and %s...\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ printf "It's possible a change to one of the headers it includes caused this error:\n"
+ grep '^#include' "$base_header"
+ printf "\n"
+ fi
+ } | tee "${base_header}.error" >&2
+ return 1
+ fi
+}
+
+min_version_is_satisfied() {
+ local -r min_version="$1"
+ local -r version_installed="$2"
+
+ printf "%s\n%s\n" "$min_version" "$version_installed" | sort -Vc > /dev/null 2>&1
+}
+
+# Make sure we have the tools we need and the arguments make sense
+check_deps() {
+ ABIDIFF="${ABIDIFF:-abidiff}"
+ CC="${CC:-gcc}"
+ ARCH="${ARCH:-$(uname -m)}"
+ if [ "$ARCH" = "x86_64" ]; then
+ ARCH="x86"
+ fi
+
+ local -r abidiff_min_version="1.7"
+ local -r libdw_min_version_if_clang="0.171"
+
+ if ! command -v "$ABIDIFF" > /dev/null 2>&1; then
+ eprintf "error - abidiff not found!\n"
+ eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
+ eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
+ return 1
+ fi
+
+ local -r abidiff_version="$("$ABIDIFF" --version | cut -d ' ' -f 2)"
+ if ! min_version_is_satisfied "$abidiff_min_version" "$abidiff_version"; then
+ eprintf "error - abidiff version too old: %s\n" "$abidiff_version"
+ eprintf "Please install abigail-tools version %s or greater\n" "$abidiff_min_version"
+ eprintf "See: https://sourceware.org/libabigail/manual/libabigail-overview.html\n"
+ return 1
+ fi
+
+ if ! command -v "$CC" > /dev/null 2>&1; then
+ eprintf 'error - %s not found\n' "$CC"
+ return 1
+ fi
+
+ if "$CC" --version | grep -q clang; then
+ local -r libdw_version="$(ldconfig -v 2>/dev/null | grep -v SKIPPED | grep -m 1 -o 'libdw-[0-9]\+.[0-9]\+' | cut -c 7-)"
+ if ! min_version_is_satisfied "$libdw_min_version_if_clang" "$libdw_version"; then
+ eprintf "error - libdw version too old for use with clang: %s\n" "$libdw_version"
+ eprintf "Please install libdw from elfutils version %s or greater\n" "$libdw_min_version_if_clang"
+ eprintf "See: https://sourceware.org/elfutils/\n"
+ return 1
+ fi
+ fi
+
+ if [ ! -d "arch/${ARCH}" ]; then
+ eprintf 'error - ARCH "%s" is not a subdirectory under arch/\n' "$ARCH"
+ eprintf "Please set ARCH to one of:\n%s\n" "$(find arch -maxdepth 1 -mindepth 1 -type d -printf '%f ' | fmt)"
+ return 1
+ fi
+
+ if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
+ eprintf "error - this script requires the kernel tree to be initialized with Git\n"
+ return 1
+ fi
+
+ if ! git rev-parse --verify "$past_ref" > /dev/null 2>&1; then
+ printf 'error - invalid git reference "%s"\n' "$past_ref"
+ return 1
+ fi
+
+ if [ -n "$base_ref" ]; then
+ if ! git merge-base --is-ancestor "$past_ref" "$base_ref" > /dev/null 2>&1; then
+ printf 'error - "%s" is not an ancestor of base ref "%s"\n' "$past_ref" "$base_ref"
+ return 1
+ fi
+ if [ "$(git rev-parse "$base_ref")" = "$(git rev-parse "$past_ref")" ]; then
+ printf 'error - "%s" and "%s" are the same reference\n' "$past_ref" "$base_ref"
+ return 1
+ fi
+ fi
+}
+
+run() {
+ local base_ref="$1"
+ local past_ref="$2"
+ local abi_error_log="$3"
+ shift 3
+
+ if [ -z "$KERNEL_SRC" ]; then
+ KERNEL_SRC="$(realpath "$(dirname "$0")"/..)"
+ fi
+
+ cd "$KERNEL_SRC"
+
+ if [ -z "$base_ref" ] && ! tree_is_dirty; then
+ base_ref=HEAD
+ fi
+
+ if [ -z "$past_ref" ]; then
+ if [ -n "$base_ref" ]; then
+ past_ref="${base_ref}^1"
+ else
+ past_ref=HEAD
+ fi
+ fi
+
+ if ! check_deps; then
+ exit "$FAIL_PREREQ"
+ fi
+
+ TMP_DIR=$(mktemp -d)
+ readonly TMP_DIR
+ trap 'exit_handler' EXIT
+
+ readonly INCOMPAT_LIST="${TMP_DIR}/incompat_list.txt"
+ touch "$INCOMPAT_LIST"
+
+ # Run make install_headers for both refs
+ install_headers "$base_ref" "$past_ref"
+
+ # Check for any differences in the installed header trees
+ if diff -r -q "$(get_header_tree "$base_ref")" "$(get_header_tree "$past_ref")" > /dev/null 2>&1; then
+ printf "No changes to UAPI headers were applied between %s and %s\n" "$past_ref" "${base_ref:-dirty tree}"
+ exit "$SUCCESS"
+ fi
+
+ if ! check_uapi_files "$base_ref" "$past_ref"; then
+ eprintf "error - UAPI header ABI check failed\n"
+ if [ -n "$abi_error_log" ]; then
+ {
+ printf 'Generated by "%s %s" from git ref %s\n\n' "$0" "$*" "$(git rev-parse HEAD)"
+ find "$TMP_DIR" -type f -name '*.error' -exec cat {} +
+ } > "$abi_error_log"
+ eprintf "Failure summary saved to %s\n" "$abi_error_log"
+ fi
+ exit "$FAIL_ABI"
+ fi
+}
+
+main() {
+ MAX_THREADS=$(nproc)
+ VERBOSE="false"
+ local base_ref=""
+ local quiet="false"
+ while getopts "hb:p:mj:l:qv" opt; do
+ case $opt in
+ h)
+ print_usage
+ exit "$SUCCESS"
+ ;;
+ b)
+ base_ref="$OPTARG"
+ ;;
+ p)
+ past_ref="$OPTARG"
+ ;;
+ j)
+ MAX_THREADS="$OPTARG"
+ ;;
+ l)
+ abi_error_log="$OPTARG"
+ ;;
+ q)
+ quiet="true"
+ ;;
+ v)
+ VERBOSE="true"
+ ;;
+ *)
+ exit "$FAIL_PREREQ"
+ esac
+ done
+
+
+ if [ "$quiet" = "true" ]; then
+ exec > /dev/null
+ fi
+
+ run "$base_ref" "$past_ref" "$abi_error_log" "$@"
+}
+
+main "$@"
--
2.17.1


2023-04-10 10:10:47

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Sat, Apr 8, 2023 at 5:35 AM John Moon <[email protected]> wrote:
>
> While the kernel community has been good at maintaining backwards
> compatibility with kernel UAPIs, it would be helpful to have a tool
> to check if a commit introduces changes that break backwards
> compatibility.
>
> To that end, introduce check-uapi.sh: a simple shell script that
> checks for changes to UAPI headers using libabigail.
>
> libabigail is "a framework which aims at helping developers and
> software distributors to spot some ABI-related issues like interface
> incompatibility in ELF shared libraries by performing a static
> analysis of the ELF binaries at hand."
>
> The script uses one of libabigail's tools, "abidiff", to compile the
> changed header before and after the commit to detect any changes.
>
> abidiff "compares the ABI of two shared libraries in ELF format. It
> emits a meaningful report describing the differences between the two
> ABIs."
>
> The script also includes the ability to check the compatibility of
> all UAPI headers across commits. This allows developers to inspect
> the stability of the UAPIs over time.
>
> Signed-off-by: John Moon <[email protected]>



BTW, is there anybody (except the submitters) who loves this tool?
(or anybody who has ever evaluated this?)

I am the only person who pointed out something.



I put some more comments from users' point of view.





[1] The number of lines of the error log.


According to this tool, it looks like we broke a lot of UAPI
headers in the previous MW (between v6.2 and v6.3-rc1).

Not only in the previous MW, we always broke so many UAPI
headers, this script says.


The following command outputs 2380 lines.


$ ./scripts/check-uapi.sh -q -b v6.3-rc1 -p v6.2 2>&1 | wc
2380 9438 104953


I do not know how many are real breakages, and
how many are false positives.

Anyway, I will not check the 2380 lines of the error log.


And, please note that the help message explains '-q' is the *quiet* mode.
It is the opposite. This is super noisy.




[2] Be careful!


While testing the patch submissions of this,
I messed up my repository multiple times.

The script takes some time because it builds many objects
internally.

However, once this script starts running, you must not hit Ctrl-C.
If you do it, your repository will be sprinkled with a ton
of untracked files.

Apply this patch, and run "./scripts/check-uapi.sh -p v6.0"
and hit Ctrl-C.

Repeat it a couple of times, and "git status" will show you
something horrible.


You will never know when git is checking out a commit
because this script hides it by 'git checkout --quiet'.


So, this tool should show a caveat at least.


'git checkout' should not be hidden, or
maybe a caveat message should be shown.


CAVEAT
This tool runs 'git checkout' a couple of times internally.
If you interrupt it, your worktree might be messed up.












--
Best Regards
Masahiro Yamada

2023-04-10 18:47:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Mon, Apr 10, 2023 at 07:03:05PM +0900, Masahiro Yamada wrote:
> On Sat, Apr 8, 2023 at 5:35 AM John Moon <[email protected]> wrote:
> >
> > While the kernel community has been good at maintaining backwards
> > compatibility with kernel UAPIs, it would be helpful to have a tool
> > to check if a commit introduces changes that break backwards
> > compatibility.
> >
> > To that end, introduce check-uapi.sh: a simple shell script that
> > checks for changes to UAPI headers using libabigail.
> >
> > libabigail is "a framework which aims at helping developers and
> > software distributors to spot some ABI-related issues like interface
> > incompatibility in ELF shared libraries by performing a static
> > analysis of the ELF binaries at hand."
> >
> > The script uses one of libabigail's tools, "abidiff", to compile the
> > changed header before and after the commit to detect any changes.
> >
> > abidiff "compares the ABI of two shared libraries in ELF format. It
> > emits a meaningful report describing the differences between the two
> > ABIs."
> >
> > The script also includes the ability to check the compatibility of
> > all UAPI headers across commits. This allows developers to inspect
> > the stability of the UAPIs over time.
> >
> > Signed-off-by: John Moon <[email protected]>
>
>
>
> BTW, is there anybody (except the submitters) who loves this tool?
> (or anybody who has ever evaluated this?)

I evaluated the first one, and yes, I do want this, but I haven't tested
it out yet, sorry.

I get patches for header files all the time and hand-verifying that they
don't break the abi is a pain at times

> According to this tool, it looks like we broke a lot of UAPI
> headers in the previous MW (between v6.2 and v6.3-rc1).

That's not ok, and needs to be fixed, otherwise this is useless as no
one can rely on it at all.

> The script takes some time because it builds many objects
> internally.
>
> However, once this script starts running, you must not hit Ctrl-C.
> If you do it, your repository will be sprinkled with a ton
> of untracked files.

That needs to be unwound and fixed.

> CAVEAT
> This tool runs 'git checkout' a couple of times internally.
> If you interrupt it, your worktree might be messed up.

ctrl-c can be properly caught and the git state needs to be restored for
this to be able to be accepted.

thanks,

greg k-h

2023-04-10 23:53:09

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/10/2023 11:45 AM, Greg Kroah-Hartman wrote:
> On Mon, Apr 10, 2023 at 07:03:05PM +0900, Masahiro Yamada wrote:
>> On Sat, Apr 8, 2023 at 5:35 AM John Moon <[email protected]> wrote:
>>>
>>> While the kernel community has been good at maintaining backwards
>>> compatibility with kernel UAPIs, it would be helpful to have a tool
>>> to check if a commit introduces changes that break backwards
>>> compatibility.
>>>
>>> To that end, introduce check-uapi.sh: a simple shell script that
>>> checks for changes to UAPI headers using libabigail.
>>>
>>> libabigail is "a framework which aims at helping developers and
>>> software distributors to spot some ABI-related issues like interface
>>> incompatibility in ELF shared libraries by performing a static
>>> analysis of the ELF binaries at hand."
>>>
>>> The script uses one of libabigail's tools, "abidiff", to compile the
>>> changed header before and after the commit to detect any changes.
>>>
>>> abidiff "compares the ABI of two shared libraries in ELF format. It
>>> emits a meaningful report describing the differences between the two
>>> ABIs."
>>>
>>> The script also includes the ability to check the compatibility of
>>> all UAPI headers across commits. This allows developers to inspect
>>> the stability of the UAPIs over time.
>>>
>>> Signed-off-by: John Moon <[email protected]>
>>
>>
>>
>> BTW, is there anybody (except the submitters) who loves this tool?
>> (or anybody who has ever evaluated this?)
>
> I evaluated the first one, and yes, I do want this, but I haven't tested
> it out yet, sorry.
>
> I get patches for header files all the time and hand-verifying that they
> don't break the abi is a pain at times
>

Agreed, this is the way we're using the tool internally. It's great as a
quick spot-check on a change.

>> According to this tool, it looks like we broke a lot of UAPI
>> headers in the previous MW (between v6.2 and v6.3-rc1).
>
> That's not ok, and needs to be fixed, otherwise this is useless as no
> one can rely on it at all.
>

Right, there are several classes of false positives that we've
documented and when examining thousands of commits at time, it'll flag
many things.

For some comparison, if you run checkpatch on the same changeset
(v6.2..v6.3-rc1), you get 995 errors and 7,313 warnings. Still,
checkpatch is helpful for spot-checks.

"./scripts/check-uapi.sh -b v6.3-rc1 -p v6.2" flags 36 out of the 911
files checked. Of those 36, 19 fell into the currently documented false
positive categories:

Enum expansion: 17
Expanding into padded/reserved fields: 2

Beyond those, the tool appears to be flagging legitimate breakages.

Some fit into the definition of "intentional breakages" where support is
being dropped or something is being refactored:

File removals:
- include/uapi/drm/i810_drm.h
- include/uapi/drm/mga_drm.h
- include/uapi/drm/r128_drm.h
- include/uapi/drm/savage_drm.h
- include/uapi/drm/sis_drm.h
- include/uapi/drm/via_drm.h
- include/uapi/linux/meye.h

File moves:
- include/uapi/misc/habanalabs.h

Removal of struct:
- include/uapi/linux/uuid.h (5e6a51787fef)
- include/uapi/linux/mei.h (failed due to uuid.h)
- include/uapi/linux/ublk_cmd.h (failed due to uuid.h)

Others do not seem to be intentional:

Addition/use of flex arrays:
- include/uapi/linux/rseq.h (f7b01bb0b57f)
- include/uapi/scsi/scsi_bsg_mpi3mr.h (c6f2e6b6eaaf)

Type change:
- include/uapi/scsi/scsi_bsg_ufs.h (3f5145a615238)

Additions into existing struct:
- include/uapi/drm/amdgpu_drm.h (b299221faf9b)
- include/uapi/linux/perf_event.h (09519ec3b19e)
- include/uapi/linux/virtio_blk.h (95bfec41bd3d)

Is there something I'm missing that makes these changes false positives?
If so, I'd be happy to add on to the documentation and work towards a
way to filter them out.

In the mean time, we will start a thread on the libabigail mailing list
to see if there's a way to add flags such as --ignore-enum-expansion,
--ignore-expansion-into-reserved-fields, etc. Enum expansion seems to be
making up the largest portion of false positives, so would be the best
thing to filter out.

>> The script takes some time because it builds many objects
>> internally.
>>
>> However, once this script starts running, you must not hit Ctrl-C.
>> If you do it, your repository will be sprinkled with a ton
>> of untracked files.
>
> That needs to be unwound and fixed.
>
>> CAVEAT
>> This tool runs 'git checkout' a couple of times internally.
>> If you interrupt it, your worktree might be messed up.
>
> ctrl-c can be properly caught and the git state needs to be restored for
> this to be able to be accepted.
>

Yes, this can be taken care of.

> thanks,
>
> greg k-h

2023-04-11 06:36:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Mon, Apr 10, 2023 at 04:32:49PM -0700, John Moon wrote:
> > > According to this tool, it looks like we broke a lot of UAPI
> > > headers in the previous MW (between v6.2 and v6.3-rc1).
> >
> > That's not ok, and needs to be fixed, otherwise this is useless as no
> > one can rely on it at all.
> >
>
> Right, there are several classes of false positives that we've documented
> and when examining thousands of commits at time, it'll flag many things.
>
> For some comparison, if you run checkpatch on the same changeset
> (v6.2..v6.3-rc1), you get 995 errors and 7,313 warnings. Still, checkpatch
> is helpful for spot-checks.

checkpatch.pl does not matter, it is a "hint", and many patches
explicitly ignore it (think about patches in the staging tree, you could
fix up one checkpatch issue for a line, but ignore another one as you
are not supposed to mix them up.)

Also for some subsystems, checkpatch does not matter because their
codebase is old and follows different rules. And in some places,
checkpatch is just wrong, because it's a perl script and can not really
parse code.

So NEVER use that as a comparison to the user/kernel abi please. It's a
false comparison.

> "./scripts/check-uapi.sh -b v6.3-rc1 -p v6.2" flags 36 out of the 911 files
> checked. Of those 36, 19 fell into the currently documented false positive
> categories:
>
> Enum expansion: 17
> Expanding into padded/reserved fields: 2
>
> Beyond those, the tool appears to be flagging legitimate breakages.
>
> Some fit into the definition of "intentional breakages" where support is
> being dropped or something is being refactored:
>
> File removals:
> - include/uapi/drm/i810_drm.h
> - include/uapi/drm/mga_drm.h
> - include/uapi/drm/r128_drm.h
> - include/uapi/drm/savage_drm.h
> - include/uapi/drm/sis_drm.h
> - include/uapi/drm/via_drm.h
> - include/uapi/linux/meye.h
>
> File moves:
> - include/uapi/misc/habanalabs.h
>
> Removal of struct:
> - include/uapi/linux/uuid.h (5e6a51787fef)
> - include/uapi/linux/mei.h (failed due to uuid.h)
> - include/uapi/linux/ublk_cmd.h (failed due to uuid.h)
>
> Others do not seem to be intentional:
>
> Addition/use of flex arrays:
> - include/uapi/linux/rseq.h (f7b01bb0b57f)
> - include/uapi/scsi/scsi_bsg_mpi3mr.h (c6f2e6b6eaaf)

That is not a breakage, that's a tool problem.

> Type change:
> - include/uapi/scsi/scsi_bsg_ufs.h (3f5145a615238)

Again, not a real breakage, size is still the same.

> Additions into existing struct:
> - include/uapi/drm/amdgpu_drm.h (b299221faf9b)
> - include/uapi/linux/perf_event.h (09519ec3b19e)
> - include/uapi/linux/virtio_blk.h (95bfec41bd3d)

Adding data to the end of a structure is a well-known way to extend the
api, in SOME instances if it is used properly.

So again, not a break.

> Is there something I'm missing that makes these changes false positives? If
> so, I'd be happy to add on to the documentation and work towards a way to
> filter them out.
>
> In the mean time, we will start a thread on the libabigail mailing list to
> see if there's a way to add flags such as --ignore-enum-expansion,
> --ignore-expansion-into-reserved-fields, etc. Enum expansion seems to be
> making up the largest portion of false positives, so would be the best thing
> to filter out.

Increasing enums is in no way an abi break unless the size of the
structure changes.

Using reserved fields too is not a breakage.

So yes, it looks like the tooling needs some work in order for us to be
able to use this properly, digging through false positives like this is
going to make it not used at all.

thanks,

greg k-h

2023-04-11 18:42:44

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/10/2023 11:34 PM, Greg Kroah-Hartman wrote:
> On Mon, Apr 10, 2023 at 04:32:49PM -0700, John Moon wrote:
>>>> According to this tool, it looks like we broke a lot of UAPI
>>>> headers in the previous MW (between v6.2 and v6.3-rc1).
>>>
>>> That's not ok, and needs to be fixed, otherwise this is useless as no
>>> one can rely on it at all.
>>>
>>
>> Right, there are several classes of false positives that we've documented
>> and when examining thousands of commits at time, it'll flag many things.
>>
>> For some comparison, if you run checkpatch on the same changeset
>> (v6.2..v6.3-rc1), you get 995 errors and 7,313 warnings. Still, checkpatch
>> is helpful for spot-checks.
>
> checkpatch.pl does not matter, it is a "hint", and many patches
> explicitly ignore it (think about patches in the staging tree, you could
> fix up one checkpatch issue for a line, but ignore another one as you
> are not supposed to mix them up.)
>
> Also for some subsystems, checkpatch does not matter because their
> codebase is old and follows different rules. And in some places,
> checkpatch is just wrong, because it's a perl script and can not really
> parse code.
>
> So NEVER use that as a comparison to the user/kernel abi please. It's a
> false comparison.
>

Fair enough. I was just trying to frame this tool as a "hint" as well. :)

>>
>> Others do not seem to be intentional:
>>
>> Addition/use of flex arrays:
>> - include/uapi/linux/rseq.h (f7b01bb0b57f)
>> - include/uapi/scsi/scsi_bsg_mpi3mr.h (c6f2e6b6eaaf)
>
> That is not a breakage, that's a tool problem.
>
>> Type change:
>> - include/uapi/scsi/scsi_bsg_ufs.h (3f5145a615238)
>
> Again, not a real breakage, size is still the same.
>

Would you find the tool more useful if it simply filtered out all
instances where the size of the type did not change? This would filter
out the following which the tool currently flags:

- enum expansions
- reserved field expansions
- expansions of a struct with a flex array at the end
- type changes
- re-ordering of existing members
- ...others?

These changes aren't _always_ safe, but if you assume the kernel
developer is doing something reasonable, then maybe it's okay. Maybe we
could hide these checks behind something like a "--pedantic" flag?

This logic is actually trivial to add. Filtering out issues where the
type size stays the same brings us down to 14 failures in that same MW
changeset. 8 of them are file removals and the 6 remaining flags are
additions to existing structs or intentional removals of APIs.

LMK what you think and I can work this into a v6 patch.

>> Additions into existing struct:
>> - include/uapi/drm/amdgpu_drm.h (b299221faf9b)
>> - include/uapi/linux/perf_event.h (09519ec3b19e)
>> - include/uapi/linux/virtio_blk.h (95bfec41bd3d)
>
> Adding data to the end of a structure is a well-known way to extend the
> api, in SOME instances if it is used properly.
>
> So again, not a break.
>

I don't know of a way the tool could be smart enough to decide if the
additions are done properly in the proper instances. Seems appropriate
that the tool would flag these changes for closer human review.

Thanks,
John

2023-04-12 06:18:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Tue, Apr 11, 2023 at 11:36:48AM -0700, John Moon wrote:
> > >
> > > Others do not seem to be intentional:
> > >
> > > Addition/use of flex arrays:
> > > - include/uapi/linux/rseq.h (f7b01bb0b57f)
> > > - include/uapi/scsi/scsi_bsg_mpi3mr.h (c6f2e6b6eaaf)
> >
> > That is not a breakage, that's a tool problem.
> >
> > > Type change:
> > > - include/uapi/scsi/scsi_bsg_ufs.h (3f5145a615238)
> >
> > Again, not a real breakage, size is still the same.
> >
>
> Would you find the tool more useful if it simply filtered out all instances
> where the size of the type did not change? This would filter out the
> following which the tool currently flags:
>
> - enum expansions
> - reserved field expansions
> - expansions of a struct with a flex array at the end
> - type changes
> - re-ordering of existing members
> - ...others?

Obviously not, as some of those are real breakages, and some are not at
all.

Please understand what is an abi breakage. Adding new enums is not.
Using a reserved field is not. Reording existing members IS.

> These changes aren't _always_ safe, but if you assume the kernel developer
> is doing something reasonable, then maybe it's okay. Maybe we could hide
> these checks behind something like a "--pedantic" flag?

Again, no, that list above has totally different things in it, some are
completly safe, others totally break the abi. Do NOT lump them all
together as that is wrong.

thanks,

greg k-h

2023-04-12 16:44:47

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
>> Would you find the tool more useful if it simply filtered out all instances
>> where the size of the type did not change? This would filter out the
>> following which the tool currently flags:
>>
>> - enum expansions
>> - reserved field expansions
>> - expansions of a struct with a flex array at the end
>> - type changes
>> - re-ordering of existing members
>> - ...others?
>
> Obviously not, as some of those are real breakages, and some are not at
> all.
>
> Please understand what is an abi breakage. Adding new enums is not.
> Using a reserved field is not. Reording existing members IS.
>

Yes, understood that method would miss certain classes of breakages. I
was suggesting it as a way to improve the signal-to-noise ratio of the
tool since we don't currently have an algorithm for determining
breakages with 100% accuracy.

We'll work internally and with the libabigail team to improve the story
here and get back to you. Thanks for the discussion!

- John

2023-04-12 16:45:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Wed, Apr 12, 2023 at 09:37:16AM -0700, John Moon wrote:
> On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
> > > Would you find the tool more useful if it simply filtered out all instances
> > > where the size of the type did not change? This would filter out the
> > > following which the tool currently flags:
> > >
> > > - enum expansions
> > > - reserved field expansions
> > > - expansions of a struct with a flex array at the end
> > > - type changes
> > > - re-ordering of existing members
> > > - ...others?
> >
> > Obviously not, as some of those are real breakages, and some are not at
> > all.
> >
> > Please understand what is an abi breakage. Adding new enums is not.
> > Using a reserved field is not. Reording existing members IS.
> >
>
> Yes, understood that method would miss certain classes of breakages. I was
> suggesting it as a way to improve the signal-to-noise ratio of the tool
> since we don't currently have an algorithm for determining breakages with
> 100% accuracy.

Why not? You know the different types of things here based on the
differences between the dwarf data, and they fall into different
categories, and those different categories mean different things.

If you have questions as to which type of change is allowed and which is
not, just ask us, the rules are not complex, nor impossible to describe,
otherwise we wouldn't have a stable api at all, right?

thanks,

greg k-h

2023-04-13 14:46:19

by Mark Wielaard

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

Hi,

On Wed, 2023-04-12 at 09:37 -0700, John Moon via Libabigail wrote:
> On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
> > > Would you find the tool more useful if it simply filtered out all instances
> > > where the size of the type did not change? This would filter out the
> > > following which the tool currently flags:
> > >
> > > - enum expansions
> > > - reserved field expansions
> > > - expansions of a struct with a flex array at the end
> > > - type changes
> > > - re-ordering of existing members
> > > - ...others?
> >
> > Obviously not, as some of those are real breakages, and some are not at
> > all.
> >
> > Please understand what is an abi breakage. Adding new enums is not.
> > Using a reserved field is not. Reording existing members IS.
> >
>
> Yes, understood that method would miss certain classes of breakages. I
> was suggesting it as a way to improve the signal-to-noise ratio of the
> tool since we don't currently have an algorithm for determining
> breakages with 100% accuracy.

Note that you can check the exit code of libabigail's abidiff to see
whether something is an incompatible abi change or not, see:
https://sourceware.org/libabigail/manual/abidiff.html#return-values

You can also of course use suppressions to instruct abidiff to avoid
reporting changes involving certain ABI artifacts:
https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppr-spec-label

Cheers,

Mark

2023-04-13 17:06:31

by Nicolas Schier

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Mon, Apr 10, 2023 at 07:03:05PM +0900 Masahiro Yamada wrote:
[...]
> [2] Be careful!
>
>
> While testing the patch submissions of this,
> I messed up my repository multiple times.
>
> The script takes some time because it builds many objects
> internally.
>
> However, once this script starts running, you must not hit Ctrl-C.
> If you do it, your repository will be sprinkled with a ton
> of untracked files.
>
> Apply this patch, and run "./scripts/check-uapi.sh -p v6.0"
> and hit Ctrl-C.
>
> Repeat it a couple of times, and "git status" will show you
> something horrible.
>
>
> You will never know when git is checking out a commit
> because this script hides it by 'git checkout --quiet'.
>
>
> So, this tool should show a caveat at least.
>
>
> 'git checkout' should not be hidden, or
> maybe a caveat message should be shown.
>
>
> CAVEAT
> This tool runs 'git checkout' a couple of times internally.
> If you interrupt it, your worktree might be messed up.

John, did you consider using git export for obtaining a source tree that
can be used for headers_install and the following comparison, instead of
git stash+checkout? AFACS the script does not depend on any other git
functionality than providing a specific version of the source tree. I am
pretty sure that leaving the original working copy in its original state
is supporting the script's popularity.

Kind regards,
Nicolas

2023-04-13 17:09:10

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/12/2023 9:43 AM, Greg Kroah-Hartman wrote:
> On Wed, Apr 12, 2023 at 09:37:16AM -0700, John Moon wrote:
>> On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
>>>> Would you find the tool more useful if it simply filtered out all instances
>>>> where the size of the type did not change? This would filter out the
>>>> following which the tool currently flags:
>>>>
>>>> - enum expansions
>>>> - reserved field expansions
>>>> - expansions of a struct with a flex array at the end
>>>> - type changes
>>>> - re-ordering of existing members
>>>> - ...others?
>>>
>>> Obviously not, as some of those are real breakages, and some are not at
>>> all.
>>>
>>> Please understand what is an abi breakage. Adding new enums is not.
>>> Using a reserved field is not. Reording existing members IS.
>>>
>>
>> Yes, understood that method would miss certain classes of breakages. I was
>> suggesting it as a way to improve the signal-to-noise ratio of the tool
>> since we don't currently have an algorithm for determining breakages with
>> 100% accuracy.
>
> Why not? You know the different types of things here based on the
> differences between the dwarf data, and they fall into different
> categories, and those different categories mean different things.
>
> If you have questions as to which type of change is allowed and which is
> not, just ask us, the rules are not complex, nor impossible to describe,
> otherwise we wouldn't have a stable api at all, right?
>

Right, it's currently a limitation of parsing the abidiff output.

Even in trivial situations like an enum expansion, the tool knows that a
variant was added and another variant had its offset changed. There's
not a good way to say for sure that the variant whose offset changed is
a "*_MAX" variant. So if we simply ignored enum expansion, we'd miss
breakages like this:

enum foo {
FLAG_A,
+ FLAG_B,
FLAG_C,
FLAG_MAX
}

Maybe we can ignore an enum expansion if only the last variant's offset
changed, but then we'd miss cases where enums don't have a MAX variant.
Maybe we could limit the check to last variant's offset whose name
contains string "MAX", but what if someone calls it "LAST" instead? It
gets fragile.

Or situations like expanding into reserved fields. How can we detect the
difference between this:

struct foo {
__u32 x;
__u32 y;
+ __u32 z;
+ __u8 padding[12];
- __u8 padding[16];
}

And this:

struct foo {
__u32 x;
__u32 y;
+ __u32 z;
+ char codename[4]; /* Takes "NAME" */
- char codename[8]; /* Takes "CODENAME" */
}

Maybe we grep for "pad" or "reserved", but again... fragile.

Another idea is to add some sort of in-line comment to give the checker
a hint that the field is intentionally unstable. It could be implicit
for "*_MAX" enum variants or "*padding" at the end of structs, but if
you wanted to have something like "end[]" (like in the rseq change), you
could add /* ABI-unstable */ next to it and the script would ignore it.

Beyond those issues, we have non-trivial situations like when it's safe
to add members to a struct. We know the kernel will zero-extend
mismatches between kernel and userspace, but how do we know the driver
properly handles the case of an old userspace passing an old struct?

So far, we've erred on the side of flagging it if it _could_ be a break
and relied on the reviewer to make the final determination.





2023-04-13 17:24:38

by Giuliano Procida

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

Hi all.

On Thu, 13 Apr 2023 at 15:37, Mark Wielaard <[email protected]> wrote:
>
> Hi,
>
> On Wed, 2023-04-12 at 09:37 -0700, John Moon via Libabigail wrote:
> > On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
> > > > Would you find the tool more useful if it simply filtered out all instances
> > > > where the size of the type did not change? This would filter out the
> > > > following which the tool currently flags:
> > > >
> > > > - enum expansions
> > > > - reserved field expansions
> > > > - expansions of a struct with a flex array at the end
> > > > - type changes
> > > > - re-ordering of existing members
> > > > - ...others?
> > >
> > > Obviously not, as some of those are real breakages, and some are not at
> > > all.
> > >
> > > Please understand what is an abi breakage. Adding new enums is not.
> > > Using a reserved field is not. Reording existing members IS.
> > >
> >
> > Yes, understood that method would miss certain classes of breakages. I
> > was suggesting it as a way to improve the signal-to-noise ratio of the
> > tool since we don't currently have an algorithm for determining
> > breakages with 100% accuracy.
>
> Note that you can check the exit code of libabigail's abidiff to see
> whether something is an incompatible abi change or not, see:
> https://sourceware.org/libabigail/manual/abidiff.html#return-values
>
> You can also of course use suppressions to instruct abidiff to avoid
> reporting changes involving certain ABI artifacts:
> https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppr-spec-label

libabigail's abidiff already hides certain differences by default.
You can turn this behaviour off with --harmless.

Note that abidiff without --harmless treats certain ABI differences
asymmetrically,
hiding them one way around but not the other.

The ABI diff tool I designed for Android always treats differences symmetrically
and will only suppress certain kinds of diff if specially requested
(which we don't
do any more in production). [Technically, we also ignore qualifier changes on
function parameter and return types, but we achieve that by stripping them out
unconditionally.]

Once we get around to UAPI monitoring, we'll do the same there. We can always
review the SNR later.

Regards,
Giuliano.

> Cheers,
>
> Mark
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>

2023-04-13 17:25:09

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/13/2023 7:37 AM, Mark Wielaard wrote:
> Hi,
>
> On Wed, 2023-04-12 at 09:37 -0700, John Moon via Libabigail wrote:
>> On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
>>>> Would you find the tool more useful if it simply filtered out all instances
>>>> where the size of the type did not change? This would filter out the
>>>> following which the tool currently flags:
>>>>
>>>> - enum expansions
>>>> - reserved field expansions
>>>> - expansions of a struct with a flex array at the end
>>>> - type changes
>>>> - re-ordering of existing members
>>>> - ...others?
>>>
>>> Obviously not, as some of those are real breakages, and some are not at
>>> all.
>>>
>>> Please understand what is an abi breakage. Adding new enums is not.
>>> Using a reserved field is not. Reording existing members IS.
>>>
>>
>> Yes, understood that method would miss certain classes of breakages. I
>> was suggesting it as a way to improve the signal-to-noise ratio of the
>> tool since we don't currently have an algorithm for determining
>> breakages with 100% accuracy.
>
> Note that you can check the exit code of libabigail's abidiff to see
> whether something is an incompatible abi change or not, see:
> https://sourceware.org/libabigail/manual/abidiff.html#return-values
>
> You can also of course use suppressions to instruct abidiff to avoid
> reporting changes involving certain ABI artifacts:
> https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppr-spec-label
>
> Cheers,
>
> Mark

Checking the ABIDIFF_ABI_INCOMPATIBLE_CHANGE flag in the return code is
a good idea, but checking it doesn't change what the tool is currently
outputting (i.e. the flag is set for all the changes currently
reported). I think this is because of some filtering we're doing based
on grepping stdout, but checking the return code would be more stable.

The suppressions may work for some cases, but I fear they would be too
eager in other cases. Looking at the docs, I'm not sure how we could
express something like:

"suppress changed enumerators if they end in 'MAX' or 'LAST' and appear
at the end of the enumeration"

or

"suppress data member insertions into a struct if the last member in the
struct has its size reduced by sizeof(new_member) and is named 'pad' or
'reserved'"

They're complicated cases to detect in a general way.

Thanks,
John

2023-04-13 17:55:28

by John Moon

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On 4/13/2023 10:03 AM, Nicolas Schier wrote:
> On Mon, Apr 10, 2023 at 07:03:05PM +0900 Masahiro Yamada wrote:
>> CAVEAT
>> This tool runs 'git checkout' a couple of times internally.
>> If you interrupt it, your worktree might be messed up.
>
> John, did you consider using git export for obtaining a source tree that
> can be used for headers_install and the following comparison, instead of
> git stash+checkout? AFACS the script does not depend on any other git
> functionality than providing a specific version of the source tree. I am
> pretty sure that leaving the original working copy in its original state
> is supporting the script's popularity.
>
> Kind regards,
> Nicolas
>

I hadn't considered that. Great suggestion! I'll use this method in the
next rev. Thanks!

2023-04-13 18:24:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] check-uapi: Introduce check-uapi.sh

On Thu, Apr 13, 2023 at 10:07:23AM -0700, John Moon wrote:
> On 4/12/2023 9:43 AM, Greg Kroah-Hartman wrote:
> > On Wed, Apr 12, 2023 at 09:37:16AM -0700, John Moon wrote:
> > > On 4/11/2023 11:14 PM, Greg Kroah-Hartman wrote:
> > > > > Would you find the tool more useful if it simply filtered out all instances
> > > > > where the size of the type did not change? This would filter out the
> > > > > following which the tool currently flags:
> > > > >
> > > > > - enum expansions
> > > > > - reserved field expansions
> > > > > - expansions of a struct with a flex array at the end
> > > > > - type changes
> > > > > - re-ordering of existing members
> > > > > - ...others?
> > > >
> > > > Obviously not, as some of those are real breakages, and some are not at
> > > > all.
> > > >
> > > > Please understand what is an abi breakage. Adding new enums is not.
> > > > Using a reserved field is not. Reording existing members IS.
> > > >
> > >
> > > Yes, understood that method would miss certain classes of breakages. I was
> > > suggesting it as a way to improve the signal-to-noise ratio of the tool
> > > since we don't currently have an algorithm for determining breakages with
> > > 100% accuracy.
> >
> > Why not? You know the different types of things here based on the
> > differences between the dwarf data, and they fall into different
> > categories, and those different categories mean different things.
> >
> > If you have questions as to which type of change is allowed and which is
> > not, just ask us, the rules are not complex, nor impossible to describe,
> > otherwise we wouldn't have a stable api at all, right?
> >
>
> Right, it's currently a limitation of parsing the abidiff output.
>
> Even in trivial situations like an enum expansion, the tool knows that a
> variant was added and another variant had its offset changed. There's not a
> good way to say for sure that the variant whose offset changed is a "*_MAX"
> variant. So if we simply ignored enum expansion, we'd miss breakages like
> this:
>
> enum foo {
> FLAG_A,
> + FLAG_B,
> FLAG_C,
> FLAG_MAX
> }
>
> Maybe we can ignore an enum expansion if only the last variant's offset
> changed, but then we'd miss cases where enums don't have a MAX variant.
> Maybe we could limit the check to last variant's offset whose name contains
> string "MAX", but what if someone calls it "LAST" instead? It gets fragile.

That's what the regexes are for, you can make them on a per-file basis,
right?

> Or situations like expanding into reserved fields. How can we detect the
> difference between this:
>
> struct foo {
> __u32 x;
> __u32 y;
> + __u32 z;
> + __u8 padding[12];
> - __u8 padding[16];
> }
>
> And this:
>
> struct foo {
> __u32 x;
> __u32 y;
> + __u32 z;
> + char codename[4]; /* Takes "NAME" */
> - char codename[8]; /* Takes "CODENAME" */
> }
>
> Maybe we grep for "pad" or "reserved", but again... fragile.

Again, regexes.

But if this is too fragile, then yes, it's going to be useless as those
are obviously allowed changes and you are giving us a tool that would
say they are forbidden.

> Another idea is to add some sort of in-line comment to give the checker a
> hint that the field is intentionally unstable. It could be implicit for
> "*_MAX" enum variants or "*padding" at the end of structs, but if you wanted
> to have something like "end[]" (like in the rseq change), you could add /*
> ABI-unstable */ next to it and the script would ignore it.

The abi isn't "unstable", it's "extensible". Those are two very
different things.

> Beyond those issues, we have non-trivial situations like when it's safe to
> add members to a struct. We know the kernel will zero-extend mismatches
> between kernel and userspace, but how do we know the driver properly handles
> the case of an old userspace passing an old struct?

Then flag it to make sure as the "driver" is the "kernel".

> So far, we've erred on the side of flagging it if it _could_ be a break and
> relied on the reviewer to make the final determination.

But don't give us loads of "this could be broken" if it really isn't
please.

You have decades of code history to run the tool on to get these things
worked out. Please do so before expecting us to use it and complain
about things it flags that are not actual breakages.

In it's current form, would you use this tool if you were the maintainer
of a subsystem?

thanks,

greg k-h

2023-07-20 16:34:38

by Giuliano Procida

[permalink] [raw]
Subject: [PATCH] scripts/check-uapi.sh: add stgdiff support

Hi John.

I spent a few minutes adding stgdiff support to the script. It's
really just for illustration purposes.

As I think you know, STG doesn't yet exist as a project outside of
AOSP. Nevertheless, this may be useful to you as-is.

STG has quite a different philosophy to libabigil in terms of
filtering out certain kinds of differences. Some of the things (like
enum enumerator additions) are not considered harmless. The reasoning
behind this is basically...
https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)

However, it does have --ignore interface_addition (and the related
--ignore type_definition_addition) which can be used to detect whether
one ABI is a subset of another.

I am looking at adding support for macro definitions (gcc -g3) to STG
which will then let us cover significantly more of the UAPI surface.

Unfortunately, there are some headers which use anonymous enums to
define constants (e.g. and ironically BTF). ABI tracking these types
would require something a bit hacky. Or we could just name them.

Regards,
Giuliano.

Signed-off-by: Giuliano Procida <[email protected]>
---
scripts/check-uapi.sh | 102 ++++++++++++++++++++++++++++--------------
1 file changed, 69 insertions(+), 33 deletions(-)

diff --git a/scripts/check-uapi.sh b/scripts/check-uapi.sh
index 755187f27be5..982666b48f3b 100755
--- a/scripts/check-uapi.sh
+++ b/scripts/check-uapi.sh
@@ -32,6 +32,7 @@ Options:
-v Verbose operation (print more information about each header being checked).

Environmental args:
+ STGDIFF Custom path to stgdiff binary - use stgdiff instead of abidiff
ABIDIFF Custom path to abidiff binary
CC C compiler (default is "gcc")
ARCH Target architecture of C compiler (default is host arch)
@@ -270,43 +271,78 @@ compare_abi() {
exit "$FAIL_COMPILE"
fi

- local ret=0
- "$ABIDIFF" --non-reachable-types "${past_header}.bin" "${base_header}.bin" > "$log" || ret="$?"
- if [ "$ret" -eq 0 ]; then
- if [ "$VERBOSE" = "true" ]; then
- printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ if [ "$STGDIFF" ]; then
+ local ret=0
+ "$STGDIFF" --types --ignore interface_addition --elf "${past_header}.bin" "${base_header}.bin" --format small --output "$log" || ret="$?"
+ if [ "$ret" -eq 0 ]; then
+ if [ "$VERBOSE" = "true" ]; then
+ printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ fi
+ else
+ # stgdiff's return code can be used to determine the type of error
+ if [ $((ret & 0x1)) -gt 0 ]; then
+ eprintf "error - stgdiff failed\n"
+ exit 1
+ fi
+
+ {
+ printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ sed -e 's/^/ /g' "$log"
+
+ if ! cmp "$past_header" "$base_header" > /dev/null 2>&1; then
+ printf "\nHeader file diff (after headers_install):\n"
+ diff -Naur "$past_header" "$base_header" \
+ | sed -e "s|${past_header}|${past_ref}/${file}|g" \
+ -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
+ printf "\n"
+ else
+ printf "\n%s did not change between %s and %s...\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ printf "It's possible a change to one of the headers it includes caused this error:\n"
+ grep '^#include' "$base_header"
+ printf "\n"
+ fi
+ } | tee "${base_header}.error" >&2
+ return 1
fi
else
- # Bits in abidiff's return code can be used to determine the type of error
- if [ $((ret & 0x1)) -gt 0 ]; then
- eprintf "error - abidiff did not run properly\n"
- exit 1
- fi
+ local ret=0
+ "$ABIDIFF" --non-reachable-types "${past_header}.bin" "${base_header}.bin" > "$log" || ret="$?"
+ if [ "$ret" -eq 0 ]; then
+ if [ "$VERBOSE" = "true" ]; then
+ printf "No ABI differences detected in %s from %s -> %s\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ fi
+ else
+ # Bits in abidiff's return code can be used to determine the type of error
+ if [ $((ret & 0x1)) -gt 0 ]; then
+ eprintf "error - abidiff did not run properly\n"
+ exit 1
+ fi

- # If the only changes were additions (not modifications to existing APIs), then
- # there's no problem. Ignore these diffs.
- if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
- grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
- return 0
- fi
- {
- printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
- sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
-
- if ! cmp "$past_header" "$base_header" > /dev/null 2>&1; then
- printf "\nHeader file diff (after headers_install):\n"
- diff -Naur "$past_header" "$base_header" \
- | sed -e "s|${past_header}|${past_ref}/${file}|g" \
- -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
- printf "\n"
- else
- printf "\n%s did not change between %s and %s...\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
- printf "It's possible a change to one of the headers it includes caused this error:\n"
- grep '^#include' "$base_header"
- printf "\n"
+ # If the only changes were additions (not modifications to existing APIs), then
+ # there's no problem. Ignore these diffs.
+ if grep "Unreachable types summary" "$log" | grep -q "0 removed" &&
+ grep "Unreachable types summary" "$log" | grep -q "0 changed"; then
+ return 0
fi
- } | tee "${base_header}.error" >&2
- return 1
+ {
+ printf "!!! ABI differences detected in %s from %s -> %s !!!\n\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ sed -e '/summary:/d' -e '/changed type/d' -e '/^$/d' -e 's/^/ /g' "$log"
+
+ if ! cmp "$past_header" "$base_header" > /dev/null 2>&1; then
+ printf "\nHeader file diff (after headers_install):\n"
+ diff -Naur "$past_header" "$base_header" \
+ | sed -e "s|${past_header}|${past_ref}/${file}|g" \
+ -e "s|${base_header}|${base_ref:-dirty}/${file}|g"
+ printf "\n"
+ else
+ printf "\n%s did not change between %s and %s...\n" "$file" "$past_ref" "${base_ref:-dirty tree}"
+ printf "It's possible a change to one of the headers it includes caused this error:\n"
+ grep '^#include' "$base_header"
+ printf "\n"
+ fi
+ } | tee "${base_header}.error" >&2
+ return 1
+ fi
fi
}

--
2.41.0.255.g8b1d071c50-goog


2023-07-22 20:18:38

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH] scripts/check-uapi.sh: add stgdiff support

On 7/20/2023 9:10 AM, Giuliano Procida wrote:
> Hi John.
>
> I spent a few minutes adding stgdiff support to the script. It's
> really just for illustration purposes.
>
> As I think you know, STG doesn't yet exist as a project outside of
> AOSP. Nevertheless, this may be useful to you as-is.
>
> STG has quite a different philosophy to libabigil in terms of
> filtering out certain kinds of differences. Some of the things (like
> enum enumerator additions) are not considered harmless. The reasoning
> behind this is basically...
> https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)
>
> However, it does have --ignore interface_addition (and the related
> --ignore type_definition_addition) which can be used to detect whether
> one ABI is a subset of another.
>
> I am looking at adding support for macro definitions (gcc -g3) to STG
> which will then let us cover significantly more of the UAPI surface.
>
> Unfortunately, there are some headers which use anonymous enums to
> define constants (e.g. and ironically BTF). ABI tracking these types
> would require something a bit hacky. Or we could just name them.

Thank you Giuliano for trying the script w/ stg. We will review the
modifications below.

Just to update everyone here that John is looking into the libabigail
changes to reduce the false positives as discussed earlier in the email
thread. There is some progress on the libabigail mailing list. Once we
have enough changes made in the libabigail, John will update here with
his results.

We have also submitted abstract in LPC 2023 Android MC as well about the
UAPI checker. We hope to make a good progress before LPC.

---Trilok Soni