Hi Greg,
Add a new feature at get_abi.pl to optionally check for existing symbols
under /sys that won't match a "What:" inside Documentation/ABI.
Such feature is very useful to detect missing documentation for ABI.
This series brings a major speedup, plus it fixes a few border cases when
matching regexes that end with a ".*" or \d+.
patch 1 changes get_abi.pl logic to handle multiple What: lines, in
order to make the script more robust;
patch 2 adds the basic logic. It runs really quicky (up to 2
seconds), but it doesn't use sysfs softlinks.
Patch 3 adds support for parsing softlinks. It makes the script a
lot slower, making it take a couple of minutes to process the entire
sysfs files. It could be optimized in the future by using a graph,
but, for now, let's keep it simple.
Patch 4 adds an optional parameter to allow filtering the results
using a regex given by the user. When this parameter is used
(which should be the normal usecase), it will only try to find softlinks
if the sysfs node matches a regex.
Patch 5 improves the report by avoiding it to ignore What: that
ends with a wildcard.
Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
results are:
$ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
real 2m35.563s
user 2m34.346s
sys 0m1.220s
7595 undefined
896 undefined_symbols
Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
search for links by a logic which handle symlinks using BFS. It
also addresses a border case that was making 'msi-irqs/\d+' regex to
be misparsed.
After patch 7, it is 11 times faster:
$ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
real 0m14.137s
user 0m12.795s
sys 0m1.348s
7030 undefined
794 undefined_symbols
(the difference on the number of undefined symbols are due to the fix for
it to properly handle 'msi-irqs/\d+' regex)
-
While this series is independent from Documentation/ABI changes, it
works best when applied from this tree, which also contain ABI fixes
and a couple of additions of frequent missed symbols on my machine:
https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
-
v3:
- Fixed parse issues with 'msi-irqs/\d+' regex;
- Added a BFS graph logic to solve symlinks at sysfs;
v2:
- multiple What: for the same description are now properly handled;
- some special cases are now better handled;
- some bugs got fixed.
The full series, with the ABI changes and some ABI improvements can be found
at:
https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/commit/?h=get_undefined&id=1838d8fb149170f6c19feda0645d6c3157f46f4f
Mauro Carvalho Chehab (7):
scripts: get_abi.pl: Better handle multiple What parameters
scripts: get_abi.pl: Check for missing symbols at the ABI specs
scripts: get_abi.pl: detect softlinks
scripts: get_abi.pl: add an option to filter undefined results
scripts: get_abi.pl: don't skip what that ends with wildcards
scripts: get_abi.pl: Ignore fs/cgroup sysfs nodes earlier
scripts: get_abi.pl: add a graph to speedup the undefined algorithm
scripts/get_abi.pl | 327 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 320 insertions(+), 7 deletions(-)
--
2.31.1
Check for the symbols that exists under /sys but aren't
defined at Documentation/ABI.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 90 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 88 insertions(+), 2 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index cfc107df59f4..78364c4c4967 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -13,7 +13,9 @@ my $help = 0;
my $man = 0;
my $debug = 0;
my $enable_lineno = 0;
+my $show_warnings = 1;
my $prefix="Documentation/ABI";
+my $sysfs_prefix="/sys";
#
# If true, assumes that the description is formatted with ReST
@@ -36,7 +38,7 @@ pod2usage(2) if (scalar @ARGV < 1 || @ARGV > 2);
my ($cmd, $arg) = @ARGV;
-pod2usage(2) if ($cmd ne "search" && $cmd ne "rest" && $cmd ne "validate");
+pod2usage(2) if ($cmd ne "search" && $cmd ne "rest" && $cmd ne "validate" && $cmd ne "undefined");
pod2usage(2) if ($cmd eq "search" && !$arg);
require Data::Dumper if ($debug);
@@ -50,6 +52,8 @@ my %symbols;
sub parse_error($$$$) {
my ($file, $ln, $msg, $data) = @_;
+ return if (!$show_warnings);
+
$data =~ s/\s+$/\n/;
print STDERR "Warning: file $file#$ln:\n\t$msg";
@@ -521,11 +525,88 @@ sub search_symbols {
}
}
+# Exclude /sys/kernel/debug and /sys/kernel/tracing from the search path
+sub skip_debugfs {
+ if (($File::Find::dir =~ m,^/sys/kernel,)) {
+ return grep {!/(debug|tracing)/ } @_;
+ }
+
+ if (($File::Find::dir =~ m,^/sys/fs,)) {
+ return grep {!/(pstore|bpf|fuse)/ } @_;
+ }
+
+ return @_
+}
+
+my %leaf;
+
+my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xff]) }x;
+sub parse_existing_sysfs {
+ my $file = $File::Find::name;
+
+ my $mode = (stat($file))[2];
+ return if ($mode & S_IFDIR);
+
+ my $leave = $file;
+ $leave =~ s,.*/,,;
+
+ if (defined($leaf{$leave})) {
+ # FIXME: need to check if the path makes sense
+ my $what = $leaf{$leave};
+
+ $what =~ s/,/ /g;
+
+ $what =~ s/\<[^\>]+\>/.*/g;
+ $what =~ s/\{[^\}]+\}/.*/g;
+ $what =~ s/\[[^\]]+\]/.*/g;
+ $what =~ s,/\.\.\./,/.*/,g;
+ $what =~ s,/\*/,/.*/,g;
+
+ $what =~ s/\s+/ /g;
+
+ # Escape all other symbols
+ $what =~ s/$escape_symbols/\\$1/g;
+
+ foreach my $i (split / /,$what) {
+ if ($file =~ m#^$i$#) {
+# print "$file: $i: OK!\n";
+ return;
+ }
+ }
+
+ print "$file: $leave is defined at $what\n";
+
+ return;
+ }
+
+ print "$file not found.\n";
+}
+
+sub undefined_symbols {
+ foreach my $w (sort keys %data) {
+ foreach my $what (split /\xac /,$w) {
+ my $leave = $what;
+ $leave =~ s,.*/,,;
+
+ if (defined($leaf{$leave})) {
+ $leaf{$leave} .= " " . $what;
+ } else {
+ $leaf{$leave} = $what;
+ }
+ }
+ }
+
+ find({wanted =>\&parse_existing_sysfs, preprocess =>\&skip_debugfs, no_chdir => 1}, $sysfs_prefix);
+}
+
# Ensure that the prefix will always end with a slash
# While this is not needed for find, it makes the patch nicer
# with --enable-lineno
$prefix =~ s,/?$,/,;
+if ($cmd eq "undefined" || $cmd eq "search") {
+ $show_warnings = 0;
+}
#
# Parses all ABI files located at $prefix dir
#
@@ -536,7 +617,9 @@ print STDERR Data::Dumper->Dump([\%data], [qw(*data)]) if ($debug);
#
# Handles the command
#
-if ($cmd eq "search") {
+if ($cmd eq "undefined") {
+ undefined_symbols;
+} elsif ($cmd eq "search") {
search_symbols;
} else {
if ($cmd eq "rest") {
@@ -575,6 +658,9 @@ B<rest> - output the ABI in ReST markup language
B<validate> - validate the ABI contents
+B<undefined> - existing symbols at the system that aren't
+ defined at Documentation/ABI
+
=back
=head1 OPTIONS
--
2.31.1
Using a comma here is problematic, as some What: expressions
may already contain a comma. So, use \xac character instead.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index d7aa82094296..cfc107df59f4 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -129,12 +129,12 @@ sub parse_abi {
push @{$symbols{$content}->{file}}, " $file:" . ($ln - 1);
if ($tag =~ m/what/) {
- $what .= ", " . $content;
+ $what .= "\xac" . $content;
} else {
if ($what) {
parse_error($file, $ln, "What '$what' doesn't have a description", "") if (!$data{$what}->{description});
- foreach my $w(split /, /, $what) {
+ foreach my $w(split /\xac/, $what) {
$symbols{$w}->{xref} = $what;
};
}
@@ -239,7 +239,7 @@ sub parse_abi {
if ($what) {
parse_error($file, $ln, "What '$what' doesn't have a description", "") if (!$data{$what}->{description});
- foreach my $w(split /, /,$what) {
+ foreach my $w(split /\xac/,$what) {
$symbols{$w}->{xref} = $what;
};
}
@@ -328,7 +328,7 @@ sub output_rest {
printf ".. _%s:\n\n", $data{$what}->{label};
- my @names = split /, /,$w;
+ my @names = split /\xac/,$w;
my $len = 0;
foreach my $name (@names) {
--
2.31.1
In order to speedup the parser and store less data, handle
fs/cgroup exceptions a lot earlier.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index fe83f295600c..aa0a751563ba 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -550,6 +550,10 @@ my @files;
my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xfe]) }x;
sub parse_existing_sysfs {
my $file = $File::Find::name;
+
+ # Ignore cgroup and firmware
+ return if ($file =~ m#^/sys/(fs/cgroup|firmware)/#);
+
my $mode = (lstat($file))[2];
my $abs_file = abs_path($file);
@@ -570,9 +574,6 @@ sub parse_existing_sysfs {
sub check_undefined_symbols {
foreach my $file (sort @files) {
- # Ignore cgroup and firmware
- next if ($file =~ m#^/sys/(fs/cgroup|firmware)/#);
-
my $defined = 0;
my $exact = 0;
my $whats = "";
--
2.31.1
The search algorithm used inside check_undefined_symbols
has an optimization: it seeks only whats that have the same
leave name. This helps not only to speedup the search, but
it also allows providing a hint about a partial match.
There's a drawback, however: when "what:" finishes with a
wildcard, the logic will skip the what, reporting it as
"not found".
Fix it by grouping the remaining cases altogether, and
disabing any hints for such cases.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 74 +++++++++++++++++++++++++++-------------------
1 file changed, 43 insertions(+), 31 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index f5f2f664e336..fe83f295600c 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -589,44 +589,47 @@ sub check_undefined_symbols {
$found_string = 1;
}
+ if ($leave =~ /^\d+$/ || !defined($leaf{$leave})) {
+ $leave = "others";
+ }
+
print "--> $file\n" if ($found_string && $hint);
- if (defined($leaf{$leave})) {
- my $what = $leaf{$leave};
- $whats .= " $what" if (!($whats =~ m/$what/));
+ my $what = $leaf{$leave};
+ $whats .= " $what" if (!($whats =~ m/$what/));
- foreach my $w (split / /, $what) {
- if ($file =~ m#^$w$#) {
- $exact = 1;
- last;
- }
+ foreach my $w (split / /, $what) {
+ if ($file =~ m#^$w$#) {
+ $exact = 1;
+ last;
}
- # Check for aliases
- #
- # TODO: this algorithm is O(w * n²). It can be
- # improved in the future in order to handle it
- # faster, by changing parse_existing_sysfs to
- # store the sysfs inside a tree, at the expense
- # on making the code less readable and/or using some
- # additional perl library.
- foreach my $a (keys %aliases) {
- my $new = $aliases{$a};
- my $len = length($new);
+ }
+ # Check for aliases
+ #
+ # TODO: this algorithm is O(w * n²). It can be
+ # improved in the future in order to handle it
+ # faster, by changing parse_existing_sysfs to
+ # store the sysfs inside a tree, at the expense
+ # on making the code less readable and/or using some
+ # additional perl library.
+ foreach my $a (keys %aliases) {
+ my $new = $aliases{$a};
+ my $len = length($new);
- if (substr($file, 0, $len) eq $new) {
- my $newf = $a . substr($file, $len);
+ if (substr($file, 0, $len) eq $new) {
+ my $newf = $a . substr($file, $len);
- print " $newf\n" if ($found_string && $hint);
- foreach my $w (split / /, $what) {
- if ($newf =~ m#^$w$#) {
- $exact = 1;
- last;
- }
+ print " $newf\n" if ($found_string && $hint);
+ foreach my $w (split / /, $what) {
+ if ($newf =~ m#^$w$#) {
+ $exact = 1;
+ last;
}
}
}
-
- $defined++;
}
+
+ $defined++;
+
next if ($exact);
# Ignore some sysfs nodes
@@ -637,7 +640,7 @@ sub check_undefined_symbols {
# is not easily parseable.
next if ($file =~ m#/parameters/#);
- if ($hint && $defined) {
+ if ($hint && $defined && $leave ne "others") {
print "$leave at $path might be one of:$whats\n" if (!$search_string || $found_string);
next;
}
@@ -699,7 +702,16 @@ sub undefined_symbols {
my $leave = $what;
$leave =~ s,.*/,,;
- next if ($leave =~ m/^\.\*/ || $leave eq "");
+ # $leave is used to improve search performance at
+ # check_undefined_symbols, as the algorithm there can seek
+ # for a small number of "what". It also allows giving a
+ # hint about a leave with the same name somewhere else.
+ # However, there are a few occurences where the leave is
+ # either a wildcard or a number. Just group such cases
+ # altogether.
+ if ($leave =~ m/^\.\*/ || $leave eq "" || $leave =~ /^\d+$/) {
+ $leave = "others" ;
+ }
# Escape all other symbols
$what =~ s/$escape_symbols/\\$1/g;
--
2.31.1
Searching for symlinks is an expensive operation with the current
logic, as it is at the order of O(n^3). In practice, running the
check spends 2-3 minutes to check all symbols.
Fix it by storing the directory tree into a graph, and using
a Breadth First Search (BFS) to find the links for each sysfs node.
With such improvement, it can now report issues with ~11 seconds
on my machine.
It comes with a price, though: there are more symbols reported
as undefined after this change. I suspect it is due to some
sysfs circular loops that are dropped by BFS. Despite such
increase, it seems that the reports are now more coherent.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 188 ++++++++++++++++++++++++++++++---------------
1 file changed, 127 insertions(+), 61 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index aa0a751563ba..c52a1cf0f49d 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -546,6 +546,73 @@ sub dont_parse_special_attributes {
my %leaf;
my %aliases;
my @files;
+my %root;
+
+sub graph_add_file {
+ my $file = shift;
+ my $type = shift;
+
+ my $dir = $file;
+ $dir =~ s,^(.*/).*,$1,;
+ $file =~ s,.*/,,;
+
+ my $name;
+ my $file_ref = \%root;
+ foreach my $edge(split "/", $dir) {
+ $name .= "$edge/";
+ if (!defined ${$file_ref}{$edge}) {
+ ${$file_ref}{$edge} = { };
+ }
+ $file_ref = \%{$$file_ref{$edge}};
+ ${$file_ref}{"__name"} = [ $name ];
+ }
+ $name .= "$file";
+ ${$file_ref}{$file} = {
+ "__name" => [ $name ]
+ };
+
+ return \%{$$file_ref{$file}};
+}
+
+sub graph_add_link {
+ my $file = shift;
+ my $link = shift;
+
+ # Traverse graph to find the reference
+ my $file_ref = \%root;
+ foreach my $edge(split "/", $file) {
+ $file_ref = \%{$$file_ref{$edge}} || die "Missing node!";
+ }
+
+ # do a BFS
+
+ my @queue;
+ my %seen;
+ my $base_name;
+ my $st;
+
+ push @queue, $file_ref;
+ $seen{$start}++;
+
+ while (@queue) {
+ my $v = shift @queue;
+ my @child = keys(%{$v});
+
+ foreach my $c(@child) {
+ next if $seen{$$v{$c}};
+ next if ($c eq "__name");
+
+ # Add new name
+ my $name = @{$$v{$c}{"__name"}}[0];
+ if ($name =~ s#^$file/#$link/#) {
+ push @{$$v{$c}{"__name"}}, $name;
+ }
+ # Add child to the queue and mark as seen
+ push @queue, $$v{$c};
+ $seen{$c}++;
+ }
+ }
+}
my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xfe]) }x;
sub parse_existing_sysfs {
@@ -568,19 +635,50 @@ sub parse_existing_sysfs {
return if (defined($data{$file}));
return if (defined($data{$abs_file}));
- push @files, $abs_file;
+ push @files, graph_add_file($abs_file, "file");
+}
+
+sub get_leave($)
+{
+ my $what = shift;
+ my $leave;
+
+ my $l = $what;
+ my $stop = 1;
+
+ $leave = $l;
+ $leave =~ s,/$,,;
+ $leave =~ s,.*/,,;
+ $leave =~ s/[\(\)]//g;
+
+ # $leave is used to improve search performance at
+ # check_undefined_symbols, as the algorithm there can seek
+ # for a small number of "what". It also allows giving a
+ # hint about a leave with the same name somewhere else.
+ # However, there are a few occurences where the leave is
+ # either a wildcard or a number. Just group such cases
+ # altogether.
+ if ($leave =~ m/^\.\*/ || $leave eq "" || $leave =~ /^\d+$/) {
+ $leave = "others";
+ }
+
+ return $leave;
}
sub check_undefined_symbols {
- foreach my $file (sort @files) {
+ foreach my $file_ref (sort @files) {
+ my @names = @{$$file_ref{"__name"}};
+ my $file = $names[0];
my $defined = 0;
my $exact = 0;
- my $whats = "";
my $found_string;
- my $leave = $file;
- $leave =~ s,.*/,,;
+ my $leave = get_leave($file);
+ if (!defined($leaf{$leave})) {
+ $leave = "others";
+ }
+ my $what = $leaf{$leave};
my $path = $file;
$path =~ s,(.*/).*,$1,;
@@ -590,41 +688,12 @@ sub check_undefined_symbols {
$found_string = 1;
}
- if ($leave =~ /^\d+$/ || !defined($leaf{$leave})) {
- $leave = "others";
- }
-
- print "--> $file\n" if ($found_string && $hint);
- my $what = $leaf{$leave};
- $whats .= " $what" if (!($whats =~ m/$what/));
-
- foreach my $w (split / /, $what) {
- if ($file =~ m#^$w$#) {
- $exact = 1;
- last;
- }
- }
- # Check for aliases
- #
- # TODO: this algorithm is O(w * n²). It can be
- # improved in the future in order to handle it
- # faster, by changing parse_existing_sysfs to
- # store the sysfs inside a tree, at the expense
- # on making the code less readable and/or using some
- # additional perl library.
- foreach my $a (keys %aliases) {
- my $new = $aliases{$a};
- my $len = length($new);
-
- if (substr($file, 0, $len) eq $new) {
- my $newf = $a . substr($file, $len);
-
- print " $newf\n" if ($found_string && $hint);
- foreach my $w (split / /, $what) {
- if ($newf =~ m#^$w$#) {
- $exact = 1;
- last;
- }
+ foreach my $a (@names) {
+ print "--> $a\n" if ($found_string && $hint);
+ foreach my $w (split /\xac/, $what) {
+ if ($a =~ m#^$w$#) {
+ $exact = 1;
+ last;
}
}
}
@@ -641,8 +710,13 @@ sub check_undefined_symbols {
# is not easily parseable.
next if ($file =~ m#/parameters/#);
- if ($hint && $defined && $leave ne "others") {
- print "$leave at $path might be one of:$whats\n" if (!$search_string || $found_string);
+ if ($hint && $defined && (!$search_string || $found_string)) {
+ $what =~ s/\xac/\n\t/g;
+ if ($leave ne "others") {
+ print " more likely regexes:\n\t$what\n";
+ } else {
+ print " tested regexes:\n\t$what\n";
+ }
next;
}
print "$file not found.\n" if (!$search_string || $found_string);
@@ -656,8 +730,10 @@ sub undefined_symbols {
no_chdir => 1
}, $sysfs_prefix);
+ $leaf{"others"} = "";
+
foreach my $w (sort keys %data) {
- foreach my $what (split /\xac /,$w) {
+ foreach my $what (split /\xac/,$w) {
next if (!($what =~ m/^$sysfs_prefix/));
# Convert what into regular expressions
@@ -700,19 +776,7 @@ sub undefined_symbols {
# (this happens on a few IIO definitions)
$what =~ s,\s*\=.*$,,;
- my $leave = $what;
- $leave =~ s,.*/,,;
-
- # $leave is used to improve search performance at
- # check_undefined_symbols, as the algorithm there can seek
- # for a small number of "what". It also allows giving a
- # hint about a leave with the same name somewhere else.
- # However, there are a few occurences where the leave is
- # either a wildcard or a number. Just group such cases
- # altogether.
- if ($leave =~ m/^\.\*/ || $leave eq "" || $leave =~ /^\d+$/) {
- $leave = "others" ;
- }
+ my $leave = get_leave($what);
# Escape all other symbols
$what =~ s/$escape_symbols/\\$1/g;
@@ -722,17 +786,14 @@ sub undefined_symbols {
$what =~ s/\xff/\\d+/g;
-
# Special case: IIO ABI which a parenthesis.
$what =~ s/sqrt(.*)/sqrt\(.*\)/;
- $leave =~ s/[\(\)]//g;
-
my $added = 0;
foreach my $l (split /\|/, $leave) {
if (defined($leaf{$l})) {
- next if ($leaf{$l} =~ m/$what/);
- $leaf{$l} .= " " . $what;
+ next if ($leaf{$l} =~ m/\b$what\b/);
+ $leaf{$l} .= "\xac" . $what;
$added = 1;
} else {
$leaf{$l} = $what;
@@ -745,6 +806,11 @@ sub undefined_symbols {
}
}
+ # Take links into account
+ foreach my $link (keys %aliases) {
+ my $abs_file = $aliases{$link};
+ graph_add_link($abs_file, $link);
+ }
check_undefined_symbols;
}
--
2.31.1
The way sysfs works is that the same leave may be present under
/sys/devices, /sys/bus and /sys/class, etc, linked via soft
symlinks.
To make it harder to parse, the ABI definition usually refers
only to one of those locations.
So, improve the logic in order to retrieve the symlinks.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---
scripts/get_abi.pl | 207 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 165 insertions(+), 42 deletions(-)
diff --git a/scripts/get_abi.pl b/scripts/get_abi.pl
index 78364c4c4967..b0ca4f4e56f2 100755
--- a/scripts/get_abi.pl
+++ b/scripts/get_abi.pl
@@ -8,8 +8,10 @@ use Pod::Usage;
use Getopt::Long;
use File::Find;
use Fcntl ':mode';
+use Cwd 'abs_path';
my $help = 0;
+my $hint = 0;
my $man = 0;
my $debug = 0;
my $enable_lineno = 0;
@@ -28,6 +30,7 @@ GetOptions(
"rst-source!" => \$description_is_rst,
"dir=s" => \$prefix,
'help|?' => \$help,
+ "show-hints" => \$hint,
man => \$man
) or pod2usage(2);
@@ -526,7 +529,7 @@ sub search_symbols {
}
# Exclude /sys/kernel/debug and /sys/kernel/tracing from the search path
-sub skip_debugfs {
+sub dont_parse_special_attributes {
if (($File::Find::dir =~ m,^/sys/kernel,)) {
return grep {!/(debug|tracing)/ } @_;
}
@@ -539,64 +542,178 @@ sub skip_debugfs {
}
my %leaf;
+my %aliases;
+my @files;
-my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xff]) }x;
+my $escape_symbols = qr { ([\x01-\x08\x0e-\x1f\x21-\x29\x2b-\x2d\x3a-\x40\x7b-\xfe]) }x;
sub parse_existing_sysfs {
my $file = $File::Find::name;
+ my $mode = (lstat($file))[2];
+ my $abs_file = abs_path($file);
- my $mode = (stat($file))[2];
- return if ($mode & S_IFDIR);
-
- my $leave = $file;
- $leave =~ s,.*/,,;
-
- if (defined($leaf{$leave})) {
- # FIXME: need to check if the path makes sense
- my $what = $leaf{$leave};
-
- $what =~ s/,/ /g;
-
- $what =~ s/\<[^\>]+\>/.*/g;
- $what =~ s/\{[^\}]+\}/.*/g;
- $what =~ s/\[[^\]]+\]/.*/g;
- $what =~ s,/\.\.\./,/.*/,g;
- $what =~ s,/\*/,/.*/,g;
-
- $what =~ s/\s+/ /g;
-
- # Escape all other symbols
- $what =~ s/$escape_symbols/\\$1/g;
-
- foreach my $i (split / /,$what) {
- if ($file =~ m#^$i$#) {
-# print "$file: $i: OK!\n";
- return;
- }
- }
-
- print "$file: $leave is defined at $what\n";
-
+ if (S_ISLNK($mode)) {
+ $aliases{$file} = $abs_file;
return;
}
- print "$file not found.\n";
+ return if (S_ISDIR($mode));
+
+ # Trivial: file is defined exactly the same way at ABI What:
+ return if (defined($data{$file}));
+ return if (defined($data{$abs_file}));
+
+ push @files, $abs_file;
+}
+
+sub check_undefined_symbols {
+ foreach my $file (sort @files) {
+
+ # sysfs-module is special, as its definitions are inside
+ # a text. For now, just ignore them.
+ next if ($file =~ m#^/sys/module/#);
+
+ # Ignore cgroup and firmware
+ next if ($file =~ m#^/sys/(fs/cgroup|firmware)/#);
+
+ my $defined = 0;
+ my $exact = 0;
+ my $whats = "";
+
+ my $leave = $file;
+ $leave =~ s,.*/,,;
+
+ my $path = $file;
+ $path =~ s,(.*/).*,$1,;
+
+ if (defined($leaf{$leave})) {
+ my $what = $leaf{$leave};
+ $whats .= " $what" if (!($whats =~ m/$what/));
+
+ foreach my $w (split / /, $what) {
+ if ($file =~ m#^$w$#) {
+ $exact = 1;
+ last;
+ }
+ }
+ # Check for aliases
+ #
+ # TODO: this algorithm is O(w * n²). It can be
+ # improved in the future in order to handle it
+ # faster, by changing parse_existing_sysfs to
+ # store the sysfs inside a tree, at the expense
+ # on making the code less readable and/or using some
+ # additional perl library.
+ foreach my $a (keys %aliases) {
+ my $new = $aliases{$a};
+ my $len = length($new);
+
+ if (substr($file, 0, $len) eq $new) {
+ my $newf = $a . substr($file, $len);
+
+ foreach my $w (split / /, $what) {
+ if ($newf =~ m#^$w$#) {
+ $exact = 1;
+ last;
+ }
+ }
+ }
+ }
+
+ $defined++;
+ }
+ next if ($exact);
+
+ # Ignore some sysfs nodes
+ next if ($file =~ m#/(sections|notes)/#);
+
+ # Would need to check at
+ # Documentation/admin-guide/kernel-parameters.txt, but this
+ # is not easily parseable.
+ next if ($file =~ m#/parameters/#);
+
+ if ($hint && $defined) {
+ print "$leave at $path might be one of:$whats\n";
+ next;
+ }
+ print "$file not found.\n";
+ }
}
sub undefined_symbols {
+ find({
+ wanted =>\&parse_existing_sysfs,
+ preprocess =>\&dont_parse_special_attributes,
+ no_chdir => 1
+ }, $sysfs_prefix);
+
foreach my $w (sort keys %data) {
foreach my $what (split /\xac /,$w) {
+ next if (!($what =~ m/^$sysfs_prefix/));
+
+ # Convert what into regular expressions
+
+ $what =~ s,/\.\.\./,/*/,g;
+ $what =~ s,\*,.*,g;
+
+ # Temporarily change [0-9]+ type of patterns
+ $what =~ s/\[0\-9\]\+/\xff/g;
+
+ # Temporarily change [\d+-\d+] type of patterns
+ $what =~ s/\[0\-\d+\]/\xff/g;
+ $what =~ s/\[(\d+)\]/\xf4$1\xf5/g;
+
+ # Temporarily change [0-9] type of patterns
+ $what =~ s/\[(\d)\-(\d)\]/\xf4$1-$2\xf5/g;
+
+ # Handle multiple option patterns
+ $what =~ s/[\{\<\[]([\w_]+)(?:[,|]+([\w_]+)){1,}[\}\>\]]/($1|$2)/g;
+
+ # Handle wildcards
+ $what =~ s/\<[^\>]+\>/.*/g;
+ $what =~ s/\{[^\}]+\}/.*/g;
+ $what =~ s/\[[^\]]+\]/.*/g;
+
+ $what =~ s/[XYZ]/.*/g;
+
+ # Recover [0-9] type of patterns
+ $what =~ s/\xf4/[/g;
+ $what =~ s/\xf5/]/g;
+
+ # Remove duplicated spaces
+ $what =~ s/\s+/ /g;
+
+ # Special case: this ABI has a parenthesis on it
+ $what =~ s/sqrt\(x^2\+y^2\+z^2\)/sqrt\(x^2\+y^2\+z^2\)/;
+
+ # Special case: drop comparition as in:
+ # What: foo = <something>
+ # (this happens on a few IIO definitions)
+ $what =~ s,\s*\=.*$,,;
+
my $leave = $what;
$leave =~ s,.*/,,;
- if (defined($leaf{$leave})) {
- $leaf{$leave} .= " " . $what;
- } else {
- $leaf{$leave} = $what;
+ next if ($leave =~ m/^\.\*/ || $leave eq "");
+
+ # Escape all other symbols
+ $what =~ s/$escape_symbols/\\$1/g;
+ $what =~ s/\\\\/\\/g;
+ $what =~ s/\\([\[\]\(\)\|])/$1/g;
+ $what =~ s/(\d+)\\(-\d+)/$1$2/g;
+
+ $leave =~ s/[\(\)]//g;
+
+ foreach my $l (split /\|/, $leave) {
+ if (defined($leaf{$l})) {
+ next if ($leaf{$l} =~ m/$what/);
+ $leaf{$l} .= " " . $what;
+ } else {
+ $leaf{$l} = $what;
+ }
}
}
}
-
- find({wanted =>\&parse_existing_sysfs, preprocess =>\&skip_debugfs, no_chdir => 1}, $sysfs_prefix);
+ check_undefined_symbols;
}
# Ensure that the prefix will always end with a slash
@@ -646,7 +763,8 @@ abi_book.pl - parse the Linux ABI files and produce a ReST book.
=head1 SYNOPSIS
B<abi_book.pl> [--debug] [--enable-lineno] [--man] [--help]
- [--(no-)rst-source] [--dir=<dir>] <COMAND> [<ARGUMENT>]
+ [--(no-)rst-source] [--dir=<dir>] [--show-hints]
+ <COMAND> [<ARGUMENT>]
Where <COMMAND> can be:
@@ -688,6 +806,11 @@ Enable output of #define LINENO lines.
Put the script in verbose mode, useful for debugging. Can be called multiple
times, to increase verbosity.
+=item B<--show-hints>
+
+Show hints about possible definitions for the missing ABI symbols.
+Used only when B<undefined>.
+
=item B<--help>
Prints a brief help message and exits.
--
2.31.1
On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> Hi Greg,
>
> Add a new feature at get_abi.pl to optionally check for existing symbols
> under /sys that won't match a "What:" inside Documentation/ABI.
>
> Such feature is very useful to detect missing documentation for ABI.
>
> This series brings a major speedup, plus it fixes a few border cases when
> matching regexes that end with a ".*" or \d+.
>
> patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> order to make the script more robust;
>
> patch 2 adds the basic logic. It runs really quicky (up to 2
> seconds), but it doesn't use sysfs softlinks.
>
> Patch 3 adds support for parsing softlinks. It makes the script a
> lot slower, making it take a couple of minutes to process the entire
> sysfs files. It could be optimized in the future by using a graph,
> but, for now, let's keep it simple.
>
> Patch 4 adds an optional parameter to allow filtering the results
> using a regex given by the user. When this parameter is used
> (which should be the normal usecase), it will only try to find softlinks
> if the sysfs node matches a regex.
>
> Patch 5 improves the report by avoiding it to ignore What: that
> ends with a wildcard.
>
> Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> results are:
>
> $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
>
> real 2m35.563s
> user 2m34.346s
> sys 0m1.220s
> 7595 undefined
> 896 undefined_symbols
>
> Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> search for links by a logic which handle symlinks using BFS. It
> also addresses a border case that was making 'msi-irqs/\d+' regex to
> be misparsed.
>
> After patch 7, it is 11 times faster:
>
> $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
>
> real 0m14.137s
> user 0m12.795s
> sys 0m1.348s
> 7030 undefined
> 794 undefined_symbols
>
> (the difference on the number of undefined symbols are due to the fix for
> it to properly handle 'msi-irqs/\d+' regex)
>
> -
>
> While this series is independent from Documentation/ABI changes, it
> works best when applied from this tree, which also contain ABI fixes
> and a couple of additions of frequent missed symbols on my machine:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
I've taken all of these, but get_abi.pl seems to be stuck in an endless
loop or something. I gave up and stopped it after 14 minutes. It had
stopped printing out anything after finding all of the pci attributes
that are not documented :)
Anything I can do to help debug this?
thanks,
greg k-h
Em Tue, 21 Sep 2021 18:52:42 +0200
Greg Kroah-Hartman <[email protected]> escreveu:
> On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> > Hi Greg,
> >
> > Add a new feature at get_abi.pl to optionally check for existing symbols
> > under /sys that won't match a "What:" inside Documentation/ABI.
> >
> > Such feature is very useful to detect missing documentation for ABI.
> >
> > This series brings a major speedup, plus it fixes a few border cases when
> > matching regexes that end with a ".*" or \d+.
> >
> > patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> > order to make the script more robust;
> >
> > patch 2 adds the basic logic. It runs really quicky (up to 2
> > seconds), but it doesn't use sysfs softlinks.
> >
> > Patch 3 adds support for parsing softlinks. It makes the script a
> > lot slower, making it take a couple of minutes to process the entire
> > sysfs files. It could be optimized in the future by using a graph,
> > but, for now, let's keep it simple.
> >
> > Patch 4 adds an optional parameter to allow filtering the results
> > using a regex given by the user. When this parameter is used
> > (which should be the normal usecase), it will only try to find softlinks
> > if the sysfs node matches a regex.
> >
> > Patch 5 improves the report by avoiding it to ignore What: that
> > ends with a wildcard.
> >
> > Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> > results are:
> >
> > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> >
> > real 2m35.563s
> > user 2m34.346s
> > sys 0m1.220s
> > 7595 undefined
> > 896 undefined_symbols
> >
> > Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> > search for links by a logic which handle symlinks using BFS. It
> > also addresses a border case that was making 'msi-irqs/\d+' regex to
> > be misparsed.
> >
> > After patch 7, it is 11 times faster:
> >
> > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> >
> > real 0m14.137s
> > user 0m12.795s
> > sys 0m1.348s
> > 7030 undefined
> > 794 undefined_symbols
> >
> > (the difference on the number of undefined symbols are due to the fix for
> > it to properly handle 'msi-irqs/\d+' regex)
> >
> > -
> >
> > While this series is independent from Documentation/ABI changes, it
> > works best when applied from this tree, which also contain ABI fixes
> > and a couple of additions of frequent missed symbols on my machine:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
>
> I've taken all of these, but get_abi.pl seems to be stuck in an endless
> loop or something. I gave up and stopped it after 14 minutes. It had
> stopped printing out anything after finding all of the pci attributes
> that are not documented :)
It is probably not an endless loop, just there are too many vars to
check on your system, which could make it really slow.
The way the search algorithm works is that reduces the number of regex
expressions that will be checked for a given file entry at sysfs. It
does that by looking at the devnode name. For instance, when it checks for
this file:
/sys/bus/pci/drivers/iosf_mbi_pci/bind
The logic will seek only the "What:" expressions that end with "bind".
Currently, there are just two What expressions for it[1]:
What: /sys/bus/fsl\-mc/drivers/.*/bind
What: /sys/bus/pci/drivers/.*/bind
It will then run an O(n²) algorithm to seek:
foreach my $a (@names) {
foreach my $w (split /\xac/, $what) {
if ($a =~ m#^$w$#) {
exact = 1;
last;
}
}
}
Which runs quickly, when there are few regexs to seek. There are,
however, some What: expressions that end with a wildcard. Those are
harder to process. Right now, they're all grouped together, which
makes them slower. Most of the processing time are spent on those.
I'm working right now on some strategy to also speed up the search
for them. Once I get something better, I'll send a patch series.
--
[1] On a side note, there are currently some problems with the What:
definitions for bind/unbind, as:
- it doesn't match all PCI devices;
- it doesn't match ACPI and other buses that also export
bind/unbind.
>
> Anything I can do to help debug this?
>
There are two parameters that can help to identify the issue:
a) You can add a "--show-hints" parameter. This turns on some
prints that may help to identify what the script is doing.
It is not really a debug option, but it helps to identify
when some regexes are failing.
b) You can limit the What expressions that will be parsed with:
--search-string <something>
You can combine both. For instance, if you want to make it
a lot more verbose, you could run it as:
./scripts/get_abi.pl undefined --search-string /sys --show-hints
The script will then print all regexes that will be checked, and when
actually checking for the missing vars, it will print all names for
a given entry at sysfs.
So, if you want to know how an i2c bind has been validated, you
could do:
$ ./scripts/get_abi.pl undefined --search-string i2c/.*/bind --show-hints
--> /sys/bus/i2c/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/driver/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/driver/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/dummy/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/dummy/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/axp20x-i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/axp20x-i2c/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/smbus_alert/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/smbus_alert/bind
--> /sys/module/i2c_smbus/drivers/i2c:smbus_alert/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/ee1004/bind
--> /sys/module/ee1004/drivers/i2c:ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/ee1004/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/driver/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/intel_soc_pmic_i2c/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/intel_soc_pmic_i2c/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/tps68470/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/tps68470/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
--> /sys/bus/i2c/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-3/i2c-14/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-4/i2c-15/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0036/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-2/i2c-13/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0050/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-10/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-5/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-3/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-1/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1/card1-DP-1/i2c-12/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/16-0037/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-8/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-9/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:15.2/i2c_designware.2/i2c-2/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:15.0/i2c_designware.0/i2c-0/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:1f.4/i2c-16/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-7/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-6/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/CHT Whiskey Cove PMIC/bind
--> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/i2c-11/subsystem/drivers/CHT Whiskey Cove PMIC/bind
more likely regexes:
/sys/bus/fsl\-mc/drivers/.*/bind
/sys/bus/pci/drivers/.*/bind
Btw, on the above example, I have already a patch addressing it
(see enclosed). I intend to submit it on a newer patch series.
Thanks,
Mauro
[PATCH] ABI: sysfs-bus-pci: add a alternative What fields
There are some PCI ABI that aren't shown under:
/sys/bus/pci/drivers/.../
Because they're registered with a different class. That's
the case of, for instance:
/sys/bus/i2c/drivers/CHT Whiskey Cove PMIC/unbind
This one is not present under /sys/bus/pci:
$ find /sys/bus/pci -name 'CHT Whiskey Cove PMIC'
Although clearly this is provided by a PCI driver:
/sys/devices/pci0000:00/0000:00:02.0/i2c-4/subsystem/drivers/CHT Whiskey Cove PMIC/unbind
So, add an altertate What location in order to match bind/unbind
to such devices.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 1da4c8db3a9e..f4efbcb0b18c 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -1,4 +1,5 @@
What: /sys/bus/pci/drivers/.../bind
+What: /sys/devices/pciX/.../bind
Date: December 2003
Contact: [email protected]
Description:
@@ -14,6 +15,7 @@ Description:
(Note: kernels before 2.6.28 may require echo -n).
What: /sys/bus/pci/drivers/.../unbind
+What: /sys/devices/pciX/.../unbind
Date: December 2003
Contact: [email protected]
Description:
@@ -29,6 +31,7 @@ Description:
(Note: kernels before 2.6.28 may require echo -n).
What: /sys/bus/pci/drivers/.../new_id
+What: /sys/devices/pciX/.../new_id
Date: December 2003
Contact: [email protected]
Description:
@@ -47,6 +50,7 @@ Description:
# echo "8086 10f5" > /sys/bus/pci/drivers/foo/new_id
What: /sys/bus/pci/drivers/.../remove_id
+What: /sys/devices/pciX/.../remove_id
Date: February 2009
Contact: Chris Wright <[email protected]>
Description:
On Tue, Sep 21, 2021 at 08:16:33PM +0200, Mauro Carvalho Chehab wrote:
> Em Tue, 21 Sep 2021 18:52:42 +0200
> Greg Kroah-Hartman <[email protected]> escreveu:
>
> > On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> > > Hi Greg,
> > >
> > > Add a new feature at get_abi.pl to optionally check for existing symbols
> > > under /sys that won't match a "What:" inside Documentation/ABI.
> > >
> > > Such feature is very useful to detect missing documentation for ABI.
> > >
> > > This series brings a major speedup, plus it fixes a few border cases when
> > > matching regexes that end with a ".*" or \d+.
> > >
> > > patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> > > order to make the script more robust;
> > >
> > > patch 2 adds the basic logic. It runs really quicky (up to 2
> > > seconds), but it doesn't use sysfs softlinks.
> > >
> > > Patch 3 adds support for parsing softlinks. It makes the script a
> > > lot slower, making it take a couple of minutes to process the entire
> > > sysfs files. It could be optimized in the future by using a graph,
> > > but, for now, let's keep it simple.
> > >
> > > Patch 4 adds an optional parameter to allow filtering the results
> > > using a regex given by the user. When this parameter is used
> > > (which should be the normal usecase), it will only try to find softlinks
> > > if the sysfs node matches a regex.
> > >
> > > Patch 5 improves the report by avoiding it to ignore What: that
> > > ends with a wildcard.
> > >
> > > Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> > > results are:
> > >
> > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > >
> > > real 2m35.563s
> > > user 2m34.346s
> > > sys 0m1.220s
> > > 7595 undefined
> > > 896 undefined_symbols
> > >
> > > Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> > > search for links by a logic which handle symlinks using BFS. It
> > > also addresses a border case that was making 'msi-irqs/\d+' regex to
> > > be misparsed.
> > >
> > > After patch 7, it is 11 times faster:
> > >
> > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > >
> > > real 0m14.137s
> > > user 0m12.795s
> > > sys 0m1.348s
> > > 7030 undefined
> > > 794 undefined_symbols
> > >
> > > (the difference on the number of undefined symbols are due to the fix for
> > > it to properly handle 'msi-irqs/\d+' regex)
> > >
> > > -
> > >
> > > While this series is independent from Documentation/ABI changes, it
> > > works best when applied from this tree, which also contain ABI fixes
> > > and a couple of additions of frequent missed symbols on my machine:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
> >
> > I've taken all of these, but get_abi.pl seems to be stuck in an endless
> > loop or something. I gave up and stopped it after 14 minutes. It had
> > stopped printing out anything after finding all of the pci attributes
> > that are not documented :)
>
> It is probably not an endless loop, just there are too many vars to
> check on your system, which could make it really slow.
Ah, yes, I ran it overnight and got the following:
$ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
real 29m39.503s
user 29m37.556s
sys 0m0.851s
26669 undefined
765 undefined_symbols
> The way the search algorithm works is that reduces the number of regex
> expressions that will be checked for a given file entry at sysfs. It
> does that by looking at the devnode name. For instance, when it checks for
> this file:
>
> /sys/bus/pci/drivers/iosf_mbi_pci/bind
>
> The logic will seek only the "What:" expressions that end with "bind".
> Currently, there are just two What expressions for it[1]:
>
> What: /sys/bus/fsl\-mc/drivers/.*/bind
> What: /sys/bus/pci/drivers/.*/bind
>
> It will then run an O(n?) algorithm to seek:
>
> foreach my $a (@names) {
> foreach my $w (split /\xac/, $what) {
> if ($a =~ m#^$w$#) {
> exact = 1;
> last;
> }
> }
> }
>
> Which runs quickly, when there are few regexs to seek. There are,
> however, some What: expressions that end with a wildcard. Those are
> harder to process. Right now, they're all grouped together, which
> makes them slower. Most of the processing time are spent on those.
>
> I'm working right now on some strategy to also speed up the search
> for them. Once I get something better, I'll send a patch series.
>
> --
>
> [1] On a side note, there are currently some problems with the What:
> definitions for bind/unbind, as:
>
> - it doesn't match all PCI devices;
> - it doesn't match ACPI and other buses that also export
> bind/unbind.
>
> >
> > Anything I can do to help debug this?
> >
>
> There are two parameters that can help to identify the issue:
>
> a) You can add a "--show-hints" parameter. This turns on some
> prints that may help to identify what the script is doing.
> It is not really a debug option, but it helps to identify
> when some regexes are failing.
>
> b) You can limit the What expressions that will be parsed with:
> --search-string <something>
>
> You can combine both. For instance, if you want to make it
> a lot more verbose, you could run it as:
>
> ./scripts/get_abi.pl undefined --search-string /sys --show-hints
Let me run this and time stamp it to see where it is getting hung up on.
Give it another 30 minutes :)
thanks,
greg k-hj
On Wed, Sep 22, 2021 at 07:43:42AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Sep 21, 2021 at 08:16:33PM +0200, Mauro Carvalho Chehab wrote:
> > Em Tue, 21 Sep 2021 18:52:42 +0200
> > Greg Kroah-Hartman <[email protected]> escreveu:
> >
> > > On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> > > > Hi Greg,
> > > >
> > > > Add a new feature at get_abi.pl to optionally check for existing symbols
> > > > under /sys that won't match a "What:" inside Documentation/ABI.
> > > >
> > > > Such feature is very useful to detect missing documentation for ABI.
> > > >
> > > > This series brings a major speedup, plus it fixes a few border cases when
> > > > matching regexes that end with a ".*" or \d+.
> > > >
> > > > patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> > > > order to make the script more robust;
> > > >
> > > > patch 2 adds the basic logic. It runs really quicky (up to 2
> > > > seconds), but it doesn't use sysfs softlinks.
> > > >
> > > > Patch 3 adds support for parsing softlinks. It makes the script a
> > > > lot slower, making it take a couple of minutes to process the entire
> > > > sysfs files. It could be optimized in the future by using a graph,
> > > > but, for now, let's keep it simple.
> > > >
> > > > Patch 4 adds an optional parameter to allow filtering the results
> > > > using a regex given by the user. When this parameter is used
> > > > (which should be the normal usecase), it will only try to find softlinks
> > > > if the sysfs node matches a regex.
> > > >
> > > > Patch 5 improves the report by avoiding it to ignore What: that
> > > > ends with a wildcard.
> > > >
> > > > Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> > > > results are:
> > > >
> > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > >
> > > > real 2m35.563s
> > > > user 2m34.346s
> > > > sys 0m1.220s
> > > > 7595 undefined
> > > > 896 undefined_symbols
> > > >
> > > > Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> > > > search for links by a logic which handle symlinks using BFS. It
> > > > also addresses a border case that was making 'msi-irqs/\d+' regex to
> > > > be misparsed.
> > > >
> > > > After patch 7, it is 11 times faster:
> > > >
> > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > >
> > > > real 0m14.137s
> > > > user 0m12.795s
> > > > sys 0m1.348s
> > > > 7030 undefined
> > > > 794 undefined_symbols
> > > >
> > > > (the difference on the number of undefined symbols are due to the fix for
> > > > it to properly handle 'msi-irqs/\d+' regex)
> > > >
> > > > -
> > > >
> > > > While this series is independent from Documentation/ABI changes, it
> > > > works best when applied from this tree, which also contain ABI fixes
> > > > and a couple of additions of frequent missed symbols on my machine:
> > > >
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
> > >
> > > I've taken all of these, but get_abi.pl seems to be stuck in an endless
> > > loop or something. I gave up and stopped it after 14 minutes. It had
> > > stopped printing out anything after finding all of the pci attributes
> > > that are not documented :)
> >
> > It is probably not an endless loop, just there are too many vars to
> > check on your system, which could make it really slow.
>
> Ah, yes, I ran it overnight and got the following:
>
> $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
>
> real 29m39.503s
> user 29m37.556s
> sys 0m0.851s
> 26669 undefined
> 765 undefined_symbols
>
> > The way the search algorithm works is that reduces the number of regex
> > expressions that will be checked for a given file entry at sysfs. It
> > does that by looking at the devnode name. For instance, when it checks for
> > this file:
> >
> > /sys/bus/pci/drivers/iosf_mbi_pci/bind
> >
> > The logic will seek only the "What:" expressions that end with "bind".
> > Currently, there are just two What expressions for it[1]:
> >
> > What: /sys/bus/fsl\-mc/drivers/.*/bind
> > What: /sys/bus/pci/drivers/.*/bind
> >
> > It will then run an O(n?) algorithm to seek:
> >
> > foreach my $a (@names) {
> > foreach my $w (split /\xac/, $what) {
> > if ($a =~ m#^$w$#) {
> > exact = 1;
> > last;
> > }
> > }
> > }
> >
> > Which runs quickly, when there are few regexs to seek. There are,
> > however, some What: expressions that end with a wildcard. Those are
> > harder to process. Right now, they're all grouped together, which
> > makes them slower. Most of the processing time are spent on those.
> >
> > I'm working right now on some strategy to also speed up the search
> > for them. Once I get something better, I'll send a patch series.
> >
> > --
> >
> > [1] On a side note, there are currently some problems with the What:
> > definitions for bind/unbind, as:
> >
> > - it doesn't match all PCI devices;
> > - it doesn't match ACPI and other buses that also export
> > bind/unbind.
> >
> > >
> > > Anything I can do to help debug this?
> > >
> >
> > There are two parameters that can help to identify the issue:
> >
> > a) You can add a "--show-hints" parameter. This turns on some
> > prints that may help to identify what the script is doing.
> > It is not really a debug option, but it helps to identify
> > when some regexes are failing.
> >
> > b) You can limit the What expressions that will be parsed with:
> > --search-string <something>
> >
> > You can combine both. For instance, if you want to make it
> > a lot more verbose, you could run it as:
> >
> > ./scripts/get_abi.pl undefined --search-string /sys --show-hints
>
> Let me run this and time stamp it to see where it is getting hung up on.
> Give it another 30 minutes :)
Hm, that didn't make too much sense as to what it was stalled on. I've
attached the compressed file if you are curious.
Anyway, this is all in my tree now, and I'll gladly take patches to make
it go faster :)
thanks,
greg k-h
Em Wed, 22 Sep 2021 08:22:33 +0200
Greg Kroah-Hartman <[email protected]> escreveu:
> On Wed, Sep 22, 2021 at 07:43:42AM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Sep 21, 2021 at 08:16:33PM +0200, Mauro Carvalho Chehab wrote:
> > > Em Tue, 21 Sep 2021 18:52:42 +0200
> > > Greg Kroah-Hartman <[email protected]> escreveu:
> > >
> > > > On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> > > > > Hi Greg,
> > > > >
> > > > > Add a new feature at get_abi.pl to optionally check for existing symbols
> > > > > under /sys that won't match a "What:" inside Documentation/ABI.
> > > > >
> > > > > Such feature is very useful to detect missing documentation for ABI.
> > > > >
> > > > > This series brings a major speedup, plus it fixes a few border cases when
> > > > > matching regexes that end with a ".*" or \d+.
> > > > >
> > > > > patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> > > > > order to make the script more robust;
> > > > >
> > > > > patch 2 adds the basic logic. It runs really quicky (up to 2
> > > > > seconds), but it doesn't use sysfs softlinks.
> > > > >
> > > > > Patch 3 adds support for parsing softlinks. It makes the script a
> > > > > lot slower, making it take a couple of minutes to process the entire
> > > > > sysfs files. It could be optimized in the future by using a graph,
> > > > > but, for now, let's keep it simple.
> > > > >
> > > > > Patch 4 adds an optional parameter to allow filtering the results
> > > > > using a regex given by the user. When this parameter is used
> > > > > (which should be the normal usecase), it will only try to find softlinks
> > > > > if the sysfs node matches a regex.
> > > > >
> > > > > Patch 5 improves the report by avoiding it to ignore What: that
> > > > > ends with a wildcard.
> > > > >
> > > > > Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> > > > > results are:
> > > > >
> > > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > > >
> > > > > real 2m35.563s
> > > > > user 2m34.346s
> > > > > sys 0m1.220s
> > > > > 7595 undefined
> > > > > 896 undefined_symbols
> > > > >
> > > > > Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> > > > > search for links by a logic which handle symlinks using BFS. It
> > > > > also addresses a border case that was making 'msi-irqs/\d+' regex to
> > > > > be misparsed.
> > > > >
> > > > > After patch 7, it is 11 times faster:
> > > > >
> > > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > > >
> > > > > real 0m14.137s
> > > > > user 0m12.795s
> > > > > sys 0m1.348s
> > > > > 7030 undefined
> > > > > 794 undefined_symbols
> > > > >
> > > > > (the difference on the number of undefined symbols are due to the fix for
> > > > > it to properly handle 'msi-irqs/\d+' regex)
> > > > >
> > > > > -
> > > > >
> > > > > While this series is independent from Documentation/ABI changes, it
> > > > > works best when applied from this tree, which also contain ABI fixes
> > > > > and a couple of additions of frequent missed symbols on my machine:
> > > > >
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
> > > >
> > > > I've taken all of these, but get_abi.pl seems to be stuck in an endless
> > > > loop or something. I gave up and stopped it after 14 minutes. It had
> > > > stopped printing out anything after finding all of the pci attributes
> > > > that are not documented :)
> > >
> > > It is probably not an endless loop, just there are too many vars to
> > > check on your system, which could make it really slow.
> >
> > Ah, yes, I ran it overnight and got the following:
> >
> > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> >
> > real 29m39.503s
> > user 29m37.556s
> > sys 0m0.851s
> > 26669 undefined
> > 765 undefined_symbols
> >
> > > The way the search algorithm works is that reduces the number of regex
> > > expressions that will be checked for a given file entry at sysfs. It
> > > does that by looking at the devnode name. For instance, when it checks for
> > > this file:
> > >
> > > /sys/bus/pci/drivers/iosf_mbi_pci/bind
> > >
> > > The logic will seek only the "What:" expressions that end with "bind".
> > > Currently, there are just two What expressions for it[1]:
> > >
> > > What: /sys/bus/fsl\-mc/drivers/.*/bind
> > > What: /sys/bus/pci/drivers/.*/bind
> > >
> > > It will then run an O(n²) algorithm to seek:
> > >
> > > foreach my $a (@names) {
> > > foreach my $w (split /\xac/, $what) {
> > > if ($a =~ m#^$w$#) {
> > > exact = 1;
> > > last;
> > > }
> > > }
> > > }
> > >
> > > Which runs quickly, when there are few regexs to seek. There are,
> > > however, some What: expressions that end with a wildcard. Those are
> > > harder to process. Right now, they're all grouped together, which
> > > makes them slower. Most of the processing time are spent on those.
> > >
> > > I'm working right now on some strategy to also speed up the search
> > > for them. Once I get something better, I'll send a patch series.
> > >
> > > --
> > >
> > > [1] On a side note, there are currently some problems with the What:
> > > definitions for bind/unbind, as:
> > >
> > > - it doesn't match all PCI devices;
> > > - it doesn't match ACPI and other buses that also export
> > > bind/unbind.
> > >
> > > >
> > > > Anything I can do to help debug this?
> > > >
> > >
> > > There are two parameters that can help to identify the issue:
> > >
> > > a) You can add a "--show-hints" parameter. This turns on some
> > > prints that may help to identify what the script is doing.
> > > It is not really a debug option, but it helps to identify
> > > when some regexes are failing.
> > >
> > > b) You can limit the What expressions that will be parsed with:
> > > --search-string <something>
> > >
> > > You can combine both. For instance, if you want to make it
> > > a lot more verbose, you could run it as:
> > >
> > > ./scripts/get_abi.pl undefined --search-string /sys --show-hints
> >
> > Let me run this and time stamp it to see where it is getting hung up on.
> > Give it another 30 minutes :)
>
> Hm, that didn't make too much sense as to what it was stalled on. I've
> attached the compressed file if you are curious.
Hmm...
[07:52:44] --> /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.1/iommu/amd-iommu/cap
[08:07:52] --> /sys/devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:05.0/iommu/amd-iommu/cap
It sounds it took quite a while handling iommu cap, which sounds weird, as
it should be looking just 3 What expressions:
[07:43:06] What: /sys/class/iommu/.*/amd\-iommu/cap
[07:43:06] What: /sys/class/iommu/.*/intel\-iommu/cap
[07:43:06] What: /sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap
Maybe there was a memory starvation while running the script, causing
swaps. Still, it is weird that it would happen there, as the hashes
and arrays used at the script are all allocated before it starts the
search logic. Here, the allocation part takes ~2 seconds.
At least on my Dell Precision 5820 (12 cpu threads), the amount of memory it
uses is not huge:
$ /usr/bin/time -v ./scripts/get_abi.pl undefined >/dev/null
Command being timed: "./scripts/get_abi.pl undefined"
User time (seconds): 12.68
System time (seconds): 1.29
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.98
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 212608
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 52003
Voluntary context switches: 1
Involuntary context switches: 56
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Unfortunately, I don't have any amd-based machine here, but I'll
try to run it later on a big arm server and see how it behaves.
> Anyway, this is all in my tree now, and I'll gladly take patches to make
> it go faster :)
Ok!
Thanks,
Mauro
On Wed, Sep 22, 2021 at 09:36:09AM +0200, Mauro Carvalho Chehab wrote:
> Em Wed, 22 Sep 2021 08:22:33 +0200
> Greg Kroah-Hartman <[email protected]> escreveu:
>
> > On Wed, Sep 22, 2021 at 07:43:42AM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, Sep 21, 2021 at 08:16:33PM +0200, Mauro Carvalho Chehab wrote:
> > > > Em Tue, 21 Sep 2021 18:52:42 +0200
> > > > Greg Kroah-Hartman <[email protected]> escreveu:
> > > >
> > > > > On Sat, Sep 18, 2021 at 11:52:10AM +0200, Mauro Carvalho Chehab wrote:
> > > > > > Hi Greg,
> > > > > >
> > > > > > Add a new feature at get_abi.pl to optionally check for existing symbols
> > > > > > under /sys that won't match a "What:" inside Documentation/ABI.
> > > > > >
> > > > > > Such feature is very useful to detect missing documentation for ABI.
> > > > > >
> > > > > > This series brings a major speedup, plus it fixes a few border cases when
> > > > > > matching regexes that end with a ".*" or \d+.
> > > > > >
> > > > > > patch 1 changes get_abi.pl logic to handle multiple What: lines, in
> > > > > > order to make the script more robust;
> > > > > >
> > > > > > patch 2 adds the basic logic. It runs really quicky (up to 2
> > > > > > seconds), but it doesn't use sysfs softlinks.
> > > > > >
> > > > > > Patch 3 adds support for parsing softlinks. It makes the script a
> > > > > > lot slower, making it take a couple of minutes to process the entire
> > > > > > sysfs files. It could be optimized in the future by using a graph,
> > > > > > but, for now, let's keep it simple.
> > > > > >
> > > > > > Patch 4 adds an optional parameter to allow filtering the results
> > > > > > using a regex given by the user. When this parameter is used
> > > > > > (which should be the normal usecase), it will only try to find softlinks
> > > > > > if the sysfs node matches a regex.
> > > > > >
> > > > > > Patch 5 improves the report by avoiding it to ignore What: that
> > > > > > ends with a wildcard.
> > > > > >
> > > > > > Patch 6 is a minor speedup. On a Dell Precision 5820, after patch 6,
> > > > > > results are:
> > > > > >
> > > > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > > > >
> > > > > > real 2m35.563s
> > > > > > user 2m34.346s
> > > > > > sys 0m1.220s
> > > > > > 7595 undefined
> > > > > > 896 undefined_symbols
> > > > > >
> > > > > > Patch 7 makes a *huge* speedup: it basically switches a linear O(n^3)
> > > > > > search for links by a logic which handle symlinks using BFS. It
> > > > > > also addresses a border case that was making 'msi-irqs/\d+' regex to
> > > > > > be misparsed.
> > > > > >
> > > > > > After patch 7, it is 11 times faster:
> > > > > >
> > > > > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > > > > >
> > > > > > real 0m14.137s
> > > > > > user 0m12.795s
> > > > > > sys 0m1.348s
> > > > > > 7030 undefined
> > > > > > 794 undefined_symbols
> > > > > >
> > > > > > (the difference on the number of undefined symbols are due to the fix for
> > > > > > it to properly handle 'msi-irqs/\d+' regex)
> > > > > >
> > > > > > -
> > > > > >
> > > > > > While this series is independent from Documentation/ABI changes, it
> > > > > > works best when applied from this tree, which also contain ABI fixes
> > > > > > and a couple of additions of frequent missed symbols on my machine:
> > > > > >
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mchehab/devel.git/log/?h=get_undefined_abi_v3
> > > > >
> > > > > I've taken all of these, but get_abi.pl seems to be stuck in an endless
> > > > > loop or something. I gave up and stopped it after 14 minutes. It had
> > > > > stopped printing out anything after finding all of the pci attributes
> > > > > that are not documented :)
> > > >
> > > > It is probably not an endless loop, just there are too many vars to
> > > > check on your system, which could make it really slow.
> > >
> > > Ah, yes, I ran it overnight and got the following:
> > >
> > > $ time ./scripts/get_abi.pl undefined |sort >undefined && cat undefined| perl -ne 'print "$1\n" if (m#.*/(\S+) not found#)'|sort|uniq -c|sort -nr >undefined_symbols; wc -l undefined; wc -l undefined_symbols
> > >
> > > real 29m39.503s
> > > user 29m37.556s
> > > sys 0m0.851s
> > > 26669 undefined
> > > 765 undefined_symbols
> > >
> > > > The way the search algorithm works is that reduces the number of regex
> > > > expressions that will be checked for a given file entry at sysfs. It
> > > > does that by looking at the devnode name. For instance, when it checks for
> > > > this file:
> > > >
> > > > /sys/bus/pci/drivers/iosf_mbi_pci/bind
> > > >
> > > > The logic will seek only the "What:" expressions that end with "bind".
> > > > Currently, there are just two What expressions for it[1]:
> > > >
> > > > What: /sys/bus/fsl\-mc/drivers/.*/bind
> > > > What: /sys/bus/pci/drivers/.*/bind
> > > >
> > > > It will then run an O(n?) algorithm to seek:
> > > >
> > > > foreach my $a (@names) {
> > > > foreach my $w (split /\xac/, $what) {
> > > > if ($a =~ m#^$w$#) {
> > > > exact = 1;
> > > > last;
> > > > }
> > > > }
> > > > }
> > > >
> > > > Which runs quickly, when there are few regexs to seek. There are,
> > > > however, some What: expressions that end with a wildcard. Those are
> > > > harder to process. Right now, they're all grouped together, which
> > > > makes them slower. Most of the processing time are spent on those.
> > > >
> > > > I'm working right now on some strategy to also speed up the search
> > > > for them. Once I get something better, I'll send a patch series.
> > > >
> > > > --
> > > >
> > > > [1] On a side note, there are currently some problems with the What:
> > > > definitions for bind/unbind, as:
> > > >
> > > > - it doesn't match all PCI devices;
> > > > - it doesn't match ACPI and other buses that also export
> > > > bind/unbind.
> > > >
> > > > >
> > > > > Anything I can do to help debug this?
> > > > >
> > > >
> > > > There are two parameters that can help to identify the issue:
> > > >
> > > > a) You can add a "--show-hints" parameter. This turns on some
> > > > prints that may help to identify what the script is doing.
> > > > It is not really a debug option, but it helps to identify
> > > > when some regexes are failing.
> > > >
> > > > b) You can limit the What expressions that will be parsed with:
> > > > --search-string <something>
> > > >
> > > > You can combine both. For instance, if you want to make it
> > > > a lot more verbose, you could run it as:
> > > >
> > > > ./scripts/get_abi.pl undefined --search-string /sys --show-hints
> > >
> > > Let me run this and time stamp it to see where it is getting hung up on.
> > > Give it another 30 minutes :)
> >
> > Hm, that didn't make too much sense as to what it was stalled on. I've
> > attached the compressed file if you are curious.
>
> Hmm...
>
> [07:52:44] --> /sys/devices/pci0000:40/0000:40:01.3/0000:4a:00.1/iommu/amd-iommu/cap
> [08:07:52] --> /sys/devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:05.0/iommu/amd-iommu/cap
>
> It sounds it took quite a while handling iommu cap, which sounds weird, as
> it should be looking just 3 What expressions:
>
> [07:43:06] What: /sys/class/iommu/.*/amd\-iommu/cap
> [07:43:06] What: /sys/class/iommu/.*/intel\-iommu/cap
> [07:43:06] What: /sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap
>
> Maybe there was a memory starvation while running the script, causing
> swaps. Still, it is weird that it would happen there, as the hashes
> and arrays used at the script are all allocated before it starts the
> search logic. Here, the allocation part takes ~2 seconds.
No memory starvation here, this thing is a beast:
$ free -h
total used free shared buff/cache available
Mem: 251Gi 36Gi 13Gi 402Mi 202Gi 212Gi
Swap: 4.0Gi 182Mi 3.8Gi
$ nproc
64
> At least on my Dell Precision 5820 (12 cpu threads), the amount of memory it
> uses is not huge:
>
> $ /usr/bin/time -v ./scripts/get_abi.pl undefined >/dev/null
> Command being timed: "./scripts/get_abi.pl undefined"
> User time (seconds): 12.68
> System time (seconds): 1.29
> Percent of CPU this job got: 99%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.98
> Average shared text size (kbytes): 0
> Average unshared data size (kbytes): 0
> Average stack size (kbytes): 0
> Average total size (kbytes): 0
> Maximum resident set size (kbytes): 212608
> Average resident set size (kbytes): 0
> Major (requiring I/O) page faults: 0
> Minor (reclaiming a frame) page faults: 52003
> Voluntary context switches: 1
> Involuntary context switches: 56
> Swaps: 0
> File system inputs: 0
> File system outputs: 0
> Socket messages sent: 0
> Socket messages received: 0
> Signals delivered: 0
> Page size (bytes): 4096
> Exit status: 0
>
> Unfortunately, I don't have any amd-based machine here, but I'll
> try to run it later on a big arm server and see how it behaves.
I'll run that and get back to you in 30 minutes :)
thanks,
greg k-h
On Wed, Sep 22, 2021 at 10:11:02AM +0200, Greg Kroah-Hartman wrote:
> On Wed, Sep 22, 2021 at 09:36:09AM +0200, Mauro Carvalho Chehab wrote:
> > It sounds it took quite a while handling iommu cap, which sounds weird, as
> > it should be looking just 3 What expressions:
> >
> > [07:43:06] What: /sys/class/iommu/.*/amd\-iommu/cap
> > [07:43:06] What: /sys/class/iommu/.*/intel\-iommu/cap
> > [07:43:06] What: /sys/devices/pci.*.*.*.*\:.*.*/0000\:.*.*\:.*.*..*/dma/dma.*chan.*/quickdata/cap
> >
> > Maybe there was a memory starvation while running the script, causing
> > swaps. Still, it is weird that it would happen there, as the hashes
> > and arrays used at the script are all allocated before it starts the
> > search logic. Here, the allocation part takes ~2 seconds.
>
> No memory starvation here, this thing is a beast:
> $ free -h
> total used free shared buff/cache available
> Mem: 251Gi 36Gi 13Gi 402Mi 202Gi 212Gi
> Swap: 4.0Gi 182Mi 3.8Gi
>
> $ nproc
> 64
>
>
> > At least on my Dell Precision 5820 (12 cpu threads), the amount of memory it
> > uses is not huge:
> >
> > $ /usr/bin/time -v ./scripts/get_abi.pl undefined >/dev/null
> > Command being timed: "./scripts/get_abi.pl undefined"
> > User time (seconds): 12.68
> > System time (seconds): 1.29
> > Percent of CPU this job got: 99%
> > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.98
> > Average shared text size (kbytes): 0
> > Average unshared data size (kbytes): 0
> > Average stack size (kbytes): 0
> > Average total size (kbytes): 0
> > Maximum resident set size (kbytes): 212608
> > Average resident set size (kbytes): 0
> > Major (requiring I/O) page faults: 0
> > Minor (reclaiming a frame) page faults: 52003
> > Voluntary context switches: 1
> > Involuntary context switches: 56
> > Swaps: 0
> > File system inputs: 0
> > File system outputs: 0
> > Socket messages sent: 0
> > Socket messages received: 0
> > Signals delivered: 0
> > Page size (bytes): 4096
> > Exit status: 0
> >
> > Unfortunately, I don't have any amd-based machine here, but I'll
> > try to run it later on a big arm server and see how it behaves.
>
> I'll run that and get back to you in 30 minutes :)
$ /usr/bin/time -v ./scripts/get_abi.pl undefined > /dev/null
Command being timed: "./scripts/get_abi.pl undefined"
User time (seconds): 1756.94
System time (seconds): 0.76
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 29:18.94
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 228116
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 55862
Voluntary context switches: 1
Involuntary context switches: 17205
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0