scripts/clang-tools/gen_compile_commands.py incorrectly assumes that
each .mod file only contains one line. In fact, such files contain one
entry per line, and for some subsystems, there can be many, many lines.
For example, Nouveau has 762 entries, but only the first entry was being
processed. This problem causes clangd to fail to provide references and
definitions for valid files that are part of the current kernel
configuration.
This problem only occurs when using Kbuild to generate, like this:
make CC=clang compile_commands.json
It does not occur if you just run gen_compile_commands.py "bare", like
this (below):
scripts/clang-tools/gen_compile_commands.py/gen_compile_commands.py .
Fix this by fully processing each .mod file. This fix causes the number
of build commands that clangd finds in my kernel build (these numbers
are heavily dependent upon .config), from 2848 to 5292, which is an 85%
increase.
Fixes: ecca4fea1ede4 ("gen_compile_commands: support *.o, *.a, modules.order in positional argument")
Cc: Masahiro Yamada <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
scripts/clang-tools/gen_compile_commands.py | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
index 1d1bde1fd45e..53590e886889 100755
--- a/scripts/clang-tools/gen_compile_commands.py
+++ b/scripts/clang-tools/gen_compile_commands.py
@@ -157,10 +157,11 @@ def cmdfiles_for_modorder(modorder):
if ext != '.ko':
sys.exit('{}: module path must end with .ko'.format(ko))
mod = base + '.mod'
- # The first line of *.mod lists the objects that compose the module.
+ # Read from *.mod, to get a list of objects that compose the module.
with open(mod) as m:
- for obj in m.readline().split():
- yield to_cmdfile(obj)
+ for line in m.readlines():
+ for obj in line.split():
+ yield to_cmdfile(obj)
def process_line(root_directory, command_prefix, file_path):
--
2.36.1
On Wed, Jun 15, 2022 at 3:33 PM John Hubbard <[email protected]> wrote:
>
> scripts/clang-tools/gen_compile_commands.py incorrectly assumes that
> each .mod file only contains one line.
Thanks for catching this.
That assumption was correct until recently.
The first line contained member objects.
The second line, if CONFIG_TRIM_UNUSED_KSYMS=y, contained unresolved symbols
Commit 9413e7640564 ("kbuild: split the second line of *.mod into *.usyms")
changed the format of *.mod so member objects are listed per-line.
> In fact, such files contain one
> entry per line, and for some subsystems, there can be many, many lines.
> For example, Nouveau has 762 entries, but only the first entry was being
> processed. This problem causes clangd to fail to provide references and
> definitions for valid files that are part of the current kernel
> configuration.
>
> This problem only occurs when using Kbuild to generate, like this:
>
> make CC=clang compile_commands.json
>
> It does not occur if you just run gen_compile_commands.py "bare", like
> this (below):
>
> scripts/clang-tools/gen_compile_commands.py/gen_compile_commands.py .
>
> Fix this by fully processing each .mod file. This fix causes the number
> of build commands that clangd finds in my kernel build (these numbers
> are heavily dependent upon .config), from 2848 to 5292, which is an 85%
> increase.
>
> Fixes: ecca4fea1ede4 ("gen_compile_commands: support *.o, *.a, modules.order in positional argument")
This should be
Fixes: 9413e7640564 ("kbuild: split the second line of *.mod into *.usyms")
Can you update the commit log?
> Cc: Masahiro Yamada <[email protected]>
> Cc: Nick Desaulniers <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
> scripts/clang-tools/gen_compile_commands.py | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> index 1d1bde1fd45e..53590e886889 100755
> --- a/scripts/clang-tools/gen_compile_commands.py
> +++ b/scripts/clang-tools/gen_compile_commands.py
> @@ -157,10 +157,11 @@ def cmdfiles_for_modorder(modorder):
> if ext != '.ko':
> sys.exit('{}: module path must end with .ko'.format(ko))
> mod = base + '.mod'
> - # The first line of *.mod lists the objects that compose the module.
> + # Read from *.mod, to get a list of objects that compose the module.
> with open(mod) as m:
> - for obj in m.readline().split():
> - yield to_cmdfile(obj)
> + for line in m.readlines():
for line in m:
is simpler, (and maybe will work more efficiently).
One more note, the 'line' iterator is shadowing (overwriting)
the outer 'line' iterator, which has been used a few lines above.
with open(modorder) as f:
for line in f:
Maybe, it is safer to use a different name for the inner iterator
because shadowing does not work in Python.
> + for obj in line.split():
This loop is unneeded because each line
contains only one word.
.rstpip() will do.
To sum up, this part can be simpler,
for example like this:
# Read from *.mod, to get a list of objects that compose the module.
with open(mod) as m:
for line2 in m:
yield to_cmdfile(line2.rstrip())
> + yield to_cmdfile(obj)
>
>
> def process_line(root_directory, command_prefix, file_path):
> --
> 2.36.1
>
--
Best Regards
Masahiro Yamada
On 6/15/22 02:23, Masahiro Yamada wrote:
>> Fixes: ecca4fea1ede4 ("gen_compile_commands: support *.o, *.a, modules.order in positional argument")
>
> This should be
>
> Fixes: 9413e7640564 ("kbuild: split the second line of *.mod into *.usyms")
>
>
> Can you update the commit log?
>
Done.
>
>> Cc: Masahiro Yamada <[email protected]>
>> Cc: Nick Desaulniers <[email protected]>
>> Signed-off-by: John Hubbard <[email protected]>
>> ---
>> scripts/clang-tools/gen_compile_commands.py | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
>> index 1d1bde1fd45e..53590e886889 100755
>> --- a/scripts/clang-tools/gen_compile_commands.py
>> +++ b/scripts/clang-tools/gen_compile_commands.py
>> @@ -157,10 +157,11 @@ def cmdfiles_for_modorder(modorder):
>> if ext != '.ko':
>> sys.exit('{}: module path must end with .ko'.format(ko))
>> mod = base + '.mod'
>> - # The first line of *.mod lists the objects that compose the module.
>> + # Read from *.mod, to get a list of objects that compose the module.
>> with open(mod) as m:
>> - for obj in m.readline().split():
>> - yield to_cmdfile(obj)
>> + for line in m.readlines():
>
>
> for line in m:
>
> is simpler, (and maybe will work more efficiently).
aha, yes.
>
>
> One more note, the 'line' iterator is shadowing (overwriting)
> the outer 'line' iterator, which has been used a few lines above.
I overlooked that point, good catch.
>
> with open(modorder) as f:
> for line in f:
>
>
>
> Maybe, it is safer to use a different name for the inner iterator
> because shadowing does not work in Python.
>
>
>
>
>
>> + for obj in line.split():
>
> This loop is unneeded because each line
> contains only one word.
> .rstpip() will do.
>
>
>
> To sum up, this part can be simpler,
> for example like this:
>
> # Read from *.mod, to get a list of objects that compose the module.
> with open(mod) as m:
> for line2 in m:
> yield to_cmdfile(line2.rstrip())
>
Yes, I'll send out a v2 with all of this. Maybe "mod_line" or something
instead of "line2", but otherwise as you've recommended.
thanks,
--
John Hubbard
NVIDIA