2012-06-22 10:46:46

by Robert P. J. Day

[permalink] [raw]
Subject: finding unused header files


inspired by that last post that located an unused header file under
arch/h8300, i ran my "find_unused_headers.sh" script on the same
sub-directory to see what would show up.

the script is stupidly conservative and didn't identify that shm.h
header since *somewhere* in the entire kernel source tree, someone was
including a file called "shm.h" -- not even for the same architecture.
like i said, stupidly conservative.

but it did find this:

$ ../s/find_unused_headers.sh arch/h8300
===== target_time.h =====
./arch/h8300/include/asm/target_time.h
$

and i see nothing anywhere in the entire tree that includes a
target_time.h header under any circumstances.

i should probably do another run of these scripts some day, just to
see what turns up.

rday

--

========================================================================
Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter: http://twitter.com/rpjday
LinkedIn: http://ca.linkedin.com/in/rpjday
========================================================================


2012-06-22 11:03:28

by Paul Bolle

[permalink] [raw]
Subject: Re: finding unused header files

On Fri, 2012-06-22 at 06:46 -0400, Robert P. J. Day wrote:
> inspired by that last post that located an unused header file under
> arch/h8300, i ran my "find_unused_headers.sh" script on the same
> sub-directory to see what would show up.
>
> the script is stupidly conservative and didn't identify that shm.h
> header since *somewhere* in the entire kernel source tree, someone was
> including a file called "shm.h" -- not even for the same architecture.
> like i said, stupidly conservative.
>
> but it did find this:
>
> $ ../s/find_unused_headers.sh arch/h8300
> ===== target_time.h =====
> ./arch/h8300/include/asm/target_time.h
> $
>
> and i see nothing anywhere in the entire tree that includes a
> target_time.h header under any circumstances.

See my patch "h8300: delete
target_time.h" (https://lkml.org/lkml/2012/6/8/518 ). I don't remember
getting a reply to that message.

> i should probably do another run of these scripts some day, just to
> see what turns up.

The script I use currently shows almost 200 (potentially) unused
headers. It's not very sophisticated and errs on the safe side (ie, it
must show absolutely no sign of files including the header). I'm not
sure whether a sophisticated approach is actually feasible.

In the last weeks I send patches to remove a few dozen of those 200
headers.


Paul Bolle

2012-06-22 11:32:09

by Pádraig Brady

[permalink] [raw]
Subject: Re: finding unused header files

On 06/22/2012 11:46 AM, Robert P. J. Day wrote:
>
> inspired by that last post that located an unused header file under
> arch/h8300, i ran my "find_unused_headers.sh" script on the same
> sub-directory to see what would show up.
>
> the script is stupidly conservative and didn't identify that shm.h
> header since *somewhere* in the entire kernel source tree, someone was
> including a file called "shm.h" -- not even for the same architecture.
> like i said, stupidly conservative.
>
> but it did find this:
>
> $ ../s/find_unused_headers.sh arch/h8300
> ===== target_time.h =====
> ./arch/h8300/include/asm/target_time.h
> $
>
> and i see nothing anywhere in the entire tree that includes a
> target_time.h header under any circumstances.
>
> i should probably do another run of these scripts some day, just to
> see what turns up.
>
> rday
>

See also:
http://code.google.com/p/include-what-you-use/

It's quite awkward (relies on a llvm/clang source tree for example)
and currently requires human interpretation.

cheers,
P?draig.

2012-06-22 11:58:35

by Paul Bolle

[permalink] [raw]
Subject: Re: finding unused header files

On Fri, 2012-06-22 at 06:46 -0400, Robert P. J. Day wrote:
> the script is stupidly conservative and didn't identify that shm.h
> header since *somewhere* in the entire kernel source tree, someone was
> including a file called "shm.h" -- not even for the same architecture.
> like i said, stupidly conservative.

The approach I settled on is, in short, to generate two sets:
I) a set of all header files in the tree;
II) a set of all "paths" used in all the preprocessor include directives
in the tree.
You compare these sets (sort of) backwards, using as little of the paths
in both sets as you can get away with. So sometimes you have a header
"foo.h" and there's not a single include ending in "foo.h": that's an
unused header, and it's easy to spot. But if you do not spot an unused
header that easily, you try whether the long path might also turn up
unused headers, because of mismatches in the earlier parts of the paths
involved. All rather obvious, I guess.

There's a bit more to it - eg, the plain "/asm/" headers generated to
include headers in "/asm-generic/" - but that's the gist of it. Not all
unused headers will be spotted using that approach, but I guess that's a
20/80 rule: 20% of the effort, 80% of the results.

But, whatever approach you take, that's the easy part. The hard part is
peeking at the (history of the) tree to see what happened: was that
header simply never used, did it end up orphaned after changes in other
files, or was it added recently and should we expect a file using that
header to show up in the near future? Figuring all that out turned out
to be time consuming.


Paul Bolle