2013-10-24 05:23:36

by Michael Opdenacker

[permalink] [raw]
Subject: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

This is the first version of a script to look for
Kconfig parameters which are still defined but no longer
used in the kernel source code.

The script may be extended in the future to perform
more checks. This explains why a rather generic name was chosen.

Several issues have already been reported and fixed
thanks to the use of this script. It is time to share it now!

Signed-off-by: Michael Opdenacker <[email protected]>
---
scripts/checkkconfig.py | 131 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 131 insertions(+)
create mode 100755 scripts/checkkconfig.py

diff --git a/scripts/checkkconfig.py b/scripts/checkkconfig.py
new file mode 100755
index 0000000..4155656
--- /dev/null
+++ b/scripts/checkkconfig.py
@@ -0,0 +1,131 @@
+#!/usr/bin/env python
+#
+# Copyright 2013 Free Electrons
+# Michael Opdenacker <[email protected]>
+#
+# Look for issues in Kconfig files
+#
+# This first version supports:
+# - Looking for unused Kconfig parameters
+#
+# Usage: scripts/checkkconfig.py
+#
+# Licensed under the terms of the GNU GPL License version 2
+
+import fnmatch
+import os
+import subprocess
+
+########################################################
+# Parse Kconfig files
+########################################################
+
+def parse_kconfig(file):
+
+ global has_select, choice_type, source_file
+
+ current_param = ''
+ in_choice = False
+
+ with open(file, 'r') as f:
+ for line in f:
+ words = line.split()
+
+ if len(words) != 0:
+ key = words[0]
+
+ if len(words) == 1:
+
+ if key == 'choice':
+ in_choice = True
+ elif key == 'endchoice':
+ in_choice = False
+
+ elif len(words) > 1:
+
+ if key == 'config' or key == 'menuconfig':
+ current_param = words[1]
+
+ has_select[current_param] = False
+ choice_type[current_param] = in_choice
+ source_file[current_param] = file
+
+ elif key == 'select' and words[1] != '':
+
+ has_select[current_param] = True
+
+########################################################
+# Find occurrences of a parameter
+########################################################
+
+def count_param(param):
+
+ global source_file, bad_params_in_file
+
+ if os.path.isdir('.git'):
+ # Use git grep when available
+ count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)
+ else:
+ # Fallback to regular grep
+ count = subprocess.check_output('grep -R ' + param + ' . | grep -v defconfig | wc -l', shell=True)
+
+ num = int(count.strip())
+
+ if num == 1:
+ 'WARNING: parameter ' + param + ' is used nowhere'
+
+ file=source_file[param]
+
+ if bad_params_in_file.has_key(file):
+ bad_params_in_file[file].append(param)
+ else:
+ bad_params_in_file[file]=[param]
+
+
+########################################################
+# Main program
+########################################################
+
+global has_select, choice_type, source_file, bad_params_in_file
+
+has_select = dict()
+choice_type = dict()
+source_file = dict()
+bad_params_in_file = dict()
+
+kconfig_files = []
+
+for root, dirnames, filenames in os.walk('.'):
+ for filename in fnmatch.filter(filenames, 'Kconfig*'):
+ kconfig_files.append(os.path.join(root, filename))
+
+print 'INFO: processing Kconfig files...'
+
+for f in kconfig_files:
+ parse_kconfig(f)
+
+total = len(has_select)
+count = 0
+old_percentage = -1
+
+for param in has_select.keys():
+
+ # Progress information... running the script can take hours!
+
+ count += 1
+ percentage = int((100*count)/total)
+
+ if percentage > old_percentage:
+ print 'INFO: checking kconfig parameter %d / %d (%d %%)' % (count, total, percentage)
+ old_percentage = percentage
+
+ # Ignore parameters with select dependencies
+ # Also ignore parameters inside 'choice ... endchoice'
+ # All of them are valid, even if they have only one instance
+
+ if not(has_select[param]) and not(choice_type[param]):
+ count_param(param)
+
+for file in bad_params_in_file.keys():
+ print 'File: ' + file
+ print bad_params_in_file[file]
--
1.8.1.2


2013-10-24 07:30:54

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

On Thu, 2013-10-24 at 07:23 +0200, Michael Opdenacker wrote:
> This is the first version of a script to look for
> Kconfig parameters which are still defined but no longer
> used in the kernel source code.
[]
> diff --git a/scripts/checkkconfig.py b/scripts/checkkconfig.py
[]
> +def count_param(param):
> +
> + global source_file, bad_params_in_file
> +
> + if os.path.isdir('.git'):
> + # Use git grep when available
> + count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)
> + else:
> + # Fallback to regular grep
> + count = subprocess.check_output('grep -R ' + param + ' . | grep -v defconfig | wc -l', shell=True)

Doesn't the grep need -w?

Also, the regular grep could probably use something like
'grep -R -w --max-count=2 --include="*.[chS]"'

2013-10-24 09:39:57

by Michal Marek

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

On 24.10.2013 07:23, Michael Opdenacker wrote:
> + count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)

Python 2.6 does not have subprocess.check_output:

$ ./scripts/checkkconfig.py
INFO: processing Kconfig files...
INFO: checking kconfig parameter 1 / 13706 (0 %)
Traceback (most recent call last):
File "./scripts/checkkconfig.py", line 127, in <module>
count_param(param)
File "./scripts/checkkconfig.py", line 67, in count_param
count = subprocess.check_output('git grep ' + param + '| grep -v
defconfig | wc -l', shell=True)
AttributeError: 'module' object has no attribute 'check_output'

Michal

2013-10-25 13:38:55

by Michael Opdenacker

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

Hi Joe,

Thank you very much for your review!

On 10/24/2013 09:30 AM, Joe Perches wrote:
> On Thu, 2013-10-24 at 07:23 +0200, Michael Opdenacker wrote:
>
>> +def count_param(param):
>> +
>> + global source_file, bad_params_in_file
>> +
>> + if os.path.isdir('.git'):
>> + # Use git grep when available
>> + count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)
>> + else:
>> + # Fallback to regular grep
>> + count = subprocess.check_output('grep -R ' + param + ' . | grep -v defconfig | wc -l', shell=True)
> Doesn't the grep need -w?
Using "-w" is a good idea, and this way I can eliminate false negatives
(for example finding matches for "PRINTK_TIME" when I'm looking for
plain "PRINTK"). The only this is that I then need to look for both
"PARAM" (in Kconfig files, in case the parameter is just used for
dependency management), and for "CONFIG_PARAM" (in source files).
>
> Also, the regular grep could probably use something like
> 'grep -R -w --max-count=2 --include="*.[chS]"'
I can definitely use "--max-count=2".

'--include="*.[chS]"' is more problematic because it excludes Kconfig
files. I'll use "-I" instead to just ignore binary files.

I'll soon send an update taking this into account.

Don't hesitate to send more comments.

Thanks again,

Michael.

--
Michael Opdenacker, CEO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
+33 484 258 098

2013-10-25 13:45:18

by Michal Marek

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

On 25.10.2013 15:38, Michael Opdenacker wrote:
> Hi Joe,
>
> Thank you very much for your review!
>
> On 10/24/2013 09:30 AM, Joe Perches wrote:
>> On Thu, 2013-10-24 at 07:23 +0200, Michael Opdenacker wrote:
>>
>>> +def count_param(param):
>>> +
>>> + global source_file, bad_params_in_file
>>> +
>>> + if os.path.isdir('.git'):
>>> + # Use git grep when available
>>> + count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)
>>> + else:
>>> + # Fallback to regular grep
>>> + count = subprocess.check_output('grep -R ' + param + ' . | grep -v defconfig | wc -l', shell=True)
>> Doesn't the grep need -w?
> Using "-w" is a good idea, and this way I can eliminate false negatives
> (for example finding matches for "PRINTK_TIME" when I'm looking for
> plain "PRINTK"). The only this is that I then need to look for both
> "PARAM" (in Kconfig files, in case the parameter is just used for
> dependency management), and for "CONFIG_PARAM" (in source files).

You can process Kconfig files and source files separately. And you can
speed up the script a lot, if you simply record all CONFIG_* strings
found in the sources, and then compare this to the set of config options
defined in Kconfig. Then you can also easily implement the opposite
check, which is currently done by scripts/checkkconfigsymbols.sh.

Michal

2013-10-29 18:06:27

by Yann E. MORIN

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

Michael, All,

On 2013-10-24 07:23 +0200, Michael Opdenacker spake thusly:
> This is the first version of a script to look for
> Kconfig parameters which are still defined but no longer
> used in the kernel source code.
>
> The script may be extended in the future to perform
> more checks. This explains why a rather generic name was chosen.
>
> Several issues have already been reported and fixed
> thanks to the use of this script. It is time to share it now!

I'm not much of a Python guy, so I'll gleefully rely on the previous
comments to guide you in your endeavour.

/me is feeling poetical, tonight! ;-)

I however have a suggestion, see below.

> Signed-off-by: Michael Opdenacker <[email protected]>
> ---
> scripts/checkkconfig.py | 131 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 131 insertions(+)
> create mode 100755 scripts/checkkconfig.py
>
> diff --git a/scripts/checkkconfig.py b/scripts/checkkconfig.py
> new file mode 100755
> index 0000000..4155656
> --- /dev/null
> +++ b/scripts/checkkconfig.py
> @@ -0,0 +1,131 @@
[--SNIP--]
> +def count_param(param):
> +
> + global source_file, bad_params_in_file
> +
> + if os.path.isdir('.git'):
> + # Use git grep when available
> + count = subprocess.check_output('git grep ' + param + '| grep -v defconfig | wc -l', shell=True)
> + else:
> + # Fallback to regular grep
> + count = subprocess.check_output('grep -R ' + param + ' . | grep -v defconfig | wc -l', shell=True)
> +
> + num = int(count.strip())
> +
> + if num == 1:
> + 'WARNING: parameter ' + param + ' is used nowhere'
> +
> + file=source_file[param]
> +
> + if bad_params_in_file.has_key(file):
> + bad_params_in_file[file].append(param)
> + else:
> + bad_params_in_file[file]=[param]
> +
> +
> +########################################################
> +# Main program
> +########################################################
[--SNIP--]
> +for param in has_select.keys():
> +
> + # Progress information... running the script can take hours!

I guess all these grep spawning are what makes it slow.

I wonder if it would not be possible to invert the loop (in pseudo
Python code):

for f in all_interesting_files:
read f in memory
for s in all_symbols:
if symbol is in f:
remove f from all_symbols
break the inner-most loop

This way:
- you scan the tree only once
- as soon as a symbol is matched, it is removed, thus decreasing
the amount of checks done in further loops.

Regards,
Yann E. MORIN.

--
.-----------------.--------------------.------------------.--------------------.
| Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ |
| +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. |
'------------------------------^-------^------------------^--------------------'

2013-10-30 03:07:47

by Michael Opdenacker

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

Hi Michal,

Thank you very much for your reviews!

On 10/25/2013 03:45 PM, Michal Marek wrote:

> You can process Kconfig files and source files separately. And you can
> speed up the script a lot, if you simply record all CONFIG_* strings
> found in the sources, and then compare this to the set of config options
> defined in Kconfig. Then you can also easily implement the opposite
> check, which is currently done by scripts/checkkconfigsymbols.sh.
I like the idea. CONFIG_* strings at least should be easy to identify in
the source code indeed. I'll also have to find all the parameters
referred to in Kconfig files (namely as dependencies), but this should
be worth the effort.

The whole program will be a bit more complex, but it will be orders of
magnitude faster!

Thanks again,

Michael.

--
Michael Opdenacker, CEO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
+33 484 258 098

2013-10-30 03:14:17

by Michael Opdenacker

[permalink] [raw]
Subject: Re: [PATCH] scripts/checkkconfig.py: find unused Kconfig parameters

Hi Yann,

Thank you for your review!

On 10/29/2013 07:06 PM, Yann E. MORIN wrote:

> I guess all these grep spawning are what makes it slow.
>
> I wonder if it would not be possible to invert the loop (in pseudo
> Python code):
>
> for f in all_interesting_files:
> read f in memory
> for s in all_symbols:
> if symbol is in f:
> remove f from all_symbols
> break the inner-most loop
You probably meant "remove s from all_symbols", but I got it.
>
> This way:
> - you scan the tree only once
> - as soon as a symbol is matched, it is removed, thus decreasing
> the amount of checks done in further loops.
This sounds like a very good idea, thanks! I hope to have time to
implement a mix of it and Michal's proposal later this week, or at least
next week.

Thanks again,

Michael.

--
Michael Opdenacker, CEO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
+33 484 258 098