2020-01-07 17:04:34

by Wen Yang

[permalink] [raw]
Subject: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls

do_div() does a 64-by-32 division.
When the divisor is unsigned long, u64, or s64,
do_div() truncates it to 32 bits, this means it
can test non-zero and be truncated to zero for division.
This semantic patch is inspired by Mateusz Guzik's patch:
commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom() calculation")

Signed-off-by: Wen Yang <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Gilles Muller <[email protected]>
Cc: Nicolas Palix <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Matthias Maennich <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
v2:
- add a special case for constants and checking whether the value is obviously safe and no warning is needed.
- fix 'WARNING:' twice in each case.
- extend the warning to say "consider using div64_xxx instead".

scripts/coccinelle/misc/do_div.cocci | 169 +++++++++++++++++++++++++++
1 file changed, 169 insertions(+)
create mode 100644 scripts/coccinelle/misc/do_div.cocci

diff --git a/scripts/coccinelle/misc/do_div.cocci b/scripts/coccinelle/misc/do_div.cocci
new file mode 100644
index 000000000000..0fd904b9157f
--- /dev/null
+++ b/scripts/coccinelle/misc/do_div.cocci
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/// do_div() does a 64-by-32 division.
+/// When the divisor is long, unsigned long, u64, or s64,
+/// do_div() truncates it to 32 bits, this means it can test
+/// non-zero and be truncated to 0 for division on 64bit platforms.
+///
+//# This makes an effort to find those inappropriate do_div() calls.
+//
+// Confidence: Moderate
+// Copyright: (C) 2020 Wen Yang, Alibaba.
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual org
+virtual report
+
+@initialize:python@
+@@
+
+def get_digit_type_and_value(str):
+ is_digit = False
+ value = 0
+
+ try:
+ if (str.isdigit()):
+ is_digit = True
+ value = int(str, 0)
+ elif (str.upper().endswith('ULL')):
+ is_digit = True
+ value = int(str[:-3], 0)
+ elif (str.upper().endswith('LL')):
+ is_digit = True
+ value = int(str[:-2], 0)
+ elif (str.upper().endswith('UL')):
+ is_digit = True
+ value = int(str[:-2], 0)
+ elif (str.upper().endswith('L')):
+ is_digit = True
+ value = int(str[:-1], 0)
+ elif (str.upper().endswith('U')):
+ is_digit = True
+ value = int(str[:-1], 0)
+ except Exception as e:
+ print('Error:',e)
+ is_digit = False
+ value = 0
+ finally:
+ return is_digit, value
+
+def construct_warnings(str, suggested_fun):
+ msg="WARNING: do_div() does a 64-by-32 division, please consider using %s instead."
+ is_digit, value = get_digit_type_and_value(str)
+ if (is_digit):
+ if (value >= 0x100000000):
+ return msg %(suggested_fun)
+ else:
+ return None
+ else:
+ return msg % suggested_fun
+
+@depends on context@
+expression f;
+long l;
+unsigned long ul;
+u64 ul64;
+s64 sl64;
+
+@@
+(
+* do_div(f, l);
+|
+* do_div(f, ul);
+|
+* do_div(f, ul64);
+|
+* do_div(f, sl64);
+)
+
+@r depends on (org || report)@
+expression f;
+long l;
+unsigned long ul;
+position p;
+u64 ul64;
+s64 sl64;
+@@
+(
+do_div@p(f, l);
+|
+do_div@p(f, ul);
+|
+do_div@p(f, ul64);
+|
+do_div@p(f, sl64);
+)
+
+@script:python depends on org@
+p << r.p;
+ul << r.ul;
+@@
+
+warnings = construct_warnings(ul, "div64_ul")
+if warnings != None:
+ coccilib.org.print_todo(p[0], warnings)
+
+@script:python depends on org@
+p << r.p;
+l << r.l;
+@@
+
+warnings = construct_warnings(l, "div64_long")
+if warnings != None:
+ coccilib.org.print_todo(p[0], warnings)
+
+@script:python depends on org@
+p << r.p;
+ul64 << r.ul64;
+@@
+
+warnings = construct_warnings(ul64, "div64_u64")
+if warnings != None:
+ coccilib.org.print_todo(p[0], warnings)
+
+@script:python depends on org@
+p << r.p;
+sl64 << r.sl64;
+@@
+
+warnings = construct_warnings(sl64, "div64_s64")
+if warnings != None:
+ coccilib.org.print_todo(p[0], warnings)
+
+
+@script:python depends on report@
+p << r.p;
+ul << r.ul;
+@@
+
+warnings = construct_warnings(ul, "div64_ul")
+if warnings != None:
+ coccilib.report.print_report(p[0], warnings)
+
+@script:python depends on report@
+p << r.p;
+l << r.l;
+@@
+
+warnings = construct_warnings(l, "div64_long")
+if warnings != None:
+ coccilib.report.print_report(p[0], warnings)
+
+@script:python depends on report@
+p << r.p;
+sl64 << r.sl64;
+@@
+
+warnings = construct_warnings(sl64, "div64_s64")
+if warnings != None:
+ coccilib.report.print_report(p[0], warnings)
+
+@script:python depends on report@
+p << r.p;
+ul64 << r.ul64;
+@@
+
+warnings = construct_warnings(ul64, "div64_u64")
+if warnings != None:
+ coccilib.report.print_report(p[0], warnings)
--
2.23.0


2020-01-07 17:26:33

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls

> +@depends on context@
> +expression f;
> +long l;
> +unsigned long ul;
> +u64 ul64;
> +s64 sl64;
> +
> +@@
> +(
> +* do_div(f, l);
> +|
> +* do_div(f, ul);
> +|
> +* do_div(f, ul64);
> +|
> +* do_div(f, sl64);
> +)

This part is not really ideal. For the reports, you filter for the
constants, but here you don't do anything. You can put some python code
in the matching of the metavariables:

unsigned long ul : script:python() { whatever you want to check on ul };

Then it will only match if the condition is satisfied.

julia

2020-01-09 11:39:22

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls

> - fix 'WARNING:' twice in each case.

A duplicate text was removed.


> +@script:python depends on org@
> +p << r.p;
> +ul << r.ul;

I interpret this variable assignment in the way that it will work only
if the corresponding branch of the SmPL disjunction was actually matched
by the referenced SmPL rule.
Thus I suggest now to split the source code search pattern into
four separate rules.


> +warnings = construct_warnings(ul, "div64_ul")

* Can two identifiers be nicer without an “s”?

* Would it be sufficient to pass a shorter identification
(without the prefix “div64_”)?

Regards,
Markus

2020-01-09 11:43:58

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls



On Thu, 9 Jan 2020, Markus Elfring wrote:

> > +@script:python depends on org@
> > +p << r.p;
> > +ul << r.ul;
>
> I interpret this variable assignment in the way that it will work only
> if the corresponding branch of the SmPL disjunction was actually matched
> by the referenced SmPL rule.
> Thus I suggest now to split the source code search pattern into
> four separate rules.

Why?

julia

2020-01-09 14:50:03

by Markus Elfring

[permalink] [raw]
Subject: Re: [v2] coccinelle: semantic patch to check for inappropriate do_div() calls

>> Thus I suggest now to split the source code search pattern into
>> four separate rules.
>
> Why?

Does the Coccinelle software ensure that a variable like “r.ul” contains
really useful data even if the expected branch of the SmPL disjunction
was occasionally not matched?

Regards,
Markus

2020-01-10 10:02:28

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls

> +@initialize:python@

> +def construct_warnings(str, suggested_fun):

This function will be used only for the operation modes “org” and “report”.
Thus I suggest to replace the specification “initialize” by a corresponding dependency
which is already applied for the SmPL rule “r”.


Can subsequent SmPL disjunctions become more succinct?


The passing of function name variants contains a bit of duplicate Python code.
Will a feature request like “Support for SmPL rule groups” become more interesting
for the shown use case?
https://github.com/coccinelle/coccinelle/issues/164

Regards,
Markus

2020-01-10 12:35:49

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls



On Fri, 10 Jan 2020, Markus Elfring wrote:

> > +@initialize:python@
> …
> > +def construct_warnings(str, suggested_fun):
>
> This function will be used only for the operation modes “org” and “report”.
> Thus I suggest to replace the specification “initialize” by a corresponding dependency
> which is already applied for the SmPL rule “r”.
>
>
> Can subsequent SmPL disjunctions become more succinct?
>
>
> The passing of function name variants contains a bit of duplicate Python code.
> Will a feature request like “Support for SmPL rule groups” become more interesting
> for the shown use case?
> https://github.com/coccinelle/coccinelle/issues/164

The code is fine as it is in these respects.

julia

2020-01-10 13:13:35

by Wen Yang

[permalink] [raw]
Subject: Re: [PATCH v2] coccinelle: semantic patch to check for inappropriate do_div() calls



On 2020/1/8 1:25 上午, Julia Lawall wrote:
>> +@depends on context@
>> +expression f;
>> +long l;
>> +unsigned long ul;
>> +u64 ul64;
>> +s64 sl64;
>> +
>> +@@
>> +(
>> +* do_div(f, l);
>> +|
>> +* do_div(f, ul);
>> +|
>> +* do_div(f, ul64);
>> +|
>> +* do_div(f, sl64);
>> +)
>
> This part is not really ideal. For the reports, you filter for the
> constants, but here you don't do anything. You can put some python code
> in the matching of the metavariables:
>
> unsigned long ul : script:python() { whatever you want to check on ul };
>
> Then it will only match if the condition is satisfied.
>
> julia
>

OK, thank you very much.
We'll fix it soon.

--
Best Wishes,
Wen

2020-01-10 15:48:43

by Markus Elfring

[permalink] [raw]
Subject: Re: [v2] coccinelle: semantic code search for inappropriate do_div() calls

> The code is fine as it is in these respects.

I have noticed additional software development opportunities.
Thus I became curious if further collateral evolution will happen.

Regards,
Markus