2020-01-10 13:17:02

by Wen Yang

[permalink] [raw]
Subject: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls

do_div() does a 64-by-32 division.
When the divisor is unsigned long, u64, or s64,
do_div() truncates it to 32 bits, this means it
can test non-zero and be truncated to zero for division.
This semantic patch is inspired by Mateusz Guzik's patch:
commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom() calculation")

Signed-off-by: Wen Yang <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Gilles Muller <[email protected]>
Cc: Nicolas Palix <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Matthias Maennich <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
v3:
- also filter out safe consts for context mode.
- cleanup code.

v2:
- add a special case for constants and checking whether the value is obviously safe and no warning is needed.
- fix 'WARNING:' twice in each case.
- extend the warning to say "consider using div64_xxx instead".

scripts/coccinelle/misc/do_div.cocci | 155 +++++++++++++++++++++++++++
1 file changed, 155 insertions(+)
create mode 100644 scripts/coccinelle/misc/do_div.cocci

diff --git a/scripts/coccinelle/misc/do_div.cocci b/scripts/coccinelle/misc/do_div.cocci
new file mode 100644
index 000000000000..79db083c5208
--- /dev/null
+++ b/scripts/coccinelle/misc/do_div.cocci
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/// do_div() does a 64-by-32 division.
+/// When the divisor is long, unsigned long, u64, or s64,
+/// do_div() truncates it to 32 bits, this means it can test
+/// non-zero and be truncated to 0 for division on 64bit platforms.
+///
+//# This makes an effort to find those inappropriate do_div() calls.
+//
+// Confidence: Moderate
+// Copyright: (C) 2020 Wen Yang, Alibaba.
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual org
+virtual report
+
+@initialize:python@
+@@
+
+def get_digit_type_and_value(str):
+ is_digit = False
+ value = 0
+
+ try:
+ if (str.isdigit()):
+ is_digit = True
+ value = int(str, 0)
+ elif (str.upper().endswith('ULL')):
+ is_digit = True
+ value = int(str[:-3], 0)
+ elif (str.upper().endswith('LL')):
+ is_digit = True
+ value = int(str[:-2], 0)
+ elif (str.upper().endswith('UL')):
+ is_digit = True
+ value = int(str[:-2], 0)
+ elif (str.upper().endswith('L')):
+ is_digit = True
+ value = int(str[:-1], 0)
+ elif (str.upper().endswith('U')):
+ is_digit = True
+ value = int(str[:-1], 0)
+ except Exception as e:
+ print('Error:',e)
+ is_digit = False
+ value = 0
+ finally:
+ return is_digit, value
+
+def filter_out_safe_constants(str):
+ is_digit, value = get_digit_type_and_value(str)
+ if (is_digit):
+ if (value >= 0x100000000):
+ return True
+ else:
+ return False
+ else:
+ return True
+
+def construct_warnings(suggested_fun):
+ msg="WARNING: do_div() does a 64-by-32 division, please consider using %s instead."
+ return msg % suggested_fun
+
+@depends on context@
+expression f;
+long l: script:python() { filter_out_safe_constants(l) };
+unsigned long ul : script:python() { filter_out_safe_constants(ul) };
+u64 ul64 : script:python() { filter_out_safe_constants(ul64) };
+s64 sl64 : script:python() { filter_out_safe_constants(sl64) };
+
+@@
+(
+* do_div(f, l);
+|
+* do_div(f, ul);
+|
+* do_div(f, ul64);
+|
+* do_div(f, sl64);
+)
+
+@r depends on (org || report)@
+expression f;
+position p;
+long l: script:python() { filter_out_safe_constants(l) };
+unsigned long ul : script:python() { filter_out_safe_constants(ul) };
+u64 ul64 : script:python() { filter_out_safe_constants(ul64) };
+s64 sl64 : script:python() { filter_out_safe_constants(sl64) };
+@@
+(
+do_div@p(f, l);
+|
+do_div@p(f, ul);
+|
+do_div@p(f, ul64);
+|
+do_div@p(f, sl64);
+)
+
+@script:python depends on org@
+p << r.p;
+ul << r.ul;
+@@
+
+coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))
+
+@script:python depends on org@
+p << r.p;
+l << r.l;
+@@
+
+coccilib.org.print_todo(p[0], construct_warnings("div64_long"))
+
+@script:python depends on org@
+p << r.p;
+ul64 << r.ul64;
+@@
+
+coccilib.org.print_todo(p[0], construct_warnings("div64_u64"))
+
+@script:python depends on org@
+p << r.p;
+sl64 << r.sl64;
+@@
+
+coccilib.org.print_todo(p[0], construct_warnings("div64_s64"))
+
+@script:python depends on report@
+p << r.p;
+ul << r.ul;
+@@
+
+coccilib.report.print_report(p[0], construct_warnings("div64_ul"))
+
+@script:python depends on report@
+p << r.p;
+l << r.l;
+@@
+
+coccilib.report.print_report(p[0], construct_warnings("div64_long"))
+
+@script:python depends on report@
+p << r.p;
+sl64 << r.sl64;
+@@
+
+coccilib.report.print_report(p[0], construct_warnings("div64_s64"))
+
+@script:python depends on report@
+p << r.p;
+ul64 << r.ul64;
+@@
+
+coccilib.report.print_report(p[0], construct_warnings("div64_u64"))
--
2.23.0


2020-01-10 16:37:02

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls

> +@initialize:python@

> +def filter_out_safe_constants(str):

> +def construct_warnings(str, suggested_fun):

* I suggest once more to adjust the dependency specifications for the usage
of these functions by SmPL rules.

* Can the local variable “msg” be omitted?


> +coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))

I suggest again to move the prefix “div64_” into the string literal
of the function implementation.


The SmPL code for two disjunctions could become shorter.

Regards,
Markus

2020-01-11 05:09:14

by Wen Yang

[permalink] [raw]
Subject: Re: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls



On 2020/1/11 12:35 上午, Markus Elfring wrote:
>> +@initialize:python@
> …
>> +def filter_out_safe_constants(str):
> …
>> +def construct_warnings(str, suggested_fun):
>
> * I suggest once more to adjust the dependency specifications for the usage
> of these functions by SmPL rules.
>

Most of the functions here are for all operation modes.


> * Can the local variable “msg” be omitted?
>
>
>> +coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))
>
> I suggest again to move the prefix “div64_” into the string literal
> of the function implementation.
>

“div64_ul” indicates the function name we recommend.
As shown in the patch:

+def construct_warnings(suggested_fun):
+ msg="WARNING: do_div() does a 64-by-32 division, please consider
using %s instead."
+ return msg % suggested_fun
...
+coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))

If we delete the prefix "div64_", it may reduce readability.

>
> The SmPL code for two disjunctions could become shorter.
>

You may suggest to modify it as follows:
+@@
+*do_div(f, \( l \| ul \| ul64 \| sl64 \) );

We agree with Julia:
I don't se any point to this. The code matched will be the same in both
cases. The original code is quite readable, without the ugly \( etc.

--
Regards,
Wen

> Regards,
> Markus
>

2020-01-11 07:32:05

by Markus Elfring

[permalink] [raw]
Subject: Re: [v3] coccinelle: semantic patch to check for inappropriate do_div() calls

>> * I suggest once more to adjust the dependency specifications for the usage
>>   of these functions by SmPL rules.
>
> Most of the functions here are for all operation modes.

I got an other understanding for this software.

You added the information “also filter out safe consts for context mode”
to the patch change log.


>> * Can the local variable “msg” be omitted?

I would appreciate another fine-tuning also at this place.


>>> +coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))
>>
>> I suggest again to move the prefix “div64_” into the string literal
>> of the function implementation.
>
> “div64_ul” indicates the function name we recommend.

The intention can be fine.


> If we delete the prefix "div64_",

I suggest to use the text at an other place.


> it may reduce readability.

I find an other code variant also readable good enough.


> +*do_div(f, \( l \| ul \| ul64 \| sl64 \) );
>
> We agree with Julia:
> I don't se any point to this.

Can the avoidance of duplicate source code (according to SmPL disjunctions)
trigger positive effects on run time characteristics and software maintenance?

Regards,
Markus

2020-01-11 07:45:41

by Julia Lawall

[permalink] [raw]
Subject: Re: [v3] coccinelle: semantic patch to check for inappropriate do_div() calls

> > +*do_div(f, \( l \| ul \| ul64 \| sl64 \) );
> >
> > We agree with Julia:
> > I don't se any point to this.
>
> Can the avoidance of duplicate source code (according to SmPL disjunctions)
> trigger positive effects on run time characteristics and software maintenance?

Markus. Please stop asking this question. You are bothering people with
this advice, why don't _you_ figure out once and for all whether the change
that you suggest has any "positive effects on the run time
characteristics"? Hint: it will not. You don't even have to run Coccinelle
to see that. Just use spatch --parse-cocci on your two suggestions and you
will see that they expand to the same thing. Coccinelle has a pass that
propagates disjunctions at the sub-statement level to the statement level.

julia

2020-01-11 08:09:00

by Markus Elfring

[permalink] [raw]
Subject: Re: [v3] coccinelle: semantic patch to check for inappropriate do_div() calls

>> Can the avoidance of duplicate source code (according to SmPL disjunctions)
>> trigger positive effects on run time characteristics and software maintenance?
>
> Markus. Please stop asking this question.

This will not happen for a while.


> You are bothering people with this advice,

I present just another view.


> why don't _you_ figure out once and for all whether the change
> that you suggest has any "positive effects on the run time characteristics"?
> Hint: it will not.

* How much attention do you give to the software development principle
"Don't repeat yourself"?

* Can the file size of a SmPL script matter a bit?


> Coccinelle has a pass that propagates disjunctions at the sub-statement level
> to the statement level.

This data processing can probably trigger further development considerations.

Regards,
Markus

2020-01-11 15:37:39

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls


On Fri, 10 Jan 2020, Wen Yang wrote:

> do_div() does a 64-by-32 division.
> When the divisor is unsigned long, u64, or s64,
> do_div() truncates it to 32 bits, this means it
> can test non-zero and be truncated to zero for division.
> This semantic patch is inspired by Mateusz Guzik's patch:
> commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom() calculation")
>
> Signed-off-by: Wen Yang <[email protected]>

Acked-by: Julia Lawall <[email protected]>

This looks good to me.

A small detail is that you don't need the parentheses in:

@r depends on (org || report)@

julia

> Cc: Julia Lawall <[email protected]>
> Cc: Gilles Muller <[email protected]>
> Cc: Nicolas Palix <[email protected]>
> Cc: Michal Marek <[email protected]>
> Cc: Matthias Maennich <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> v3:
> - also filter out safe consts for context mode.
> - cleanup code.
>
> v2:
> - add a special case for constants and checking whether the value is obviously safe and no warning is needed.
> - fix 'WARNING:' twice in each case.
> - extend the warning to say "consider using div64_xxx instead".
>
> scripts/coccinelle/misc/do_div.cocci | 155 +++++++++++++++++++++++++++
> 1 file changed, 155 insertions(+)
> create mode 100644 scripts/coccinelle/misc/do_div.cocci
>
> diff --git a/scripts/coccinelle/misc/do_div.cocci b/scripts/coccinelle/misc/do_div.cocci
> new file mode 100644
> index 000000000000..79db083c5208
> --- /dev/null
> +++ b/scripts/coccinelle/misc/do_div.cocci
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/// do_div() does a 64-by-32 division.
> +/// When the divisor is long, unsigned long, u64, or s64,
> +/// do_div() truncates it to 32 bits, this means it can test
> +/// non-zero and be truncated to 0 for division on 64bit platforms.
> +///
> +//# This makes an effort to find those inappropriate do_div() calls.
> +//
> +// Confidence: Moderate
> +// Copyright: (C) 2020 Wen Yang, Alibaba.
> +// Comments:
> +// Options: --no-includes --include-headers
> +
> +virtual context
> +virtual org
> +virtual report
> +
> +@initialize:python@
> +@@
> +
> +def get_digit_type_and_value(str):
> + is_digit = False
> + value = 0
> +
> + try:
> + if (str.isdigit()):
> + is_digit = True
> + value = int(str, 0)
> + elif (str.upper().endswith('ULL')):
> + is_digit = True
> + value = int(str[:-3], 0)
> + elif (str.upper().endswith('LL')):
> + is_digit = True
> + value = int(str[:-2], 0)
> + elif (str.upper().endswith('UL')):
> + is_digit = True
> + value = int(str[:-2], 0)
> + elif (str.upper().endswith('L')):
> + is_digit = True
> + value = int(str[:-1], 0)
> + elif (str.upper().endswith('U')):
> + is_digit = True
> + value = int(str[:-1], 0)
> + except Exception as e:
> + print('Error:',e)
> + is_digit = False
> + value = 0
> + finally:
> + return is_digit, value
> +
> +def filter_out_safe_constants(str):
> + is_digit, value = get_digit_type_and_value(str)
> + if (is_digit):
> + if (value >= 0x100000000):
> + return True
> + else:
> + return False
> + else:
> + return True
> +
> +def construct_warnings(suggested_fun):
> + msg="WARNING: do_div() does a 64-by-32 division, please consider using %s instead."
> + return msg % suggested_fun
> +
> +@depends on context@
> +expression f;
> +long l: script:python() { filter_out_safe_constants(l) };
> +unsigned long ul : script:python() { filter_out_safe_constants(ul) };
> +u64 ul64 : script:python() { filter_out_safe_constants(ul64) };
> +s64 sl64 : script:python() { filter_out_safe_constants(sl64) };
> +
> +@@
> +(
> +* do_div(f, l);
> +|
> +* do_div(f, ul);
> +|
> +* do_div(f, ul64);
> +|
> +* do_div(f, sl64);
> +)
> +
> +@r depends on (org || report)@
> +expression f;
> +position p;
> +long l: script:python() { filter_out_safe_constants(l) };
> +unsigned long ul : script:python() { filter_out_safe_constants(ul) };
> +u64 ul64 : script:python() { filter_out_safe_constants(ul64) };
> +s64 sl64 : script:python() { filter_out_safe_constants(sl64) };
> +@@
> +(
> +do_div@p(f, l);
> +|
> +do_div@p(f, ul);
> +|
> +do_div@p(f, ul64);
> +|
> +do_div@p(f, sl64);
> +)
> +
> +@script:python depends on org@
> +p << r.p;
> +ul << r.ul;
> +@@
> +
> +coccilib.org.print_todo(p[0], construct_warnings("div64_ul"))
> +
> +@script:python depends on org@
> +p << r.p;
> +l << r.l;
> +@@
> +
> +coccilib.org.print_todo(p[0], construct_warnings("div64_long"))
> +
> +@script:python depends on org@
> +p << r.p;
> +ul64 << r.ul64;
> +@@
> +
> +coccilib.org.print_todo(p[0], construct_warnings("div64_u64"))
> +
> +@script:python depends on org@
> +p << r.p;
> +sl64 << r.sl64;
> +@@
> +
> +coccilib.org.print_todo(p[0], construct_warnings("div64_s64"))
> +
> +@script:python depends on report@
> +p << r.p;
> +ul << r.ul;
> +@@
> +
> +coccilib.report.print_report(p[0], construct_warnings("div64_ul"))
> +
> +@script:python depends on report@
> +p << r.p;
> +l << r.l;
> +@@
> +
> +coccilib.report.print_report(p[0], construct_warnings("div64_long"))
> +
> +@script:python depends on report@
> +p << r.p;
> +sl64 << r.sl64;
> +@@
> +
> +coccilib.report.print_report(p[0], construct_warnings("div64_s64"))
> +
> +@script:python depends on report@
> +p << r.p;
> +ul64 << r.ul64;
> +@@
> +
> +coccilib.report.print_report(p[0], construct_warnings("div64_u64"))
> --
> 2.23.0
>
>

2020-01-12 08:33:43

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls

> This semantic patch is inspired by Mateusz Guzik's patch:

Does such a wording mean also that you would like to support the operation mode “patch”
by this SmPL script?

Regards,
Markus

2020-01-12 08:44:02

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v3] coccinelle: semantic patch to check for inappropriate do_div() calls


On Sun, 12 Jan 2020, Markus Elfring wrote:

> > This semantic patch is inspired by Mateusz Guzik's patch:
>
> Does such a wording mean also that you would like to support the operation mode “patch”
> by this SmPL script?

I see no reason why such a wording would imply such a thing.

julia

2020-01-12 08:51:44

by Markus Elfring

[permalink] [raw]
Subject: Re: [v3] coccinelle: semantic patch to check for inappropriate do_div() calls

> I see no reason why such a wording would imply such a thing.

Thus I suggest once more to improve the distinction between patching
and the description of source code searches in the commit message.

Regards,
Markus