2019-02-14 11:20:35

by Wen Yang

[permalink] [raw]
Subject: [PATCH v4] coccinelle: semantic patch for missing put_device()

The of_find_device_by_node() takes a reference to the underlying device
structure, we should release that reference.
The implementation of this semantic patch is:
In a function, for variables returned by calling of_find_device_by_node(),
a, if it is released by a function such as
put_device()/of_dev_put()/platform_device_put() after the last use,
it is considered that there is no reference leak;
b, if it is passed back to the caller via
dev_get_drvdata()/platform_get_drvdata()/get_device(), etc., the
reference will be released in other functions, and the current function
also considers that there is no reference leak;
c, for the rest of the situation, the current function should release the
reference by calling put_device, this patch will report the
corresponding error message.

By using this semantic patch, we have found some object reference leaks,
such as:
commit 11907e9d3533 ("ASoC: fsl-asoc-card: fix object reference leaks in
fsl_asoc_card_probe")
commit a12085d13997 ("mtd: rawnand: atmel: fix possible object reference leak")
commit 11493f26856a ("mtd: rawnand: jz4780: fix possible object reference leak")

There are still dozens of reference leaks in the current kernel code.

Further, for the case of b, the object returned to other functions may also
have a reference leak, we will continue to develop other cocci scripts to
further check the reference leak.

Signed-off-by: Wen Yang <[email protected]>
Reviewed-by: Julia Lawall <[email protected]>
Reviewed-by: Markus Elfring <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Gilles Muller <[email protected]>
Cc: Nicolas Palix <[email protected]>
Cc: Michal Marek <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Wen Yang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
v4->v3:
- add Masahiro Yamada
- omit a blank line
- split the long message parameter
- reduce the number of metavariables
- Describe the implementation of the semantic patch,
explain the scenarios it can detect,
and further software development considerations.
v3->v2:
- reduction of a bit of redundant C code within SmPL search specifications.
- consider the message construction without using the extra Python variable “msg”
v2->v1:
- put exists after search, and then drop the when exists below.
- should not use the same e as in the when's below.
- Make a new type metavariable and use it to put a cast on the result of
platform_get_drvdata.

scripts/coccinelle/free/put_device.cocci | 55 ++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
create mode 100644 scripts/coccinelle/free/put_device.cocci

diff --git a/scripts/coccinelle/free/put_device.cocci b/scripts/coccinelle/free/put_device.cocci
new file mode 100644
index 0000000..6fd79cf
--- /dev/null
+++ b/scripts/coccinelle/free/put_device.cocci
@@ -0,0 +1,55 @@
+/// Find missing put_device for every of_find_device_by_node.
+///
+// Confidence: Moderate
+// Copyright: (C) 2018-2019 Wen Yang, ZTE. GPLv2.
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual report
+virtual org
+
+@search exists@
+local idexpression id;
+expression x,e,e1;
+position p1,p2;
+type T,T1,T2,T3;
+@@
+
+id = of_find_device_by_node@p1(x)
+... when != e = id
+if (id == NULL || ...) { ... return ...; }
+... when != put_device(&id->dev)
+ when != platform_device_put(id)
+ when != of_dev_put(id)
+ when != if (id) { ... put_device(&id->dev) ... }
+ when != e1 = (T)id
+ when != e1 = &id->dev
+ when != e1 = get_device(&id->dev)
+ when != e1 = (T1)platform_get_drvdata(id)
+(
+ return
+( id
+| (T2)dev_get_drvdata(&id->dev)
+| (T3)platform_get_drvdata(id)
+);
+| return@p2 ...;
+)
+
+@script:python depends on report@
+p1 << search.p1;
+p2 << search.p2;
+@@
+
+coccilib.report.print_report(p2[0],
+ "ERROR: missing put_device;"
+ + " call of_find_device_by_node on line "
+ + p1[0].line
+ + " and return without releasing.")
+
+@script:python depends on org@
+p1 << search.p1;
+p2 << search.p2;
+@@
+
+cocci.print_main("of_find_device_by_node", p1)
+cocci.print_secs("needed put_device", p2)
--
2.9.5



2019-02-14 17:44:11

by Markus Elfring

[permalink] [raw]
Subject: Re: [PATCH v4] coccinelle: semantic patch for missing put_device()

> The implementation of this semantic patch is:

Thanks for your extension of such a commit message.


I would interpret the provided SmPL code in the way that it will not generate
adjusted (patched) C code so far.
Source code search results will be presented by the operation mode “report” or “org”.

How do you think about to exchange the word “patch” by “code search”
at affected places (and in the subject) then?


> In a function, for variables returned by calling of_find_device_by_node(),

Do variables really get returned?

The provided pointer should usually be stored somewhere.


> c, for the rest of the situation, the current function should release the
> reference by calling put_device, this patch will report the
> corresponding error message.

* Do you expect that the desired object release should be performed only in
the same function implementation?

* Would you like to pick any software development challenges up around
inter-procedural data flow (or even escape) analysis for the shown use case?


> Further, for the case of b, the object returned to other functions may also
> have a reference leak, we will continue to develop other cocci scripts to
> further check the reference leak.

I am curious on how these approaches will evolve further.


> +// Copyright: (C) 2018-2019 Wen Yang, ZTE. GPLv2.

Would you like to add a SPDX identifier?


> +coccilib.report.print_report(p2[0],

Thanks for a nicer indentation here.


> + "ERROR: missing put_device;"

Will change confidence considerations result in another fine-tuning for this message?


> + + " call of_find_device_by_node on line "

I find that such a split string literal can be unwanted.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?id=1f947a7a011fcceb14cb912f5481a53b18f1879a#n94


> + + " and return without releasing.")

Possible rewording?

+ + " but without a corresponding object release within this function.")


Regards,
Markus

2019-02-15 15:16:20

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v4] coccinelle: semantic patch for missing put_device()



On Fri, 15 Feb 2019, [email protected] wrote:

> > How do you think about to exchange the word “patch” by “code search”
> > at affected places (and in the subject) then?
>
> Thanks, we‘ll fix it.
>
> >> In a function, for variables returned by calling of_find_device_by_node(),
> > Do variables really get returned?
> > The provided pointer should usually be stored somewhere.
>
> Thank you very much, we will consider this situation and submit a next version to fix it.

I don't know what Markus is talking about here, so I'm not sure that a
change is needed.

>
> > * Would you like to pick any software development challenges up around
> > inter-procedural data flow (or even escape) analysis for the shown use case?
>
> We are very interested in doing this work, but currently coccinelle may
> not support data flow analysis, and we hope to contribute a little.
>
> > Would you like to add a SPDX identifier?
>
> OK, we will add a SPDX identifierfix soon.
>
> >> + "ERROR: missing put_device;"
> >Will change confidence considerations result in another fine-tuning for this message?
>
> Thank you, we will change "ERROR" to "WARNING".

I think ERROR is fine. If it is a real positive than it is a real
problem. Warning is for things that look ugly, but don't have any impact
on the execution.

julia

> >> + + " call of_find_device_by_node on line "
> >I find that such a split string literal can be unwanted.
>
> Thank you, we will fix it soon.
>
> >> + + " and return without releasing.")
> >Possible rewording?
> >+ + " but without a corresponding object release within this function.")
>
> Thanks, we will modify it according to your suggestion.
>
> Regards,
> Wen

2019-02-15 15:19:23

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH v4] coccinelle: semantic patch for missing put_device()



On Fri, 15 Feb 2019, [email protected] wrote:

> Hi Julia, thank you very much.
>
> > > >> In a function, for variables returned by calling of_find_device_by_node(),
> > > > Do variables really get returned?
> > > > The provided pointer should usually be stored somewhere.
> > >
> > > Thank you very much, we will consider this situation and submit a next version to fix it.
> >
> > I don't know what Markus is talking about here, so I'm not sure that a
> > change is needed.
>
> I think Markus means that we need to deal with two situations:
> 1, The return value of of_find_device_by_node () is assigned to a variable, such as:
> pdev = of_find_device_by_node(np);
> 2, The return value of of_find_device_by_node() is assigned to a variable in a structure, such as:
> dev->pdev = of_find_device_by_node(args.np);
>
> So I plan to modify the following to capture both cases:
> -local idexpression id;
> +expression id;

I'm not sure that this is a good idea. There is likely no need for a put
in the latter case.

julia

> ...
> id = of_find_device_by_node@p1(x)
>
> > > >> + "ERROR: missing put_device;"
> > > >Will change confidence considerations result in another fine-tuning for this message?
> > >
> > > Thank you, we will change "ERROR" to "WARNING".
> >
> > I think ERROR is fine. If it is a real positive than it is a real
> > problem. Warning is for things that look ugly, but don't have any impact
> > on the execution.
>
> OK, I will keep it.
> Thanks.
>
> Regards,
> Wen

2019-02-15 15:24:24

by Markus Elfring

[permalink] [raw]
Subject: Re: [v4] coccinelle: semantic patch for missing put_device()

>>>> In a function, for variables returned by calling of_find_device_by_node(),
>>> Do variables really get returned?
>>> The provided pointer should usually be stored somewhere.
>>
>> Thank you very much, we will consider this situation and submit a next version to fix it.
>
> I don't know what Markus is talking about here,

I find that my feedback contained details for two items.

1. I suggested another adjustment for a wording in the evolving
commit description.

2. Should any more case distinctions be taken into account for the data
storage of function return values so that the shown source code search
approach can become safer?


> so I'm not sure that a change is needed.

I am curious if there is a need for further clarifications then.


>>>> + "ERROR: missing put_device;"
>>> Will change confidence considerations result in another fine-tuning for this message?
>>
>> Thank you, we will change "ERROR" to "WARNING".
>
> I think ERROR is fine.

I have got a different software development view for the possible
severity information.


> If it is a real positive than it is a real problem.

I can agree to this information.

You specified also a precondition.
How should be achieved that the source code analysis will not point
false positives out?


> Warning is for things that look ugly,

I am also curious on how views will evolve around such ugliness.


> but don't have any impact on the execution.

I find such a restriction questionable.


Would you like to share any more ideas for safe data flow analysis
(eventually also together with your software)?

Regards,
Markus

2019-02-15 15:35:25

by Markus Elfring

[permalink] [raw]
Subject: Re: [v4] coccinelle: semantic patch for missing put_device()

>> So I plan to modify the following to capture both cases:
>> -local idexpression id;
>> +expression id;
>
> I'm not sure that this is a good idea.

Why have you got doubts here?


> There is likely no need for a put in the latter case.

I have got understanding difficulties for such information.
What did you try to express with this sentence finally?

Regards,
Markus

2019-02-15 15:36:16

by Julia Lawall

[permalink] [raw]
Subject: Re: [v4] coccinelle: semantic patch for missing put_device()



On Fri, 15 Feb 2019, Markus Elfring wrote:

> >> So I plan to modify the following to capture both cases:
> >> -local idexpression id;
> >> +expression id;
> >
> > I'm not sure that this is a good idea.
>
> Why have you got doubts here?
>
>
> > There is likely no need for a put in the latter case.
>
> I have got understanding difficulties for such information.
> What did you try to express with this sentence finally?

The whole goal of the semantic patch is to ensure that put_device is
called when needed. If the value is stored in a structure, then someone
else will likely take care of calling put_device later when the structure
is destroyed.

julia

2019-02-15 15:36:39

by Markus Elfring

[permalink] [raw]
Subject: Re: [v4] coccinelle: semantic patch for missing put_device()

> The whole goal of the semantic patch is to ensure that put_device is
> called when needed.

Thanks for this clarification.

The software development goal seems to be clear to some degree.


> If the value is stored in a structure,

Will any further means become relevant for the discussed data processing?


> then someone else will likely take care of calling put_device later
> when the structure is destroyed.

Such a view is reasonable.

Are there any additional source code analysis and software development challenges
to consider for safer resource management?

Regards,
Markus