In common init function, when run into err branch, we didn`t
use kfree to release kzmalloc area, this may bring in memleak
Signed-off-by: Bernard Zhao <[email protected]>
---
drivers/clk/meson/meson8b.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/meson/meson8b.c b/drivers/clk/meson/meson8b.c
index 34a70c4b4899..0f07d5a4cd16 100644
--- a/drivers/clk/meson/meson8b.c
+++ b/drivers/clk/meson/meson8b.c
@@ -3687,6 +3687,7 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
if (ret) {
pr_err("%s: Failed to register clkc reset controller: %d\n",
__func__, ret);
+ kfree(rstc);
return;
}
@@ -3710,8 +3711,10 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
continue;
ret = of_clk_hw_register(np, clk_hw_onecell_data->hws[i]);
- if (ret)
+ if (ret) {
+ kfree(rstc);
return;
+ }
}
meson8b_cpu_nb_data.cpu_clk = clk_hw_onecell_data->hws[CLKID_CPUCLK];
@@ -3727,13 +3730,16 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
if (ret) {
pr_err("%s: failed to register the CPU clock notifier\n",
__func__);
+ kfree(rstc);
return;
}
ret = of_clk_add_hw_provider(np, of_clk_hw_onecell_get,
clk_hw_onecell_data);
- if (ret)
+ if (ret) {
pr_err("%s: failed to register clock provider\n", __func__);
+ kfree(rstc);
+ }
}
static void __init meson8_clkc_init(struct device_node *np)
--
2.26.2
On Wed 29 Apr 2020 at 05:14, Bernard Zhao <[email protected]> wrote:
> In common init function, when run into err branch, we didn`t
> use kfree to release kzmalloc area, this may bring in memleak
Thx for reporting this Bernard.
I'm not a fan of adding kfree everywhere. I'd much prefer a label and
clear error exit path.
That being said, the allocation is probably not the only thing that
needs to be undone in case of error. I guess this is due to conversion
to CLK_OF_DECLARE_DRIVER() which forced to drop all the devm_
This was done because the clock controller was required early in the
boot sequence.
There is 2 paths to properly solve this:
1) Old school: manually undo everything with every error exit condition
Doable but probably a bit messy
2) Convert back the driver to a real platform driver and use devm_.
We would still need the controller to register early but I wonder if
we could use the same method as drivers/clk/mediatek/clk-mt2701.c and
use arch_initcall() ?
Martin, you did the initial conversion, what do you think of option 2 ?
Would it still answer the problem you were trying to solve back then ?
One added benefit of option 2 is we could drop CLK_OF_DECLARE_DRIVER().
We could even do the same in for the other SoCs, which I suppose would
avoid a fair amount of probe deferral.
>
> Signed-off-by: Bernard Zhao <[email protected]>
> ---
> drivers/clk/meson/meson8b.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/clk/meson/meson8b.c b/drivers/clk/meson/meson8b.c
> index 34a70c4b4899..0f07d5a4cd16 100644
> --- a/drivers/clk/meson/meson8b.c
> +++ b/drivers/clk/meson/meson8b.c
> @@ -3687,6 +3687,7 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
> if (ret) {
> pr_err("%s: Failed to register clkc reset controller: %d\n",
> __func__, ret);
> + kfree(rstc);
> return;
> }
>
> @@ -3710,8 +3711,10 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
> continue;
>
> ret = of_clk_hw_register(np, clk_hw_onecell_data->hws[i]);
> - if (ret)
> + if (ret) {
> + kfree(rstc);
> return;
> + }
> }
>
> meson8b_cpu_nb_data.cpu_clk = clk_hw_onecell_data->hws[CLKID_CPUCLK];
> @@ -3727,13 +3730,16 @@ static void __init meson8b_clkc_init_common(struct device_node *np,
> if (ret) {
> pr_err("%s: failed to register the CPU clock notifier\n",
> __func__);
> + kfree(rstc);
> return;
> }
>
> ret = of_clk_add_hw_provider(np, of_clk_hw_onecell_get,
> clk_hw_onecell_data);
> - if (ret)
> + if (ret) {
> pr_err("%s: failed to register clock provider\n", __func__);
> + kfree(rstc);
> + }
> }
>
> static void __init meson8_clkc_init(struct device_node *np)
Hi Jerome,
On Wed, Apr 29, 2020 at 2:37 PM Jerome Brunet <[email protected]> wrote:
>
>
> On Wed 29 Apr 2020 at 05:14, Bernard Zhao <[email protected]> wrote:
>
> > In common init function, when run into err branch, we didn`t
> > use kfree to release kzmalloc area, this may bring in memleak
>
> Thx for reporting this Bernard.
> I'm not a fan of adding kfree everywhere. I'd much prefer a label and
> clear error exit path.
>
> That being said, the allocation is probably not the only thing that
> needs to be undone in case of error. I guess this is due to conversion
> to CLK_OF_DECLARE_DRIVER() which forced to drop all the devm_
> This was done because the clock controller was required early in the
> boot sequence.
>
> There is 2 paths to properly solve this:
> 1) Old school: manually undo everything with every error exit condition
> Doable but probably a bit messy
> 2) Convert back the driver to a real platform driver and use devm_.
> We would still need the controller to register early but I wonder if
> we could use the same method as drivers/clk/mediatek/clk-mt2701.c and
> use arch_initcall() ?
>
> Martin, you did the initial conversion, what do you think of option 2 ?
I tried it with the attached patch
unfortunately my "m8b_clkc_test_probe" is still run too late
> Would it still answer the problem you were trying to solve back then ?
I'm afraid it does not:
- the resets are needed early for SMP initialization
- the clocks are needed even earlier for timer registration (we have
both, the ARM TWD timer and some Amlogic custom timer. both have clock
inputs)
> One added benefit of option 2 is we could drop CLK_OF_DECLARE_DRIVER().
> We could even do the same in for the other SoCs, which I suppose would
> avoid a fair amount of probe deferral.
it would be great, indeed
but this will only work once timer initialization and SMP boot can
happen at a later stage
If the clock controller registration fails the board won't boot. Yes,
cleaning up memory is good, but in this specific case it will add a
couple of extra CPU cycles before the kernel is dead
So, if we want to ignore that fact then I agree with your first option
(undoing things the "old school" way).
Martin
From: Martin Blumenstingl <[email protected]>
Date: 2020-04-30 01:43:55
To: Jerome Brunet <[email protected]>
Cc: Bernard Zhao <[email protected]>,Neil Armstrong <[email protected]>,Stephen Boyd <[email protected]>,Kevin Hilman <[email protected]>,[email protected],[email protected],[email protected],[email protected]
Subject: Re: [PATCH] clk/meson: fixes memleak issue in init err branch>Hi Jerome,
>
>On Wed, Apr 29, 2020 at 2:37 PM Jerome Brunet <[email protected]> wrote:
>>
>>
>> On Wed 29 Apr 2020 at 05:14, Bernard Zhao <[email protected]> wrote:
>>
>> > In common init function, when run into err branch, we didn`t
>> > use kfree to release kzmalloc area, this may bring in memleak
>>
>> Thx for reporting this Bernard.
>> I'm not a fan of adding kfree everywhere. I'd much prefer a label and
>> clear error exit path.
>>
>> That being said, the allocation is probably not the only thing that
>> needs to be undone in case of error. I guess this is due to conversion
>> to CLK_OF_DECLARE_DRIVER() which forced to drop all the devm_
>> This was done because the clock controller was required early in the
>> boot sequence.
>>
>> There is 2 paths to properly solve this:
>> 1) Old school: manually undo everything with every error exit condition
>> Doable but probably a bit messy
>> 2) Convert back the driver to a real platform driver and use devm_.
>> We would still need the controller to register early but I wonder if
>> we could use the same method as drivers/clk/mediatek/clk-mt2701.c and
>> use arch_initcall() ?
>>
>> Martin, you did the initial conversion, what do you think of option 2 ?
>I tried it with the attached patch
>unfortunately my "m8b_clkc_test_probe" is still run too late
>
>> Would it still answer the problem you were trying to solve back then ?
>I'm afraid it does not:
>- the resets are needed early for SMP initialization
>- the clocks are needed even earlier for timer registration (we have
>both, the ARM TWD timer and some Amlogic custom timer. both have clock
>inputs)
>
>> One added benefit of option 2 is we could drop CLK_OF_DECLARE_DRIVER().
>> We could even do the same in for the other SoCs, which I suppose would
>> avoid a fair amount of probe deferral.
>it would be great, indeed
>but this will only work once timer initialization and SMP boot can
>happen at a later stage
>
>If the clock controller registration fails the board won't boot. Yes,
>cleaning up memory is good, but in this specific case it will add a
>couple of extra CPU cycles before the kernel is dead
>So, if we want to ignore that fact then I agree with your first option
>(undoing things the "old school" way).
>
Hi
I am not sure whether my understanding is correct.
I feels that the failure of our module can not block the entire kernel from starting.
Maybe we should throw an exception, clear the status, as "old way",
and continue to execute the kernel.
And if our module requires that the kernel cannot continue to run when the exception occurs,
then we do not need to return in the error branch, also we do not need to kfree, just BUG_ON().
Regards,
Bernard
>Martin
Hi Bernhard,
On Thu, Apr 30, 2020 at 5:33 AM 赵军奎 <[email protected]> wrote:
[...]
> I am not sure whether my understanding is correct.
> I feels that the failure of our module can not block the entire kernel from starting.
> Maybe we should throw an exception, clear the status, as "old way",
> and continue to execute the kernel.
I added a HACK right at the beginning of meson8b_clkc_init_common where I added:
if (true)
return; // HACK: simulate an error
when I then try to boot the resulting kernel there's silence after
u-boot is done:
Starting kernel ...
(output stops here)
This is expected, because the UART clock is provided by this clock
controller. And since we didn't register it there will be no serial
output.
Then I added "earlycon" to the boot args and attached the output to this mail.
The result is still not great, because only the early console is
working. Also timer initialization (smp_twd) as well as SMP startup
failed.
All other peripherals also failed to probe, because almost everything
on this SoC depends on this main clock controller (the two exceptions
that I know are: the always-on remote processor and the DDR clock
controller).
I hope that this explains the behavior when the main clock and reset
controllers are not registered (either due to my hack or if there's an
actual error).
Please let me know if you have more questions.
Now I would like to hear your suggestion, how to handle errors in this driver!
Regards
Martin