Unlike the other data structures provided during registration the
thermal core takes a copy of the thermal_zone_params provided to it and
stores that copy in the thermal_zone_device, taking care to free it on
unregistration. This is done because the parameters will be modified at
runtime.
Unfortunately the thermal_of code assumes that the params structure it
provides will be used throughout the lifetime of the device and since
the params are dynamically allocated based on the bindings it attempts
to free it on unregistration. This results in not only leaking the
original params but also double freeing the copy the core made, leading
to memory corruption.
Fix this by instead freeing the params parsed from the DT during
registration.
This issue causing instability on systems where thermal zones are
unregistered, especially visble on those systems where some zones
provided by a device have no trip points such as Allwinner systems.
For example with current mainline an arm64 defconfig is unbootable on
Pine64 Plus and LibreTech Tritium is massively unstable. These issues
have been there for a while and have been made more prominent by recent
memory management changes.
Fixes: 3fd6d6e2b4e80 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
---
drivers/thermal/thermal_of.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 6fb14e521197..0af11cdfa2c1 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -442,13 +442,11 @@ static int thermal_of_unbind(struct thermal_zone_device *tz,
static void thermal_of_zone_unregister(struct thermal_zone_device *tz)
{
struct thermal_trip *trips = tz->trips;
- struct thermal_zone_params *tzp = tz->tzp;
struct thermal_zone_device_ops *ops = tz->ops;
thermal_zone_device_disable(tz);
thermal_zone_device_unregister(tz);
kfree(trips);
- kfree(tzp);
kfree(ops);
}
@@ -530,6 +528,9 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
goto out_kfree_tzp;
}
+ /* The core will take a copy of tzp, free our copy here. */
+ kfree(tzp);
+
ret = thermal_zone_device_enable(tz);
if (ret) {
pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n",
---
base-commit: fdf0eaf11452d72945af31804e2a1048ee1b574c
change-id: 20230722-thermal-fix-of-memory-corruption-73c023f8612b
Best regards,
--
Mark Brown <[email protected]>
Hi Mark,
On 23/07/2023 01:26, Mark Brown wrote:
> Unlike the other data structures provided during registration the
> thermal core takes a copy of the thermal_zone_params provided to it and
> stores that copy in the thermal_zone_device, taking care to free it on
> unregistration. This is done because the parameters will be modified at
> runtime.
>
> Unfortunately the thermal_of code assumes that the params structure it
> provides will be used throughout the lifetime of the device and since
> the params are dynamically allocated based on the bindings it attempts
> to free it on unregistration. This results in not only leaking the
> original params but also double freeing the copy the core made, leading
> to memory corruption.
>
> Fix this by instead freeing the params parsed from the DT during
> registration.
>
> This issue causing instability on systems where thermal zones are
> unregistered, especially visble on those systems where some zones
> provided by a device have no trip points such as Allwinner systems.
> For example with current mainline an arm64 defconfig is unbootable on
> Pine64 Plus and LibreTech Tritium is massively unstable. These issues
> have been there for a while and have been made more prominent by recent
> memory management changes.
>
> Fixes: 3fd6d6e2b4e80 ("thermal/of: Rework the thermal device tree initialization")
> Signed-off-by: Mark Brown <[email protected]>
> Cc: [email protected]
I think this issue has been fixed by:
https://lore.kernel.org/all/[email protected]/
Rafael ? Did you pick it up ?
--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
On Sun, Jul 23, 2023 at 11:57:52AM +0200, Daniel Lezcano wrote:
> On 23/07/2023 01:26, Mark Brown wrote:
> I think this issue has been fixed by:
> https://lore.kernel.org/all/[email protected]/
Yes, that should fix the same issue.
> Rafael ? Did you pick it up ?
There was a message on the thread saying the patches have been applied
for v6.5 but I can't see them in either mainline or -next.
On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <[email protected]> wrote:
>
> On Sun, Jul 23, 2023 at 11:57:52AM +0200, Daniel Lezcano wrote:
> > On 23/07/2023 01:26, Mark Brown wrote:
>
> > I think this issue has been fixed by:
>
> > https://lore.kernel.org/all/[email protected]/
>
> Yes, that should fix the same issue.
>
> > Rafael ? Did you pick it up ?
>
> There was a message on the thread saying the patches have been applied
> for v6.5 but I can't see them in either mainline or -next.
They should be there in linux-next (as of today).
Surely, they are present in my linux-next branch.
On Wed, Jul 26, 2023 at 08:51:20PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jul 26, 2023 at 8:47 PM Mark Brown <[email protected]> wrote:
> > > Surely, they are present in my linux-next branch.
> > Are they queued as fixes?
> They are.
> > It'd be really good to get these into v6.5,
> > they're rendering the Allwinner platforms I have unusable.
> I'm going to send a pull request with them tomorrow or on Friday.
Ah, excellent - thanks!
On Wed, Jul 26, 2023 at 8:47 PM Mark Brown <[email protected]> wrote:
>
> On Wed, Jul 26, 2023 at 08:42:39PM +0200, Rafael J. Wysocki wrote:
> > On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <[email protected]> wrote:
>
> > > There was a message on the thread saying the patches have been applied
> > > for v6.5 but I can't see them in either mainline or -next.
>
> > They should be there in linux-next (as of today).
>
> Yes, they're there now. They weren't at time of writing the above (on
> Sunday).
>
> > Surely, they are present in my linux-next branch.
>
> Are they queued as fixes?
They are.
> It'd be really good to get these into v6.5,
> they're rendering the Allwinner platforms I have unusable.
I'm going to send a pull request with them tomorrow or on Friday.
On Wed, Jul 26, 2023 at 08:42:39PM +0200, Rafael J. Wysocki wrote:
> On Sun, Jul 23, 2023 at 4:32 PM Mark Brown <[email protected]> wrote:
> > There was a message on the thread saying the patches have been applied
> > for v6.5 but I can't see them in either mainline or -next.
> They should be there in linux-next (as of today).
Yes, they're there now. They weren't at time of writing the above (on
Sunday).
> Surely, they are present in my linux-next branch.
Are they queued as fixes? It'd be really good to get these into v6.5,
they're rendering the Allwinner platforms I have unusable.