From: Rafael J. Wysocki <[email protected]>
Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
run in parallel with thermal_debug_cdev_state_update() and it may free
the struct thermal_debugfs object used by the latter after it has been
checked against NULL.
If that happens, thermal_debug_cdev_state_update() will access memory
that has been freed already causing the kernel to crash.
Address this by using cdev->lock in thermal_debug_cdev_remove() around
the cdev->debugfs value check (in case the same cdev is removed at the
same time in two differet threads) and its reset to NULL.
Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Cc :6.8+ <[email protected]> # 6.8+
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/thermal/thermal_debugfs.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm
*/
void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
{
- struct thermal_debugfs *thermal_dbg = cdev->debugfs;
+ struct thermal_debugfs *thermal_dbg;
+ mutex_lock(&cdev->lock);
+
+ thermal_dbg = cdev->debugfs;
if (!thermal_dbg)
return;
+ cdev->debugfs = NULL;
+
+ mutex_unlock(&cdev->lock);
+
mutex_lock(&thermal_dbg->lock);
thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
- cdev->debugfs = NULL;
mutex_unlock(&thermal_dbg->lock);
On 4/25/24 14:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> run in parallel with thermal_debug_cdev_state_update() and it may free
> the struct thermal_debugfs object used by the latter after it has been
> checked against NULL.
>
> If that happens, thermal_debug_cdev_state_update() will access memory
> that has been freed already causing the kernel to crash.
>
> Address this by using cdev->lock in thermal_debug_cdev_remove() around
> the cdev->debugfs value check (in case the same cdev is removed at the
> same time in two differet threads) and its reset to NULL.
s/differet/different/
>
> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> Cc :6.8+ <[email protected]> # 6.8+
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/thermal/thermal_debugfs.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> Index: linux-pm/drivers/thermal/thermal_debugfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> +++ linux-pm/drivers/thermal/thermal_debugfs.c
> @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm
> */
> void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> {
> - struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> + struct thermal_debugfs *thermal_dbg;
>
> + mutex_lock(&cdev->lock);
> +
> + thermal_dbg = cdev->debugfs;
> if (!thermal_dbg)
mutex_unlock(&cdev->lock) missing here
> return;
>
> + cdev->debugfs = NULL;
> +
> + mutex_unlock(&cdev->lock);
> +
> mutex_lock(&thermal_dbg->lock);
>
> thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> - cdev->debugfs = NULL;
>
> mutex_unlock(&thermal_dbg->lock);
>
>
>
>
>
On Fri, Apr 26, 2024 at 12:05 AM Lukasz Luba <[email protected]> wrote:
>
>
>
> On 4/25/24 14:57, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> > run in parallel with thermal_debug_cdev_state_update() and it may free
> > the struct thermal_debugfs object used by the latter after it has been
> > checked against NULL.
> >
> > If that happens, thermal_debug_cdev_state_update() will access memory
> > that has been freed already causing the kernel to crash.
> >
> > Address this by using cdev->lock in thermal_debug_cdev_remove() around
> > the cdev->debugfs value check (in case the same cdev is removed at the
> > same time in two differet threads) and its reset to NULL.
>
> s/differet/different/
>
> >
> > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > Cc :6.8+ <[email protected]> # 6.8+
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> > drivers/thermal/thermal_debugfs.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/drivers/thermal/thermal_debugfs.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> > +++ linux-pm/drivers/thermal/thermal_debugfs.c
> > @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm
> > */
> > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> > {
> > - struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> > + struct thermal_debugfs *thermal_dbg;
> >
> > + mutex_lock(&cdev->lock);
> > +
> > + thermal_dbg = cdev->debugfs;
> > if (!thermal_dbg)
>
> mutex_unlock(&cdev->lock) missing here
Good catch, thanks!
Ho-hum, I'm not sure why I haven't added it here ...
I'll send a v2 of this patch shortly.
> > return;
> >
> > + cdev->debugfs = NULL;
> > +
> > + mutex_unlock(&cdev->lock);
> > +
> > mutex_lock(&thermal_dbg->lock);
> >
> > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> > - cdev->debugfs = NULL;
> >
> > mutex_unlock(&thermal_dbg->lock);
> >
> >
> >
> >
> >
>
From: Rafael J. Wysocki <[email protected]>
Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
run in parallel with thermal_debug_cdev_state_update() and it may free
the struct thermal_debugfs object used by the latter after it has been
checked against NULL.
If that happens, thermal_debug_cdev_state_update() will access memory
that has been freed already causing the kernel to crash.
Address this by using cdev->lock in thermal_debug_cdev_remove() around
the cdev->debugfs value check (in case the same cdev is removed at the
same time in two different threads) and its reset to NULL.
Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Cc :6.8+ <[email protected]> # 6.8+
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
v1 -> v2: Add missing mutex_unlock() (Lukasz).
---
drivers/thermal/thermal_debugfs.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm
*/
void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
{
- struct thermal_debugfs *thermal_dbg = cdev->debugfs;
+ struct thermal_debugfs *thermal_dbg;
- if (!thermal_dbg)
+ mutex_lock(&cdev->lock);
+
+ thermal_dbg = cdev->debugfs;
+ if (!thermal_dbg) {
+ mutex_unlock(&cdev->lock);
return;
+ }
+
+ cdev->debugfs = NULL;
+
+ mutex_unlock(&cdev->lock);
mutex_lock(&thermal_dbg->lock);
thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
- cdev->debugfs = NULL;
mutex_unlock(&thermal_dbg->lock);
On 4/26/24 10:28, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> run in parallel with thermal_debug_cdev_state_update() and it may free
> the struct thermal_debugfs object used by the latter after it has been
> checked against NULL.
>
> If that happens, thermal_debug_cdev_state_update() will access memory
> that has been freed already causing the kernel to crash.
>
> Address this by using cdev->lock in thermal_debug_cdev_remove() around
> the cdev->debugfs value check (in case the same cdev is removed at the
> same time in two different threads) and its reset to NULL.
>
> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> Cc :6.8+ <[email protected]> # 6.8+
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
>
> v1 -> v2: Add missing mutex_unlock() (Lukasz).
>
> ---
> drivers/thermal/thermal_debugfs.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> Index: linux-pm/drivers/thermal/thermal_debugfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> +++ linux-pm/drivers/thermal/thermal_debugfs.c
> @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm
> */
> void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> {
> - struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> + struct thermal_debugfs *thermal_dbg;
>
> - if (!thermal_dbg)
> + mutex_lock(&cdev->lock);
> +
> + thermal_dbg = cdev->debugfs;
> + if (!thermal_dbg) {
> + mutex_unlock(&cdev->lock);
> return;
> + }
> +
> + cdev->debugfs = NULL;
> +
> + mutex_unlock(&cdev->lock);
>
> mutex_lock(&thermal_dbg->lock);
>
> thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> - cdev->debugfs = NULL;
>
> mutex_unlock(&thermal_dbg->lock);
>
>
>
>
>
It looks good now
Reviewed-by: Lukasz Luba <[email protected]>
On Fri, Apr 26, 2024 at 11:35 AM Lukasz Luba <[email protected]> wrote:
>
>
>
> On 4/26/24 10:28, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> > run in parallel with thermal_debug_cdev_state_update() and it may free
> > the struct thermal_debugfs object used by the latter after it has been
> > checked against NULL.
> >
> > If that happens, thermal_debug_cdev_state_update() will access memory
> > that has been freed already causing the kernel to crash.
> >
> > Address this by using cdev->lock in thermal_debug_cdev_remove() around
> > the cdev->debugfs value check (in case the same cdev is removed at the
> > same time in two different threads) and its reset to NULL.
> >
> > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > Cc :6.8+ <[email protected]> # 6.8+
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> >
> > v1 -> v2: Add missing mutex_unlock() (Lukasz).
> >
> > ---
> > drivers/thermal/thermal_debugfs.c | 14 +++++++++++---
> > 1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/drivers/thermal/thermal_debugfs.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> > +++ linux-pm/drivers/thermal/thermal_debugfs.c
> > @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm
> > */
> > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> > {
> > - struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> > + struct thermal_debugfs *thermal_dbg;
> >
> > - if (!thermal_dbg)
> > + mutex_lock(&cdev->lock);
> > +
> > + thermal_dbg = cdev->debugfs;
> > + if (!thermal_dbg) {
> > + mutex_unlock(&cdev->lock);
> > return;
> > + }
> > +
> > + cdev->debugfs = NULL;
> > +
> > + mutex_unlock(&cdev->lock);
> >
> > mutex_lock(&thermal_dbg->lock);
> >
> > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> > - cdev->debugfs = NULL;
> >
> > mutex_unlock(&thermal_dbg->lock);
> >
> >
> >
> >
> >
>
> It looks good now
>
> Reviewed-by: Lukasz Luba <[email protected]>
Thanks!