2018-10-09 21:31:31

by Stefan Agner

[permalink] [raw]
Subject: [PATCH 1/2] component: add optional cleanup function

Add optional cleanup function on master level. This allows the
master to call framework level cleanup functions in case binding
of any component failed before the previously successfully bound
components get unbound.

Signed-off-by: Stefan Agner <[email protected]>
---
Hi,

This is an attempt to fix the issue reported in:
"drm/imx: Crash in drm_mode_config_cleanup"

--
Stefan

drivers/base/component.c | 4 ++++
include/linux/component.h | 1 +
2 files changed, 5 insertions(+)

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 8946dfee4768..5350d931a663 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -535,6 +535,10 @@ int component_bind_all(struct device *master_dev, void *data)
break;
}

+ /* Allow the master to call framework cleanup functions */
+ if (master->ops->cleanup)
+ master->ops->cleanup(master->dev);
+
if (ret != 0) {
for (; i--; )
if (!master->match->compare[i].duplicate) {
diff --git a/include/linux/component.h b/include/linux/component.h
index e71fbbbc74e2..800534b52165 100644
--- a/include/linux/component.h
+++ b/include/linux/component.h
@@ -24,6 +24,7 @@ struct master;
struct component_master_ops {
int (*bind)(struct device *master);
void (*unbind)(struct device *master);
+ void (*cleanup)(struct device *master);
};

void component_master_del(struct device *,
--
2.19.1



2018-10-09 21:31:41

by Stefan Agner

[permalink] [raw]
Subject: [PATCH 2/2] drm/imx: make sure to cleanup DRM before unbinding components

In situations where a component fails to bind, a previously
successfully bound component might already registered itself
with the DRM framework (e.g. an encoder). When the master
component then calls drm_mode_config_cleanup, we end up in a
use after free sitution.

Use the cleanup callback to make sure all framework level
cleanup is done by the time we unbind components.

Signed-off-by: Stefan Agner <[email protected]>
---
drivers/gpu/drm/imx/imx-drm-core.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c
index 5ea0c82f9957..b174a0ca9acb 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -288,8 +288,8 @@ static int imx_drm_bind(struct device *dev)
err_unbind:
#endif
component_unbind_all(drm->dev, drm);
-err_kms:
drm_mode_config_cleanup(drm);
+err_kms:
drm_dev_put(drm);

return ret;
@@ -313,9 +313,17 @@ static void imx_drm_unbind(struct device *dev)
drm_dev_put(drm);
}

+static void imx_drm_cleanup(struct device *dev)
+{
+ struct drm_device *drm = dev_get_drvdata(dev);
+
+ drm_mode_config_cleanup(drm);
+}
+
static const struct component_master_ops imx_drm_ops = {
.bind = imx_drm_bind,
.unbind = imx_drm_unbind,
+ .cleanup = imx_drm_cleanup,
};

static int imx_drm_platform_probe(struct platform_device *pdev)
--
2.19.1


2018-10-10 10:40:45

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/imx: make sure to cleanup DRM before unbinding components

On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
> In situations where a component fails to bind, a previously
> successfully bound component might already registered itself
> with the DRM framework (e.g. an encoder). When the master
> component then calls drm_mode_config_cleanup, we end up in a
> use after free sitution.
>
> Use the cleanup callback to make sure all framework level
> cleanup is done by the time we unbind components.

I'm not sure about this approach - the idea about the component bind
and unbind callbacks is that unbind undoes _everything_ that bind has
done, so everything is correctly ordered. If bind registers something,
unbind should unregister it.

What seems to be going on is that imx is registering stuff in bind()
but not unregistering it in unbind().

Since imx was one of the drivers that the component helper was
created for, if it's now crashing, that's a regression in the imx
driver. Looking at the commit log, I'd say:

commit 8e3b16e2117409625b89807de3912ff773aea354
Author: Lucas Stach <[email protected]>
Date: Thu Aug 11 11:18:49 2016 +0200

drm/imx: don't destroy mode objects manually on driver unbind

Instead let drm_mode_config_cleanup() do the work when taking down
the master device. This requires all cleanup functions to be
properly hooked up to the mode object .destroy callback.

Signed-off-by: Lucas Stach <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>

is probably responsible for introducing this problem, since the
explicit calls were added by me when imx was stuck in staging due to
the problems that the component helper solved.

I think what we have here are different opinions on how cleanup
should be handled.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2018-10-10 11:04:41

by Stefan Agner

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/imx: make sure to cleanup DRM before unbinding components

[adding Lucas]

On 10.10.2018 12:38, Russell King - ARM Linux wrote:
> On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
>> In situations where a component fails to bind, a previously
>> successfully bound component might already registered itself
>> with the DRM framework (e.g. an encoder). When the master
>> component then calls drm_mode_config_cleanup, we end up in a
>> use after free sitution.
>>
>> Use the cleanup callback to make sure all framework level
>> cleanup is done by the time we unbind components.
>
> I'm not sure about this approach - the idea about the component bind
> and unbind callbacks is that unbind undoes _everything_ that bind has
> done, so everything is correctly ordered. If bind registers something,
> unbind should unregister it.
>
> What seems to be going on is that imx is registering stuff in bind()
> but not unregistering it in unbind().

Yes indeed, if that can be fixed this seems to be the better approach to
me.

>
> Since imx was one of the drivers that the component helper was
> created for, if it's now crashing, that's a regression in the imx
> driver. Looking at the commit log, I'd say:
>
> commit 8e3b16e2117409625b89807de3912ff773aea354
> Author: Lucas Stach <[email protected]>
> Date: Thu Aug 11 11:18:49 2016 +0200
>
> drm/imx: don't destroy mode objects manually on driver unbind
>
> Instead let drm_mode_config_cleanup() do the work when taking down
> the master device. This requires all cleanup functions to be
> properly hooked up to the mode object .destroy callback.
>
> Signed-off-by: Lucas Stach <[email protected]>
> Signed-off-by: Philipp Zabel <[email protected]>
>
> is probably responsible for introducing this problem, since the
> explicit calls were added by me when imx was stuck in staging due to
> the problems that the component helper solved.

The commit above does not revert cleanly today, but a quick fixing
seemed to resolve the problem I am seeing...

>
> I think what we have here are different opinions on how cleanup
> should be handled.

In the regular case using the framework cleanup function before calling
component_unbind_all() works fine.

Its really only the case where a subcomponent fails to bind where unbind
happens before calling drm_mode_config_cleanup(drm). I guess Lucas was
not aware of that special case...?

I can send a patch which properly reverts the above commit.

--
Stefan

2018-10-10 11:23:16

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH 2/2] drm/imx: make sure to cleanup DRM before unbinding components

On Wed, Oct 10, 2018 at 01:02:16PM +0200, Stefan Agner wrote:
> [adding Lucas]
>
> On 10.10.2018 12:38, Russell King - ARM Linux wrote:
> > On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
> >> In situations where a component fails to bind, a previously
> >> successfully bound component might already registered itself
> >> with the DRM framework (e.g. an encoder). When the master
> >> component then calls drm_mode_config_cleanup, we end up in a
> >> use after free sitution.
> >>
> >> Use the cleanup callback to make sure all framework level
> >> cleanup is done by the time we unbind components.
> >
> > I'm not sure about this approach - the idea about the component bind
> > and unbind callbacks is that unbind undoes _everything_ that bind has
> > done, so everything is correctly ordered. If bind registers something,
> > unbind should unregister it.
> >
> > What seems to be going on is that imx is registering stuff in bind()
> > but not unregistering it in unbind().
>
> Yes indeed, if that can be fixed this seems to be the better approach to
> me.
>
> >
> > Since imx was one of the drivers that the component helper was
> > created for, if it's now crashing, that's a regression in the imx
> > driver. Looking at the commit log, I'd say:
> >
> > commit 8e3b16e2117409625b89807de3912ff773aea354
> > Author: Lucas Stach <[email protected]>
> > Date: Thu Aug 11 11:18:49 2016 +0200
> >
> > drm/imx: don't destroy mode objects manually on driver unbind
> >
> > Instead let drm_mode_config_cleanup() do the work when taking down
> > the master device. This requires all cleanup functions to be
> > properly hooked up to the mode object .destroy callback.
> >
> > Signed-off-by: Lucas Stach <[email protected]>
> > Signed-off-by: Philipp Zabel <[email protected]>
> >
> > is probably responsible for introducing this problem, since the
> > explicit calls were added by me when imx was stuck in staging due to
> > the problems that the component helper solved.
>
> The commit above does not revert cleanly today, but a quick fixing
> seemed to resolve the problem I am seeing...
>
> >
> > I think what we have here are different opinions on how cleanup
> > should be handled.
>
> In the regular case using the framework cleanup function before calling
> component_unbind_all() works fine.
>
> Its really only the case where a subcomponent fails to bind where unbind
> happens before calling drm_mode_config_cleanup(drm). I guess Lucas was
> not aware of that special case...?

As long as ->unbind undoes everything that ->bind does, it's not a
problem.

WRT using devres groups in one of your previous mails on this topic,
consider what would happen if you use devm_* APIs in the bind callback
and the component helper did not use devres groups.

Any resources taken in the bind callback would not be freed until the
device was unbound from the driver by the driver model. This means
each time the bind callback gets called, new sets of resources are
attempted to be acquired. If some resources are exclusive, the first
bind attempt would succeed, but the second attempt would fail with
-EBUSY for example. In the case of memory, the allocated memory
would not be freed, but new memory would be allocated each time a
bind was attempted.

The alternative would be that devres must not be used at all in the
bind/unbind callbacks - but then we'll be subject to the same error-
prone cleanup that we see with driver probe functions prior to devres.

Consider what the LDB bind/unbind callbacks would look like if devres
was not permitted, and how the memory for the imx_ldb structure would
be managed - you'd need some sort of ref-counting on the imx_ldb
structure, and management of that in the encoder and optional connector
destroy functions so you know when all embedded encoders and connectors
in that structure have been "destroyed" by drm_mode_config_cleanup(),
and hence it's safe to free the data structure.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up