2013-08-06 09:32:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> Removing any omap device always resulted in a crash; turns out
> BUS_NOTIFY_DEL_DEVICE is not the last notifier event sent in the
> course of removing the device, the correct event is
> BUS_NOTIFY_UNBOUND_DRIVER, which still is not the right place to
> perform the cleanup. A device callback handles that properly, as
> well as making sure the hwmods of the device are shutdown.
>
> Signed-off-by: Pantelis Antoniou <[email protected]>
> ---
> arch/arm/mach-omap2/omap_device.c | 34 ++++++++++++++++++++++++++++++++--
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
> index f33b40c..6dec521 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -178,6 +178,32 @@ odbfd_exit:
> return ret;
> }
>
> +static void _omap_device_cleanup(struct device *dev)
> +{
> + struct platform_device *pdev = to_platform_device(dev);
> + struct omap_device *od;
> + struct omap_hwmod *oh;
> + int i;
> +
> + od = pdev->archdata.od;
> + if (!od)
> + return;
> +
> + for (i = 0; i < od->hwmods_cnt; i++) {
> +
> + oh = od->hwmods[i];
> +
> + /* shutdown hwmods */
> + omap_hwmod_shutdown(oh);
> +
> + /* we don't remove clocks cause there's no API to do so */
> + /* no harm done, since they will not be created next time */
> + }
> +
> + /* cleanup the structure now */
> + omap_device_delete(od);
> +}
> +
> static int _omap_device_notifier_call(struct notifier_block *nb,
> unsigned long event, void *dev)
> {
> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
> struct omap_device *od;
>
> switch (event) {
> - case BUS_NOTIFY_DEL_DEVICE:
> + case BUS_NOTIFY_UNBOUND_DRIVER:
> + /* NOTIFY_DEL_DEVICE is not the right call...
> + * we use a callback here, to make sure no-one is going to
> + * try to use the omap_device data after they're deleted
> + */
> if (pdev->archdata.od)
> - omap_device_delete(pdev->archdata.od);
> + device_schedule_callback(dev, _omap_device_cleanup);

Really? This is one sign that you are totally using the driver core
incorrectly. You shouldn't have to rely on notifier callbacks to handle
device removals, your bus code should do that for you directly.

I don't like this at all, sorry.

And I was waiting for the day when people started to finally remove
platform devices from the system, I always thought it would never work
properly. Good luck with this, I think you have a lot of work ahead of
yourself...

greg k-h


2013-08-06 09:37:33

by Pantelis Antoniou

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

Hi Greg,

On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
>> Removing any omap device always resulted in a crash; turns out
>> BUS_NOTIFY_DEL_DEVICE is not the last notifier event sent in the
>> course of removing the device, the correct event is
>> BUS_NOTIFY_UNBOUND_DRIVER, which still is not the right place to
>> perform the cleanup. A device callback handles that properly, as
>> well as making sure the hwmods of the device are shutdown.
>>
>> Signed-off-by: Pantelis Antoniou <[email protected]>
>> ---
>> arch/arm/mach-omap2/omap_device.c | 34 ++++++++++++++++++++++++++++++++--
>> 1 file changed, 32 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c
>> index f33b40c..6dec521 100644
>> --- a/arch/arm/mach-omap2/omap_device.c
>> +++ b/arch/arm/mach-omap2/omap_device.c
>> @@ -178,6 +178,32 @@ odbfd_exit:
>> return ret;
>> }
>>
>> +static void _omap_device_cleanup(struct device *dev)
>> +{
>> + struct platform_device *pdev = to_platform_device(dev);
>> + struct omap_device *od;
>> + struct omap_hwmod *oh;
>> + int i;
>> +
>> + od = pdev->archdata.od;
>> + if (!od)
>> + return;
>> +
>> + for (i = 0; i < od->hwmods_cnt; i++) {
>> +
>> + oh = od->hwmods[i];
>> +
>> + /* shutdown hwmods */
>> + omap_hwmod_shutdown(oh);
>> +
>> + /* we don't remove clocks cause there's no API to do so */
>> + /* no harm done, since they will not be created next time */
>> + }
>> +
>> + /* cleanup the structure now */
>> + omap_device_delete(od);
>> +}
>> +
>> static int _omap_device_notifier_call(struct notifier_block *nb,
>> unsigned long event, void *dev)
>> {
>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
>> struct omap_device *od;
>>
>> switch (event) {
>> - case BUS_NOTIFY_DEL_DEVICE:
>> + case BUS_NOTIFY_UNBOUND_DRIVER:
>> + /* NOTIFY_DEL_DEVICE is not the right call...
>> + * we use a callback here, to make sure no-one is going to
>> + * try to use the omap_device data after they're deleted
>> + */
>> if (pdev->archdata.od)
>> - omap_device_delete(pdev->archdata.od);
>> + device_schedule_callback(dev, _omap_device_cleanup);
>
> Really? This is one sign that you are totally using the driver core
> incorrectly. You shouldn't have to rely on notifier callbacks to handle
> device removals, your bus code should do that for you directly.
>
> I don't like this at all, sorry.
>

Don't shoot the messenger please...

This is all about fixing a crash without messing too many things.

> And I was waiting for the day when people started to finally remove
> platform devices from the system, I always thought it would never work
> properly. Good luck with this, I think you have a lot of work ahead of
> yourself...
>

I know.

Platform device removal is the wild-wild west...

If I had to make a wish, would be to kill platform_device completely and
integrate all it's functionality in the the vanilla device.

What exactly is a platform device anyway?

> greg k-h

Regards

-- Pantelis

2013-08-06 10:13:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

On Tue, Aug 06, 2013 at 12:37:25PM +0300, Pantelis Antoniou wrote:
> > I don't like this at all, sorry.
> >
>
> Don't shoot the messenger please...
>
> This is all about fixing a crash without messing too many things.

I understand, it's not your fault at all.

> > And I was waiting for the day when people started to finally remove
> > platform devices from the system, I always thought it would never work
> > properly. Good luck with this, I think you have a lot of work ahead of
> > yourself...
> >
>
> I know.
>
> Platform device removal is the wild-wild west...
>
> If I had to make a wish, would be to kill platform_device completely and
> integrate all it's functionality in the the vanilla device.

YES!!!!

> What exactly is a platform device anyway?

Originally it was a "something that wasn't connected to a bus, but just
had memory-mapped i/o." Like the PS2 keyboard controller.

Embedded systems got ahold of this and went to town, and made everything
a platform device because they could, and no one was paying attention.

Then OF came along and used it as well, and you know the rest...

I think we need to get the ACPI and OF people, and me, in a room
together at the kernel summit and not let us out until we have this all
worked out.

thanks,

greg k-h

2013-08-06 13:38:35

by Alexander Holler

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:

>> What exactly is a platform device anyway?
>
> Originally it was a "something that wasn't connected to a bus, but just
> had memory-mapped i/o." Like the PS2 keyboard controller.
>
> Embedded systems got ahold of this and went to town, and made everything
> a platform device because they could, and no one was paying attention.
>
> Then OF came along and used it as well, and you know the rest...
>
> I think we need to get the ACPI and OF people, and me, in a room
> together at the kernel summit and not let us out until we have this all
> worked out.

MFD uses platform devices too.

Regards,

Alexander Holler

2013-08-07 05:51:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

On Tue, Aug 06, 2013 at 03:37:13PM +0200, Alexander Holler wrote:
> Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:
>
> >> What exactly is a platform device anyway?
> >
> > Originally it was a "something that wasn't connected to a bus, but just
> > had memory-mapped i/o." Like the PS2 keyboard controller.
> >
> > Embedded systems got ahold of this and went to town, and made everything
> > a platform device because they could, and no one was paying attention.
> >
> > Then OF came along and used it as well, and you know the rest...
> >
> > I think we need to get the ACPI and OF people, and me, in a room
> > together at the kernel summit and not let us out until we have this all
> > worked out.
>
> MFD uses platform devices too.

Ugh, I've been avoiding looking at mfd for a long time now, and really
don't want to start now...

2013-08-07 07:44:30

by Pantelis Antoniou

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

Hi Greg,

On Aug 6, 2013, at 1:14 PM, Greg Kroah-Hartman wrote:

> On Tue, Aug 06, 2013 at 12:37:25PM +0300, Pantelis Antoniou wrote:
>>> I don't like this at all, sorry.
>>>
>>

[snip]

>> Don't shoot the messenger please...
>>
>> This is all about fixing a crash without messing too many things.
>
> I understand, it's not your fault at all.

Let's start talking about the patchset again.

I know all of this code is rather distasteful but it's only purpose
is to fix a bunch of crashes in a code path that was supposed to be working,
namely the removal of a platform device, in the supposedly well supported
ARM OMAP arch.

Can we agree to at least not crashing until we get to a consensus about
fixing the whole mess properly? This will take at least a few minor
kernel revisions while in the meantime users get crashes everyday.

Regards

-- Pantelis

2013-08-07 15:24:25

by Alexander Holler

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

Am 07.08.2013 07:52, schrieb Greg Kroah-Hartman:
> On Tue, Aug 06, 2013 at 03:37:13PM +0200, Alexander Holler wrote:
>> Am 06.08.2013 12:14, schrieb Greg Kroah-Hartman:
>>
>>>> What exactly is a platform device anyway?
>>>
>>> Originally it was a "something that wasn't connected to a bus, but just
>>> had memory-mapped i/o." Like the PS2 keyboard controller.
>>>
>>> Embedded systems got ahold of this and went to town, and made everything
>>> a platform device because they could, and no one was paying attention.
>>>
>>> Then OF came along and used it as well, and you know the rest...
>>>
>>> I think we need to get the ACPI and OF people, and me, in a room
>>> together at the kernel summit and not let us out until we have this all
>>> worked out.
>>
>> MFD uses platform devices too.
>
> Ugh, I've been avoiding looking at mfd for a long time now, and really
> don't want to start now...
>

I've just mentioned it to suggest that platform devices seem to be used
all over the kernel as the generic (minimal) form of a device driver. At
least that is the impression I've got.

Regards,

Alexander Holler

2013-08-07 16:16:08

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

* Pantelis Antoniou <[email protected]> [130806 02:44]:
> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
> > On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> >> +
> >> static int _omap_device_notifier_call(struct notifier_block *nb,
> >> unsigned long event, void *dev)
> >> {
> >> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
> >> struct omap_device *od;
> >>
> >> switch (event) {
> >> - case BUS_NOTIFY_DEL_DEVICE:
> >> + case BUS_NOTIFY_UNBOUND_DRIVER:
> >> + /* NOTIFY_DEL_DEVICE is not the right call...
> >> + * we use a callback here, to make sure no-one is going to
> >> + * try to use the omap_device data after they're deleted
> >> + */
> >> if (pdev->archdata.od)
> >> - omap_device_delete(pdev->archdata.od);
> >> + device_schedule_callback(dev, _omap_device_cleanup);
> >
> > Really? This is one sign that you are totally using the driver core
> > incorrectly. You shouldn't have to rely on notifier callbacks to handle
> > device removals, your bus code should do that for you directly.
> >
> > I don't like this at all, sorry.
> >
>
> Don't shoot the messenger please...

As you're inititalizing capebus with DT, let's figure out what if
anything you actually need from omap_device. I'd much rather remove
dependencies than add more.

If you need omap_device for the clocks, there are patches pending to
make them DT only for omaps. And we already have DT based solution for
pins, regulators and DMA.

So what else remains? The pieces needed for runtime PM?

> This is all about fixing a crash without messing too many things.

It seems this fix is only needed for supporting out-of-tree code?
These features with omap_device we may not even want to support in
the mainline tree as is being discussed..

Regards,

Tony

2013-08-07 16:24:09

by Pantelis Antoniou

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

Hi Tony,

On Aug 7, 2013, at 7:15 PM, Tony Lindgren wrote:

> * Pantelis Antoniou <[email protected]> [130806 02:44]:
>> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
>>> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
>>>> +
>>>> static int _omap_device_notifier_call(struct notifier_block *nb,
>>>> unsigned long event, void *dev)
>>>> {
>>>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
>>>> struct omap_device *od;
>>>>
>>>> switch (event) {
>>>> - case BUS_NOTIFY_DEL_DEVICE:
>>>> + case BUS_NOTIFY_UNBOUND_DRIVER:
>>>> + /* NOTIFY_DEL_DEVICE is not the right call...
>>>> + * we use a callback here, to make sure no-one is going to
>>>> + * try to use the omap_device data after they're deleted
>>>> + */
>>>> if (pdev->archdata.od)
>>>> - omap_device_delete(pdev->archdata.od);
>>>> + device_schedule_callback(dev, _omap_device_cleanup);
>>>
>>> Really? This is one sign that you are totally using the driver core
>>> incorrectly. You shouldn't have to rely on notifier callbacks to handle
>>> device removals, your bus code should do that for you directly.
>>>
>>> I don't like this at all, sorry.
>>>
>>
>> Don't shoot the messenger please...
>
> As you're inititalizing capebus with DT, let's figure out what if
> anything you actually need from omap_device. I'd much rather remove
> dependencies than add more.
>

There is no such thing as capebus anymore. This is just the path of
removing a platform device, which happens to also be an omap_device.

> If you need omap_device for the clocks, there are patches pending to
> make them DT only for omaps. And we already have DT based solution for
> pins, regulators and DMA.
>
> So what else remains? The pieces needed for runtime PM?
>

What happens here is that the omap_device data are freed prematurely and then end up
used again during the teardown of the platform device.


>> This is all about fixing a crash without messing too many things.
>
> It seems this fix is only needed for supporting out-of-tree code?
> These features with omap_device we may not even want to support in
> the mainline tree as is being discussed..
>

What out of tree code? The only thing this patch does is make sure we
don't crash when a perfectly valid call to platform_device_unregister() happens.

Drivers that don't use omap_device work just fine.

> Regards,
>
> Tony
>

Regards

-- Pantelis

2013-08-08 07:25:26

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH 5/5] arm: omap: Proper cleanups for omap_device

* Pantelis Antoniou <[email protected]> [130807 09:31]:
> Hi Tony,
>
> On Aug 7, 2013, at 7:15 PM, Tony Lindgren wrote:
>
> > * Pantelis Antoniou <[email protected]> [130806 02:44]:
> >> On Aug 6, 2013, at 12:33 PM, Greg Kroah-Hartman wrote:
> >>> On Tue, Aug 06, 2013 at 10:53:44AM +0300, Pantelis Antoniou wrote:
> >>>> +
> >>>> static int _omap_device_notifier_call(struct notifier_block *nb,
> >>>> unsigned long event, void *dev)
> >>>> {
> >>>> @@ -185,9 +211,13 @@ static int _omap_device_notifier_call(struct notifier_block *nb,
> >>>> struct omap_device *od;
> >>>>
> >>>> switch (event) {
> >>>> - case BUS_NOTIFY_DEL_DEVICE:
> >>>> + case BUS_NOTIFY_UNBOUND_DRIVER:
> >>>> + /* NOTIFY_DEL_DEVICE is not the right call...
> >>>> + * we use a callback here, to make sure no-one is going to
> >>>> + * try to use the omap_device data after they're deleted
> >>>> + */
> >>>> if (pdev->archdata.od)
> >>>> - omap_device_delete(pdev->archdata.od);
> >>>> + device_schedule_callback(dev, _omap_device_cleanup);
> >>>
> >>> Really? This is one sign that you are totally using the driver core
> >>> incorrectly. You shouldn't have to rely on notifier callbacks to handle
> >>> device removals, your bus code should do that for you directly.
> >>>
> >>> I don't like this at all, sorry.
> >>>
> >>
> >> Don't shoot the messenger please...
> >
> > As you're inititalizing capebus with DT, let's figure out what if
> > anything you actually need from omap_device. I'd much rather remove
> > dependencies than add more.
> >
>
> There is no such thing as capebus anymore. This is just the path of
> removing a platform device, which happens to also be an omap_device.

OK, so let's figure out the minimal fixes needed.

> >> This is all about fixing a crash without messing too many things.
> >
> > It seems this fix is only needed for supporting out-of-tree code?
> > These features with omap_device we may not even want to support in
> > the mainline tree as is being discussed..
> >
>
> What out of tree code? The only thing this patch does is make sure we
> don't crash when a perfectly valid call to platform_device_unregister() happens.
>
> Drivers that don't use omap_device work just fine.

So what's the minimal set of fixes then?

Regards,

Tony