2007-02-18 20:30:17

by Jean Delvare

[permalink] [raw]
Subject: [PATCH] platform: reorder platform_device_del

In platform_device_del(), we currently delete the device resources
first, then we delete the device itself. This causes a (minor) bug to
occur when one unregisters a platform device before unregistering its
platform driver, and the driver is requesting (in .probe()) and
releasing (in .remove()) a resource of the device. The device
resources are already gone by the time the driver gets the chance to
release the resources it had been requesting, causing an error like:
Trying to free nonexistent resource <0000000000000295-0000000000000296>

If the platform driver is unregistered first, the problem doesn't
occur, as the driver will have the opportunity to release the
resources it had requested before the device resources themselves are
released. It's a bit odd that unregistering the driver first or the
device first doesn't lead to the same result.

So I believe that we should delete the device first in
platform_device_del(). I've searched the git history and found that it
used to be the case before 2.6.8, but was changed here:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=96ef7b3689936ee1e64b711511342026a8ce459c

> 2004/07/14 16:09:44-07:00 dtor_core
> [PATCH] Driver core: Fix OOPS in device_platform_unregister
>
> Driver core: platform_device_unregister should release resources first
> and only then call device_unregister, otherwise if there
> are no more references to the device it will be freed and
> the fucntion will try to access freed memory.

However we now have an explicit call to put_device() at the end of
platform_device_unregister() so I guess the original problem no longer
exists and it is safe to revert that change.

Signed-off-by: Jean Delvare <[email protected]>
---
drivers/base/platform.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.20.orig/drivers/base/platform.c 2007-02-04 19:44:54.000000000 +0100
+++ linux-2.6.20/drivers/base/platform.c 2007-02-18 19:38:09.000000000 +0100
@@ -299,13 +299,13 @@ void platform_device_del(struct platform
int i;

if (pdev) {
+ device_del(&pdev->dev);
+
for (i = 0; i < pdev->num_resources; i++) {
struct resource *r = &pdev->resource[i];
if (r->flags & (IORESOURCE_MEM|IORESOURCE_IO))
release_resource(r);
}
-
- device_del(&pdev->dev);
}
}
EXPORT_SYMBOL_GPL(platform_device_del);


--
Jean Delvare


2007-02-19 06:05:33

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] platform: reorder platform_device_del

Hi Jean,

On Sunday 18 February 2007 15:30, Jean Delvare wrote:
> In platform_device_del(), we currently delete the device resources
> first, then we delete the device itself. This causes a (minor) bug to
> occur when one unregisters a platform device before unregistering its
> platform driver, and the driver is requesting (in .probe()) and
> releasing (in .remove()) a resource of the device. The device
> resources are already gone by the time the driver gets the chance to
> release the resources it had been requesting, causing an error like:
> Trying to free nonexistent resource <0000000000000295-0000000000000296>
>
> If the platform driver is unregistered first, the problem doesn't
> occur, as the driver will have the opportunity to release the
> resources it had requested before the device resources themselves are
> released. It's a bit odd that unregistering the driver first or the
> device first doesn't lead to the same result.

I am not sure that this is correct argument. Why does a driver request
and release device's resources? "Higher power" has already requested
these resources for the device and this is not drivers task to free
them. However you are right that we should not free resources until
device is marked for deletion and driver's ->remove() method completes
but I would not rely on surrounding code for keeping an extra reference
and would just take one explicitely in platform_device_del().

--
Dmitry

2007-02-19 09:23:33

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH] platform: reorder platform_device_del

Hi Dmitry,

On Mon, 19 Feb 2007 01:05:30 -0500, Dmitry Torokhov wrote:
> On Sunday 18 February 2007 15:30, Jean Delvare wrote:
> > In platform_device_del(), we currently delete the device resources
> > first, then we delete the device itself. This causes a (minor) bug to
> > occur when one unregisters a platform device before unregistering its
> > platform driver, and the driver is requesting (in .probe()) and
> > releasing (in .remove()) a resource of the device. The device
> > resources are already gone by the time the driver gets the chance to
> > release the resources it had been requesting, causing an error like:
> > Trying to free nonexistent resource <0000000000000295-0000000000000296>
> >
> > If the platform driver is unregistered first, the problem doesn't
> > occur, as the driver will have the opportunity to release the
> > resources it had requested before the device resources themselves are
> > released. It's a bit odd that unregistering the driver first or the
> > device first doesn't lead to the same result.
>
> I am not sure that this is correct argument. Why does a driver request
> and release device's resources? "Higher power" has already requested
> these resources for the device and this is not drivers task to free
> them.

My understanding is that, in the device driver model, devices _declare_
resources, and it's up to the drivers to actually request the ones they
wish to use. Looking at /proc/ioports, this hierarchy is quite visible.
And while not all platform drivers actually request their device's
resource, quite a few do (I counted 95 out of 228), so I'm definitely
not the only one thinking it is the right thing to do.

If the intent was that devices request the resources and the drivers
don't have to care, then a driver requesting a device's resource should
be returned an error. Instead, it works fine, which make me confident
that I got the idea right.

> However you are right that we should not free resources until
> device is marked for deletion and driver's ->remove() method completes
> but I would not rely on surrounding code for keeping an extra reference
> and would just take one explicitely in platform_device_del().

You mean that you wouldn't trust the rest of the platform driver core,
which lives in the same source file, to do the right thing? That's
being too defensive IMHO, and I don't think we usually do that. Greg?

--
Jean Delvare

2007-02-19 14:40:13

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] platform: reorder platform_device_del

Hi Jean,

On 2/19/07, Jean Delvare <[email protected]> wrote:
> Hi Dmitry,
>
> On Mon, 19 Feb 2007 01:05:30 -0500, Dmitry Torokhov wrote:
> > On Sunday 18 February 2007 15:30, Jean Delvare wrote:
> > > In platform_device_del(), we currently delete the device resources
> > > first, then we delete the device itself. This causes a (minor) bug to
> > > occur when one unregisters a platform device before unregistering its
> > > platform driver, and the driver is requesting (in .probe()) and
> > > releasing (in .remove()) a resource of the device. The device
> > > resources are already gone by the time the driver gets the chance to
> > > release the resources it had been requesting, causing an error like:
> > > Trying to free nonexistent resource <0000000000000295-0000000000000296>
> > >
> > > If the platform driver is unregistered first, the problem doesn't
> > > occur, as the driver will have the opportunity to release the
> > > resources it had requested before the device resources themselves are
> > > released. It's a bit odd that unregistering the driver first or the
> > > device first doesn't lead to the same result.
> >
> > I am not sure that this is correct argument. Why does a driver request
> > and release device's resources? "Higher power" has already requested
> > these resources for the device and this is not drivers task to free
> > them.
>
> My understanding is that, in the device driver model, devices _declare_
> resources, and it's up to the drivers to actually request the ones they
> wish to use. Looking at /proc/ioports, this hierarchy is quite visible.
> And while not all platform drivers actually request their device's
> resource, quite a few do (I counted 95 out of 228), so I'm definitely
> not the only one thinking it is the right thing to do.
>
> If the intent was that devices request the resources and the drivers
> don't have to care, then a driver requesting a device's resource should
> be returned an error. Instead, it works fine, which make me confident
> that I got the idea right.

But the thing is that when you add resources to a platofrm device it
is platform_device_add() that marks resources as busy and therefore it
is not driver's task to delete them. If you want driver to request
resources you better pass them through a private data structure
attached to a device.

The attached resources are normally used by arch setup codes that
posess the knowledge about whether a device is present on a box and
what IO port or memory region is used. The device is usualy there and
its resources fixed and not available to anyone else even if there is
no driver to actually control the device.

>
> > However you are right that we should not free resources until
> > device is marked for deletion and driver's ->remove() method completes
> > but I would not rely on surrounding code for keeping an extra reference
> > and would just take one explicitely in platform_device_del().
>
> You mean that you wouldn't trust the rest of the platform driver core,
> which lives in the same source file, to do the right thing? That's
> being too defensive IMHO, and I don't think we usually do that. Greg?
>

platform_device_del() is public so all bets are off...

--
Dmitry

2007-02-20 21:55:25

by Jean Delvare

[permalink] [raw]
Subject: Re: [PATCH] platform: reorder platform_device_del

Hi Dmitry,

On Mon, 19 Feb 2007 09:40:09 -0500, Dmitry Torokhov wrote:
> But the thing is that when you add resources to a platofrm device it
> is platform_device_add() that marks resources as busy and therefore it

No, devices declare the resources, but do not mark them as busy.
Otherwise any further attempt to request them would fail, which isn't
the case.

> is not driver's task to delete them. If you want driver to request
> resources you better pass them through a private data structure
> attached to a device.

The driver isn't deleting the resources, it is marking them as no longer
busy.

We can't seem to agree on this point. Can someone else who is familiar
with the new resource model please shed some light?

> The attached resources are normally used by arch setup codes that
> posess the knowledge about whether a device is present on a box and
> what IO port or memory region is used. The device is usualy there and
> its resources fixed and not available to anyone else even if there is
> no driver to actually control the device.

That we agree on.

> > > However you are right that we should not free resources until
> > > device is marked for deletion and driver's ->remove() method completes
> > > but I would not rely on surrounding code for keeping an extra reference
> > > and would just take one explicitely in platform_device_del().
> >
> > You mean that you wouldn't trust the rest of the platform driver core,
> > which lives in the same source file, to do the right thing? That's
> > being too defensive IMHO, and I don't think we usually do that. Greg?
> >
>
> platform_device_del() is public so all bets are off...

Good point, I had missed that fact. However, I've now checked all the
callers and found that they all call platform_device_put() right after
platform_device_del(), so we're safe. And they have to, otherwise their
code is broken. It's always in an error path, as this appears to the
only legitimate use for platform_device_del(). So I really don't see
the benefit of taking an additional reference in platform_device_del
"just in case". If it ever makes a difference, it means the caller code
is badly broken anyway, so why bother?

platform_device_put has the following comment: "This function must
_only_ be externally called in error cases. All other usage is a bug."
As I understand it, we could/should add this to platform_device_del
too, to discourage the bogus use cases you fear could happen. Hence an
updated patch:

* * * * *

In platform_device_del(), we currently delete the device resources
first, then we delete the device itself. This causes a (minor) bug to
occur when one unregisters a platform device before unregistering its
platform driver, and the driver is requesting (in .probe()) and
releasing (in .remove()) a resource of the device. The device
resources are already gone by the time the driver gets the chance to
release the resources it had been requesting, causing an error like:
Trying to free nonexistent resource <0000000000000295-0000000000000296>

If the platform driver is unregistered first, the problem doesn't
occur, as the driver will have the opportunity to release the
resources it had requested before the device resources themselves are
released. It's a bit odd that unregistering the driver first or the
device first doesn't lead to the same result.

So I believe that we should delete the device first in
platform_device_del(). I've searched the git history and found that it
used to be the case before 2.6.8, but was changed here:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=96ef7b3689936ee1e64b711511342026a8ce459c

> 2004/07/14 16:09:44-07:00 dtor_core
> [PATCH] Driver core: Fix OOPS in device_platform_unregister
>
> Driver core: platform_device_unregister should release resources first
> and only then call device_unregister, otherwise if there
> are no more references to the device it will be freed and
> the fucntion will try to access freed memory.

However we now have an explicit call to put_device() at the end of
platform_device_unregister() so I guess the original problem no longer
exists and it is safe to revert that change.

Signed-off-by: Jean Delvare <[email protected]>
---
drivers/base/platform.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

--- linux-2.6.21-pre.orig/drivers/base/platform.c 2007-02-20 11:05:25.000000000 +0100
+++ linux-2.6.21-pre/drivers/base/platform.c 2007-02-20 22:44:35.000000000 +0100
@@ -292,20 +292,22 @@ EXPORT_SYMBOL_GPL(platform_device_add);
* @pdev: platform device we're removing
*
* Note that this function will also release all memory- and port-based
- * resources owned by the device (@dev->resource).
+ * resources owned by the device (@dev->resource). This function
+ * must _only_ be externally called in error cases. All other usage
+ * is a bug.
*/
void platform_device_del(struct platform_device *pdev)
{
int i;

if (pdev) {
+ device_del(&pdev->dev);
+
for (i = 0; i < pdev->num_resources; i++) {
struct resource *r = &pdev->resource[i];
if (r->flags & (IORESOURCE_MEM|IORESOURCE_IO))
release_resource(r);
}
-
- device_del(&pdev->dev);
}
}
EXPORT_SYMBOL_GPL(platform_device_del);


--
Jean Delvare

2007-02-27 00:52:11

by Greg KH

[permalink] [raw]
Subject: patch platform-reorder-platform_device_del.patch added to gregkh-2.6 tree


This is a note to let you know that I've just added the patch titled

Subject: platform: reorder platform_device_del

to my gregkh-2.6 tree. Its filename is

platform-reorder-platform_device_del.patch

This tree can be found at
http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/


>From [email protected] Tue Feb 20 13:55:29 2007
From: Jean Delvare <[email protected]>
Date: Tue, 20 Feb 2007 22:55:06 +0100
Subject: platform: reorder platform_device_del
To: "Dmitry Torokhov" <[email protected]>
Cc: "Greg Kroah-Hartman" <[email protected]>, LKML <[email protected]>
Message-ID: <[email protected]>


In platform_device_del(), we currently delete the device resources
first, then we delete the device itself. This causes a (minor) bug to
occur when one unregisters a platform device before unregistering its
platform driver, and the driver is requesting (in .probe()) and
releasing (in .remove()) a resource of the device. The device
resources are already gone by the time the driver gets the chance to
release the resources it had been requesting, causing an error like:
Trying to free nonexistent resource <0000000000000295-0000000000000296>

If the platform driver is unregistered first, the problem doesn't
occur, as the driver will have the opportunity to release the
resources it had requested before the device resources themselves are
released. It's a bit odd that unregistering the driver first or the
device first doesn't lead to the same result.

So I believe that we should delete the device first in
platform_device_del(). I've searched the git history and found that it
used to be the case before 2.6.8, but was changed here:

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commitdiff;h=96ef7b3689936ee1e64b711511342026a8ce459c

> 2004/07/14 16:09:44-07:00 dtor_core
> [PATCH] Driver core: Fix OOPS in device_platform_unregister
>
> Driver core: platform_device_unregister should release resources first
> and only then call device_unregister, otherwise if there
> are no more references to the device it will be freed and
> the fucntion will try to access freed memory.

However we now have an explicit call to put_device() at the end of
platform_device_unregister() so I guess the original problem no longer
exists and it is safe to revert that change.

Signed-off-by: Jean Delvare <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/platform.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

--- gregkh-2.6.orig/drivers/base/platform.c
+++ gregkh-2.6/drivers/base/platform.c
@@ -292,20 +292,22 @@ EXPORT_SYMBOL_GPL(platform_device_add);
* @pdev: platform device we're removing
*
* Note that this function will also release all memory- and port-based
- * resources owned by the device (@dev->resource).
+ * resources owned by the device (@dev->resource). This function
+ * must _only_ be externally called in error cases. All other usage
+ * is a bug.
*/
void platform_device_del(struct platform_device *pdev)
{
int i;

if (pdev) {
+ device_del(&pdev->dev);
+
for (i = 0; i < pdev->num_resources; i++) {
struct resource *r = &pdev->resource[i];
if (r->flags & (IORESOURCE_MEM|IORESOURCE_IO))
release_resource(r);
}
-
- device_del(&pdev->dev);
}
}
EXPORT_SYMBOL_GPL(platform_device_del);


Patches currently in gregkh-2.6 which might be from [email protected] are