From: Junxiao Chang <[email protected]>
When platform device is deleted or there is error in adding
device, platform device resources should be released. Currently
API release_resource is used to release platform device resources.
However, this API releases not only platform resource itself but
also its child resources. It might release resources which are
still in use. Calling remove_resource only releases current
resource itself, not resource tree, it moves its child resources
to up level.
For example, platform device 1 and device 2 are registered, then only
device 1 is unregistered in below code:
...
// Register platform test device 1, resource 0xfed1a000 ~ 0xfed1afff
pdev1 = platform_device_register_full(&pdevinfo1);
// Register platform test device 2, resource 0xfed1a200 ~ 0xfed1a2ff
pdev2 = platform_device_register_full(&pdevinfo2);
// Now platform device 2 resource should be device 1 resource's child
// Unregister device 1 only
platform_device_unregister(pdev1);
...
Platform device 2 resource will be released as well because its
parent resource(device 1's resource) is released, this is not expected.
If using API remove_resource, device 2 resource will not be released.
This change fixed an intel pmc platform device resource issue when
intel pmc ipc kernel module is inserted/removed for twice.
Signed-off-by: Junxiao Chang <[email protected]>
---
drivers/base/platform.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index dab0a5a..5fd1a41 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -461,7 +461,7 @@ int platform_device_add(struct platform_device *pdev)
while (--i >= 0) {
struct resource *r = &pdev->resource[i];
if (r->parent)
- release_resource(r);
+ remove_resource(r);
}
err_out:
@@ -492,7 +492,7 @@ void platform_device_del(struct platform_device *pdev)
for (i = 0; i < pdev->num_resources; i++) {
struct resource *r = &pdev->resource[i];
if (r->parent)
- release_resource(r);
+ remove_resource(r);
}
}
}
--
2.7.4
Any update?
This issue could be reproduced in one intel platform. To simulate the issue, adding following code could reproduce the issue.
Without the fix, device 2's resource will be released but the device is still registered.
With the fix, by cat /proc/iomem, device 2's resource is still there after device 1 is unregistered, this is expected.
Any comment is welcome, thanks,
Junxiao
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 4d17298..6832833 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1500,3 +1500,63 @@ void __init early_platform_cleanup(void)
}
}
+static struct resource resource1[] = {
+ {
+ .start = 0xfed1a000,
+ .end = 0xfed1afff,
+ .flags = IORESOURCE_MEM,
+ }
+};
+
+static struct resource resource2[] = {
+ {
+ .start = 0xfed1a200,
+ .end = 0xfed1a2ff,
+ .flags = IORESOURCE_MEM,
+ }
+};
+
+static int __init simulate_insmod_rmmod(void)
+{
+ struct platform_device *pdev1, *pdev2;
+
+ const struct platform_device_info pdevinfo1 = {
+ .parent = NULL,
+ .name = "device1",
+ .id = -1,
+ .res = resource1,
+ .num_res = 1,
+ };
+
+ const struct platform_device_info pdevinfo2 = {
+ .parent = NULL,
+ .name = "device2",
+ .id = -1,
+ .res = resource2,
+ .num_res = 1,
+ };
+
+ // Register platform test device 1, resource 0xfed1a000 ~ 0xfed1afff
+ pdev1 = platform_device_register_full(&pdevinfo1);
+ if (IS_ERR(pdev1)) {
+ printk("Unable to register device 1\n");
+ return -1;
+ }
+
+ // Register platform test device 2, resource 0xfed1a200 ~ 0xfed1a2ff
+ pdev2 = platform_device_register_full(&pdevinfo2);
+ if (IS_ERR(pdev2)) {
+ printk("Unable to register device 2\n");
+ platform_device_unregister(pdev1);
+ return -1;
+ }
+
+
+ // Now platform device 2 resource should be device 1 resource's child
+
+ // Unregister device 1 only
+ platform_device_unregister(pdev1);
+
+ return 0;
+}
+fs_initcall(simulate_insmod_rmmod);
-----Original Message-----
From: Chang, Junxiao
Sent: Thursday, April 25, 2019 2:24 PM
To: [email protected]
Cc: [email protected]; [email protected]; Chang, Junxiao <[email protected]>; Li, Lili <[email protected]>
Subject: [PATCH] platform: release resource itself instead of resource tree
From: Junxiao Chang <[email protected]>
When platform device is deleted or there is error in adding device, platform device resources should be released. Currently API release_resource is used to release platform device resources.
However, this API releases not only platform resource itself but also its child resources. It might release resources which are still in use. Calling remove_resource only releases current resource itself, not resource tree, it moves its child resources to up level.
For example, platform device 1 and device 2 are registered, then only device 1 is unregistered in below code:
...
// Register platform test device 1, resource 0xfed1a000 ~ 0xfed1afff
pdev1 = platform_device_register_full(&pdevinfo1);
// Register platform test device 2, resource 0xfed1a200 ~ 0xfed1a2ff
pdev2 = platform_device_register_full(&pdevinfo2);
// Now platform device 2 resource should be device 1 resource's child
// Unregister device 1 only
platform_device_unregister(pdev1);
...
Platform device 2 resource will be released as well because its parent resource(device 1's resource) is released, this is not expected.
If using API remove_resource, device 2 resource will not be released.
This change fixed an intel pmc platform device resource issue when intel pmc ipc kernel module is inserted/removed for twice.
Signed-off-by: Junxiao Chang <[email protected]>
---
drivers/base/platform.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c index dab0a5a..5fd1a41 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -461,7 +461,7 @@ int platform_device_add(struct platform_device *pdev)
while (--i >= 0) {
struct resource *r = &pdev->resource[i];
if (r->parent)
- release_resource(r);
+ remove_resource(r);
}
err_out:
@@ -492,7 +492,7 @@ void platform_device_del(struct platform_device *pdev)
for (i = 0; i < pdev->num_resources; i++) {
struct resource *r = &pdev->resource[i];
if (r->parent)
- release_resource(r);
+ remove_resource(r);
}
}
}
--
2.7.4
On Thu, Apr 25, 2019 at 02:24:18PM +0800, [email protected] wrote:
> From: Junxiao Chang <[email protected]>
>
> When platform device is deleted or there is error in adding
> device, platform device resources should be released. Currently
> API release_resource is used to release platform device resources.
> However, this API releases not only platform resource itself but
> also its child resources. It might release resources which are
> still in use. Calling remove_resource only releases current
> resource itself, not resource tree, it moves its child resources
> to up level.
But shouldn't the parent device not get removed until all of the
children are removed? What is causing this "inversion" to happen?
>
> For example, platform device 1 and device 2 are registered, then only
> device 1 is unregistered in below code:
>
> ...
> // Register platform test device 1, resource 0xfed1a000 ~ 0xfed1afff
> pdev1 = platform_device_register_full(&pdevinfo1);
>
> // Register platform test device 2, resource 0xfed1a200 ~ 0xfed1a2ff
> pdev2 = platform_device_register_full(&pdevinfo2);
>
> // Now platform device 2 resource should be device 1 resource's child
>
> // Unregister device 1 only
> platform_device_unregister(pdev1);
> ...
Don't do that. :)
You created this mess of platform devices, so you need to keep track of
them.
> Platform device 2 resource will be released as well because its
> parent resource(device 1's resource) is released, this is not expected.
> If using API remove_resource, device 2 resource will not be released.
>
> This change fixed an intel pmc platform device resource issue when
> intel pmc ipc kernel module is inserted/removed for twice.
Why not fix that kernel module instead? It seems like that is the real
problem here, not a driver core issue.
thanks,
greg k-h
Hi Greg,
Thank you for looking at it.
Below example is simplified description. Our detail problem is:
1. ACPI driver registers a MEM resource during bootup;
2. Our PUNIT(Intel CPU power management module) platform device reads ACPI driver's resource, and registers same MEM resource;
3. According to current resource management logic, if two resources are same, later registered resource will be parent. That is, PUNIT platform device's resource will be ACPI driver resource's parent.
4. PUNIT kernel module is removed, its resource will be removed. If we use original API "release_resource", ACPI driver's resource will be released as well because it is PUNIT device's child;
5. Next time PUNIT platform device kernel module is inserted, it might read wrong ACPI MEM resource because it has been released in step 4.
How should we handle this case? :)
We should not register same MEM resource in step 2? Or, make change in resource management logic, if two resources are same, later registered resource should be child instead of parent?
Thanks,
Junxiao
-----Original Message-----
From: Greg KH [mailto:[email protected]]
Sent: Thursday, July 25, 2019 9:39 PM
To: Chang, Junxiao <[email protected]>
Cc: [email protected]; [email protected]; Li, Lili <[email protected]>
Subject: Re: [PATCH] platform: release resource itself instead of resource tree
On Thu, Apr 25, 2019 at 02:24:18PM +0800, [email protected] wrote:
> From: Junxiao Chang <[email protected]>
>
> When platform device is deleted or there is error in adding device,
> platform device resources should be released. Currently API
> release_resource is used to release platform device resources.
> However, this API releases not only platform resource itself but also
> its child resources. It might release resources which are still in
> use. Calling remove_resource only releases current resource itself,
> not resource tree, it moves its child resources to up level.
But shouldn't the parent device not get removed until all of the children are removed? What is causing this "inversion" to happen?
>
> For example, platform device 1 and device 2 are registered, then only
> device 1 is unregistered in below code:
>
> ...
> // Register platform test device 1, resource 0xfed1a000 ~ 0xfed1afff
> pdev1 = platform_device_register_full(&pdevinfo1);
>
> // Register platform test device 2, resource 0xfed1a200 ~ 0xfed1a2ff
> pdev2 = platform_device_register_full(&pdevinfo2);
>
> // Now platform device 2 resource should be device 1 resource's
> child
>
> // Unregister device 1 only
> platform_device_unregister(pdev1);
> ...
Don't do that. :)
You created this mess of platform devices, so you need to keep track of them.
> Platform device 2 resource will be released as well because its parent
> resource(device 1's resource) is released, this is not expected.
> If using API remove_resource, device 2 resource will not be released.
>
> This change fixed an intel pmc platform device resource issue when
> intel pmc ipc kernel module is inserted/removed for twice.
Why not fix that kernel module instead? It seems like that is the real problem here, not a driver core issue.
thanks,
greg k-h
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Fri, Jul 26, 2019 at 10:24:22AM +0000, Chang, Junxiao wrote:
> Hi Greg,
>
> Thank you for looking at it.
>
> Below example is simplified description. Our detail problem is:
> 1. ACPI driver registers a MEM resource during bootup;
> 2. Our PUNIT(Intel CPU power management module) platform device reads ACPI driver's resource, and registers same MEM resource;
> 3. According to current resource management logic, if two resources are same, later registered resource will be parent. That is, PUNIT platform device's resource will be ACPI driver resource's parent.
> 4. PUNIT kernel module is removed, its resource will be removed. If we use original API "release_resource", ACPI driver's resource will be released as well because it is PUNIT device's child;
Why did you remove this module? That never happens unless you do it "by
hand", and as root. Just don't do this if it causes problems in your
system, right?
Anyway, if you create a child reference you should always clean up all
of your children before removing yourself from memory. So try fixing
that up first.
> 5. Next time PUNIT platform device kernel module is inserted, it might read wrong ACPI MEM resource because it has been released in step 4.
>
> How should we handle this case? :)
Don't unload kernel modules unless you know what you are doing :)
Seriously, you are creating a dependancy here that you are not
expressing in your module reference count, and you are not properly
cleaning up after yourself when removing your devices. Just delete your
child devices when unloading yourself and you should be fine, right?
> We should not register same MEM resource in step 2? Or, make change in
> resource management logic, if two resources are same, later registered
> resource should be child instead of parent?
I don't know how your resource management logic works, try working with
the developers who wrote that. But as you describe it, it is not
correct at least when you try to clean things up.
thanks,
greg k-h