2019-10-09 01:36:59

by Chris Lew

[permalink] [raw]
Subject: [PATCH] rpmsg: glink: Remove channel decouple from rpdev release

If a channel is being rapidly restarted and the kobj release worker is
busy, there is a chance the the rpdev_release function will run after
the channel struct itself has been released.

There should not be a need to decouple the channel from rpdev in the
rpdev release since that should only happen from the channel close
commands.

Signed-off-by: Chris Lew <[email protected]>
---
drivers/rpmsg/qcom_glink_native.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/rpmsg/qcom_glink_native.c b/drivers/rpmsg/qcom_glink_native.c
index 621f1afd4d6b..836a0bd99d11 100644
--- a/drivers/rpmsg/qcom_glink_native.c
+++ b/drivers/rpmsg/qcom_glink_native.c
@@ -1350,9 +1350,7 @@ static const struct rpmsg_endpoint_ops glink_endpoint_ops = {
static void qcom_glink_rpdev_release(struct device *dev)
{
struct rpmsg_device *rpdev = to_rpmsg_device(dev);
- struct glink_channel *channel = to_glink_channel(rpdev->ept);

- channel->rpdev = NULL;
kfree(rpdev);
}

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


2019-10-10 05:05:34

by Stephen Boyd

[permalink] [raw]
Subject: Re: [PATCH] rpmsg: glink: Remove channel decouple from rpdev release

Quoting Chris Lew (2019-10-08 18:33:45)
> If a channel is being rapidly restarted and the kobj release worker is
> busy, there is a chance the the rpdev_release function will run after
> the channel struct itself has been released.
>
> There should not be a need to decouple the channel from rpdev in the
> rpdev release since that should only happen from the channel close
> commands.
>
> Signed-off-by: Chris Lew <[email protected]>

Fixes tag? The whole thing sounds broken and probably is still racy in
the face of SMP given that channel->rpdev is tested for "published" or
not. Can you describe the race that you're closing more?

2019-10-11 18:44:03

by Chris Lew

[permalink] [raw]
Subject: Re: [PATCH] rpmsg: glink: Remove channel decouple from rpdev release



On 10/9/2019 10:04 PM, Stephen Boyd wrote:
> Quoting Chris Lew (2019-10-08 18:33:45)
>> If a channel is being rapidly restarted and the kobj release worker is
>> busy, there is a chance the the rpdev_release function will run after
>> the channel struct itself has been released.
>>
>> There should not be a need to decouple the channel from rpdev in the
>> rpdev release since that should only happen from the channel close
>> commands.
>>
>> Signed-off-by: Chris Lew <[email protected]>
>
> Fixes tag? The whole thing sounds broken and probably is still racy in
> the face of SMP given that channel->rpdev is tested for "published" or
> not. Can you describe the race that you're closing more?
>

Thanks Stephen, will add Fixes tag and try to describe the race better.

I agree that the whole thing sounds broken, the glink channel cleanup
code has a couple bugs that need to be addressed in a more extensive
patch. This patch is more to address the immediate issue of a
use-after-free from one of the races.

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project