Hello,
My static analysis tool reports a possible deadlock in the ucsi driver
in Linux 5.16:
ucsi_pr_swap()
mutex_lock(&con->lock); --> Line 962 (Lock A)
wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
ucsi_handle_connector_change()
mutex_lock(&con->lock); --> Line 763 (Lock A)
complete(&con->complete); --> Line 782 (Wake X)
complete(&con->complete); --> Line 807 (Wake X)
When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock
A". If ucsi_handle_connector_change() is executed at this time, "Wake X"
cannot be performed to wake up "Wait X" in
ucsi_handle_connector_change(), because "Lock A" has been already held
by ucsi_handle_connector_change(), causing a possible deadlock.
I find that "Wait X" is performed with a timeout, to relieve the
possible deadlock; but I think this timeout can cause inefficient execution.
I am not quite sure whether this possible problem is real.
Any feedback would be appreciated, thanks :)
Best wishes,
Jia-Ju Bai
On Wed, Feb 09, 2022 at 11:50:57AM +0800, Jia-Ju Bai wrote:
> Hello,
>
> My static analysis tool reports a possible deadlock in the ucsi driver in
> Linux 5.16:
>
> ucsi_pr_swap()
> ? mutex_lock(&con->lock); --> Line 962 (Lock A)
> ? wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
>
> ucsi_handle_connector_change()
> ? mutex_lock(&con->lock); --> Line 763 (Lock A)
> ? complete(&con->complete); --> Line 782 (Wake X)
> ? complete(&con->complete); --> Line 807 (Wake X)
>
> When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock A".
> If ucsi_handle_connector_change() is executed at this time, "Wake X" cannot
> be performed to wake up "Wait X" in ucsi_handle_connector_change(), because
> "Lock A" has been already held by ucsi_handle_connector_change(), causing a
> possible deadlock.
> I find that "Wait X" is performed with a timeout, to relieve the possible
> deadlock; but I think this timeout can cause inefficient execution.
>
> I am not quite sure whether this possible problem is real.
> Any feedback would be appreciated, thanks :)
This is probable a regression from commit ad74b8649bea ("usb: typec:
ucsi: Preliminary support for alternate modes"). Can you test does
this patch fix the issue (attached)?
thanks,
--
heikki
On 2022/2/9 22:30, Heikki Krogerus wrote:
> On Wed, Feb 09, 2022 at 11:50:57AM +0800, Jia-Ju Bai wrote:
>> Hello,
>>
>> My static analysis tool reports a possible deadlock in the ucsi driver in
>> Linux 5.16:
>>
>> ucsi_pr_swap()
>> ? mutex_lock(&con->lock); --> Line 962 (Lock A)
>> ? wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
>>
>> ucsi_handle_connector_change()
>> ? mutex_lock(&con->lock); --> Line 763 (Lock A)
>> ? complete(&con->complete); --> Line 782 (Wake X)
>> ? complete(&con->complete); --> Line 807 (Wake X)
>>
>> When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock A".
>> If ucsi_handle_connector_change() is executed at this time, "Wake X" cannot
>> be performed to wake up "Wait X" in ucsi_handle_connector_change(), because
>> "Lock A" has been already held by ucsi_handle_connector_change(), causing a
>> possible deadlock.
>> I find that "Wait X" is performed with a timeout, to relieve the possible
>> deadlock; but I think this timeout can cause inefficient execution.
>>
>> I am not quite sure whether this possible problem is real.
>> Any feedback would be appreciated, thanks :)
> This is probable a regression from commit ad74b8649bea ("usb: typec:
> ucsi: Preliminary support for alternate modes"). Can you test does
> this patch fix the issue (attached)?
Hi Heikki,
Thanks for the reply and patch.
After the patch is used, my tool does not report this deadlock in the
ucsi driver.
Thus, I think this patch should be okay to fix the deadlock :)
Best wishes,
Jia-Ju Bai
On Thu, Feb 10, 2022 at 10:24:42AM +0800, Jia-Ju Bai wrote:
>
>
> On 2022/2/9 22:30, Heikki Krogerus wrote:
> > On Wed, Feb 09, 2022 at 11:50:57AM +0800, Jia-Ju Bai wrote:
> > > Hello,
> > >
> > > My static analysis tool reports a possible deadlock in the ucsi driver in
> > > Linux 5.16:
> > >
> > > ucsi_pr_swap()
> > > ? mutex_lock(&con->lock); --> Line 962 (Lock A)
> > > ? wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
> > >
> > > ucsi_handle_connector_change()
> > > ? mutex_lock(&con->lock); --> Line 763 (Lock A)
> > > ? complete(&con->complete); --> Line 782 (Wake X)
> > > ? complete(&con->complete); --> Line 807 (Wake X)
> > >
> > > When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock A".
> > > If ucsi_handle_connector_change() is executed at this time, "Wake X" cannot
> > > be performed to wake up "Wait X" in ucsi_handle_connector_change(), because
> > > "Lock A" has been already held by ucsi_handle_connector_change(), causing a
> > > possible deadlock.
> > > I find that "Wait X" is performed with a timeout, to relieve the possible
> > > deadlock; but I think this timeout can cause inefficient execution.
> > >
> > > I am not quite sure whether this possible problem is real.
> > > Any feedback would be appreciated, thanks :)
> > This is probable a regression from commit ad74b8649bea ("usb: typec:
> > ucsi: Preliminary support for alternate modes"). Can you test does
> > this patch fix the issue (attached)?
>
> Hi Heikki,
>
> Thanks for the reply and patch.
> After the patch is used, my tool does not report this deadlock in the ucsi
> driver.
> Thus, I think this patch should be okay to fix the deadlock :)
Thanks for testing it. I'll tag you as the reporter and tester in the
patch.
Br,
--
heikki
Hi Heikki,
On Wed, Feb 09, 2022 at 04:30:31PM +0200, Heikki Krogerus wrote:
> On Wed, Feb 09, 2022 at 11:50:57AM +0800, Jia-Ju Bai wrote:
> > Hello,
> >
> > My static analysis tool reports a possible deadlock in the ucsi driver in
> > Linux 5.16:
> >
> > ucsi_pr_swap()
> > ? mutex_lock(&con->lock); --> Line 962 (Lock A)
> > ? wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
> >
> > ucsi_handle_connector_change()
> > ? mutex_lock(&con->lock); --> Line 763 (Lock A)
> > ? complete(&con->complete); --> Line 782 (Wake X)
> > ? complete(&con->complete); --> Line 807 (Wake X)
> >
> > When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock A".
> > If ucsi_handle_connector_change() is executed at this time, "Wake X" cannot
> > be performed to wake up "Wait X" in ucsi_handle_connector_change(), because
> > "Lock A" has been already held by ucsi_handle_connector_change(), causing a
> > possible deadlock.
> > I find that "Wait X" is performed with a timeout, to relieve the possible
> > deadlock; but I think this timeout can cause inefficient execution.
> >
> > I am not quite sure whether this possible problem is real.
> > Any feedback would be appreciated, thanks :)
>
> This is probable a regression from commit ad74b8649bea ("usb: typec:
> ucsi: Preliminary support for alternate modes"). Can you test does
> this patch fix the issue (attached)?
We encountered a slightly different twist to this bug. Instead of
deadlocking, we see that the dr_swap() / pr_swap() operations actually
jump out of the wait_for_completion_timeout() immediately, even before
any partner change occurs. This is because the con->complete may
already have its done flag set to true from the first time
ucsi_handle_connector_change() runs, and is never reset after that.
In addition to the unlocking below, I think we need to also add
reinit_completion() calls at the start of ucsi_{pr,dr}_swap().
Thanks,
Jack
> From 2ad06425a3df7be656f8a5b3c202aab45554fd17 Mon Sep 17 00:00:00 2001
> From: Heikki Krogerus <[email protected]>
> Date: Wed, 9 Feb 2022 17:27:19 +0300
> Subject: [PATCH] usb: typec: ucsi: Test fix
>
> Interim.
>
> Not-Signed-off-by: Heikki Krogerus <[email protected]>
> ---
> drivers/usb/typec/ucsi/ucsi.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index f0c2fa19f3e0f..225104beda8be 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -956,14 +956,18 @@ static int ucsi_dr_swap(struct typec_port *port, enum typec_data_role role)
> if (ret < 0)
> goto out_unlock;
>
> + mutex_unlock(&con->lock);
> +
> if (!wait_for_completion_timeout(&con->complete,
> msecs_to_jiffies(UCSI_SWAP_TIMEOUT_MS)))
> - ret = -ETIMEDOUT;
> + return -ETIMEDOUT;
> +
> + return 0;
>
> out_unlock:
> mutex_unlock(&con->lock);
>
> - return ret < 0 ? ret : 0;
> + return ret;
> }
>
> static int ucsi_pr_swap(struct typec_port *port, enum typec_role role)
> @@ -992,11 +996,13 @@ static int ucsi_pr_swap(struct typec_port *port, enum typec_role role)
> if (ret < 0)
> goto out_unlock;
>
> + mutex_unlock(&con->lock);
> +
> if (!wait_for_completion_timeout(&con->complete,
> - msecs_to_jiffies(UCSI_SWAP_TIMEOUT_MS))) {
> - ret = -ETIMEDOUT;
> - goto out_unlock;
> - }
> + msecs_to_jiffies(UCSI_SWAP_TIMEOUT_MS)))
> + return -ETIMEDOUT;
> +
> + mutex_lock(&con->lock);
>
> /* Something has gone wrong while swapping the role */
> if (UCSI_CONSTAT_PWR_OPMODE(con->status.flags) !=
> @@ -1372,6 +1378,7 @@ void ucsi_unregister(struct ucsi *ucsi)
> ucsi->ops->async_write(ucsi, UCSI_CONTROL, &cmd, sizeof(cmd));
>
> for (i = 0; i < ucsi->cap.num_connectors; i++) {
> + complete(&ucsi->connector[i].complete);
> cancel_work_sync(&ucsi->connector[i].work);
> ucsi_unregister_partner(&ucsi->connector[i]);
> ucsi_unregister_altmodes(&ucsi->connector[i],
> --
> 2.34.1
>
On Fri, Mar 25, 2022 at 01:39:59PM -0700, Jack Pham wrote:
> Hi Heikki,
>
> On Wed, Feb 09, 2022 at 04:30:31PM +0200, Heikki Krogerus wrote:
> > On Wed, Feb 09, 2022 at 11:50:57AM +0800, Jia-Ju Bai wrote:
> > > Hello,
> > >
> > > My static analysis tool reports a possible deadlock in the ucsi driver in
> > > Linux 5.16:
> > >
> > > ucsi_pr_swap()
> > > ? mutex_lock(&con->lock); --> Line 962 (Lock A)
> > > ? wait_for_completion_timeout(&con->complete, ...) --> Line 981 (Wait X)
> > >
> > > ucsi_handle_connector_change()
> > > ? mutex_lock(&con->lock); --> Line 763 (Lock A)
> > > ? complete(&con->complete); --> Line 782 (Wake X)
> > > ? complete(&con->complete); --> Line 807 (Wake X)
> > >
> > > When ucsi_pr_swap() is executed, "Wait X" is performed by holding "Lock A".
> > > If ucsi_handle_connector_change() is executed at this time, "Wake X" cannot
> > > be performed to wake up "Wait X" in ucsi_handle_connector_change(), because
> > > "Lock A" has been already held by ucsi_handle_connector_change(), causing a
> > > possible deadlock.
> > > I find that "Wait X" is performed with a timeout, to relieve the possible
> > > deadlock; but I think this timeout can cause inefficient execution.
> > >
> > > I am not quite sure whether this possible problem is real.
> > > Any feedback would be appreciated, thanks :)
> >
> > This is probable a regression from commit ad74b8649bea ("usb: typec:
> > ucsi: Preliminary support for alternate modes"). Can you test does
> > this patch fix the issue (attached)?
>
> We encountered a slightly different twist to this bug. Instead of
> deadlocking, we see that the dr_swap() / pr_swap() operations actually
> jump out of the wait_for_completion_timeout() immediately, even before
> any partner change occurs. This is because the con->complete may
> already have its done flag set to true from the first time
> ucsi_handle_connector_change() runs, and is never reset after that.
>
> In addition to the unlocking below, I think we need to also add
> reinit_completion() calls at the start of ucsi_{pr,dr}_swap().
OK. I'll add that to the patch.
thanks,
--
heikki