Hello,
My static analysis tool reports a possible deadlock in the tty driver in
Linux 5.10:
uart_remove_one_port()
mutex_lock(&port->mutex); --> Line 3017 (Lock A)
wait_event(state->remove_wait, ...); --> Line 3019 (Wait X)
mutex_unlock(&port->mutex); --> Line 3021 (Unlock A)
uart_hangup()
mutex_lock(&port->mutex); --> Line 1667 (Lock A)
uart_flush_buffer()
uart_port_unlock()
uart_port_deref()
wake_up(&uport->state->remove_wait); --> Line 68 (Wake X)
mutex_unlock(&port->mutex); --> Line 1684 (Unlock A)
When uart_remove_one_port() is executed, "Wait X" is performed by
holding "Lock A". If uart_hangup() is executed at this time, "Wake X"
cannot be performed to wake up "Wait X" in uart_remove_one_port(),
because "Lock A" has been already hold by uart_remove_one_port(),
causing a possible deadlock.
I am not quite sure whether this possible problem is real and how to fix
it if it is real.
Maybe we can call wait_event() before mutex_lock() in
uart_remove_one_port().
Any feedback would be appreciated, thanks :)
Best wishes,
Jia-Ju Bai
On Sat, Jan 29, 2022 at 05:34:05PM +0800, Jia-Ju Bai wrote:
> Hello,
>
> My static analysis tool reports a possible deadlock in the tty driver in
> Linux 5.10:
5.10 was released over a year ago and over 100 thousand changes ago.
Please redo your check on 5.16 at the oldest.
thanks,
greg k-h
On 2022/1/29 17:57, Greg KH wrote:
> On Sat, Jan 29, 2022 at 05:34:05PM +0800, Jia-Ju Bai wrote:
>> Hello,
>>
>> My static analysis tool reports a possible deadlock in the tty driver in
>> Linux 5.10:
> 5.10 was released over a year ago and over 100 thousand changes ago.
> Please redo your check on 5.16 at the oldest.
Thanks, Greg.
I will redo my check on 5.16.
Best wishes,
Jia-Ju Bai
On 2022/1/29 17:57, Greg KH wrote:
> On Sat, Jan 29, 2022 at 05:34:05PM +0800, Jia-Ju Bai wrote:
>> Hello,
>>
>> My static analysis tool reports a possible deadlock in the tty driver in
>> Linux 5.10:
> 5.10 was released over a year ago and over 100 thousand changes ago.
> Please redo your check on 5.16 at the oldest.
My static analysis tool checks the tty driver in Linux 5.16, and also
finds this possible deadlock:
uart_remove_one_port()
mutex_lock(&port->mutex); --> Line 3032 (Lock A)
wait_event(state->remove_wait, ...); --> Line 3034 (Wait X)
mutex_unlock(&port->mutex); --> Line 3036 (Unlock A)
uart_hangup()
mutex_lock(&port->mutex); --> Line 1669 (Lock A)
uart_flush_buffer()
uart_port_unlock()
uart_port_deref()
wake_up(&uport->state->remove_wait); --> Line 68 (Wake X)
mutex_unlock(&port->mutex); --> Line 1686 (Unlock A)
When uart_remove_one_port() is executed, "Wait X" is performed by
holding "Lock A". If uart_hangup() is executed at this time, "Wake X"
cannot be performed to wake up "Wait X" in uart_remove_one_port(),
because "Lock A" has been already hold by uart_remove_one_port(),
causing a possible deadlock.
I am not quite sure whether this possible problem is real and how to fix
it if it is real.
Maybe we can call wait_event() before mutex_lock() in
uart_remove_one_port().
Any feedback would be appreciated, thanks :)
Best wishes,
Jia-Ju Bai
On 2022/2/1 17:07, Hillf Danton wrote:
> On Tue, 1 Feb 2022 14:51:09 +0800 Jia-Ju Bai wrote:
>> On 2022/1/29 17:57, Greg KH wrote:
>>> On Sat, Jan 29, 2022 at 05:34:05PM +0800, Jia-Ju Bai wrote:
>>>> Hello,
>>>>
>>>> My static analysis tool reports a possible deadlock in the tty driver in
>>>> Linux 5.10:
>>> 5.10 was released over a year ago and over 100 thousand changes ago.
>>> Please redo your check on 5.16 at the oldest.
>> My static analysis tool checks the tty driver in Linux 5.16, and also
>> finds this possible deadlock:
>>
>> uart_remove_one_port()
>> mutex_lock(&port->mutex); --> Line 3032 (Lock A)
>> wait_event(state->remove_wait, ...); --> Line 3034 (Wait X)
>> mutex_unlock(&port->mutex); --> Line 3036 (Unlock A)
>>
>> uart_hangup()
>> mutex_lock(&port->mutex); --> Line 1669 (Lock A)
>> uart_flush_buffer()
>> uart_port_unlock()
>> uart_port_deref()
>> wake_up(&uport->state->remove_wait); --> Line 68 (Wake X)
>> mutex_unlock(&port->mutex); --> Line 1686 (Unlock A)
>>
>> When uart_remove_one_port() is executed, "Wait X" is performed by
>> holding "Lock A". If uart_hangup() is executed at this time, "Wake X"
>> cannot be performed to wake up "Wait X" in uart_remove_one_port(),
>> because "Lock A" has been already hold by uart_remove_one_port(),
>> causing a possible deadlock.
>>
>> I am not quite sure whether this possible problem is real and how to fix
>> it if it is real.
>> Maybe we can call wait_event() before mutex_lock() in
>> uart_remove_one_port().
>> Any feedback would be appreciated, thanks :)
>>
>>
>> Best wishes,
>> Jia-Ju Bai
> Hey Jia-Ju
>
> Thank you for reporting it.
>
> In uart_flush_buffer(), uart_port_unlock() pairs with uart_port_lock()
> which bumps refcount up. OTOH no wakep is needed without refcount
> incremented, so the wakeup above in the hangup path is not waited for
> in the remove path.
Hi Hillf,
Thanks for the explanation :)
So I wonder which wait_event() can be paired with
wake_up(&uport->state->remove_wait) in uart_port_deref()?
Best wishes,
Jia-Ju Bai