On Sun, 2023-04-23 at 10:24 +0200, Mirsad Goran Todorovac wrote:
> In the function ieee80211_tx_dequeue() there is a locking sequence:
>
> begin:
> spin_lock(&local->queue_stop_reason_lock);
> q_stopped = local->queue_stop_reasons[q];
> spin_unlock(&local->queue_stop_reason_lock);
>
> However small the chance (increased by ftracetest), an asynchronous
> interrupt can occur in between of spin_lock() and spin_unlock(),
> and the interrupt routine will attempt to lock the same
> &local->queue_stop_reason_lock again.
>
> This is the only remaining spin_lock() on local->queue_stop_reason_lock
> that did not disable interrupts and could have possibly caused the deadlock
> on the same CPU (core).
>
> This will cause a costly reset of the CPU and wifi device or an
> altogether hang in the single CPU and single core scenario.
>
> This is the probable reproduce of the deadlock:
>
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: Possible unsafe locking scenario:
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: CPU0
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: ----
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: <Interrupt>
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel:
> *** DEADLOCK ***
>
> Fixes: 4444bc2116ae
That fixes tag is wrong, should be
Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption")
Otherwise seems fine to me, submit it properly?
johannes
On 24.4.2023. 19:27, Johannes Berg wrote:
> On Sun, 2023-04-23 at 10:24 +0200, Mirsad Goran Todorovac wrote:
>> In the function ieee80211_tx_dequeue() there is a locking sequence:
>>
>> begin:
>> spin_lock(&local->queue_stop_reason_lock);
>> q_stopped = local->queue_stop_reasons[q];
>> spin_unlock(&local->queue_stop_reason_lock);
>>
>> However small the chance (increased by ftracetest), an asynchronous
>> interrupt can occur in between of spin_lock() and spin_unlock(),
>> and the interrupt routine will attempt to lock the same
>> &local->queue_stop_reason_lock again.
>>
>> This is the only remaining spin_lock() on local->queue_stop_reason_lock
>> that did not disable interrupts and could have possibly caused the deadlock
>> on the same CPU (core).
>>
>> This will cause a costly reset of the CPU and wifi device or an
>> altogether hang in the single CPU and single core scenario.
>>
>> This is the probable reproduce of the deadlock:
>>
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: Possible unsafe locking scenario:
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: CPU0
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: ----
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: <Interrupt>
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel: lock(&local->queue_stop_reason_lock);
>> Apr 10 00:58:33 marvin-IdeaPad-3-15ITL6 kernel:
>> *** DEADLOCK ***
>>
>> Fixes: 4444bc2116ae
>
> That fixes tag is wrong, should be
>
> Fixes: 4444bc2116ae ("wifi: mac80211: Proper mark iTXQs for resumption")
>
> Otherwise seems fine to me, submit it properly?
>
> johannes
Will do, Sir. Do I have an Acked-by: ?
Thank you.
Mirsad
--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu