2022-11-23 20:26:20

by Thomas Gleixner

[permalink] [raw]
Subject: [patch V3 17/17] Bluetooth: hci_qca: Fix the teardown problem for real

While discussing solutions for the teardown problem which results from
circular dependencies between timers and workqueues, where timers schedule
work from their timer callback and workqueues arm the timers from work
items, it was discovered that the recent fix to the QCA code is incorrect.

That commit fixes the obvious problem of using del_timer() instead of
del_timer_sync() and reorders the teardown calls to

destroy_workqueue(wq);
del_timer_sync(t);

This makes it less likely to explode, but it's still broken:

destroy_workqueue(wq);
/* After this point @wq cannot be touched anymore */

---> timer expires
queue_work(wq) <---- Results in a NULl pointer dereference
deep in the work queue core code.
del_timer_sync(t);

Use the new timer_shutdown_sync() function to ensure that the timers are
disarmed, no timer callbacks are running and the timers cannot be armed
again. This restores the original teardown sequence:

timer_shutdown_sync(t);
destroy_workqueue(wq);

which is now correct because the timer core silently ignores potential
rearming attempts which can happen when destroy_workqueue() drains pending
work before mopping up the workqueue.

Fixes: 72ef98445aca ("Bluetooth: hci_qca: Use del_timer_sync() before freeing")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Acked-by: Luiz Augusto von Dentz <[email protected]>
Cc: Marcel Holtmann <[email protected]>
Cc: Johan Hedberg <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/all/87iljhsftt.ffs@tglx
---
drivers/bluetooth/hci_qca.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -696,9 +696,15 @@ static int qca_close(struct hci_uart *hu
skb_queue_purge(&qca->tx_wait_q);
skb_queue_purge(&qca->txq);
skb_queue_purge(&qca->rx_memdump_q);
+ /*
+ * Shut the timers down so they can't be rearmed when
+ * destroy_workqueue() drains pending work which in turn might try
+ * to arm a timer. After shutdown rearm attempts are silently
+ * ignored by the timer core code.
+ */
+ timer_shutdown_sync(&qca->tx_idle_timer);
+ timer_shutdown_sync(&qca->wake_retrans_timer);
destroy_workqueue(qca->workqueue);
- del_timer_sync(&qca->tx_idle_timer);
- del_timer_sync(&qca->wake_retrans_timer);
qca->hu = NULL;

kfree_skb(qca->rx_skb);


2022-11-24 13:57:51

by Anna-Maria Behnsen

[permalink] [raw]
Subject: Re: [patch V3 17/17] Bluetooth: hci_qca: Fix the teardown problem for real

On Wed, 23 Nov 2022, Thomas Gleixner wrote:

> While discussing solutions for the teardown problem which results from
> circular dependencies between timers and workqueues, where timers schedule
> work from their timer callback and workqueues arm the timers from work
> items, it was discovered that the recent fix to the QCA code is incorrect.
>
> That commit fixes the obvious problem of using del_timer() instead of
> del_timer_sync() and reorders the teardown calls to
>
> destroy_workqueue(wq);
> del_timer_sync(t);
>
> This makes it less likely to explode, but it's still broken:
>
> destroy_workqueue(wq);
> /* After this point @wq cannot be touched anymore */
>
> ---> timer expires
> queue_work(wq) <---- Results in a NULl pointer dereference

The last NIT (for now...): s/NULl/NULL

Thanks,

Anna-Maria