LinuxLists.cc - [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister

2022-04-11 13:00:35

Subject: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()

There is a potential double bug in nfcmrvl usb driver between
unregister and resume operation.

The race that cause that double free bug can be shown as below:

(FREE) | (USE)
| nfcmrvl_resume
| nfcmrvl_submit_bulk_urb
| nfcmrvl_bulk_complete
| nfcmrvl_nci_recv_frame
| nfcmrvl_fw_dnld_recv_frame
| queue_work
| fw_dnld_rx_work
| fw_dnld_over
| release_firmware
| kfree(fw); //(1)
nfcmrvl_disconnect |
nfcmrvl_nci_unregister_dev |
nfcmrvl_fw_dnld_abort |
fw_dnld_over | ...
if (priv->fw_dnld.fw) |
release_firmware |
kfree(fw); //(2) |
... | fw = NULL;

When nfcmrvl usb driver is resuming, we detach the device.
The release_firmware() will deallocate firmware in position (1),
but firmware will be deallocated again in position (2), which
leads to double free.

This patch reorders nfcmrvl_fw_dnld_deinit() before nfcmrvl_fw_dnld_abort()
in order to prevent double free bug. Because destroy_workqueue() will
not return until all work is finished.

Signed-off-by: Duoming Zhou <[email protected]>
---
drivers/nfc/nfcmrvl/main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
index 2fcf545012b..d8342271f50 100644
--- a/drivers/nfc/nfcmrvl/main.c
+++ b/drivers/nfc/nfcmrvl/main.c
@@ -183,11 +183,10 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
{
struct nci_dev *ndev = priv->ndev;

+ nfcmrvl_fw_dnld_deinit(priv);
if (priv->ndev->nfc_dev->fw_download_in_progress)
nfcmrvl_fw_dnld_abort(priv);

- nfcmrvl_fw_dnld_deinit(priv);
-
if (gpio_is_valid(priv->config.reset_n_io))
gpio_free(priv->config.reset_n_io);

--
2.17.1

2022-04-11 22:19:13

by Krzysztof Kozlowski

[permalink] [raw]

Subject: Re: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()

2022-04-12 13:22:05

by Duoming Zhou

[permalink] [raw]

Subject: Re: Re: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()

Hello,

On Sun, 10 Apr 2022 11:27:09 +0200 Krzysztof Kozlowski wrote:

> > There is a potential double bug in nfcmrvl usb driver between
> > unregister and resume operation.
> >
>
> Thank you for your patch. There is something to discuss/improve.
>
> > The race that cause that double free bug can be shown as below:
>
> Your patch solves the most visible race, but because of lack of locking,
> I believe race still might exist:
>
> (FREE) | (USE)
> | nfcmrvl_resume
> | nfcmrvl_submit_bulk_urb
> | nfcmrvl_bulk_complete
> | nfcmrvl_nci_recv_frame
> | nfcmrvl_fw_dnld_recv_frame
> | queue_work
> | fw_dnld_rx_work
> nfcmrvl_disconnect |
> nfcmrvl_nci_unregister_dev |
> nfcmrvl_fw_dnld_deinit |
> wait for the workqueue to finish |
> | fw_dnld_over
> | release_firmware
> | kfree(fw);
> | no synchronization //(1)
> if (fw_download_in_progress)
> - no synchronization, so CPU sees old value
> nfcmrvl_fw_dnld_abort |
> fw_dnld_over | ...
> if (priv->fw_dnld.fw) |
> release_firmware |
> kfree(fw); //(2) |
> ... | fw = NULL;
>
> The kfree() from (2) would still free old value. Even if fw=NULL happens
> earlier, it is not propagated back to the other CPU, unless there are
> some implicit barriers due to workqueue?
>
> Is it safe then to rely on such implicit barriers from workqueue?

I think it is safe to rely on such barriers from destroy_workqueue().
The destroy_workqueue will wait all work items to finish, the code
behind it will not execute until all work items are finished.
The progress is shown below:

destroy_workqueue()-->drain_workqueue()-->flush_workqueue()-->wait_for_completion().

The function drain_workqueue() will wait until the workqueue becomes empty.
It sets wq->flags to __WQ_DRAINING, this could ensure the new coming work items
will not be queued into the draining workqueue. Because __queue_work will check
the __WQ_DRAINING flag.

Then drain_workqueue() calls flush_workqueue() to ensure that any scheduled work
has run to completion. Because wait_for_completion() is called in flush_workqueue(),
it will block current thread and wait work items to completion.

Best regards,
Duoming Zhou.