2022-04-29 10:43:37

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

On 29/04/2022 03:14, Duoming Zhou wrote:
> There are destructive operations such as nfcmrvl_fw_dnld_abort and
> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> gpio and so on could be destructed while the upper layer functions such as
> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> to double-free, use-after-free and null-ptr-deref bugs.
>
> There are three situations that could lead to double-free bugs.
>
> The first situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_fw_dnld_start |
> ... | nfcmrvl_nci_unregister_dev
> release_firmware() | nfcmrvl_fw_dnld_abort
> kfree(fw) //(1) | fw_dnld_over
> | release_firmware
> ... | kfree(fw) //(2)
> | ...
>
> The second situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_fw_dnld_start |
> ... |
> mod_timer |
> (wait a time) |
> fw_dnld_timeout | nfcmrvl_nci_unregister_dev
> fw_dnld_over | nfcmrvl_fw_dnld_abort
> release_firmware | fw_dnld_over
> kfree(fw) //(1) | release_firmware
> ... | kfree(fw) //(2)

How exactly the case here is being prevented?

If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?


>
> The third situation is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_nci_recv_frame |
> if(..->fw_download_in_progress)|
> nfcmrvl_fw_dnld_recv_frame |
> queue_work |
> |
> fw_dnld_rx_work | nfcmrvl_nci_unregister_dev
> fw_dnld_over | nfcmrvl_fw_dnld_abort
> release_firmware | fw_dnld_over
> kfree(fw) //(1) | release_firmware
> | kfree(fw) //(2)
>
> The firmware struct is deallocated in position (1) and deallocated
> in position (2) again.
>
> The crash trace triggered by POC is like below:
>
> [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0
> [ 122.640457] Call Trace:
> [ 122.640457] <TASK>
> [ 122.640457] kfree+0xb0/0x330
> [ 122.640457] fw_dnld_over+0x28/0xf0
> [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70
> [ 122.640457] nci_uart_tty_close+0x87/0xd0
> [ 122.640457] tty_ldisc_kill+0x3e/0x80
> [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0
> [ 122.640457] __tty_hangup.part.0+0x316/0x520
> [ 122.640457] tty_release+0x200/0x670
> [ 122.640457] __fput+0x110/0x410
> [ 122.640457] task_work_run+0x86/0xd0
> [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0
> [ 122.640457] syscall_exit_to_user_mode+0x19/0x50
> [ 122.640457] do_syscall_64+0x48/0x90
> [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 122.640457] RIP: 0033:0x7f68433f6beb

Please always strip unrelated parts of oops, so minimum the timing. The
addresses could be removed as well.

>
> What's more, there are also use-after-free and null-ptr-deref bugs
> in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
> set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
> then, we dereference firmware, gpio or the members of priv->fw_dnld in
> nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
>
> This patch reorders destructive operations after nci_unregister_device
> and uses bool variable in nfc_dev to synchronize between cleanup routine
> and firmware download routine. The process is shown below.
>
> (Thread 1) | (Thread 2)
> nfcmrvl_nci_unregister_dev |
> nci_unregister_device |
> nfc_unregister_device | nfc_fw_download
> device_lock() |
> ... |
> dev->dev_register = false; | ...
> device_unlock() |
> ... | device_lock()
> (destructive operations) | if(!dev->dev_register)
> | goto error;
> | error:
> | device_unlock()

How T2 calls appeared here? They were not present in any of your
previous process-examples...

>
> If the device is detaching, the download function will goto error.
> If the download function is executing, nci_unregister_device will
> wait until download function is finished.
>
> Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support")
> Signed-off-by: Duoming Zhou <[email protected]>
> ---
> Changes in v5:
> - Use bool variable added in patch 1/2 to synchronize.
>
> drivers/nfc/nfcmrvl/main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
> index 2fcf545012b..1a5284de434 100644
> --- a/drivers/nfc/nfcmrvl/main.c
> +++ b/drivers/nfc/nfcmrvl/main.c
> @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> {
> struct nci_dev *ndev = priv->ndev;
>
> + nci_unregister_device(ndev);
> if (priv->ndev->nfc_dev->fw_download_in_progress)
> nfcmrvl_fw_dnld_abort(priv);
>
> @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> if (gpio_is_valid(priv->config.reset_n_io))
> gpio_free(priv->config.reset_n_io);
>
> - nci_unregister_device(ndev);
> nci_free_device(ndev);
> kfree(priv);
> }


Best regards,
Krzysztof


2022-04-29 21:05:21

by Duoming Zhou

[permalink] [raw]
Subject: Re: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

Hello,

On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote:

> > There are destructive operations such as nfcmrvl_fw_dnld_abort and
> > gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> > gpio and so on could be destructed while the upper layer functions such as
> > nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> > to double-free, use-after-free and null-ptr-deref bugs.
> >
> > There are three situations that could lead to double-free bugs.
> >
> > The first situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_fw_dnld_start |
> > ... | nfcmrvl_nci_unregister_dev
> > release_firmware() | nfcmrvl_fw_dnld_abort
> > kfree(fw) //(1) | fw_dnld_over
> > | release_firmware
> > ... | kfree(fw) //(2)
> > | ...
> >
> > The second situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_fw_dnld_start |
> > ... |
> > mod_timer |
> > (wait a time) |
> > fw_dnld_timeout | nfcmrvl_nci_unregister_dev
> > fw_dnld_over | nfcmrvl_fw_dnld_abort
> > release_firmware | fw_dnld_over
> > kfree(fw) //(1) | release_firmware
> > ... | kfree(fw) //(2)
>
> How exactly the case here is being prevented?
>
> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?

I think it could be prevented. We use nci_unregister_device() to synchronize, if the
firmware download routine is running, the cleanup routine will wait it to finish.
The flag "fw_download_in_progress" will be set to false, if the the firmware download
routine is finished.

Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev()
will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort()
in nfcmrvl_nci_unregister_dev() will not execute.

> >
> > The third situation is shown below:
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_nci_recv_frame |
> > if(..->fw_download_in_progress)|
> > nfcmrvl_fw_dnld_recv_frame |
> > queue_work |
> > |
> > fw_dnld_rx_work | nfcmrvl_nci_unregister_dev
> > fw_dnld_over | nfcmrvl_fw_dnld_abort
> > release_firmware | fw_dnld_over
> > kfree(fw) //(1) | release_firmware
> > | kfree(fw) //(2)
> >
> > The firmware struct is deallocated in position (1) and deallocated
> > in position (2) again.
> >
> > The crash trace triggered by POC is like below:
> >
> > [ 122.640457] BUG: KASAN: double-free or invalid-free in fw_dnld_over+0x28/0xf0
> > [ 122.640457] Call Trace:
> > [ 122.640457] <TASK>
> > [ 122.640457] kfree+0xb0/0x330
> > [ 122.640457] fw_dnld_over+0x28/0xf0
> > [ 122.640457] nfcmrvl_nci_unregister_dev+0x61/0x70
> > [ 122.640457] nci_uart_tty_close+0x87/0xd0
> > [ 122.640457] tty_ldisc_kill+0x3e/0x80
> > [ 122.640457] tty_ldisc_hangup+0x1b2/0x2c0
> > [ 122.640457] __tty_hangup.part.0+0x316/0x520
> > [ 122.640457] tty_release+0x200/0x670
> > [ 122.640457] __fput+0x110/0x410
> > [ 122.640457] task_work_run+0x86/0xd0
> > [ 122.640457] exit_to_user_mode_prepare+0x1aa/0x1b0
> > [ 122.640457] syscall_exit_to_user_mode+0x19/0x50
> > [ 122.640457] do_syscall_64+0x48/0x90
> > [ 122.640457] entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [ 122.640457] RIP: 0033:0x7f68433f6beb
>
> Please always strip unrelated parts of oops, so minimum the timing. The
> addresses could be removed as well.

Thanks, I will strip unrelated parts of oops.

> >
> > What's more, there are also use-after-free and null-ptr-deref bugs
> > in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
> > set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
> > then, we dereference firmware, gpio or the members of priv->fw_dnld in
> > nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
> >
> > This patch reorders destructive operations after nci_unregister_device
> > and uses bool variable in nfc_dev to synchronize between cleanup routine
> > and firmware download routine. The process is shown below.
> >
> > (Thread 1) | (Thread 2)
> > nfcmrvl_nci_unregister_dev |
> > nci_unregister_device |
> > nfc_unregister_device | nfc_fw_download
> > device_lock() |
> > ... |
> > dev->dev_register = false; | ...
> > device_unlock() |
> > ... | device_lock()
> > (destructive operations) | if(!dev->dev_register)
> > | goto error;
> > | error:
> > | device_unlock()
>
> How T2 calls appeared here? They were not present in any of your
> previous process-examples...

The firmware download process is shown below:

nfc_genl_fw_download()->nfc_fw_download()->nci_fw_download()->nfcmrvl_nci_fw_download()
->nfcmrvl_fw_dnld_start()

So T2 calls is a part of firmware download routine in nfc core layer.

We use bool variable and device_lock() to synchronize between cleanup routine and
firmware download routine in nfc core layer. The above picture is attempt to show
this synchronized process.

> >
> > If the device is detaching, the download function will goto error.
> > If the download function is executing, nci_unregister_device will
> > wait until download function is finished.
> >
> > Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support")
> > Signed-off-by: Duoming Zhou <[email protected]>
> > ---
> > Changes in v5:
> > - Use bool variable added in patch 1/2 to synchronize.
> >
> > drivers/nfc/nfcmrvl/main.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nfc/nfcmrvl/main.c b/drivers/nfc/nfcmrvl/main.c
> > index 2fcf545012b..1a5284de434 100644
> > --- a/drivers/nfc/nfcmrvl/main.c
> > +++ b/drivers/nfc/nfcmrvl/main.c
> > @@ -183,6 +183,7 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> > {
> > struct nci_dev *ndev = priv->ndev;
> >
> > + nci_unregister_device(ndev);
> > if (priv->ndev->nfc_dev->fw_download_in_progress)
> > nfcmrvl_fw_dnld_abort(priv);
> >
> > @@ -191,7 +192,6 @@ void nfcmrvl_nci_unregister_dev(struct nfcmrvl_private *priv)
> > if (gpio_is_valid(priv->config.reset_n_io))
> > gpio_free(priv->config.reset_n_io);
> >
> > - nci_unregister_device(ndev);
> > nci_free_device(ndev);
> > kfree(priv);
> > }


Best regards,
Duoming Zhou

2022-05-02 18:20:10

by Duoming Zhou

[permalink] [raw]
Subject: Re: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs




> -----原始邮件-----
> 发件人: "Krzysztof Kozlowski" <[email protected]>
> 发送时间: 2022-05-02 14:34:07 (星期一)
> 收件人: [email protected]
> 抄送: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
> 主题: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs
>
> On 29/04/2022 11:13, [email protected] wrote:
> > Hello,
> >
> > On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote:
> >
> >>> There are destructive operations such as nfcmrvl_fw_dnld_abort and
> >>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
> >>> gpio and so on could be destructed while the upper layer functions such as
> >>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
> >>> to double-free, use-after-free and null-ptr-deref bugs.
> >>>
> >>> There are three situations that could lead to double-free bugs.
> >>>
> >>> The first situation is shown below:
> >>>
> >>> (Thread 1) | (Thread 2)
> >>> nfcmrvl_fw_dnld_start |
> >>> ... | nfcmrvl_nci_unregister_dev
> >>> release_firmware() | nfcmrvl_fw_dnld_abort
> >>> kfree(fw) //(1) | fw_dnld_over
> >>> | release_firmware
> >>> ... | kfree(fw) //(2)
> >>> | ...
> >>>
> >>> The second situation is shown below:
> >>>
> >>> (Thread 1) | (Thread 2)
> >>> nfcmrvl_fw_dnld_start |
> >>> ... |
> >>> mod_timer |
> >>> (wait a time) |
> >>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev
> >>> fw_dnld_over | nfcmrvl_fw_dnld_abort
> >>> release_firmware | fw_dnld_over
> >>> kfree(fw) //(1) | release_firmware
> >>> ... | kfree(fw) //(2)
> >>
> >> How exactly the case here is being prevented?
> >>
> >> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
> >> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?
> >
> > I think it could be prevented. We use nci_unregister_device() to synchronize, if the
> > firmware download routine is running, the cleanup routine will wait it to finish.
> > The flag "fw_download_in_progress" will be set to false, if the the firmware download
> > routine is finished.
>
> fw_download_in_progress is not synchronized in
> nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to
> true, the nfcmrvl_nci_unregister_dev() happening concurrently will not
> see updated fw_download_in_progress.

The fw_download_in_progress is set to false in nfc_fw_download(). The nfc_fw_download() is
synchronized with nfc_unregister_device(). If nfc_fw_download() is running, nfc_unregister_device()
will wait nfc_fw_download() to finish. So the nfcmrvl_nci_unregister_dev() could see the updated
fw_download_in_progress. The process is shown below:

(Thread 1) | (Thread 2)
nfcmrvl_nci_unregister_dev | nfc_fw_download
nci_unregister_device | ...
| device_lock()
... | dev->fw_download_in_progress = false; //(1)
| device_unlock()
nfc_unregister_device |
if (priv->ndev->nfc_dev->fw_download_in_progress) //(2) |
nfcmrvl_fw_dnld_abort(priv); //not execute |

We set fw_download_in_progress to false in position (1) and the check in position (2) will fail,
the nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. So the double-free
bugs could be prevented.

> > Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev()
> > will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort()
> > in nfcmrvl_nci_unregister_dev() will not execute.
>
> I am sorry, but you cannot move code around hoping it will by itself
> solve synchronization issues.

I think this solution sove synchronization issues. If you still have any questions welcome to ask me.

Best regards,
Duoming Zhou

2022-05-02 23:36:44

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

On 29/04/2022 11:13, [email protected] wrote:
> Hello,
>
> On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote:
>
>>> There are destructive operations such as nfcmrvl_fw_dnld_abort and
>>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
>>> gpio and so on could be destructed while the upper layer functions such as
>>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
>>> to double-free, use-after-free and null-ptr-deref bugs.
>>>
>>> There are three situations that could lead to double-free bugs.
>>>
>>> The first situation is shown below:
>>>
>>> (Thread 1) | (Thread 2)
>>> nfcmrvl_fw_dnld_start |
>>> ... | nfcmrvl_nci_unregister_dev
>>> release_firmware() | nfcmrvl_fw_dnld_abort
>>> kfree(fw) //(1) | fw_dnld_over
>>> | release_firmware
>>> ... | kfree(fw) //(2)
>>> | ...
>>>
>>> The second situation is shown below:
>>>
>>> (Thread 1) | (Thread 2)
>>> nfcmrvl_fw_dnld_start |
>>> ... |
>>> mod_timer |
>>> (wait a time) |
>>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev
>>> fw_dnld_over | nfcmrvl_fw_dnld_abort
>>> release_firmware | fw_dnld_over
>>> kfree(fw) //(1) | release_firmware
>>> ... | kfree(fw) //(2)
>>
>> How exactly the case here is being prevented?
>>
>> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
>> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?
>
> I think it could be prevented. We use nci_unregister_device() to synchronize, if the
> firmware download routine is running, the cleanup routine will wait it to finish.
> The flag "fw_download_in_progress" will be set to false, if the the firmware download
> routine is finished.

fw_download_in_progress is not synchronized in
nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to
true, the nfcmrvl_nci_unregister_dev() happening concurrently will not
see updated fw_download_in_progress.

>
> Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev()
> will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort()
> in nfcmrvl_nci_unregister_dev() will not execute.

I am sorry, but you cannot move code around hoping it will by itself
solve synchronization issues.

Best regards,
Krzysztof

2022-05-04 13:56:48

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs

On 02/05/2022 09:25, [email protected] wrote:
>
>
>
>> -----原始邮件-----
>> 发件人: "Krzysztof Kozlowski" <[email protected]>
>> 发送时间: 2022-05-02 14:34:07 (星期一)
>> 收件人: [email protected]
>> 抄送: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
>> 主题: Re: [PATCH net v5 2/2] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs
>>
>> On 29/04/2022 11:13, [email protected] wrote:
>>> Hello,
>>>
>>> On Fri, 29 Apr 2022 09:27:48 +0200 Krzysztof wrote:
>>>
>>>>> There are destructive operations such as nfcmrvl_fw_dnld_abort and
>>>>> gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
>>>>> gpio and so on could be destructed while the upper layer functions such as
>>>>> nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
>>>>> to double-free, use-after-free and null-ptr-deref bugs.
>>>>>
>>>>> There are three situations that could lead to double-free bugs.
>>>>>
>>>>> The first situation is shown below:
>>>>>
>>>>> (Thread 1) | (Thread 2)
>>>>> nfcmrvl_fw_dnld_start |
>>>>> ... | nfcmrvl_nci_unregister_dev
>>>>> release_firmware() | nfcmrvl_fw_dnld_abort
>>>>> kfree(fw) //(1) | fw_dnld_over
>>>>> | release_firmware
>>>>> ... | kfree(fw) //(2)
>>>>> | ...
>>>>>
>>>>> The second situation is shown below:
>>>>>
>>>>> (Thread 1) | (Thread 2)
>>>>> nfcmrvl_fw_dnld_start |
>>>>> ... |
>>>>> mod_timer |
>>>>> (wait a time) |
>>>>> fw_dnld_timeout | nfcmrvl_nci_unregister_dev
>>>>> fw_dnld_over | nfcmrvl_fw_dnld_abort
>>>>> release_firmware | fw_dnld_over
>>>>> kfree(fw) //(1) | release_firmware
>>>>> ... | kfree(fw) //(2)
>>>>
>>>> How exactly the case here is being prevented?
>>>>
>>>> If nfcmrvl_nci_unregister_dev() happens slightly earlier, before
>>>> fw_dnld_timeout() on the left side (T1), the T1 will still hit it, won't it?
>>>
>>> I think it could be prevented. We use nci_unregister_device() to synchronize, if the
>>> firmware download routine is running, the cleanup routine will wait it to finish.
>>> The flag "fw_download_in_progress" will be set to false, if the the firmware download
>>> routine is finished.
>>
>> fw_download_in_progress is not synchronized in
>> nfcmrvl_nci_unregister_dev(), so even if fw_dnld_timeout() set it to
>> true, the nfcmrvl_nci_unregister_dev() happening concurrently will not
>> see updated fw_download_in_progress.
>
> The fw_download_in_progress is set to false in nfc_fw_download(). The nfc_fw_download() is
> synchronized with nfc_unregister_device().

No, it is not. There is no synchronization primitive in
nfc_unregister_device(), at least explicitly.

> If nfc_fw_download() is running, nfc_unregister_device()
> will wait nfc_fw_download() to finish. So the nfcmrvl_nci_unregister_dev() could see the updated
> fw_download_in_progress. The process is shown below:
>
> (Thread 1) | (Thread 2)
> nfcmrvl_nci_unregister_dev | nfc_fw_download
> nci_unregister_device | ...
> | device_lock()
> ... | dev->fw_download_in_progress = false; //(1)
> | device_unlock()
> nfc_unregister_device |
> if (priv->ndev->nfc_dev->fw_download_in_progress) //(2) |
> nfcmrvl_fw_dnld_abort(priv); //not execute |
>
> We set fw_download_in_progress to false in position (1) and the check in position (2) will fail,
> the nfcmrvl_fw_dnld_abort() in nfcmrvl_nci_unregister_dev() will not execute. So the double-free
> bugs could be prevented.

You just repeated the same not answering the question. The
fw_download_in_progress at point (2) can be still true, on that CPU. I
explain it third time so let me rephrase it - the
fw_download_in_progress can be reordered by compiler or CPU to:

T1 | T2
nfcmrvl_nci_unregister_dev()
nci_unregister_device()
var = fw_download_in_progress; (true)
| nfc_fw_download
| device_lock
| dev->fw_download = false;
| device_unlock
if (var) |
nfcmrvl_fw_dnld_abort(priv); |

Every write barrier must be paired with read barrier. Every lock on one
access to variable, must be paired with same lock on other access to
variable .

>
>>> Although the timer handler fw_dnld_timeout() could be running, nfcmrvl_nci_unregister_dev()
>>> will check the flag "fw_download_in_progress" which is already set to false and nfcmrvl_fw_dnld_abort()
>>> in nfcmrvl_nci_unregister_dev() will not execute.
>>
>> I am sorry, but you cannot move code around hoping it will by itself
>> solve synchronization issues.
>
> I think this solution sove synchronization issues. If you still have any questions welcome to ask me.

No, you still do not get the pint. You cannot move code around because
this itself does not solve missing synchronization primitives and
related issues.


Best regards,
Krzysztof