2022-05-24 19:01:50

by Danny van Heumen

[permalink] [raw]
Subject: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Dear all,

I am not a regular C developer nor kernel developer. I don't regularly report issues, so I will probably do things wrong.

I investigated a crash that IIUC occurs with hardened memory allocation enabled and a firmware-crash followed by an early failure during hardware reinitialization/probing. The hardened allocator detects double-free issue.

I have created the patch (see attachment) against linux-5.18. Though, please check carefully, because I have not been able to confirm that it actually works. I am hoping someone familiar with the code-base can either test this easily, or confirm from review/analysis.

The commit message describes it in more detail. In summary:
'brcmf_sdio_bus_reset' cleans up and reinitializes the hardware. It frees memory used by (struct brcmf_sdio_dev)->freezer (struct brcmf_sdiod_freezer). However, it then goes to probe the hardware, and an early failure to probe results in the same freeing, both called through function 'brcmf_sdiod_freezer_detach' called inside 'brcmf_sdiod_remove'. Which results in double freeing.

As mentioned before, I was not able to test this and I do not regularly develop in C. I am not confident that this is the proper way to fix it, but it seemed obvious enough. I hope you can support in fixing this bug.

Kind regards,
Danny

PS: Please let me know if I am doing things wrong. I have included both maintainers and mailing lists from https://docs.kernel.org/process/maintainers.html#broadcom-brcm80211-ieee802-11n-wireless-driver I hope I this is alright.


Attachments:
0001-brcmfmac-prevent-double-free-on-hardware-reset.patch (3.25 kB)

2022-05-31 11:23:27

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi all,

I'd like to follow up with an updated patch. I had another look at the code. I think the
following proposal may correct the control flow to prevent the double-free from happening
in the first place.

Again, I would appreciate any feedback you might have, as I have little experience in this
area. A stacktrace is present in the commit message, in case you are looking for extra data
that demonstrates the issue.

Kind regards,
Danny


------- Original Message -------
On Tuesday, May 24th, 2022 at 18:51, Danny van Heumen <[email protected]> wrote:


>
>
> Dear all,
>
> I am not a regular C developer nor kernel developer. I don't regularly report issues, so I will probably do things wrong.
>
> I investigated a crash that IIUC occurs with hardened memory allocation enabled and a firmware-crash followed by an early failure during hardware reinitialization/probing. The hardened allocator detects double-free issue.
>
> I have created the patch (see attachment) against linux-5.18. Though, please check carefully, because I have not been able to confirm that it actually works. I am hoping someone familiar with the code-base can either test this easily, or confirm from review/analysis.
>
> The commit message describes it in more detail. In summary:
> 'brcmf_sdio_bus_reset' cleans up and reinitializes the hardware. It frees memory used by (struct brcmf_sdio_dev)->freezer (struct brcmf_sdiod_freezer). However, it then goes to probe the hardware, and an early failure to probe results in the same freeing, both called through function 'brcmf_sdiod_freezer_detach' called inside 'brcmf_sdiod_remove'. Which results in double freeing.
>
>
> As mentioned before, I was not able to test this and I do not regularly develop in C. I am not confident that this is the proper way to fix it, but it seemed obvious enough. I hope you can support in fixing this bug.
>
> Kind regards,
> Danny
>
> PS: Please let me know if I am doing things wrong. I have included both maintainers and mailing lists from https://docs.kernel.org/process/maintainers.html#broadcom-brcm80211-ieee802-11n-wireless-driver I hope I this is alright.


Attachments:
0001-brcmfmac-prevent-double-free-on-hardware-reset.patch (3.64 kB)

2022-06-04 10:28:14

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi,

------- Original Message -------
On Monday, May 30th, 2022 at 19:59, Danny van Heumen <[email protected]> wrote:

> Hi all,
>
> I'd like to follow up with an updated patch. I had another look at the code. I think the
> following proposal may correct the control flow to prevent the double-free from happening
> in the first place.
>
> Again, I would appreciate any feedback you might have, as I have little experience in this
> area. A stacktrace is present in the commit message, in case you are looking for extra data
> that demonstrates the issue.

Could someone follow up on this?

I have not received any response, so it is not clear to me if the patch is the issue,
or whether it is something else. I am running these changes on my Pinebook Pro laptop,
so far without issue.

Thanks in advance,
Danny


> [..]
>
> ------- Original Message -------
> On Tuesday, May 24th, 2022 at 18:51, Danny van Heumen [email protected] wrote:
>
>
>
> > Dear all,
> >
> > I am not a regular C developer nor kernel developer. I don't regularly report issues, so I will probably do things wrong.
> >
> > I investigated a crash that IIUC occurs with hardened memory allocation enabled and a firmware-crash followed by an early failure during hardware reinitialization/probing. The hardened allocator detects double-free issue.
> >
> > I have created the patch (see attachment) against linux-5.18. Though, please check carefully, because I have not been able to confirm that it actually works. I am hoping someone familiar with the code-base can either test this easily, or confirm from review/analysis.
> >
> > The commit message describes it in more detail. In summary:
> > 'brcmf_sdio_bus_reset' cleans up and reinitializes the hardware. It frees memory used by (struct brcmf_sdio_dev)->freezer (struct brcmf_sdiod_freezer). However, it then goes to probe the hardware, and an early failure to probe results in the same freeing, both called through function 'brcmf_sdiod_freezer_detach' called inside 'brcmf_sdiod_remove'. Which results in double freeing.
> >
> > As mentioned before, I was not able to test this and I do not regularly develop in C. I am not confident that this is the proper way to fix it, but it seemed obvious enough. I hope you can support in fixing this bug.
> >
> > Kind regards,
> > Danny
> >
> > PS: Please let me know if I am doing things wrong. I have included both maintainers and mailing lists from https://docs.kernel.org/process/maintainers.html#broadcom-brcm80211-ieee802-11n-wireless-driver I hope I this is alright.

2022-06-04 13:06:32

by Franky Lin

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Danny,

>>
>> ------- Original Message -------
>> On Tuesday, May 24th, 2022 at 18:51, Danny van Heumen [email protected] wrote:
>>>
>>> I investigated a crash that IIUC occurs with hardened memory allocation enabled and a firmware-crash followed by an early failure during hardware reinitialization/probing. The hardened allocator detects double-free issue.
>>>
>>> I have created the patch (see attachment) against linux-5.18. Though, please check carefully, because I have not been able to confirm that it actually works. I am hoping someone familiar with the code-base can either test this easily, or confirm from review/analysis.
>>>
>>> The commit message describes it in more detail. In summary:
>>> 'brcmf_sdio_bus_reset' cleans up and reinitializes the hardware. It frees memory used by (struct brcmf_sdio_dev)->freezer (struct brcmf_sdiod_freezer). However, it then goes to probe the hardware, and an early failure to probe results in the same freeing, both called through function 'brcmf_sdiod_freezer_detach' called inside 'brcmf_sdiod_remove'. Which results in double freeing.
>>>

Thanks for reporting and sending out a patch to fix this.

If the problem is double freeing the freezer buffer, it should be addressed from the root by setting pointer to NULL. Same thing might need to be done for sg table as well. Sorry I don’t have any sdio module to reproduce and test. Please see if the below change fixes the problem.

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
index ac02244a6fdf..e9bad7197ba9 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
@@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct brcmf_sdio_dev *sdiodev)
if (sdiodev->freezer) {
WARN_ON(atomic_read(&sdiodev->freezer->freezing));
kfree(sdiodev->freezer);
+ sdiodev->freezer = NULL;
}
}

@@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev *sdiodev)
sdio_disable_func(sdiodev->func1);
sdio_release_host(sdiodev->func1);

- sg_free_table(&sdiodev->sgtable);
+ if (sdiodev->sgtable) {
+ sg_free_table(&sdiodev->sgtable);
+ sdiodev->sgtable = NULL;
+ }
+
sdiodev->sbwad = 0;

pm_runtime_allow(sdiodev->func1->card->host->parent);

As for submitting patch to linux-wireless, please follow the guideline. [1]

Thanks,
- Franky

[1] https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.


Attachments:
smime.p7s (4.10 kB)
S/MIME Cryptographic Signature

2022-06-04 16:33:50

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Franky,

------- Original Message -------
On Saturday, June 4th, 2022 at 00:58, Franky Lin <[email protected]> wrote:

> Hi Danny,
>
> [..]
>
> Thanks for reporting and sending out a patch to fix this.
>
> If the problem is double freeing the freezer buffer, it should be addressed from the root by setting pointer to NULL. Same thing might need to be done for sg table as well. Sorry I don’t have any sdio module to reproduce and test. Please see if the below change fixes the problem.

Your suggestion to set the freeze buffer address to zero was also my first proposal. I have since
revised, because there are a few things I considered, although I am not sure:

- does zero-ing the address prevent future detection of double-frees with the hardened memory
allocator? (If so, I would prefer to avoid doing this.)
- IIUC correctly, 'sdio_set_block_size' does not do any meaningful "activation" or "allocation".
Therefore would not need to be *undone*. (repeated probes would override previous calls)
- Starting with the call 'sdio_enable_func', I guess/suspect more elaborate "cleanup" is necessary
therefore, leaving the 'goto out' from that point on. I would assume (for lack of evidence to the
contrary) that the logic at 'goto out' provides proper clean-up.

So, returning immediately with the errorcode seemed more appropriate. Regardless, I have only
incidental knowledge from checking the code just for this particular problem. In the end my goal
is to have the issues addressed so that I am not forced to reboot my system to get it back in
working order.

As for your remark about sg-table: I had not considered that, but if my notes above check out,
maybe this does not need to be treated conditionally at all.

Kind regards,
Danny

> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> index ac02244a6fdf..e9bad7197ba9 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct brcmf_sdio_dev *sdiodev)
> if (sdiodev->freezer) {
>
> WARN_ON(atomic_read(&sdiodev->freezer->freezing));
>
> kfree(sdiodev->freezer);
>
> + sdiodev->freezer = NULL;
>
> }
> }
>
> @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev *sdiodev)
> sdio_disable_func(sdiodev->func1);
>
> sdio_release_host(sdiodev->func1);
>
>
> - sg_free_table(&sdiodev->sgtable);
>
> + if (sdiodev->sgtable) {
>
> + sg_free_table(&sdiodev->sgtable);
>
> + sdiodev->sgtable = NULL;
>
> + }
> +
> sdiodev->sbwad = 0;
>
>
> pm_runtime_allow(sdiodev->func1->card->host->parent);
>
>
> As for submitting patch to linux-wireless, please follow the guideline. [1]
>
> Thanks,
> - Franky
>
> [1] https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
>
>
>
> --
> This electronic communication and the information and any files transmitted
> with it, or attached to it, are confidential and are intended solely for
> the use of the individual or entity to whom it is addressed and may contain
> information that is confidential, legally privileged, protected by privacy
> laws, or otherwise restricted from disclosure to anyone else. If you are
> not the intended recipient or the person responsible for delivering the
> e-mail to the intended recipient, you are hereby notified that any use,
> copying, distributing, dissemination, forwarding, printing, or copying of
> this e-mail is strictly prohibited. If you received this e-mail in error,
> please return the e-mail to the sender, delete it from your computer, and
> destroy any printed copy of it.

2022-06-07 11:34:25

by Franky Lin

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Danny,

My apology. I didn’t read the thread carefully and failed to notice the rev1 to rev 2 change of the patch.

> On Jun 4, 2022, at 7:59 AM, Danny van Heumen <[email protected]> wrote:
>
> Hi Franky,
>
> ------- Original Message -------
> On Saturday, June 4th, 2022 at 00:58, Franky Lin <[email protected]> wrote:
>
>> Hi Danny,
>>
>> [..]
>>
>> Thanks for reporting and sending out a patch to fix this.
>>
>> If the problem is double freeing the freezer buffer, it should be addressed from the root by setting pointer to NULL. Same thing might need to be done for sg table as well. Sorry I don’t have any sdio module to reproduce and test. Please see if the below change fixes the problem.
>
> Your suggestion to set the freeze buffer address to zero was also my first proposal. I have since
> revised, because there are a few things I considered, although I am not sure:
>
> - does zero-ing the address prevent future detection of double-frees with the hardened memory
> allocator? (If so, I would prefer to avoid doing this.)
> - IIUC correctly, 'sdio_set_block_size' does not do any meaningful "activation" or "allocation".
> Therefore would not need to be *undone*. (repeated probes would override previous calls)
> - Starting with the call 'sdio_enable_func', I guess/suspect more elaborate "cleanup" is necessary
> therefore, leaving the 'goto out' from that point on. I would assume (for lack of evidence to the
> contrary) that the logic at 'goto out' provides proper clean-up.

While directly return without invoking clean up process makes perfect sense for the issue described here, it doesn’t address the broader issue that sdiodev might hold on to couple stale pointers that might subsequently be used in somewhere down the path because sdiodev is not freed. Setting these pointer to NULL after freeing them could help us to catch such issue which is more catastrophic than a double-free. The perfect solution of course is to rework the code to free sdiodev whenever brcmf_sdiod_remove() is invoked but that can not be done in short-term unfortunately.

Also I forgot that our IT attached a legal footer to all emails sent to an external party. I am sorry about that to anyone reading my mail but there is nothing I can do at the moment.

Thanks,
- Franky

>
> So, returning immediately with the errorcode seemed more appropriate. Regardless, I have only
> incidental knowledge from checking the code just for this particular problem. In the end my goal
> is to have the issues addressed so that I am not forced to reboot my system to get it back in
> working order.
>
> As for your remark about sg-table: I had not considered that, but if my notes above check out,
> maybe this does not need to be treated conditionally at all.
>
> Kind regards,
> Danny
>
>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>> index ac02244a6fdf..e9bad7197ba9 100644
>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>> @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct brcmf_sdio_dev *sdiodev)
>> if (sdiodev->freezer) {
>>
>> WARN_ON(atomic_read(&sdiodev->freezer->freezing));
>>
>> kfree(sdiodev->freezer);
>>
>> + sdiodev->freezer = NULL;
>>
>> }
>> }
>>
>> @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev *sdiodev)
>> sdio_disable_func(sdiodev->func1);
>>
>> sdio_release_host(sdiodev->func1);
>>
>>
>> - sg_free_table(&sdiodev->sgtable);
>>
>> + if (sdiodev->sgtable) {
>>
>> + sg_free_table(&sdiodev->sgtable);
>>
>> + sdiodev->sgtable = NULL;
>>
>> + }
>> +
>> sdiodev->sbwad = 0;
>>
>>
>> pm_runtime_allow(sdiodev->func1->card->host->parent);
>>
>>
>> As for submitting patch to linux-wireless, please follow the guideline. [1]
>>
>> Thanks,
>> - Franky
>>
>> [1] https://www.google.com/url?q=https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches&source=gmail-imap&ust=1654959604000000&usg=AOvVaw1Q6aXVZjiKkrq9qmyYVVDa




--
This electronic communication and the information and any files transmitted
with it, or attached to it, are confidential and are intended solely for
the use of the individual or entity to whom it is addressed and may contain
information that is confidential, legally privileged, protected by privacy
laws, or otherwise restricted from disclosure to anyone else. If you are
not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, you are hereby notified that any use,
copying, distributing, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error,
please return the e-mail to the sender, delete it from your computer, and
destroy any printed copy of it.


Attachments:
smime.p7s (4.10 kB)
S/MIME Cryptographic Signature

2022-06-07 16:16:11

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Franky,

------- Original Message -------
On Tuesday, June 7th, 2022 at 01:50, Franky Lin <[email protected]> wrote:
>
>
> Hi Danny,
>
> My apology. I didn’t read the thread carefully and failed to notice the rev1 to rev 2 change of the patch.
>
> > On Jun 4, 2022, at 7:59 AM, Danny van Heumen [email protected] wrote:
> >
> > Hi Franky,
> >
> > ------- Original Message -------
> > On Saturday, June 4th, 2022 at 00:58, Franky Lin [email protected] wrote:
> >
> > > Hi Danny,
> > >
> > > [..]
> > >
> > > Thanks for reporting and sending out a patch to fix this.
> > >
> > > If the problem is double freeing the freezer buffer, it should be addressed from the root by setting pointer to NULL. Same thing might need to be done for sg table as well. Sorry I don’t have any sdio module to reproduce and test. Please see if the below change fixes the problem.
> >
> > Your suggestion to set the freeze buffer address to zero was also my first proposal. I have since
> > revised, because there are a few things I considered, although I am not sure:
> >
> > - does zero-ing the address prevent future detection of double-frees with the hardened memory
> > allocator? (If so, I would prefer to avoid doing this.)
> > - IIUC correctly, 'sdio_set_block_size' does not do any meaningful "activation" or "allocation".
> > Therefore would not need to be undone. (repeated probes would override previous calls)
> > - Starting with the call 'sdio_enable_func', I guess/suspect more elaborate "cleanup" is necessary
> > therefore, leaving the 'goto out' from that point on. I would assume (for lack of evidence to the
> > contrary) that the logic at 'goto out' provides proper clean-up.
>
>
> While directly return without invoking clean up process makes perfect sense for the issue described here, it doesn’t address the broader issue that sdiodev might hold on to couple stale pointers that might subsequently be used in somewhere down the path because sdiodev is not freed. Setting these pointer to NULL after freeing them could help us to catch such issue which is more catastrophic than a double-free. The perfect solution of course is to rework the code to free sdiodev whenever brcmf_sdiod_remove() is invoked but that can not be done in short-term unfortunately.

- True.
- If the two early returns are appropriate -- I think they are -- so we can leave those in. (Again, I'm unfamiliar with the code-base.)
- Setting the pointer to NULL at least has the benefit that behavior (even if bugged) is the same irrespective of memory allocation behavior.
- I checked your suggestion for 'sdiodev->sgtable': it is not a pointer, so setting to NULL will not help. If I read this correctly, 'sg_free_table(..)' is already resistant to multiple freeing attempts with a test of '.sgl'.

.. as for the control flow. Sure, rework would be nice, but -- to me at least -- it is not clear if it is really necessary. Maybe I'm mistaken, but there seem to be few entry-points to take into account. The "hardware-reset after firmware-crash"-logic was added later IIUC, so maybe it was an oversight? Regardless, I have updated the patch.

>
> Also I forgot that our IT attached a legal footer to all emails sent to an external party. I am sorry about that to anyone reading my mail but there is nothing I can do at the moment.
>
> Thanks,
> - Franky

I have attached the updated patch. As mentioned before, I will be running the changes myself.

Regards,
Danny



> > So, returning immediately with the errorcode seemed more appropriate. Regardless, I have only
> > incidental knowledge from checking the code just for this particular problem. In the end my goal
> > is to have the issues addressed so that I am not forced to reboot my system to get it back in
> > working order.
> >
> > As for your remark about sg-table: I had not considered that, but if my notes above check out,
> > maybe this does not need to be treated conditionally at all.
> >
> > Kind regards,
> > Danny
> >
> > > diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> > > index ac02244a6fdf..e9bad7197ba9 100644
> > > --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> > > +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> > > @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct brcmf_sdio_dev *sdiodev)
> > > if (sdiodev->freezer) {
> > >
> > > WARN_ON(atomic_read(&sdiodev->freezer->freezing));
> > >
> > > kfree(sdiodev->freezer);
> > >
> > > + sdiodev->freezer = NULL;
> > >
> > > }
> > > }
> > >
> > > @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev *sdiodev)
> > > sdio_disable_func(sdiodev->func1);
> > >
> > > sdio_release_host(sdiodev->func1);
> > >
> > > - sg_free_table(&sdiodev->sgtable);
> > >
> > > + if (sdiodev->sgtable) {
> > >
> > > + sg_free_table(&sdiodev->sgtable);
> > >
> > > + sdiodev->sgtable = NULL;
> > >
> > > + }
> > > +
> > > sdiodev->sbwad = 0;
> > >
> > > pm_runtime_allow(sdiodev->func1->card->host->parent);
> > >
> > > As for submitting patch to linux-wireless, please follow the guideline. [1]
> > >
> > > Thanks,
> > > - Franky
> > >
> > > [1] https://www.google.com/url?q=https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches&source=gmail-imap&ust=1654959604000000&usg=AOvVaw1Q6aXVZjiKkrq9qmyYVVDa
>
>
>
>
>
> --
> This electronic communication and the information and any files transmitted
> with it, or attached to it, are confidential and are intended solely for
> the use of the individual or entity to whom it is addressed and may contain
> information that is confidential, legally privileged, protected by privacy
> laws, or otherwise restricted from disclosure to anyone else. If you are
> not the intended recipient or the person responsible for delivering the
> e-mail to the intended recipient, you are hereby notified that any use,
> copying, distributing, dissemination, forwarding, printing, or copying of
> this e-mail is strictly prohibited. If you received this e-mail in error,
> please return the e-mail to the sender, delete it from your computer, and
> destroy any printed copy of it.


Attachments:
0001-brcmfmac-prevent-double-free-on-hardware-reset.patch (4.12 kB)

2022-06-07 21:24:17

by Arend van Spriel

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

On 6/7/2022 3:52 PM, Danny van Heumen wrote:> Hi Franky,
>
> ------- Original Message -------
> On Tuesday, June 7th, 2022 at 01:50, Franky Lin
<[email protected]> wrote:
>>
>>
>> Hi Danny,
>>
>> My apology. I didn’t read the thread carefully and failed to notice
the rev1 to rev 2 change of the patch.
>>
>>> On Jun 4, 2022, at 7:59 AM, Danny van Heumen
[email protected] wrote:
>>>
>>> Hi Franky,
>>>
>>> ------- Original Message -------
>>> On Saturday, June 4th, 2022 at 00:58, Franky Lin
[email protected] wrote:
>>>
>>>> Hi Danny,
>>>>
>>>> [..]
>>>>
>>>> Thanks for reporting and sending out a patch to fix this.
>>>>
>>>> If the problem is double freeing the freezer buffer, it should be
addressed from the root by setting pointer to NULL. Same thing might
need to be done for sg table as well. Sorry I don’t have any sdio module
to reproduce and test. Please see if the below change fixes the problem.
>>>
>>> Your suggestion to set the freeze buffer address to zero was also
my first proposal. I have since
>>> revised, because there are a few things I considered, although I am
not sure:
>>>
>>> - does zero-ing the address prevent future detection of
double-frees with the hardened memory
>>> allocator? (If so, I would prefer to avoid doing this.)
>>> - IIUC correctly, 'sdio_set_block_size' does not do any meaningful
"activation" or "allocation".
>>> Therefore would not need to be undone. (repeated probes would
override previous calls)
>>> - Starting with the call 'sdio_enable_func', I guess/suspect more
elaborate "cleanup" is necessary
>>> therefore, leaving the 'goto out' from that point on. I would
assume (for lack of evidence to the
>>> contrary) that the logic at 'goto out' provides proper clean-up.
>>
>>
>> While directly return without invoking clean up process makes
perfect sense for the issue described here, it doesn’t address the
broader issue that sdiodev might hold on to couple stale pointers that
might subsequently be used in somewhere down the path because sdiodev is
not freed. Setting these pointer to NULL after freeing them could help
us to catch such issue which is more catastrophic than a double-free.
The perfect solution of course is to rework the code to free sdiodev
whenever brcmf_sdiod_remove() is invoked but that can not be done in
short-term unfortunately.
>
> - True.
> - If the two early returns are appropriate -- I think they are -- so
we can leave those in. (Again, I'm unfamiliar with the code-base.)
> - Setting the pointer to NULL at least has the benefit that behavior
(even if bugged) is the same irrespective of memory allocation behavior.
> - I checked your suggestion for 'sdiodev->sgtable': it is not a
pointer, so setting to NULL will not help. If I read this correctly,
'sg_free_table(..)' is already resistant to multiple freeing attempts
with a test of '.sgl'.
>
> .. as for the control flow. Sure, rework would be nice, but -- to me
at least -- it is not clear if it is really necessary. Maybe I'm
mistaken, but there seem to be few entry-points to take into account.
The "hardware-reset after firmware-crash"-logic was added later IIUC, so
maybe it was an oversight? Regardless, I have updated the patch.

The reset-after-fw-crash was indeed added later. I am wondering is we
can leave the remove-reprobe dance to the mmc core. Maybe mmc_hw_reset()
could do the trick, ie. simply call mmc_hw_reset() in
brcmf_sdio_bus_reset() and be done with it. Maybe opening a can of worms
here, but I always felt uncertain about calling .remove and .probe
callbacks directly from the driver itself as it may cause issue like
this double-free, but also the bus is unaware and I don't know that is a
bad thing or not.

Regards,
Arend

>>
>> Also I forgot that our IT attached a legal footer to all emails sent
to an external party. I am sorry about that to anyone reading my mail
but there is nothing I can do at the moment.
>>
>> Thanks,
>> - Franky
>
> I have attached the updated patch. As mentioned before, I will be
running the changes myself.
>
> Regards,
> Danny
>
>
>
>>> So, returning immediately with the errorcode seemed more
appropriate. Regardless, I have only
>>> incidental knowledge from checking the code just for this
particular problem. In the end my goal
>>> is to have the issues addressed so that I am not forced to reboot
my system to get it back in
>>> working order.
>>>
>>> As for your remark about sg-table: I had not considered that, but
if my notes above check out,
>>> maybe this does not need to be treated conditionally at all.
>>>
>>> Kind regards,
>>> Danny
>>>
>>>> diff --git
a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>>>> index ac02244a6fdf..e9bad7197ba9 100644
>>>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>>>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>>>> @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct
brcmf_sdio_dev *sdiodev)
>>>> if (sdiodev->freezer) {
>>>>
>>>> WARN_ON(atomic_read(&sdiodev->freezer->freezing));
>>>>
>>>> kfree(sdiodev->freezer);
>>>>
>>>> + sdiodev->freezer = NULL;
>>>>
>>>> }
>>>> }
>>>>
>>>> @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev
*sdiodev)
>>>> sdio_disable_func(sdiodev->func1);
>>>>
>>>> sdio_release_host(sdiodev->func1);
>>>>
>>>> - sg_free_table(&sdiodev->sgtable);
>>>>
>>>> + if (sdiodev->sgtable) {
>>>>
>>>> + sg_free_table(&sdiodev->sgtable);
>>>>
>>>> + sdiodev->sgtable = NULL;
>>>>
>>>> + }
>>>> +
>>>> sdiodev->sbwad = 0;
>>>>
>>>> pm_runtime_allow(sdiodev->func1->card->host->parent);
>>>>
>>>> As for submitting patch to linux-wireless, please follow the
guideline. [1]
>>>>
>>>> Thanks,
>>>> - Franky
>>>>
>>>> [1]
https://www.google.com/url?q=https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches&source=gmail-imap&ust=1654959604000000&usg=AOvVaw1Q6aXVZjiKkrq9qmyYVVDa
>>
>>
>>
>>
>>
>> --
>> This electronic communication and the information and any files
transmitted
>> with it, or attached to it, are confidential and are intended solely for
>> the use of the individual or entity to whom it is addressed and may
contain
>> information that is confidential, legally privileged, protected by
privacy
>> laws, or otherwise restricted from disclosure to anyone else. If you are
>> not the intended recipient or the person responsible for delivering the
>> e-mail to the intended recipient, you are hereby notified that any use,
>> copying, distributing, dissemination, forwarding, printing, or
copying of
>> this e-mail is strictly prohibited. If you received this e-mail in
error,
>> please return the e-mail to the sender, delete it from your
computer, and
>> destroy any printed copy of it.


Attachments:
smime.p7s (4.12 kB)
S/MIME Cryptographic Signature

2022-06-08 06:00:19

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend,

------- Original Message -------
On Tuesday, June 7th, 2022 at 21:00, Arend van Spriel <[email protected]> wrote:

> [..]
>
> > > While directly return without invoking clean up process makes
> > > perfect sense for the issue described here, it doesn’t address the
> > > broader issue that sdiodev might hold on to couple stale pointers that
> > > might subsequently be used in somewhere down the path because sdiodev is
> > > not freed. Setting these pointer to NULL after freeing them could help
> > > us to catch such issue which is more catastrophic than a double-free.
> > > The perfect solution of course is to rework the code to free sdiodev
> > > whenever brcmf_sdiod_remove() is invoked but that can not be done in
> > > short-term unfortunately.
>
> > - True.
> > - If the two early returns are appropriate -- I think they are -- so
> > we can leave those in. (Again, I'm unfamiliar with the code-base.)
> > - Setting the pointer to NULL at least has the benefit that behavior
> > (even if bugged) is the same irrespective of memory allocation behavior.
> > - I checked your suggestion for 'sdiodev->sgtable': it is not a
>
> pointer, so setting to NULL will not help. If I read this correctly,
> 'sg_free_table(..)' is already resistant to multiple freeing attempts
> with a test of '.sgl'.
>
> > .. as for the control flow. Sure, rework would be nice, but -- to me
> > at least -- it is not clear if it is really necessary. Maybe I'm
> > mistaken, but there seem to be few entry-points to take into account.
> > The "hardware-reset after firmware-crash"-logic was added later IIUC, so
> > maybe it was an oversight? Regardless, I have updated the patch.
>
> The reset-after-fw-crash was indeed added later. I am wondering is we
> can leave the remove-reprobe dance to the mmc core. Maybe mmc_hw_reset()
> could do the trick, ie. simply call mmc_hw_reset() in
> brcmf_sdio_bus_reset() and be done with it. Maybe opening a can of worms
> here, but I always felt uncertain about calling .remove and .probe
> callbacks directly from the driver itself as it may cause issue like
> this double-free, but also the bus is unaware and I don't know that is a
> bad thing or not.

I am pretty sure you found a can of worms indeed :-)

So, a few things to note:

- do you have a reliable way to test this behavior?
- from my PoV: I have encountered various compositions of firmware and uCode
for the BCM4345/9 (43456-sdio). Earlier versions may occasionally
exhibit the crashing-behavior, but not reliably.
- do you want to tackle this as a separate effort, given that there is already
benefit in merging the earlier patch proposal?

Let me try to give my impression of the code, that I get from my cursory scans
through the brcmfmac code: it seems that the device as a whole (the "root")
gets activated. From what I remember, there seems to be a callback that
triggers subsequent initialization. So, it makes somewhat sense to me if a
hardware-reset could flow (back) into the existing initialization flow.
(Again, don't trust info below, as I have very limited knowledge in this domain.
I'm trying to check the extent to which I can make sense of it.)

Kind regards,
Danny



> Regards,
> Arend
>
> > > Also I forgot that our IT attached a legal footer to all emails sent
>
> to an external party. I am sorry about that to anyone reading my mail
> but there is nothing I can do at the moment.
>
> > > Thanks,
>
> > > - Franky
>
> > I have attached the updated patch. As mentioned before, I will be
>
> running the changes myself.
>
> > Regards,
>
> > Danny
>
> > > > So, returning immediately with the errorcode seemed more
>
> appropriate. Regardless, I have only
>
> > > > incidental knowledge from checking the code just for this
>
> particular problem. In the end my goal
>
> > > > is to have the issues addressed so that I am not forced to reboot
>
> my system to get it back in
>
> > > > working order.
>
> > > > As for your remark about sg-table: I had not considered that, but
>
> if my notes above check out,
>
> > > > maybe this does not need to be treated conditionally at all.
>
> > > > Kind regards,
>
> > > > Danny
>
> > > > > diff --git
>
> a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>
> > > > > index ac02244a6fdf..e9bad7197ba9 100644
>
> > > > > --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>
> > > > > +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
>
> > > > > @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct
>
> brcmf_sdio_dev *sdiodev)
>
> > > > > if (sdiodev->freezer) {
>
> > > > > WARN_ON(atomic_read(&sdiodev->freezer->freezing));
>
> > > > > kfree(sdiodev->freezer);
>
> > > > > + sdiodev->freezer = NULL;
>
> > > > > }
>
> > > > > }
>
> > > > > @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev
>
> *sdiodev)
>
> > > > > sdio_disable_func(sdiodev->func1);
>
> > > > > sdio_release_host(sdiodev->func1);
>
> > > > > - sg_free_table(&sdiodev->sgtable);
>
> > > > > + if (sdiodev->sgtable) {
>
> > > > > + sg_free_table(&sdiodev->sgtable);
>
> > > > > + sdiodev->sgtable = NULL;
>
> > > > > + }
>
> > > > > +
>
> > > > > sdiodev->sbwad = 0;
>
> > > > > pm_runtime_allow(sdiodev->func1->card->host->parent);
>
> > > > > As for submitting patch to linux-wireless, please follow the
>
> guideline. [1]
>
> > > > > Thanks,
>
> > > > > - Franky
>
> > > > > [1]
>
> https://www.google.com/url?q=https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches&source=gmail-imap&ust=1654959604000000&usg=AOvVaw1Q6aXVZjiKkrq9qmyYVVDa
>
> > > --
>
> > > This electronic communication and the information and any files
>
> transmitted
>
> > > with it, or attached to it, are confidential and are intended solely for
>
> > > the use of the individual or entity to whom it is addressed and may
>
> contain
>
> > > information that is confidential, legally privileged, protected by
>
> privacy
>
> > > laws, or otherwise restricted from disclosure to anyone else. If you are
>
> > > not the intended recipient or the person responsible for delivering the
>
> > > e-mail to the intended recipient, you are hereby notified that any use,
>
> > > copying, distributing, dissemination, forwarding, printing, or
>
> copying of
>
> > > this e-mail is strictly prohibited. If you received this e-mail in
>
> error,
>
> > > please return the e-mail to the sender, delete it from your
>
> computer, and
>
> > > destroy any printed copy of it.

2022-06-13 16:38:04

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend, Franky,

------- Original Message -------
On Tuesday, June 7th, 2022 at 21:45, Danny van Heumen <[email protected]> wrote:

> Hi Arend,
>
> ------- Original Message -------
> On Tuesday, June 7th, 2022 at 21:00, Arend van Spriel [email protected] wrote:
>
> > [..]
> >
> > > > While directly return without invoking clean up process makes
> > > > perfect sense for the issue described here, it doesn’t address the
> > > > broader issue that sdiodev might hold on to couple stale pointers that
> > > > might subsequently be used in somewhere down the path because sdiodev is
> > > > not freed. Setting these pointer to NULL after freeing them could help
> > > > us to catch such issue which is more catastrophic than a double-free.
> > > > The perfect solution of course is to rework the code to free sdiodev
> > > > whenever brcmf_sdiod_remove() is invoked but that can not be done in
> > > > short-term unfortunately.
> >
> > > - True.
> > > - If the two early returns are appropriate -- I think they are -- so
> > > we can leave those in. (Again, I'm unfamiliar with the code-base.)
> > > - Setting the pointer to NULL at least has the benefit that behavior
> > > (even if bugged) is the same irrespective of memory allocation behavior.
> > > - I checked your suggestion for 'sdiodev->sgtable': it is not a
> >
> > pointer, so setting to NULL will not help. If I read this correctly,
> > 'sg_free_table(..)' is already resistant to multiple freeing attempts
> > with a test of '.sgl'.
> >
> > > .. as for the control flow. Sure, rework would be nice, but -- to me
> > > at least -- it is not clear if it is really necessary. Maybe I'm
> > > mistaken, but there seem to be few entry-points to take into account.
> > > The "hardware-reset after firmware-crash"-logic was added later IIUC, so
> > > maybe it was an oversight? Regardless, I have updated the patch.
> >
> > The reset-after-fw-crash was indeed added later. I am wondering is we
> > can leave the remove-reprobe dance to the mmc core. Maybe mmc_hw_reset()
> > could do the trick, ie. simply call mmc_hw_reset() in
> > brcmf_sdio_bus_reset() and be done with it. Maybe opening a can of worms
> > here, but I always felt uncertain about calling .remove and .probe
> > callbacks directly from the driver itself as it may cause issue like
> > this double-free, but also the bus is unaware and I don't know that is a
> > bad thing or not.
>
>
> I am pretty sure you found a can of worms indeed :-)
>
> So, a few things to note:
>
> - do you have a reliable way to test this behavior?
> - from my PoV: I have encountered various compositions of firmware and uCode
> for the BCM4345/9 (43456-sdio). Earlier versions may occasionally
> exhibit the crashing-behavior, but not reliably.
> - do you want to tackle this as a separate effort, given that there is already
> benefit in merging the earlier patch proposal?
>
> Let me try to give my impression of the code, that I get from my cursory scans
> through the brcmfmac code: it seems that the device as a whole (the "root")
> gets activated. From what I remember, there seems to be a callback that
> triggers subsequent initialization. So, it makes somewhat sense to me if a
> hardware-reset could flow (back) into the existing initialization flow.
> (Again, don't trust info below, as I have very limited knowledge in this domain.
> I'm trying to check the extent to which I can make sense of it.)

What would be the next steps?

I would assume that you are interested in incorporating the bug fixes in some
form. So, I would like to know how we can proceed with this.

Kind regards,
Danny


> > Regards,
> > Arend
> >
> > > > Also I forgot that our IT attached a legal footer to all emails sent
> >
> > to an external party. I am sorry about that to anyone reading my mail
> > but there is nothing I can do at the moment.
> >
> > > > Thanks,
> >
> > > > - Franky
> >
> > > I have attached the updated patch. As mentioned before, I will be
> >
> > running the changes myself.
> >
> > > Regards,
> >
> > > Danny
> >
> > > > > So, returning immediately with the errorcode seemed more
> >
> > appropriate. Regardless, I have only
> >
> > > > > incidental knowledge from checking the code just for this
> >
> > particular problem. In the end my goal
> >
> > > > > is to have the issues addressed so that I am not forced to reboot
> >
> > my system to get it back in
> >
> > > > > working order.
> >
> > > > > As for your remark about sg-table: I had not considered that, but
> >
> > if my notes above check out,
> >
> > > > > maybe this does not need to be treated conditionally at all.
> >
> > > > > Kind regards,
> >
> > > > > Danny
> >
> > > > > > diff --git
> >
> > a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> > b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> >
> > > > > > index ac02244a6fdf..e9bad7197ba9 100644
> >
> > > > > > --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> >
> > > > > > +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
> >
> > > > > > @@ -802,6 +802,7 @@ static void brcmf_sdiod_freezer_detach(struct
> >
> > brcmf_sdio_dev *sdiodev)
> >
> > > > > > if (sdiodev->freezer) {
> >
> > > > > > WARN_ON(atomic_read(&sdiodev->freezer->freezing));
> >
> > > > > > kfree(sdiodev->freezer);
> >
> > > > > > + sdiodev->freezer = NULL;
> >
> > > > > > }
> >
> > > > > > }
> >
> > > > > > @@ -885,7 +886,11 @@ int brcmf_sdiod_remove(struct brcmf_sdio_dev
> >
> > *sdiodev)
> >
> > > > > > sdio_disable_func(sdiodev->func1);
> >
> > > > > > sdio_release_host(sdiodev->func1);
> >
> > > > > > - sg_free_table(&sdiodev->sgtable);
> >
> > > > > > + if (sdiodev->sgtable) {
> >
> > > > > > + sg_free_table(&sdiodev->sgtable);
> >
> > > > > > + sdiodev->sgtable = NULL;
> >
> > > > > > + }
> >
> > > > > > +
> >
> > > > > > sdiodev->sbwad = 0;
> >
> > > > > > pm_runtime_allow(sdiodev->func1->card->host->parent);
> >
> > > > > > As for submitting patch to linux-wireless, please follow the
> >
> > guideline. [1]
> >
> > > > > > Thanks,
> >
> > > > > > - Franky
> >
> > > > > > [1]
> >
> > https://www.google.com/url?q=https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches&source=gmail-imap&ust=1654959604000000&usg=AOvVaw1Q6aXVZjiKkrq9qmyYVVDa
> >
> > > > --
> >
> > > > This electronic communication and the information and any files
> >
> > transmitted
> >
> > > > with it, or attached to it, are confidential and are intended solely for
> >
> > > > the use of the individual or entity to whom it is addressed and may
> >
> > contain
> >
> > > > information that is confidential, legally privileged, protected by
> >
> > privacy
> >
> > > > laws, or otherwise restricted from disclosure to anyone else. If you are
> >
> > > > not the intended recipient or the person responsible for delivering the
> >
> > > > e-mail to the intended recipient, you are hereby notified that any use,
> >
> > > > copying, distributing, dissemination, forwarding, printing, or
> >
> > copying of
> >
> > > > this e-mail is strictly prohibited. If you received this e-mail in
> >
> > error,
> >
> > > > please return the e-mail to the sender, delete it from your
> >
> > computer, and
> >
> > > > destroy any printed copy of it.

2022-06-13 19:16:40

by Arend van Spriel

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

On June 13, 2022 2:33:51 PM Danny van Heumen <[email protected]> wrote:

> Hi Arend, Franky,
>
> ------- Original Message -------
> On Tuesday, June 7th, 2022 at 21:45, Danny van Heumen
> <[email protected]> wrote:
>
>> Hi Arend,
>>
>> ------- Original Message -------
>> On Tuesday, June 7th, 2022 at 21:00, Arend van Spriel
>> [email protected] wrote:
>>
>>> [..]
>>>
>>>>> While directly return without invoking clean up process makes
>>>>> perfect sense for the issue described here, it doesn’t address the
>>>>> broader issue that sdiodev might hold on to couple stale pointers that
>>>>> might subsequently be used in somewhere down the path because sdiodev is
>>>>> not freed. Setting these pointer to NULL after freeing them could help
>>>>> us to catch such issue which is more catastrophic than a double-free.
>>>>> The perfect solution of course is to rework the code to free sdiodev
>>>>> whenever brcmf_sdiod_remove() is invoked but that can not be done in
>>>>> short-term unfortunately.
>>>
>>>> - True.
>>>> - If the two early returns are appropriate -- I think they are -- so
>>>> we can leave those in. (Again, I'm unfamiliar with the code-base.)
>>>> - Setting the pointer to NULL at least has the benefit that behavior
>>>> (even if bugged) is the same irrespective of memory allocation behavior.
>>>> - I checked your suggestion for 'sdiodev->sgtable': it is not a
>>>
>>> pointer, so setting to NULL will not help. If I read this correctly,
>>> 'sg_free_table(..)' is already resistant to multiple freeing attempts
>>> with a test of '.sgl'.
>>>
>>>> .. as for the control flow. Sure, rework would be nice, but -- to me
>>>> at least -- it is not clear if it is really necessary. Maybe I'm
>>>> mistaken, but there seem to be few entry-points to take into account.
>>>> The "hardware-reset after firmware-crash"-logic was added later IIUC, so
>>>> maybe it was an oversight? Regardless, I have updated the patch.
>>>
>>> The reset-after-fw-crash was indeed added later. I am wondering is we
>>> can leave the remove-reprobe dance to the mmc core. Maybe mmc_hw_reset()
>>> could do the trick, ie. simply call mmc_hw_reset() in
>>> brcmf_sdio_bus_reset() and be done with it. Maybe opening a can of worms
>>> here, but I always felt uncertain about calling .remove and .probe
>>> callbacks directly from the driver itself as it may cause issue like
>>> this double-free, but also the bus is unaware and I don't know that is a
>>> bad thing or not.
>>
>>
>> I am pretty sure you found a can of worms indeed :-)
>>
>> So, a few things to note:
>>
>> - do you have a reliable way to test this behavior?
>> - from my PoV: I have encountered various compositions of firmware and uCode
>> for the BCM4345/9 (43456-sdio). Earlier versions may occasionally
>> exhibit the crashing-behavior, but not reliably.
>> - do you want to tackle this as a separate effort, given that there is already
>> benefit in merging the earlier patch proposal?
>>
>> Let me try to give my impression of the code, that I get from my cursory scans
>> through the brcmfmac code: it seems that the device as a whole (the "root")
>> gets activated. From what I remember, there seems to be a callback that
>> triggers subsequent initialization. So, it makes somewhat sense to me if a
>> hardware-reset could flow (back) into the existing initialization flow.
>> (Again, don't trust info below, as I have very limited knowledge in this
>> domain.
>> I'm trying to check the extent to which I can make sense of it.)
>
> What would be the next steps?

You should send a proper patch to the linux-wireless list, ie. not in an
attachment but the commit message and patch diff in plain text email
message. Please refer to [1] for guidelines.

I reinstated SDIO card in my test setup so I can test your patch.

Regards,
Arend

[1]
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



Attachments:
smime.p7s (4.12 kB)
S/MIME Cryptographic Signature

2022-06-13 20:16:24

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend,

------- Original Message -------
On Monday, June 13th, 2022 at 19:32, Arend Van Spriel <[email protected]> wrote:

> [..]
> > What would be the next steps?
>
>
> You should send a proper patch to the linux-wireless list, ie. not in an
> attachment but the commit message and patch diff in plain text email
> message. Please refer to [1] for guidelines.

Thanks. Will do.

> I reinstated SDIO card in my test setup so I can test your patch.

That's great!

I have tried to reduce/remove the probe-logic in `.reset`, but I can simply not reach that logic reliably (or at all atm), so I cannot test even basic simplifications of the hardware-reset logic.


> Regards,
> Arend
>
> [1]
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



Another question:

`BRCMF_FW_DEF(43456, "brcmfmac43456-sdio");`

This line defines IIUC that a firmware-binary exists. However, there is another macro that defines both the firmware-binary and the clm_blob binary. AFAIK, BCM4345/9 (brcmfmac43456-sdio) supports loading a *.clm_blob file. So I suppose the line should be:

`BRCMF_FW_CLM_DEF(43456, "brcmfmac43456-sdio");`

Does this really matter? Should I also submit a patch for this?

Thanks so far,
Danny

2022-06-17 14:42:35

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend, others,

------- Original Message -------
On Monday, June 13th, 2022 at 20:50, Danny van Heumen <[email protected]> wrote:

> [..]
> >
> > You should send a proper patch to the linux-wireless list, ie. not in an
> > attachment but the commit message and patch diff in plain text email
> > message. Please refer to [1] for guidelines.

Done. (Sent as separate email not part of this thread.)

> [..]
> I have tried to reduce/remove the probe-logic in `.reset`, but I can simply not reach that logic reliably (or at all atm), so I cannot test even basic simplifications of the hardware-reset logic.

I have not made progress concerning the reduced logic in '.reset'.
I will not attempt this any further as I do not have the right circumstances
to test this properly.

>
> > Regards,
> > Arend
> >
> > [1]
> > https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
>

>
> Another question:
>
> `BRCMF_FW_DEF(43456, "brcmfmac43456-sdio");`
>
> This line defines IIUC that a firmware-binary exists. However, there is another macro that defines both the firmware-binary and the clm_blob binary. AFAIK, BCM4345/9 (brcmfmac43456-sdio) supports loading a *.clm_blob file. So I suppose the line should be:
>
> `BRCMF_FW_CLM_DEF(43456, "brcmfmac43456-sdio");`
>
> Does this really matter? Should I also submit a patch for this?

This is still an open question to me: from what I can tell,
'brcmfmac43456-sdio.clm_blob' loads correctly even though the macro
does not define it. So this may concern certain specific use cases.

Regards,
Danny

2022-06-21 07:52:54

by Arend van Spriel

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

On 6/17/2022 4:31 PM, Danny van Heumen wrote:
> Hi Arend, others,
>
> ------- Original Message -------
> On Monday, June 13th, 2022 at 20:50, Danny van Heumen<[email protected]> wrote:
>
>> [..]
>>> You should send a proper patch to the linux-wireless list, ie. not in an
>>> attachment but the commit message and patch diff in plain text email
>>> message. Please refer to [1] for guidelines.
> Done. (Sent as separate email not part of this thread.)

Hi Danny,

Maybe overlooked, but I have not seen a patch from you on the
linux-wireless list. Do you have reference, ie. URL or Message-ID?

Regards,
Arend


Attachments:
smime.p7s (4.12 kB)
S/MIME Cryptographic Signature

2022-06-21 13:23:05

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend,

------- Original Message -------
On Tuesday, June 21st, 2022 at 09:41, Arend van Spriel <[email protected]> wrote:

> [..]
>
> Maybe overlooked, but I have not seen a patch from you on the
> linux-wireless list. Do you have reference, ie. URL or Message-ID?

I'm not sure what's most convenient here, but here's an archive link:
https://marc.info/?l=linux-wireless&m=165547582612979

Message-ID: ThT2jwvySn9tFQm9FxrlPNMQkiGUnx_87cOhmmeexoVOFZqOkpjmAntCWG-
kIBMj2830LHZaOULlJxQKiRXkVtGYrrT8rBaB7R65qjIinYo= () dannyvanheumen ! nl

You haven't commented yet on my question regarding definition macro for the clm_blob.
I intend to send a patch for it, because I believe _at the very least_ some parts of distribution
processes rely on these firmware entries. It would be good if you can confirm.

> Regards,
> Arend

Regards,
Danny

2022-06-22 09:43:22

by Arend van Spriel

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

On 6/21/2022 3:18 PM, Danny van Heumen wrote:
> Hi Arend,
>
> ------- Original Message -------
> On Tuesday, June 21st, 2022 at 09:41, Arend van Spriel <[email protected]> wrote:
>
>> [..]
>>
>> Maybe overlooked, but I have not seen a patch from you on the
>> linux-wireless list. Do you have reference, ie. URL or Message-ID?
>
> I'm not sure what's most convenient here, but here's an archive link:
> https://marc.info/?l=linux-wireless&m=165547582612979
>
> Message-ID: ThT2jwvySn9tFQm9FxrlPNMQkiGUnx_87cOhmmeexoVOFZqOkpjmAntCWG-
> kIBMj2830LHZaOULlJxQKiRXkVtGYrrT8rBaB7R65qjIinYo= () dannyvanheumen ! nl

Found it. Had to check my gmail account instead of my work account.

> You haven't commented yet on my question regarding definition macro for the clm_blob.
> I intend to send a patch for it, because I believe _at the very least_ some parts of distribution
> processes rely on these firmware entries. It would be good if you can confirm.

The clm_blob is optional unless the firmware image was built without any
clm data. If that is the case for 43456 (I think it is not) than it
should be reflected with BRCMF_FW_CLM_DEF(). Otherwise it is fine as it
is right now.

Thanks,
Arend


Attachments:
smime.p7s (4.12 kB)
S/MIME Cryptographic Signature

2022-06-22 13:32:12

by Danny van Heumen

[permalink] [raw]
Subject: Re: [PATCH] work-in-progress: double-free after hardware reset due to firmware-crash

Hi Arend,

------- Original Message -------
On Wednesday, June 22nd, 2022 at 11:36, Arend van Spriel <[email protected]> wrote:

> On 6/21/2022 3:18 PM, Danny van Heumen wrote:
>
> > Hi Arend,
> >
> > ------- Original Message -------
> > On Tuesday, June 21st, 2022 at 09:41, Arend van Spriel [email protected] wrote:
> >
> > > [..]
> > >
> > > Maybe overlooked, but I have not seen a patch from you on the
> > > linux-wireless list. Do you have reference, ie. URL or Message-ID?
> >
> > I'm not sure what's most convenient here, but here's an archive link:
> > https://marc.info/?l=linux-wireless&m=165547582612979
> >
> > Message-ID: ThT2jwvySn9tFQm9FxrlPNMQkiGUnx_87cOhmmeexoVOFZqOkpjmAntCWG-
> > kIBMj2830LHZaOULlJxQKiRXkVtGYrrT8rBaB7R65qjIinYo= () dannyvanheumen ! nl
>
>
> Found it. Had to check my gmail account instead of my work account.
>
> > You haven't commented yet on my question regarding definition macro for the clm_blob.
> > I intend to send a patch for it, because I believe at the very least some parts of distribution
> > processes rely on these firmware entries. It would be good if you can confirm.
>
>
> The clm_blob is optional unless the firmware image was built without any
> clm data. If that is the case for 43456 (I think it is not) than it
> should be reflected with BRCMF_FW_CLM_DEF(). Otherwise it is fine as it
> is right now.

Okay. I checked a few of the binary blobs, some are built with and some are built without (`strings` gives me a line with 'noclm' in it or with 'clm' and some revision/identifier) This is a bit of guessing on my part.

I will leave it as is then.

Regards,
Danny