2021-08-17 06:37:20

by Ahmad Fatoum

[permalink] [raw]
Subject: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
hot resume and then fall back to removing the PCI device and then
reprobing. If this probe fails, the kernel will oops, because brcmf_err,
which is called to report the failure will dereference the stale bus
pointer. Open code and use the default bus-less brcmf_err to avoid this.

Signed-off-by: Ahmad Fatoum <[email protected]>
---
To: Arend van Spriel <[email protected]>
To: Franky Lin <[email protected]>
To: Hante Meuleman <[email protected]>
To: Chi-hsien Lin <[email protected]>
To: Wright Feng <[email protected]>
To: Chung-hsien Hsu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Kalle Valo <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
index 9ef94d7a7ca7..d824bea4b79d 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
@@ -2209,7 +2209,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)

err = brcmf_pcie_probe(pdev, NULL);
if (err)
- brcmf_err(bus, "probe after resume failed, err=%d\n", err);
+ __brcmf_err(NULL, __func__, "probe after resume failed, err=%d\n", err);

return err;
}
--
2.30.2


2021-08-17 10:02:35

by Ahmad Fatoum

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On 17.08.21 08:35, Ahmad Fatoum wrote:
> When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
> hot resume and then fall back to removing the PCI device and then
> reprobing. If this probe fails, the kernel will oops, because brcmf_err,
> which is called to report the failure will dereference the stale bus
> pointer. Open code and use the default bus-less brcmf_err to avoid this.

Should've included a Fixes tag:

Fixes: 8602e62441ab ("brcmfmac: pass bus to the __brcmf_err() in pcie.c")

Please let me know if I should resend with the tag added.

Cheers,
Ahmad

> Signed-off-by: Ahmad Fatoum <[email protected]>
> ---
> To: Arend van Spriel <[email protected]>
> To: Franky Lin <[email protected]>
> To: Hante Meuleman <[email protected]>
> To: Chi-hsien Lin <[email protected]>
> To: Wright Feng <[email protected]>
> To: Chung-hsien Hsu <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Kalle Valo <[email protected]>
> Cc: Jakub Kicinski <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: [email protected]
> ---
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> index 9ef94d7a7ca7..d824bea4b79d 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
> @@ -2209,7 +2209,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)
>
> err = brcmf_pcie_probe(pdev, NULL);
> if (err)
> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> + __brcmf_err(NULL, __func__, "probe after resume failed, err=%d\n", err);
>
> return err;
> }
>


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2021-08-17 11:12:47

by Ahmad Fatoum

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On 17.08.21 13:02, Andy Shevchenko wrote:
> On Tuesday, August 17, 2021, Ahmad Fatoum <[email protected]> wrote:
>
>> When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
>> hot resume and then fall back to removing the PCI device and then
>> reprobing. If this probe fails, the kernel will oops, because brcmf_err,
>> which is called to report the failure will dereference the stale bus
>> pointer. Open code and use the default bus-less brcmf_err to avoid this.
>>
>> Signed-off-by: Ahmad Fatoum <[email protected]>
>> ---
>> To: Arend van Spriel <[email protected]>
>> To: Franky Lin <[email protected]>
>> To: Hante Meuleman <[email protected]>
>> To: Chi-hsien Lin <[email protected]>
>> To: Wright Feng <[email protected]>
>> To: Chung-hsien Hsu <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: Kalle Valo <[email protected]>
>> Cc: Jakub Kicinski <[email protected]>
>> Cc: "David S. Miller" <[email protected]>
>> Cc: [email protected]
>> ---
>> drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> index 9ef94d7a7ca7..d824bea4b79d 100644
>> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pcie.c
>> @@ -2209,7 +2209,7 @@ static int brcmf_pcie_pm_leave_D3(struct device *dev)
>>
>> err = brcmf_pcie_probe(pdev, NULL);
>> if (err)
>> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>> + __brcmf_err(NULL, __func__, "probe after resume failed,
>> err=%d\n",
>
>
> This is weird looking line now. Why can’t you simply use dev_err() /
> netdev_err()?

That's what brcmf_err normally expands to, but in this file the macro
is overridden to add the extra first argument.

The brcmf_ logging function write to brcmf trace buffers. This is not
done with netdev_err/dev_err (and replacing the existing logging
is out of scope for a regression fix anyway).

Cheers,
Ahmad

>
>
>>
>> return err;
>> }
>> --
>> 2.30.2
>>
>>
>


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2021-08-17 11:55:28

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <[email protected]> wrote:
> On 17.08.21 13:02, Andy Shevchenko wrote:
> > On Tuesday, August 17, 2021, Ahmad Fatoum <[email protected]> wrote:

...

> >> err = brcmf_pcie_probe(pdev, NULL);
> >> if (err)
> >> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> >> + __brcmf_err(NULL, __func__, "probe after resume failed,
> >> err=%d\n",
> >
> >
> > This is weird looking line now. Why can’t you simply use dev_err() /
> > netdev_err()?
>
> That's what brcmf_err normally expands to, but in this file the macro
> is overridden to add the extra first argument.

So, then the problem is in macro here. You need another portion of
macro(s) that will use the dev pointer directly. When you have a valid
device, use it. And here it seems the case.

> The brcmf_ logging function write to brcmf trace buffers. This is not
> done with netdev_err/dev_err (and replacing the existing logging
> is out of scope for a regression fix anyway).

I see.

--
With Best Regards,
Andy Shevchenko

2021-08-17 12:08:13

by Ahmad Fatoum

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On 17.08.21 14:03, Ahmad Fatoum wrote:
> On 17.08.21 13:54, Andy Shevchenko wrote:
>> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <[email protected]> wrote:
>>> On 17.08.21 13:02, Andy Shevchenko wrote:
>>>> On Tuesday, August 17, 2021, Ahmad Fatoum <[email protected]> wrote:
>>
>> ...
>>
>>>>> err = brcmf_pcie_probe(pdev, NULL);
>>>>> if (err)
>>>>> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>>>>> + __brcmf_err(NULL, __func__, "probe after resume failed,
>>>>> err=%d\n",
>>>>
>>>>
>>>> This is weird looking line now. Why can’t you simply use dev_err() /
>>>> netdev_err()?
>>>
>>> That's what brcmf_err normally expands to, but in this file the macro
>>> is overridden to add the extra first argument.
>>
>> So, then the problem is in macro here. You need another portion of
>> macro(s) that will use the dev pointer directly. When you have a valid
>> device, use it. And here it seems the case.
>
> Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
> Thanks for pointing out.

Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
but there is none I can pass along. As the whole file uses the brcm_
logging functions, I'd just leave this one without a device.

>
>>
>>> The brcmf_ logging function write to brcmf trace buffers. This is not
>>> done with netdev_err/dev_err (and replacing the existing logging
>>> is out of scope for a regression fix anyway).
>>
>> I see.
>>
>
>


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2021-08-17 13:08:38

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On Tue, Aug 17, 2021 at 3:07 PM Ahmad Fatoum <[email protected]> wrote:
> On 17.08.21 14:03, Ahmad Fatoum wrote:
> > On 17.08.21 13:54, Andy Shevchenko wrote:
> >> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <[email protected]> wrote:
> >>> On 17.08.21 13:02, Andy Shevchenko wrote:
> >>>> On Tuesday, August 17, 2021, Ahmad Fatoum <[email protected]> wrote:

...

> >>>>> err = brcmf_pcie_probe(pdev, NULL);
> >>>>> if (err)
> >>>>> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
> >>>>> + __brcmf_err(NULL, __func__, "probe after resume failed,
> >>>>> err=%d\n",
> >>>>
> >>>>
> >>>> This is weird looking line now. Why can’t you simply use dev_err() /
> >>>> netdev_err()?
> >>>
> >>> That's what brcmf_err normally expands to, but in this file the macro
> >>> is overridden to add the extra first argument.
> >>
> >> So, then the problem is in macro here. You need another portion of
> >> macro(s) that will use the dev pointer directly. When you have a valid
> >> device, use it. And here it seems the case.
> >
> > Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
> > Thanks for pointing out.
>
> Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
> but there is none I can pass along. As the whole file uses the brcm_
> logging functions, I'd just leave this one without a device.

And what exactly prevents you to split that to something like

__brcm_dev_err() // as current __brcm_err with dev argument
{
...
}

__brsm_err(bus, ...) __brcm_dev_err(bus->dev, ...)

?

--
With Best Regards,
Andy Shevchenko

2021-08-17 13:20:29

by Ahmad Fatoum

[permalink] [raw]
Subject: Re: [PATCH] brcmfmac: pcie: fix oops on failure to resume and reprobe

On 17.08.21 15:06, Andy Shevchenko wrote:
> On Tue, Aug 17, 2021 at 3:07 PM Ahmad Fatoum <[email protected]> wrote:
>> On 17.08.21 14:03, Ahmad Fatoum wrote:
>>> On 17.08.21 13:54, Andy Shevchenko wrote:
>>>> On Tue, Aug 17, 2021 at 2:11 PM Ahmad Fatoum <[email protected]> wrote:
>>>>> On 17.08.21 13:02, Andy Shevchenko wrote:
>>>>>> On Tuesday, August 17, 2021, Ahmad Fatoum <[email protected]> wrote:
>
> ...
>
>>>>>>> err = brcmf_pcie_probe(pdev, NULL);
>>>>>>> if (err)
>>>>>>> - brcmf_err(bus, "probe after resume failed, err=%d\n", err);
>>>>>>> + __brcmf_err(NULL, __func__, "probe after resume failed,
>>>>>>> err=%d\n",
>>>>>>
>>>>>>
>>>>>> This is weird looking line now. Why can’t you simply use dev_err() /
>>>>>> netdev_err()?
>>>>>
>>>>> That's what brcmf_err normally expands to, but in this file the macro
>>>>> is overridden to add the extra first argument.
>>>>
>>>> So, then the problem is in macro here. You need another portion of
>>>> macro(s) that will use the dev pointer directly. When you have a valid
>>>> device, use it. And here it seems the case.
>>>
>>> Ah, you mean using pdev instead of the stale bus. Ye, I could do that.
>>> Thanks for pointing out.
>>
>> Ah, not so easy: __brcmf_err accepts a struct brcmf_bus * as first argument,
>> but there is none I can pass along. As the whole file uses the brcm_
>> logging functions, I'd just leave this one without a device.
>
> And what exactly prevents you to split that to something like
>
> __brcm_dev_err() // as current __brcm_err with dev argument
> {
> ...
> }
>
> __brsm_err(bus, ...) __brcm_dev_err(bus->dev, ...)
>
> ?

I like my regression fixes to be short and to the point.

Cheers,
Ahmad


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |