2022-05-05 13:44:26

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: [PATCH] net: atlantic: always deep reset on pm op, fixing null deref regression

On 2022-05-05 00:06, Manuel Ullmann wrote:
> From a3eccd32c618fe4b4f5c537cd83ba5611149623e Mon Sep 17 00:00:00 2001
> Date: Wed, 4 May 2022 21:30:44 +0200
>
> The impact of this regression is the same for resume that I saw on
> thaw: the kernel hangs and nothing except SysRq rebooting can be done.
>
> The null deref occurs at the same position as on thaw.
> BUG: kernel NULL pointer dereference
> RIP: aq_ring_rx_fill+0xcf/0x210 [atlantic]
>
> Fixes regression in cbe6c3a8f8f4 ("net: atlantic: invert deep par in
> pm functions, preventing null derefs"), where I disabled deep pm
> resets in suspend and resume, trying to make sense of the
> atl_resume_common deep parameter in the first place.
>
> It turns out, that atlantic always has to deep reset on pm operations
> and the parameter is useless. Even though I expected that and tested
> resume, I screwed up by kexec-rebooting into an unpatched kernel, thus
> missing the breakage.
>
> This fixup obsoletes the deep parameter of atl_resume_common, but I
> leave the cleanup for the maintainers to post to mainline.
>
> PS: I'm very sorry for this regression.
>
> Fixes: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c
> Link: https://lore.kernel.org/regressions/9-Ehc_xXSwdXcvZqKD5aSqsqeNj5Izco4MYEwnx5cySXVEc9-x_WC4C3kAoCqNTi-H38frroUK17iobNVnkLtW36V6VWGSQEOHXhmVMm5iQ=@protonmail.com/
> Reported-by: Jordan Leppert <[email protected]>
> Reported-by: Holger Hoffstätte <[email protected]>
> CC: <[email protected]> # 5.17.5
> CC: <[email protected]> # 5.15.36
> CC: <[email protected]> # 5.10.113
> Signed-off-by: Manuel ULlmann <[email protected]>
> ---
> drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
> index 3a529ee8c834..831833911a52 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
> @@ -449,7 +449,7 @@ static int aq_pm_freeze(struct device *dev)
>
> static int aq_pm_suspend_poweroff(struct device *dev)
> {
> - return aq_suspend_common(dev, false);
> + return aq_suspend_common(dev, true);
> }
>
> static int aq_pm_thaw(struct device *dev)
> @@ -459,7 +459,7 @@ static int aq_pm_thaw(struct device *dev)
>
> static int aq_pm_resume_restore(struct device *dev)
> {
> - return atl_resume_common(dev, false);
> + return atl_resume_common(dev, true);
> }
>
> static const struct dev_pm_ops aq_pm_ops = {
>
> base-commit: 672c0c5173427e6b3e2a9bbb7be51ceeec78093a
>

As mentioned in the discusson thread this reliably restores resume
for me, so:

Tested-by: Holger Hoffstätte <[email protected]>

Thanks!
Holger