2023-01-17 19:49:01

by Janne Grunau

[permalink] [raw]
Subject: [PATCH v2 0/2] nvme-apple: Fix suspend-resume regression

c76b8308e4c9 removed a NVMe controller reset in the shutdown path. This
broke suspend since it triggered a reset on resume. This reset hangs
since the co-processor is not up.
In addition the reset is needed on suspend to shutdown the co-processor
cleanly.

This series contains a functional revert of c76b8308e4c9 (a simple revert
is not possible due to other changes) and issues the NVMe reset only when
the co-processor is running.

Changes since Hector's v1:
- keep the fix localy in nvme-apple
- disable on shutdown for clean co-processor shutdown
- disable on reset only while the co-processor is running

---
Janne Grunau (2):
nvme-apple: Reset controller during shutdown
nvme-apple: Only reset the controller when RTKit is running

drivers/nvme/host/apple.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
---
base-commit: 5dc4c995db9eb45f6373a956eb1f69460e69e6d4
change-id: 20230114-apple-nvme-suspend-fixes-v6.2-rc4

Best regards,
--
Janne Grunau <[email protected]>


2023-01-17 20:37:41

by Janne Grunau

[permalink] [raw]
Subject: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller
shutdown in apple_nvme_disable").

The commit broke suspend/resume since apple_nvme_reset_work() tries to
disable the controller on resume. This does not work for the apple NVMe
controller since register access only works while the co-processor
firmware is running.

Disabling the NVMe controller in the shutdown path is also required
for shutting the co-processor down. The original code was appropriate
for this hardware. Add a comment to prevent a similar breaking changes
in the future.

Fixes: c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable")
Reported-by: Janne Grunau <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Janne Grunau <[email protected]>
---
drivers/nvme/host/apple.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index bf1c60edb7f9..2a1f11b30615 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -829,7 +829,13 @@ static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
apple_nvme_remove_cq(anv);
}

- nvme_disable_ctrl(&anv->ctrl, shutdown);
+ /*
+ * Always reset the NVMe controller on shutdown. The reset is
+ * required to shutdown the co-processor cleanly.
+ */
+ if (shutdown)
+ nvme_disable_ctrl(&anv->ctrl, shutdown);
+ nvme_disable_ctrl(&anv->ctrl, false);
}

WRITE_ONCE(anv->ioq.enabled, false);

--
2.38.2

2023-01-17 20:46:04

by Janne Grunau

[permalink] [raw]
Subject: [PATCH v2 2/2] nvme-apple: Only reset the controller when RTKit is running

NVMe controller register access hangs indefinitely when the co-processor
is not running. A missed reset is preferable over a hanging thread since
it could be recoverable.

Signed-off-by: Janne Grunau <[email protected]>
---
drivers/nvme/host/apple.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index 2a1f11b30615..3258fd7efaf9 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -991,11 +991,11 @@ static void apple_nvme_reset_work(struct work_struct *work)
goto out;
}

- if (anv->ctrl.ctrl_config & NVME_CC_ENABLE)
- apple_nvme_disable(anv, false);
-
/* RTKit must be shut down cleanly for the (soft)-reset to work */
if (apple_rtkit_is_running(anv->rtk)) {
+ /* reset the controller if it is enabled */
+ if (anv->ctrl.ctrl_config & NVME_CC_ENABLE)
+ apple_nvme_disable(anv, false);
dev_dbg(anv->dev, "Trying to shut down RTKit before reset.");
ret = apple_rtkit_shutdown(anv->rtk);
if (ret)

--
2.38.2

2023-01-18 05:37:55

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> + /*
> + * Always reset the NVMe controller on shutdown. The reset is
> + * required to shutdown the co-processor cleanly.
> + */

Hmm. This comment doesn't seem to match the discussion we had last
week. Which would be:

/*
* NVMe requires a reset before setting up a controller to
* ensure it is in a clean state. For NVMe PCIe this is
* done in the setup path to be able to deal with controllers
* in any kind of state. For for Apple devices, the firmware
* will not be available at that time and the reset will
* time out. Thus reset after shutting the NVMe controller
* down and before shutting the firmware down.
*/

2023-01-18 05:41:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] nvme-apple: Fix suspend-resume regression

I'll queue this up in nvme-6.2 as an urgent fix. But I'd love to hear
clarification on what hsould be in the comment based on the discussion
last week.

2023-01-19 06:37:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

Folks, can you chime in if this comment makes sense? I'd really
like to send the patches off to Jens before rc5.

On Wed, Jan 18, 2023 at 06:24:50AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> > + /*
> > + * Always reset the NVMe controller on shutdown. The reset is
> > + * required to shutdown the co-processor cleanly.
> > + */
>
> Hmm. This comment doesn't seem to match the discussion we had last
> week. Which would be:
>
> /*
> * NVMe requires a reset before setting up a controller to
> * ensure it is in a clean state. For NVMe PCIe this is
> * done in the setup path to be able to deal with controllers
> * in any kind of state. For for Apple devices, the firmware
> * will not be available at that time and the reset will
> * time out. Thus reset after shutting the NVMe controller
> * down and before shutting the firmware down.
> */
---end quoted text---

2023-01-19 08:26:32

by Janne Grunau

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

On 2023-01-19 09:08:39 +0100, Christoph Hellwig wrote:
> Thanks, this looks good. Updated commit here:
>
> http://git.infradead.org/nvme.git/commitdiff/c06ba7b892a50b48522ad441a40053f483dfee9e

looks good to me as well.

thanks

Janne

2023-01-19 08:37:13

by Janne Grunau

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

Hej,

On 2023-01-18 06:24:50 +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> > + /*
> > + * Always reset the NVMe controller on shutdown. The reset is
> > + * required to shutdown the co-processor cleanly.
> > + */
>
> Hmm. This comment doesn't seem to match the discussion we had last
> week. Which would be:
>
> /*
> * NVMe requires a reset before setting up a controller to
> * ensure it is in a clean state. For NVMe PCIe this is
> * done in the setup path to be able to deal with controllers
> * in any kind of state. For for Apple devices, the firmware
> * will not be available at that time and the reset will
> * time out. Thus reset after shutting the NVMe controller
> * down and before shutting the firmware down.
> */

yes, it differs from the discussion last week. I tried to issue the
reset later in the setup path after the firmware was brought back up.
That fixes the hang but the device is still not useable. So it appears
we need to reset the controller before the firmware is shutdown.

Janne

2023-01-19 08:37:59

by Hector Martin

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] nvme-apple: Reset controller during shutdown

(Replying from mobile, please excuse formatting)

I'm actually not sure exactly how this works any more. The previous series I sent (which had slightly different logic) worked for me on a t8103 Mac Mini in smoke tests and I'd assumed fixed the issue, but it turned out to fail (in a different way) on other machines/circumstances. This one seems to work everywhere, but I can't explain exactly why. Maybe we do in fact need to issue an NVMe disable before shutting down the firmware to reliably come up properly on firmware restart.

Maybe something like this?

/*
* Always disable the NVMe controller after shutdown.
* We need to do this to bring it back up later anyway,
* and we can't do it while the firmware is not running
* (e.g. in the resume reset path before RTKit is
* initialized), so for Apple controllers it makes sense to
* unconditionally do it here. Additionally, this sequence
* of events is reliable, while others (like disabling after
* bringing back the firmware on resume) seem to run
* into trouble under some circumstances.
*
* Both U-Boot and m1n1 also use this convention
* (i.e. an ANS NVMe controller is handed off with
* firmware shut down, in an NVMe disabled state,
* after a clean shutdown).
*/

On 2023年1月19日 15:14:52 JST, Christoph Hellwig <[email protected]> wrote:
>Folks, can you chime in if this comment makes sense? I'd really
>like to send the patches off to Jens before rc5.
>
>On Wed, Jan 18, 2023 at 06:24:50AM +0100, Christoph Hellwig wrote:
>> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
>> > + /*
>> > + * Always reset the NVMe controller on shutdown. The reset is
>> > + * required to shutdown the co-processor cleanly.
>> > + */
>>
>> Hmm. This comment doesn't seem to match the discussion we had last
>> week. Which would be:
>>
>> /*
>> * NVMe requires a reset before setting up a controller to
>> * ensure it is in a clean state. For NVMe PCIe this is
>> * done in the setup path to be able to deal with controllers
>> * in any kind of state. For for Apple devices, the firmware
>> * will not be available at that time and the reset will
>> * time out. Thus reset after shutting the NVMe controller
>> * down and before shutting the firmware down.
>> */
>---end quoted text---
>

--
Hector Martin "marcan" ([email protected])
Public key: https://mrcn.st/pub