LinuxLists.cc - Report long suspend times of NVMe devices (mostly firmware/device issues)

2018-01-22 21:03:22

Subject: Report long suspend times of NVMe devices (mostly firmware/device issues)

Dear Linux folks,

Benchmarking the ACPI S3 suspend and resume times with `sleepgraph.py
-config config/suspend-callgraph.cfg` [1], shows that the NVMe disk
SAMSUNG MZVKW512HMJP-00000 in the TUXEDO Book BU1406 takes between 0.3
and 1.4 seconds, holding up the suspend cycle.

The time is spent in `nvme_shutdown_ctrl()`.

### Linux 4.14.1-041401-generic

> nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 1439.299 ms Total Resume: 19.865 ms)

### Linux 4.15-rc9

> nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 362.239 ms Total Resume: 19.897 m
It’d be useful, if the Linux kernel logged such issues visibly to the
user, so that the hardware manufacturer can be contacted to fix the
device (probably the firmware).

In my opinion anything longer than 200 ms should be reported similar to
[2], and maybe worded like below.

> NVMe took more than 200 ms to do suspend routine

What do you think?

Kind regards,

Paul

[1] https://github.com/01org/pm-graph
[2] https://patchwork.kernel.org/patch/10135925/

Attachments:

=?utf-8?Q?20180122=E2=80=93tuxedo=5Fbook=5Fbu1406=E2=80=93linux=5Fmessages=2Etxt?= (198.96 kB)

2018-01-22 21:27:40

by Keith Busch

[permalink] [raw]

Subject: Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

On Mon, Jan 22, 2018 at 10:02:12PM +0100, Paul Menzel wrote:
> Dear Linux folks,
>
>
> Benchmarking the ACPI S3 suspend and resume times with `sleepgraph.py
> -config config/suspend-callgraph.cfg` [1], shows that the NVMe disk SAMSUNG
> MZVKW512HMJP-00000 in the TUXEDO Book BU1406 takes between 0.3 and 1.4
> seconds, holding up the suspend cycle.
>
> The time is spent in `nvme_shutdown_ctrl()`.
>
> ### Linux 4.14.1-041401-generic
>
> > nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 1439.299 ms Total Resume: 19.865 ms)
>
> ### Linux 4.15-rc9
>
> > nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 362.239 ms Total Resume: 19.897 m
> It’d be useful, if the Linux kernel logged such issues visibly to the user,
> so that the hardware manufacturer can be contacted to fix the device
> (probably the firmware).
>
> In my opinion anything longer than 200 ms should be reported similar to [2],
> and maybe worded like below.
>
> > NVMe took more than 200 ms to do suspend routine
>
> What do you think?

The nvme spec guides toward longer times than that. I don't see the
point of warning users about things operating within spec.

2018-01-22 21:32:37

by Martin K. Petersen

[permalink] [raw]

Subject: Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

Paul,

> In my opinion anything longer than 200 ms should be reported similar to
> [2], and maybe worded like below.

There are many controllers out there that take double-digit seconds to
shut down.

--
Martin K. Petersen Oracle Linux Engineering

2018-01-24 22:30:20

by Paul Menzel

[permalink] [raw]

Subject: Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

Dear Keith,

Thank you for your reply.

Am 22.01.2018 um 22:30 schrieb Keith Busch:
> On Mon, Jan 22, 2018 at 10:02:12PM +0100, Paul Menzel wrote:

>> Benchmarking the ACPI S3 suspend and resume times with `sleepgraph.py
>> -config config/suspend-callgraph.cfg` [1], shows that the NVMe disk SAMSUNG
>> MZVKW512HMJP-00000 in the TUXEDO Book BU1406 takes between 0.3 and 1.4
>> seconds, holding up the suspend cycle.
>>
>> The time is spent in `nvme_shutdown_ctrl()`.
>>
>> ### Linux 4.14.1-041401-generic
>>
>>> nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 1439.299 ms Total Resume: 19.865 ms)
>>
>> ### Linux 4.15-rc9
>>
>>> nvme @ 0000:04:00.0 {nvme} async_device (Total Suspend: 362.239 ms Total Resume: 19.897 m
>> It’d be useful, if the Linux kernel logged such issues visibly to the user,
>> so that the hardware manufacturer can be contacted to fix the device
>> (probably the firmware).
>>
>> In my opinion anything longer than 200 ms should be reported similar to [2],
>> and maybe worded like below.
>>
>>> NVMe took more than 200 ms to do suspend routine
>>
>> What do you think?
>
> The nvme spec guides toward longer times than that. I don't see the
> point of warning users about things operating within spec.

I quickly glanced over NVM Express revision 1.3 specification [1] but
searching for *second *, I could not find something about this. Could
you please point me to the section?

In my opinion, it’s a good thing to point users to devices holding up
suspend.

Kind regards,

Paul

[1] https://nvmexpress.org/wp-content/uploads/NVM_Express_Revision_1.3.pdf

2018-01-24 22:44:21

by Keith Busch

[permalink] [raw]

Subject: Re: Report long suspend times of NVMe devices (mostly firmware/device issues)

On Wed, Jan 24, 2018 at 11:29:12PM +0100, Paul Menzel wrote:
> Am 22.01.2018 um 22:30 schrieb Keith Busch:
> > The nvme spec guides toward longer times than that. I don't see the
> > point of warning users about things operating within spec.
>
> I quickly glanced over NVM Express revision 1.3 specification [1] but
> searching for *second *, I could not find something about this. Could you
> please point me to the section?

Section 7.6.2:

It is recommended that the host wait a minimum of the RTD3 Entry
Latency reported in the Identify Controller data structure for the
shutdown operations to complete; if the value reported in RTD3 Entry
Latency is 0h, then the host should wait for a minimum of one second.

So if the controller is new enough, it will report it's RTD3 Entry.
The maximum allowed by spec is something like 4000 seconds.

For controllers that pre-date this field, we're supposed to wait a
"minimum" of one second.

The spec does not recommend a maximum time in either case.

> In my opinion, it’s a good thing to point users to devices holding up
> suspend.

If a device shutdown exceeds its reported constraints, then absolutely,
and we do log a warning for that already.

Picking an arbitrary time below spec recommendations is just guaranteed
to create alarmed people demanding answers to a warning about behavior
that is entirely normal.