Hi,
Today, for the first time, I encountered the fact that my new nvme disk is down.
In the kernel logs, I found the following sequence of messages:
[ 3005.869069] [drm] free PSP TMR buffer
[ 4626.562712] nvme nvme0: controller is down; will reset:
CSTS=0xffffffff, PCI_STATUS=0x10
[ 4626.584716] nvme 0000:06:00.0: enabling device (0000 -> 0002)
[ 4626.585006] nvme nvme0: Removing after probe failure status: -19
[ 4626.590776] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590784] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590797] nvme0n1: detected capacity change from 7814037168 to 0
[ 4626.590814] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 4, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590816] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 5, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590816] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590832] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590835] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590838] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 9, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590847] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
[ 4626.590847] BTRFS error (device nvme0n1p3): bdev /dev/nvme0n1p3
errs: wr 11, rd 0, flush 0, corrupt 0, gen 0
[ 4626.593059] BTRFS: error (device nvme0n1p3) in
btrfs_commit_transaction:2418: errno=-5 IO failure (Error while
writing out transaction)
[ 4626.593075] BTRFS info (device nvme0n1p3: state E): forced readonly
[ 4626.593099] BTRFS warning (device nvme0n1p3: state E): Skipping
commit of aborted transaction.
[ 4626.593107] BTRFS: error (device nvme0n1p3: state EA) in
cleanup_transaction:1982: errno=-5 IO failure
[ 4626.593137] BTRFS: error (device nvme0n1p3: state EA) in
btrfs_sync_log:3331: errno=-5 IO failure
Googling turned up a lot of links to various old reports (4.xx kernel)
and APST issue reports.
In a bug report on kernel.org [6], the unfortunate users talking with
each other with no hope of a solution being found.
The most clarifying article turned out to be [1].
After analyzing the answer of the commands "nvme id-ctrl /dev/nvme0"
and "cat /sys/module/nvme_core/parameters/default_ps_max_latency_us".
# nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid : 0x1bb1
ssvid : 0x1bb1
sn : 7VS00CLE
mn : Seagate FireCuda 530 ZP4000GM30013
fr : SU6SM001
[...]
ps 0 : mp:8.80W operational enlat:0 exlat:0 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
ps 1 : mp:7.10W operational enlat:0 exlat:0 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:5.20W operational enlat:0 exlat:0 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
ps 3 : mp:0.0620W non-operational enlat:2500 exlat:7500 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0440W non-operational enlat:10500 exlat:65000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
# cat /sys/module/nvme_core/parameters/default_ps_max_latency_us
100000
I concluded that my problem is not related to APST because 2500 + 7500
+ 10500 + 65000 = 85500 < 100000
100000 is greater than the total latency of any state (enlat + xlat).
Or am I misinterpreting the results?
Therefore, I would like to ask if there are any other ideas why nvme
can stop working with the message "controller is down; will reset:
CSTS=0xffffffff, PCI_STATUS=0x10", which generally does not say
anything about the reason why this happened.
My kernel is 5.18rc5.
Thanks in advance for any answer that will clear things up. And where
to dig in search of a solution to the problem.
[1] https://wiki.archlinux.org/title/Solid_state_drive/NVMe
[2] [# smartctl -a /dev/nvme0] - https://pastebin.com/JwSXwu6c
[3] [# nvme get-feature /dev/nvme0 -f 0x0c -H] - https://pastebin.com/KZ6FjhGt
[4] [# nvme id-ctrl /dev/nvme0] - https://pastebin.com/seEkPfF7
[5] [full dmesg] - https://pastebin.com/aNEaqtCV
[6] [bug report about Samsung PM951 NVMe] -
https://bugzilla.kernel.org/show_bug.cgi?id=195039
--
Best Regards,
Mike Gavrilov.
On Thu, May 05, 2022 at 06:58:11AM +0500, Mikhail Gavrilov wrote:
> ps 1 : mp:7.10W operational enlat:0 exlat:0 rrt:1 rrl:1
> rwt:1 rwl:1 idle_power:- active_power:-
> ps 2 : mp:5.20W operational enlat:0 exlat:0 rrt:2 rrl:2
> rwt:2 rwl:2 idle_power:- active_power:-
> ps 3 : mp:0.0620W non-operational enlat:2500 exlat:7500 rrt:3 rrl:3
> rwt:3 rwl:3 idle_power:- active_power:-
> ps 4 : mp:0.0440W non-operational enlat:10500 exlat:65000 rrt:4 rrl:4
> rwt:4 rwl:4 idle_power:- active_power:-
>
> # cat /sys/module/nvme_core/parameters/default_ps_max_latency_us
> 100000
>
> I concluded that my problem is not related to APST because 2500 + 7500
> + 10500 + 65000 = 85500 < 100000
> 100000 is greater than the total latency of any state (enlat + xlat).
>
> Or am I misinterpreting the results?
I think you did misinterpret the results. The max latency just says which power
state is the deepest it will request APST, and your controller's reported
values will allow the deepest low power state your controller supports, which
is known to cause problems with some platform/controller combinations.
The troubleshooting steps for your observation is to:
1. Turn off APST (nvme_core.default_ps_max_latency_us=0)
2. Turn off APSM (pcie_aspm=off)
3. Turn off both
Typically one of those resolves the issue.
On Thu, May 05, 2022 at 01:02:12PM +0500, Mikhail Gavrilov wrote:
> On Thu, May 5, 2022 at 10:19 AM Keith Busch <[email protected]> wrote:
> > I think you did misinterpret the results. The max latency just says which power
> > state is the deepest it will request APST, and your controller's reported
> > values will allow the deepest low power state your controller supports, which
> > is known to cause problems with some platform/controller combinations.
> >
> > The troubleshooting steps for your observation is to:
> >
> > 1. Turn off APST (nvme_core.default_ps_max_latency_us=0)
> > 2. Turn off APSM (pcie_aspm=off)
> > 3. Turn off both
> >
> > Typically one of those resolves the issue.
>
> Thanks.
> To make it easier for everyone to diagnose such problems, it would be
> great if every switching between power save modes would be written to
> the kernel log (when console_loglevel is KERN_DEBUG)
> If APST is culprit, we would have seen the change in the power state
> in the kernel logs before the message "nvme nvme0: controller is
> down;".
The "A" in "APST" stands for "Autonomous", as in the kernel doesn't participate
in the power state transitions, so we don't have an opportunity to log such
things. We could perhaps add a kernel message like the classic "Dazed and
confused" power mode strangeness since this spec compliant feature problem
seems to be bizarrely common.
On Thu, May 5, 2022 at 10:19 AM Keith Busch <[email protected]> wrote:
> I think you did misinterpret the results. The max latency just says which power
> state is the deepest it will request APST, and your controller's reported
> values will allow the deepest low power state your controller supports, which
> is known to cause problems with some platform/controller combinations.
>
> The troubleshooting steps for your observation is to:
>
> 1. Turn off APST (nvme_core.default_ps_max_latency_us=0)
> 2. Turn off APSM (pcie_aspm=off)
> 3. Turn off both
>
> Typically one of those resolves the issue.
Thanks.
To make it easier for everyone to diagnose such problems, it would be
great if every switching between power save modes would be written to
the kernel log (when console_loglevel is KERN_DEBUG)
If APST is culprit, we would have seen the change in the power state
in the kernel logs before the message "nvme nvme0: controller is
down;".
--
Best Regards,
Mike Gavrilov.
On Thu, May 5, 2022 at 10:19 AM Keith Busch <[email protected]> wrote:
> The troubleshooting steps for your observation is to:
>
> 1. Turn off APST (nvme_core.default_ps_max_latency_us=0)
> 2. Turn off APSM (pcie_aspm=off)
> 3. Turn off both
>
> Typically one of those resolves the issue.
What to do if none of these steps helped? I attached log which proves
that I am using both parameters nvme_core.default_ps_max_latency_us=0
and pcie_aspm=off .
--
Best Regards,
Mike Gavrilov.
On Wed, Feb 22, 2023 at 06:59:59PM +0500, Mikhail Gavrilov wrote:
> On Thu, May 5, 2022 at 10:19 AM Keith Busch <[email protected]> wrote:
>
> > The troubleshooting steps for your observation is to:
> >
> > 1. Turn off APST (nvme_core.default_ps_max_latency_us=0)
> > 2. Turn off APSM (pcie_aspm=off)
> > 3. Turn off both
> >
> > Typically one of those resolves the issue.
>
> What to do if none of these steps helped? I attached log which proves
> that I am using both parameters nvme_core.default_ps_max_latency_us=0
> and pcie_aspm=off .
Those are just the most readily available things we can tune at
this level that has helped on *some* platform/device combinations.
Certainly not going to solve every problem.
You are showing that the driver can't read from the device's memory,
and there's nothing the driver can do about that. This is usually
some platform bios breakage well below the visibility of the nvme
driver.
Perhaps your platform's bridge windows are screwed up. One other
thing you can try is adding param "pci=nocrs" to have the kernel
ignore ACPI when setting these up.
On Wed, Feb 22, 2023 at 8:37 PM Keith Busch <[email protected]> wrote:
>
> Those are just the most readily available things we can tune at
> this level that has helped on *some* platform/device combinations.
> Certainly not going to solve every problem.
>
> You are showing that the driver can't read from the device's memory,
> and there's nothing the driver can do about that. This is usually
> some platform bios breakage well below the visibility of the nvme
> driver.
>
> Perhaps your platform's bridge windows are screwed up. One other
> thing you can try is adding param "pci=nocrs" to have the kernel
> ignore ACPI when setting these up.
Hi,
with parameter "pci=nocrs" not working WiFi (mt7921e) and after some
time nvme again downed.
So I must say with regret that this did not help. I attached here the
kerner log and lspci which are maided with "pci=nocrs".
No more ideas?
--
Best Regards,
Mike Gavrilov.