2021-11-06 09:23:23

by Enzo Matsumiya

[permalink] [raw]
Subject: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

This particular Kioxia device times out and aborts I/O during any load,
but it's more easily observable with discards (fstrim).

The device gets to a state that is also not possible to use "nvme set-feature"
to disable APST. Booting with nvme_core.default_ps_max_latency=0 solves the issue.

We had a dozen or so of these behaving this same way on customer
environment.

Signed-off-by: Enzo Matsumiya <[email protected]>
---
drivers/nvme/host/core.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 838b5e2058be..a698c099164c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2469,7 +2469,19 @@ static const struct nvme_core_quirk_entry core_quirks[] = {
.vid = 0x14a4,
.fr = "22301111",
.quirks = NVME_QUIRK_SIMPLE_SUSPEND,
- }
+ },
+ {
+ /*
+ * This Kioxia device times out and aborts I/O during any load,
+ * but more easily reproducible with discards (fstrim).
+ *
+ * Device is left in a state that is also not possible to use "nvme set-feature"
+ * to disable APST, but booting with nvme_core.default_ps_max_latency=0 works.
+ */
+ .vid = 0x1e0f,
+ .mn = "KCD6XVUL6T40",
+ .quirks = NVME_QUIRK_NO_APST,
+ }
};

/* match is null-terminated but idstr is space-padded. */
--
2.33.0


2021-11-09 16:55:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

On Fri, Nov 05, 2021 at 11:08:57PM -0300, Enzo Matsumiya wrote:
> + },
> + {
> + /*
> + * This Kioxia device times out and aborts I/O during any load,
> + * but more easily reproducible with discards (fstrim).
> + *
> + * Device is left in a state that is also not possible to use "nvme set-feature"
> + * to disable APST, but booting with nvme_core.default_ps_max_latency=0 works.
> + */

Overly long lines here, but I can fix that up. Do you have a product
name for this device or is just a nameless OEM controller?

2021-11-09 23:58:43

by Enzo Matsumiya

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

On 11/09, Christoph Hellwig wrote:
>On Fri, Nov 05, 2021 at 11:08:57PM -0300, Enzo Matsumiya wrote:
>> + },
>> + {
>> + /*
>> + * This Kioxia device times out and aborts I/O during any load,
>> + * but more easily reproducible with discards (fstrim).
>> + *
>> + * Device is left in a state that is also not possible to use "nvme set-feature"
>> + * to disable APST, but booting with nvme_core.default_ps_max_latency=0 works.
>> + */
>
>Overly long lines here, but I can fix that up.

Missed that, sorry. Thanks.

> Do you have a product
>name for this device or is just a nameless OEM controller?

I didn't, had to google. I see it listed on Kioxia's website as
CD6-V Series (6.4T model), but since customer was running on an
DL380 Gen10, I suspect it could also be HPE PE8030 (PN P19837-B21).


Cheers,

Enzo

2021-11-10 06:12:32

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

Enzo,

On 11/9/2021 7:04 AM, Enzo Matsumiya wrote:
> On 11/09, Christoph Hellwig wrote:
>> On Fri, Nov 05, 2021 at 11:08:57PM -0300, Enzo Matsumiya wrote:
>>> +    },
>>> +    {
>>> +        /*
>>> +         * This Kioxia device times out and aborts I/O during any load,
>>> +         * but more easily reproducible with discards (fstrim).
>>> +         *
>>> +         * Device is left in a state that is also not possible to
>>> use "nvme set-feature"
>>> +         * to disable APST, but booting with
>>> nvme_core.default_ps_max_latency=0 works.
>>> +         */
>>
>> Overly long lines here, but I can fix that up.
>
> Missed that, sorry. Thanks.
>
>>  Do you have a product
>> name for this device or is just a nameless OEM controller?
>
> I didn't, had to google. I see it listed on Kioxia's website as
> CD6-V Series (6.4T model), but since customer was running on an
> DL380 Gen10, I suspect it could also be HPE PE8030 (PN P19837-B21).
>

You can get more information about this controller using nvme-cli
id-ctrl command.


>
> Cheers,
>
> Enzo
>

2021-11-10 15:25:51

by Enzo Matsumiya

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

Hi Chaitanya,

On 11/10, Chaitanya Kulkarni wrote:
>Enzo,
>
>> I didn't, had to google. I see it listed on Kioxia's website as
>> CD6-V Series (6.4T model), but since customer was running on an
>> DL380 Gen10, I suspect it could also be HPE PE8030 (PN P19837-B21).
>>
>
>You can get more information about this controller using nvme-cli
>id-ctrl command.

I'm aware.

I don't have access to the system, but id-ctrl output was shared with
us.

AFAICS these are the only identifying fields, which all indicate to be
Kioxia's:

NVME Identify Controller:
vid : 0x1e0f
ssvid : 0x1e0f
sn : [redacted]
mn : KCD6XVUL6T40
fr : GPK1
rab : 2
ieee : 8ce38e
...

Now something that I'm not aware. Would these change even if another
brand was using this Kioxia's chipset?

Is it crucial to have the product information (vs only chipset information)
for quirks? I'm already querying the customer for this anyway.


Cheers,

Enzo

2021-11-11 10:01:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] nvme: add NO APST quirk for Kioxia device

Thanks, applied to nvme-5.16 with a few cosmetic fixups.