2020-04-17 08:38:18

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH] nvme/pci: Use Discard instead of Write Zeroes on SK hynix SC300

After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
command"), SK hynix SC300 becomes very slow with the following error
message:
[ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]

Use quirk NVME_QUIRK_DEALLOCATE_ZEROES to workaround this issue.

BugLink: https://bugs.launchpad.net/bugs/1872383
Cc: linux-stable <[email protected]> # >= 5.1
Signed-off-by: Kai-Heng Feng <[email protected]>
---
drivers/nvme/host/pci.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 4e79e412b276..e3f4dac823d8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3068,6 +3068,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
{ PCI_DEVICE(0x1c58, 0x0023), /* WDC SN200 adapter */
.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
+ { PCI_DEVICE(0x1c5c, 0x1504), /* SK hynix SC300 */
+ .driver_data = NVME_QUIRK_DEALLOCATE_ZEROES, },
{ PCI_DEVICE(0x1c5f, 0x0540), /* Memblaze Pblaze4 adapter */
.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
{ PCI_DEVICE(0x144d, 0xa821), /* Samsung PM1725 */
--
2.17.1


2020-04-17 13:04:54

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] nvme/pci: Use Discard instead of Write Zeroes on SK hynix SC300

On Fri, Apr 17, 2020 at 04:36:41PM +0800, Kai-Heng Feng wrote:
> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
> command"), SK hynix SC300 becomes very slow with the following error
> message:
> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>
> Use quirk NVME_QUIRK_DEALLOCATE_ZEROES to workaround this issue.

Do you have a written guarantee from SK Hynix that it will always zero
all blocks discarded?

2020-04-17 19:26:44

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH] nvme/pci: Use Discard instead of Write Zeroes on SK hynix SC300

On 04/17/2020 01:37 AM, Kai-Heng Feng wrote:
> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
> command"), SK hynix SC300 becomes very slow with the following error
> message:
> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>
> Use quirk NVME_QUIRK_DEALLOCATE_ZEROES to workaround this issue.
Can you share
nvme id-ctrl -H /dev/nvme0 | grep oncs -A 8
output?

2020-04-21 04:53:06

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] nvme/pci: Use Discard instead of Write Zeroes on SK hynix SC300



> On Apr 17, 2020, at 20:59, Christoph Hellwig <[email protected]> wrote:
>
> On Fri, Apr 17, 2020 at 04:36:41PM +0800, Kai-Heng Feng wrote:
>> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
>> command"), SK hynix SC300 becomes very slow with the following error
>> message:
>> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>>
>> Use quirk NVME_QUIRK_DEALLOCATE_ZEROES to workaround this issue.
>
> Do you have a written guarantee from SK Hynix that it will always zero
> all blocks discarded?

Raised the issue to SK Hynix and waiting for their reply...

Kai-Heng

2020-04-21 04:53:35

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] nvme/pci: Use Discard instead of Write Zeroes on SK hynix SC300



> On Apr 18, 2020, at 03:25, Chaitanya Kulkarni <[email protected]> wrote:
>
> On 04/17/2020 01:37 AM, Kai-Heng Feng wrote:
>> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
>> command"), SK hynix SC300 becomes very slow with the following error
>> message:
>> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>>
>> Use quirk NVME_QUIRK_DEALLOCATE_ZEROES to workaround this issue.
> Can you share
> nvme id-ctrl -H /dev/nvme0 | grep oncs -A 8
> output?
>

$ sudo nvme id-ctrl -H /dev/nvme0 | grep oncs -A 8
oncs : 0x1f
[7:7] : 0 Verify Not Supported
[6:6] : 0 Timestamp Not Supported
[5:5] : 0 Reservations Not Supported
[4:4] : 0x1 Save and Select Supported
[3:3] : 0x1 Write Zeroes Supported
[2:2] : 0x1 Data Set Management Supported
[1:1] : 0x1 Write Uncorrectable Supported
[0:0] : 0x1 Compare Supported

2020-07-23 16:02:24

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH v2] nvme/pci: Add new quirk for SK hynix PC400 NLB off-by-one bug

After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
command"), SK hynix PC400 becomes very slow with the following error
message:
[ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]

SK Hynix PC400 has a buggy firmware that treats NLB as max value instead
of a range, so the NLB passed isn't a valid value to the firmware.

According to SK hynix there are three commands are affected:
- Write Zeroes
- Compare
- Write Uncorrectable

Write Uncorrectable isn't implemented yet, so add a new quirk to
workaround the former two commands.

BugLink: https://bugs.launchpad.net/bugs/1872383
Cc: kyounghwan sohn <[email protected]>
Signed-off-by: Kai-Heng Feng <[email protected]>
---
v2:
- SK hynix found the root cause so change the approach accordingly.
- lspci is wrong, the device is PC400 instead of SC300.

drivers/nvme/host/core.c | 11 ++++++++++-
drivers/nvme/host/nvme.h | 5 +++++
drivers/nvme/host/pci.c | 2 ++
3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index add040168e67..1b51b6f5e2dd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -659,15 +659,21 @@ static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns,
struct request *req, struct nvme_command *cmnd)
{
+ u16 length;
+
if (ns->ctrl->quirks & NVME_QUIRK_DEALLOCATE_ZEROES)
return nvme_setup_discard(ns, req, cmnd);

+ length = (blk_rq_bytes(req) >> ns->lba_shift) - 1;
+ if (ns->ctrl->quirks & NVME_QUIRK_SKHYNIX_NLB_BUG)
+ length--;
+
cmnd->write_zeroes.opcode = nvme_cmd_write_zeroes;
cmnd->write_zeroes.nsid = cpu_to_le32(ns->head->ns_id);
cmnd->write_zeroes.slba =
cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req)));
cmnd->write_zeroes.length =
- cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1);
+ cpu_to_le16(length);
cmnd->write_zeroes.control = 0;
return BLK_STS_OK;
}
@@ -1302,6 +1308,9 @@ static int nvme_submit_io(struct nvme_ns *ns, struct nvme_user_io __user *uio)
}

length = (io.nblocks + 1) << ns->lba_shift;
+ if (ns->ctrl->quirks & NVME_QUIRK_SKHYNIX_NLB_BUG && io.opcode == nvme_cmd_compare)
+ length--;
+
meta_len = (io.nblocks + 1) * ns->ms;
metadata = nvme_to_user_ptr(io.metadata);

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 1de3f9b827aa..4cbced5062a8 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -129,6 +129,11 @@ enum nvme_quirks {
* Don't change the value of the temperature threshold feature
*/
NVME_QUIRK_NO_TEMP_THRESH_CHANGE = (1 << 14),
+
+ /*
+ * SK Hynix PC400 NLB off-by-one bug
+ */
+ NVME_QUIRK_SKHYNIX_NLB_BUG = (1 << 15),
};

/*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b1d18f0633c7..b5a54e6726e4 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3122,6 +3122,8 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(0x1cc1, 0x8201), /* ADATA SX8200PNP 512GB */
.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+ { PCI_DEVICE(0x1c5c, 0x1504), /* SK Hynix PC400 */
+ .driver_data = NVME_QUIRK_SKHYNIX_NLB_BUG, },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
--
2.17.1

2020-07-23 16:04:54

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] nvme/pci: Add new quirk for SK hynix PC400 NLB off-by-one bug

On Thu, Jul 23, 2020 at 11:57:31PM +0800, Kai-Heng Feng wrote:
> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
> command"), SK hynix PC400 becomes very slow with the following error
> message:
> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>
> SK Hynix PC400 has a buggy firmware that treats NLB as max value instead
> of a range, so the NLB passed isn't a valid value to the firmware.
>
> According to SK hynix there are three commands are affected:
> - Write Zeroes
> - Compare
> - Write Uncorrectable
>
> Write Uncorrectable isn't implemented yet, so add a new quirk to
> workaround the former two commands.

compare isn't implemented either in the kernel, and we certainly
aren't going to do these quirks for passthrough. So I think we really
want a "write zeroes is buggy" quirk and just disable issuing that
command from the driver.

>
> BugLink: https://bugs.launchpad.net/bugs/1872383
> Cc: kyounghwan sohn <[email protected]>
> Signed-off-by: Kai-Heng Feng <[email protected]>
> ---
> v2:
> - SK hynix found the root cause so change the approach accordingly.
> - lspci is wrong, the device is PC400 instead of SC300.

I don't remember seing a v1..

2020-07-23 16:14:01

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v2] nvme/pci: Add new quirk for SK hynix PC400 NLB off-by-one bug


> On Jul 24, 2020, at 00:03, Christoph Hellwig <[email protected]> wrote:
>
> On Thu, Jul 23, 2020 at 11:57:31PM +0800, Kai-Heng Feng wrote:
>> After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
>> command"), SK hynix PC400 becomes very slow with the following error
>> message:
>> [ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]
>>
>> SK Hynix PC400 has a buggy firmware that treats NLB as max value instead
>> of a range, so the NLB passed isn't a valid value to the firmware.
>>
>> According to SK hynix there are three commands are affected:
>> - Write Zeroes
>> - Compare
>> - Write Uncorrectable
>>
>> Write Uncorrectable isn't implemented yet, so add a new quirk to
>> workaround the former two commands.
>
> compare isn't implemented either in the kernel, and we certainly
> aren't going to do these quirks for passthrough. So I think we really
> want a "write zeroes is buggy" quirk and just disable issuing that
> command from the driver.

Ok, will send a new version based on your suggestion.

>
>>
>> BugLink: https://bugs.launchpad.net/bugs/1872383
>> Cc: kyounghwan sohn <[email protected]>
>> Signed-off-by: Kai-Heng Feng <[email protected]>
>> ---
>> v2:
>> - SK hynix found the root cause so change the approach accordingly.
>> - lspci is wrong, the device is PC400 instead of SC300.
>
> I don't remember seing a v1..

Well, because it took some time for the vendor to find the root cause...
Here's the v1:
https://lkml.org/lkml/2020/4/17/169

Kai-Heng

2020-07-23 17:31:07

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH v3] nvme/pci: Prevent SK hynix PC400 from using Write Zeroes command

After commit 6e02318eaea5 ("nvme: add support for the Write Zeroes
command"), SK hynix PC400 becomes very slow with the following error
message:
[ 224.567695] blk_update_request: operation not supported error, dev nvme1n1, sector 499384320 op 0x9:(WRITE_ZEROES) flags 0x1000000 phys_seg 0 prio class 0]

SK Hynix PC400 has a buggy firmware that treats NLB as max value instead
of a range, so the NLB passed isn't a valid value to the firmware.

According to SK hynix there are three commands are affected:
- Write Zeroes
- Compare
- Write Uncorrectable

Right now only Write Zeroes is implemented, so disable it completely on
SK hynix PC400.

BugLink: https://bugs.launchpad.net/bugs/1872383
Cc: kyounghwan sohn <[email protected]>
Signed-off-by: Kai-Heng Feng <[email protected]>
---
v3:
- Simply disable Write Zeroes command.

v2:
- SK hynix found the root cause so change the approach accordingly.
- lspci is wrong, the device is PC400 instead of SC300.

drivers/nvme/host/pci.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b1d18f0633c7..25a187e43dbe 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3122,6 +3122,8 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(0x1cc1, 0x8201), /* ADATA SX8200PNP 512GB */
.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+ { PCI_DEVICE(0x1c5c, 0x1504), /* SK Hynix PC400 */
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
--
2.17.1

2020-10-13 12:00:13

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH] nvme-pci: Disable Write Zeroes on Sandisk Skyhawk

Like commit 5611ec2b9814 ("nvme-pci: prevent SK hynix PC400 from using
Write Zeroes command"), Sandisk Skyhawk has the same issue:
[ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

So also disable Write Zeroes command on Sandisk Skyhawk.

BugLink: https://bugs.launchpad.net/bugs/1899503
Signed-off-by: Kai-Heng Feng <[email protected]>
---
drivers/nvme/host/pci.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8984796db0c8..d310d7317e2a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3187,6 +3187,8 @@ static const struct pci_device_id nvme_id_table[] = {
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
{ PCI_DEVICE(0x1c5c, 0x1504), /* SK Hynix PC400 */
.driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+ { PCI_DEVICE(0x15b7, 0x2001), /* Sandisk Skyhawk */
+ .driver_data = NVME_QUIRK_DISABLE_WRITE_ZEROES, },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
.driver_data = NVME_QUIRK_SINGLE_VECTOR },
--
2.17.1

2020-10-14 08:04:13

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Disable Write Zeroes on Sandisk Skyhawk

On 10/13/20 01:45, Kai-Heng Feng wrote:
> Like commit 5611ec2b9814 ("nvme-pci: prevent SK hynix PC400 from using
> Write Zeroes command"), Sandisk Skyhawk has the same issue:
> [ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>
> So also disable Write Zeroes command on Sandisk Skyhawk.
>
> BugLink: https://bugs.launchpad.net/bugs/1899503
> Signed-off-by: Kai-Heng Feng <[email protected]>

Are you sure this happens all the devices of the same model

and not a firmware bug on specific device ?

If yes then looks good.

Reviewed-by: Chaitanya Kulkarni <[email protected]>



2020-10-15 10:47:19

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Disable Write Zeroes on Sandisk Skyhawk



> On Oct 14, 2020, at 08:20, Chaitanya Kulkarni <[email protected]> wrote:
>
> On 10/13/20 01:45, Kai-Heng Feng wrote:
>> Like commit 5611ec2b9814 ("nvme-pci: prevent SK hynix PC400 from using
>> Write Zeroes command"), Sandisk Skyhawk has the same issue:
>> [ 6305.633887] blk_update_request: operation not supported error, dev nvme0n1, sector 340812032 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
>>
>> So also disable Write Zeroes command on Sandisk Skyhawk.
>>
>> BugLink: https://bugs.launchpad.net/bugs/1899503
>> Signed-off-by: Kai-Heng Feng <[email protected]>
>
> Are you sure this happens all the devices of the same model
>
> and not a firmware bug on specific device ?
>
> If yes then looks good.

Yes, and this is the reply from WD, requested by the user:
"These are newest WD branded drives and WD already confirmed they are
the newest firmware / no upgrades are available."

Kai-Heng

>
> Reviewed-by: Chaitanya Kulkarni <[email protected]>
>
>
>

2020-10-16 18:43:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Disable Write Zeroes on Sandisk Skyhawk

Thanks,

applied to nvme-5.10.