2018-11-06 07:12:59

by AceLan Kao

[permalink] [raw]
Subject: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

It leads to the power consumption raises to 2.2W during s2idle, while
it consumes less than 1W during long idle if put SK hynix nvme to D3
and then enter s2idle.
From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
APST feature to do the power management.
To leverage its APST feature during s2idle, we can't disable nvme
device while suspending, too.

BTW, prevent it from entering D3 will increase the power consumtion around
0.13W ~ 0.15W during short/long idle, and the power consumption during
s2idle becomes 0.77W.

Signed-off-by: AceLan Kao <[email protected]>
---
drivers/pci/quirks.c | 1 +
include/linux/pci_ids.h | 2 ++
2 files changed, 3 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4700d24e5d55..b7e6492e8311 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1332,6 +1332,7 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
occur when mode detecting */
DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);

/*
* This was originally an Alpha-specific thing, but it really fits here.
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 69f0abe1ba1a..5f5adda07de0 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -3090,4 +3090,6 @@

#define PCI_VENDOR_ID_NCUBE 0x10ff

+#define PCI_VENDOR_ID_SK_HYNIX 0x1c5c
+
#endif /* _LINUX_PCI_IDS_H */
--
2.17.1



2018-11-06 07:13:02

by AceLan Kao

[permalink] [raw]
Subject: [PATCH v2 2/2] nvme: add quirk to not call disable function when suspending

Call nvme_dev_disable() function leads to the power consumption goes
up to 2.2 Watt during suspend-to-idle, and from SK hynix FE, they
suggest us to use its own APST feature to do the power management during
s2idle.
After D3 is diabled and nvme_dev_disable() is not called while
suspending, the power consumption drops to 0.77 Watt during s2idle.

V2:
- replace PCI_DEVICE with PCI_VDEVICE
- replace 0x1c5c with SK_HYNIX

Signed-off-by: AceLan Kao <[email protected]>
---
drivers/nvme/host/nvme.h | 5 +++++
drivers/nvme/host/pci.c | 8 +++++++-
2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cee79cb388af..35d260a4cf46 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -90,6 +90,11 @@ enum nvme_quirks {
* Set MEDIUM priority on SQ creation
*/
NVME_QUIRK_MEDIUM_PRIO_SQ = (1 << 7),
+
+ /*
+ * Do not disable nvme when suspending(s2idle)
+ */
+ NVME_QUIRK_NO_DISABLE = (1 << 8),
};

/*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c33bb201b884..13a2d6b2d047 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -31,6 +31,7 @@
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/sed-opal.h>
#include <linux/pci-p2pdma.h>
+#include <linux/suspend.h>

#include "nvme.h"

@@ -2612,8 +2613,11 @@ static int nvme_suspend(struct device *dev)
{
struct pci_dev *pdev = to_pci_dev(dev);
struct nvme_dev *ndev = pci_get_drvdata(pdev);
+ struct nvme_ctrl *ctrl = &ndev->ctrl;
+
+ if (!(pm_suspend_via_s2idle() && (ctrl->quirks & NVME_QUIRK_NO_DISABLE)))
+ nvme_dev_disable(ndev, true);

- nvme_dev_disable(ndev, true);
return 0;
}

@@ -2716,6 +2720,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_LIGHTNVM, },
{ PCI_DEVICE(0x1d1d, 0x2601), /* CNEX Granby */
.driver_data = NVME_QUIRK_LIGHTNVM, },
+ { PCI_VDEVICE(SK_HYNIX, 0x1527), /* Sk Hynix */
+ .driver_data = NVME_QUIRK_NO_DISABLE, },
{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) },
{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
--
2.17.1


2018-11-09 00:22:41

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
> It leads to the power consumption raises to 2.2W during s2idle, while
> it consumes less than 1W during long idle if put SK hynix nvme to D3
> and then enter s2idle.
> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
> APST feature to do the power management.
> To leverage its APST feature during s2idle, we can't disable nvme
> device while suspending, too.

I don't know how APST works, but it sounds like you want to disable D3
if you're using APST. But that's not what this patch does; this
disables it always.

I'm not sure we want a quirk for this at all, since as Christoph
points out, it doesn't fix a functional issue as the other uses of
quirk_no_ata_d3() do.

From your emails with Christoph, it sounds like this quirk is a
workaround for a firmware defect. If we *do* end up wanting a quirk,
the changelog should at least mention the firmware defect and maybe
check whether it has been fixed.

> BTW, prevent it from entering D3 will increase the power consumtion around
> 0.13W ~ 0.15W during short/long idle, and the power consumption during
> s2idle becomes 0.77W.
>
> Signed-off-by: AceLan Kao <[email protected]>
> ---
> drivers/pci/quirks.c | 1 +
> include/linux/pci_ids.h | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4700d24e5d55..b7e6492e8311 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -1332,6 +1332,7 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
> occur when mode detecting */
> DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
> PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);
>
> /*
> * This was originally an Alpha-specific thing, but it really fits here.
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 69f0abe1ba1a..5f5adda07de0 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -3090,4 +3090,6 @@
>
> #define PCI_VENDOR_ID_NCUBE 0x10ff
>
> +#define PCI_VENDOR_ID_SK_HYNIX 0x1c5c
> +
> #endif /* _LINUX_PCI_IDS_H */
> --
> 2.17.1
>
>
> _______________________________________________
> Linux-nvme mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-nvme

2018-11-15 07:18:48

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

Hi,

> On Nov 9, 2018, at 08:21, Bjorn Helgaas <[email protected]> wrote:
>
> On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
>> It leads to the power consumption raises to 2.2W during s2idle, while
>> it consumes less than 1W during long idle if put SK hynix nvme to D3
>> and then enter s2idle.
>> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
>> APST feature to do the power management.
>> To leverage its APST feature during s2idle, we can't disable nvme
>> device while suspending, too.

We have a new Intel NVMe [8086:f1a6] that has this “new” behavior.

>
> I don't know how APST works, but it sounds like you want to disable D3
> if you're using APST. But that's not what this patch does; this
> disables it always.

Ok, will work on a new patch that only disables D3 when APST is enabled.

>
> I'm not sure we want a quirk for this at all, since as Christoph
> points out, it doesn't fix a functional issue as the other uses of
> quirk_no_ata_d3() do.
>
> From your emails with Christoph, it sounds like this quirk is a
> workaround for a firmware defect. If we *do* end up wanting a quirk,
> the changelog should at least mention the firmware defect and maybe
> check whether it has been fixed.

According to SK Hynix folks and new evidence on the new Intel NVMe
we have, this is something we are going to see more often.

Kai-Heng

>
>> BTW, prevent it from entering D3 will increase the power consumtion around
>> 0.13W ~ 0.15W during short/long idle, and the power consumption during
>> s2idle becomes 0.77W.
>>
>> Signed-off-by: AceLan Kao <[email protected]>
>> ---
>> drivers/pci/quirks.c | 1 +
>> include/linux/pci_ids.h | 2 ++
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 4700d24e5d55..b7e6492e8311 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -1332,6 +1332,7 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
>> occur when mode detecting */
>> DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
>> PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
>> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);
>>
>> /*
>> * This was originally an Alpha-specific thing, but it really fits here.
>> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
>> index 69f0abe1ba1a..5f5adda07de0 100644
>> --- a/include/linux/pci_ids.h
>> +++ b/include/linux/pci_ids.h
>> @@ -3090,4 +3090,6 @@
>>
>> #define PCI_VENDOR_ID_NCUBE 0x10ff
>>
>> +#define PCI_VENDOR_ID_SK_HYNIX 0x1c5c
>> +
>> #endif /* _LINUX_PCI_IDS_H */
>> --
>> 2.17.1
>>
>>
>> _______________________________________________
>> Linux-nvme mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-nvme


2018-11-15 14:59:19

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

On Thu, Nov 15, 2018 at 03:16:29PM +0800, Kai Heng Feng wrote:
> > On Nov 9, 2018, at 08:21, Bjorn Helgaas <[email protected]> wrote:
> > On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
> >> It leads to the power consumption raises to 2.2W during s2idle, while
> >> it consumes less than 1W during long idle if put SK hynix nvme to D3
> >> and then enter s2idle.
> >> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
> >> APST feature to do the power management.
> >> To leverage its APST feature during s2idle, we can't disable nvme
> >> device while suspending, too.
>
> We have a new Intel NVMe [8086:f1a6] that has this “new” behavior.
>
> > I don't know how APST works, but it sounds like you want to disable D3
> > if you're using APST. But that's not what this patch does; this
> > disables it always.
>
> Ok, will work on a new patch that only disables D3 when APST is enabled.

My comment was that the changelog didn't match the code. I don't know
which one is wrong, so I wasn't trying to suggest that you change the
code. If the code is right and the changelog is wrong, just change
the changelog.

> > I'm not sure we want a quirk for this at all, since as Christoph
> > points out, it doesn't fix a functional issue as the other uses of
> > quirk_no_ata_d3() do.
> >
> > From your emails with Christoph, it sounds like this quirk is a
> > workaround for a firmware defect. If we *do* end up wanting a quirk,
> > the changelog should at least mention the firmware defect and maybe
> > check whether it has been fixed.
>
> According to SK Hynix folks and new evidence on the new Intel NVMe
> we have, this is something we are going to see more often.

Hmmm, are you suggesting that if we went this quirk route, we'd be
updating the quirk frequently to add new devices?

I'm opposed to that as a strategy because it makes needless work. You
have to update the quirk, backport it to older kernels, re-release
distro kernels, etc.

If this situation is going to happen frequently, it would be better to
(a) fix the firmware defect (if that's what this is) or (b) pursue
some APST or other spec change so there's a generic documented way to
handle this without requiring device-specific quirks.

Bjorn

2018-11-15 17:31:43

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

On Thu, Nov 15, 2018 at 08:58:09AM -0600, Bjorn Helgaas wrote:
> On Thu, Nov 15, 2018 at 03:16:29PM +0800, Kai Heng Feng wrote:
> > On Nov 9, 2018, at 08:21, Bjorn Helgaas <[email protected]> wrote:

> > > I'm not sure we want a quirk for this at all, since as Christoph
> > > points out, it doesn't fix a functional issue as the other uses of
> > > quirk_no_ata_d3() do.
> > >
> > > From your emails with Christoph, it sounds like this quirk is a
> > > workaround for a firmware defect. If we *do* end up wanting a quirk,
> > > the changelog should at least mention the firmware defect and maybe
> > > check whether it has been fixed.
> >
> > According to SK Hynix folks and new evidence on the new Intel NVMe
> > we have, this is something we are going to see more often.
>
> Hmmm, are you suggesting that if we went this quirk route, we'd be
> updating the quirk frequently to add new devices?
>
> I'm opposed to that as a strategy because it makes needless work. You
> have to update the quirk, backport it to older kernels, re-release
> distro kernels, etc.

But I guess you have to do this anyway just to add the vendor/device
ID to the driver, so maybe this isn't a big deal to you. If you can
do a quirk like this in the driver, it would be invisible to me and I
wouldn't care. I just don't want to deal with ongoing tweaks like
this in the PCI core :)

Bjorn

2018-11-16 07:50:03

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] pci: prevent sk hynix nvme from entering D3

On Thu, Nov 15, 2018 at 11:30:15AM -0600, Bjorn Helgaas wrote:
>
> But I guess you have to do this anyway just to add the vendor/device
> ID to the driver, so maybe this isn't a big deal to you. If you can
> do a quirk like this in the driver, it would be invisible to me and I
> wouldn't care. I just don't want to deal with ongoing tweaks like
> this in the PCI core :)

No, NVMe is a spec with a class code, and a specification that is
vendor independent. NVMe devices declare invididual features based
on common fields.

APST is an optional feature with all kinds of parameters, but there
is absolutely no language that a host should not put the device into
D3 mode if APST is supported anywhere in the NVMe spec, and such
behavior is also rather counter intuitive. If SK Hynix thinks this
is sensible behavior they should bring it up in the NVMe technical
working group. I've pinged a contact there to see what this whole
story is about.