2024-02-28 07:53:39

by Daniel Drake

[permalink] [raw]
Subject: [PATCH v3 1/2] PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge

The Asus B1400 with original shipped firmware versions and VMD disabled
cannot resume from suspend: the NVMe device becomes unresponsive and
inaccessible.

This appears to be an untested D3cold transition by the vendor; Intel
socwatch shows that Windows leaves the NVMe device and parent bridge in D0
during suspend, even though these firmware versions have StorageD3Enable=1.

The NVMe device and parent PCI bridge both share the same "PXP" ACPI power
resource, which gets turned off as both devices are put into D3cold
during suspend. The _OFF() method calls DL23() which sets a L23E
bit at offset 0xe2 into the PCI configuration space for this root port.
This is the specific write that the _ON() routine is unable to recover
from. This register is not documented in the public chipset datasheet.

Disallow D3cold on the PCI bridge to enable successful suspend/resume.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
Acked-by: Jian-Hong Pan <[email protected]>
Signed-off-by: Daniel Drake <[email protected]>
---
arch/x86/pci/fixup.c | 48 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)

v3:
Adjust comment and commit message based on feedback, and more detailed
investigation (on bugzilla) which indicates the problem may be more
attributable to the (lack of?) power management on the NVMe device port
rather than the parent bridge. There's no difference practically though
- both ACPI devices share the same power resource which is the one powered
down in D3cold...

v2:
Match only specific BIOS versions where this quirk is required.
Add subsequent patch to this series to revert the original S3 workaround
now that s2idle is usable again.

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index f347c20247d30..859a32fba8a96 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -907,6 +907,54 @@ static void chromeos_fixup_apl_pci_l1ss_capability(struct pci_dev *dev)
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability);

+/*
+ * Disable D3cold on Asus B1400 NVMe-NCIe bridge
+ *
+ * On this platform with VMD off, the NVMe device cannot successfully power
+ * back on from D3cold. This appears to be an untested transition by the
+ * vendor: Windows leaves the NVMe and parent bridge in D0 during suspend.
+ *
+ * We disable D3cold on the parent bridge for simplicity, and the fact that
+ * both parent bridge and NVMe device share the same power resource.
+ *
+ * This is only needed on BIOS versions before 308; the newer versions flip
+ * StorageD3Enable from 1 to 0.
+ */
+static const struct dmi_system_id asus_nvme_broken_d3cold_table[] = {
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.304"),
+ },
+ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.305"),
+ },
+ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.306"),
+ },
+ },
+ {
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_BIOS_VERSION, "B1400CEAE.307"),
+ },
+ },
+ {}
+};
+
+static void asus_disable_nvme_d3cold(struct pci_dev *pdev)
+{
+ if (dmi_check_system(asus_nvme_broken_d3cold_table) > 0)
+ pci_d3cold_disable(pdev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x9a09, asus_disable_nvme_d3cold);
+
#ifdef CONFIG_SUSPEND
/*
* Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but
--
2.39.2



2024-02-28 07:53:41

by Daniel Drake

[permalink] [raw]
Subject: [PATCH v3 2/2] Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"

This reverts commit d52848620de00cde4a3a5df908e231b8c8868250, which
was originally put in place to work around a s2idle failure on this
platform where the NVMe device was inaccessible upon resume.

After extended testing, we found that the firmware's implementation of
S3 is buggy and intermittently fails to wake up the system. We need
to revert to s2idle mode.

The NVMe issue has now been solved more precisely in the commit titled
"PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge"

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
Acked-by: Jian-Hong Pan <[email protected]>
Signed-off-by: Daniel Drake <[email protected]>
---
drivers/acpi/sleep.c | 12 ------------
1 file changed, 12 deletions(-)

diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 808484d112097..728acfeb774d8 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -385,18 +385,6 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
DMI_MATCH(DMI_PRODUCT_NAME, "20GGA00L00"),
},
},
- /*
- * ASUS B1400CEAE hangs on resume from suspend (see
- * https://bugzilla.kernel.org/show_bug.cgi?id=215742).
- */
- {
- .callback = init_default_s3,
- .ident = "ASUS B1400CEAE",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "ASUS EXPERTBOOK B1400CEAE"),
- },
- },
{},
};

--
2.39.2


2024-02-28 22:26:14

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"

[+to Rafael]

On Wed, Feb 28, 2024 at 08:53:16AM +0100, Daniel Drake wrote:
> This reverts commit d52848620de00cde4a3a5df908e231b8c8868250, which
> was originally put in place to work around a s2idle failure on this
> platform where the NVMe device was inaccessible upon resume.
>
> After extended testing, we found that the firmware's implementation of
> S3 is buggy and intermittently fails to wake up the system. We need
> to revert to s2idle mode.
>
> The NVMe issue has now been solved more precisely in the commit titled
> "PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge"
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
> Acked-by: Jian-Hong Pan <[email protected]>
> Signed-off-by: Daniel Drake <[email protected]>

Rafael, if you're OK with this, I can queue both patches for v6.9.

> ---
> drivers/acpi/sleep.c | 12 ------------
> 1 file changed, 12 deletions(-)
>
> diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
> index 808484d112097..728acfeb774d8 100644
> --- a/drivers/acpi/sleep.c
> +++ b/drivers/acpi/sleep.c
> @@ -385,18 +385,6 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
> DMI_MATCH(DMI_PRODUCT_NAME, "20GGA00L00"),
> },
> },
> - /*
> - * ASUS B1400CEAE hangs on resume from suspend (see
> - * https://bugzilla.kernel.org/show_bug.cgi?id=215742).
> - */
> - {
> - .callback = init_default_s3,
> - .ident = "ASUS B1400CEAE",
> - .matches = {
> - DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> - DMI_MATCH(DMI_PRODUCT_NAME, "ASUS EXPERTBOOK B1400CEAE"),
> - },
> - },
> {},
> };
>
> --
> 2.39.2
>

2024-02-29 17:46:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"

On Wed, Feb 28, 2024 at 11:26 PM Bjorn Helgaas <[email protected]> wrote:
>
> [+to Rafael]
>
> On Wed, Feb 28, 2024 at 08:53:16AM +0100, Daniel Drake wrote:
> > This reverts commit d52848620de00cde4a3a5df908e231b8c8868250, which
> > was originally put in place to work around a s2idle failure on this
> > platform where the NVMe device was inaccessible upon resume.
> >
> > After extended testing, we found that the firmware's implementation of
> > S3 is buggy and intermittently fails to wake up the system. We need
> > to revert to s2idle mode.
> >
> > The NVMe issue has now been solved more precisely in the commit titled
> > "PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge"
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
> > Acked-by: Jian-Hong Pan <[email protected]>
> > Signed-off-by: Daniel Drake <[email protected]>
>
> Rafael, if you're OK with this, I can queue both patches for v6.9.

Yes, please!

Feel free to add

Acked-by: Rafael J. Wysocki <[email protected]

to both.

Thanks!

2024-02-29 19:26:56

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default"

On Thu, Feb 29, 2024 at 06:38:01PM +0100, Rafael J. Wysocki wrote:
> On Wed, Feb 28, 2024 at 11:26 PM Bjorn Helgaas <[email protected]> wrote:
> >
> > [+to Rafael]
> >
> > On Wed, Feb 28, 2024 at 08:53:16AM +0100, Daniel Drake wrote:
> > > This reverts commit d52848620de00cde4a3a5df908e231b8c8868250, which
> > > was originally put in place to work around a s2idle failure on this
> > > platform where the NVMe device was inaccessible upon resume.
> > >
> > > After extended testing, we found that the firmware's implementation of
> > > S3 is buggy and intermittently fails to wake up the system. We need
> > > to revert to s2idle mode.
> > >
> > > The NVMe issue has now been solved more precisely in the commit titled
> > > "PCI: Disable D3cold on Asus B1400 PCI-NVMe bridge"
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=215742
> > > Acked-by: Jian-Hong Pan <[email protected]>
> > > Signed-off-by: Daniel Drake <[email protected]>
> >
> > Rafael, if you're OK with this, I can queue both patches for v6.9.
>
> Yes, please!
>
> Feel free to add
>
> Acked-by: Rafael J. Wysocki <[email protected]
>
> to both.

Both patches applied with Rafael's ack to pci/pm for v6.9, thanks!

Bjorn