A regression occurred in 6.4 due to a missing ID in `amd_nb`
and the `amd_pmc` driver now being dependent on amd_smn_read().
This series adds the missing ID which fixes s2idle on this
system. The ID also enables k10temp to work.
Mario Limonciello (2):
amd_nb: Add PCI ID for family 19h model 78h
k10temp: Add pci ID for family 19, model 78h
arch/x86/kernel/amd_nb.c | 2 ++
drivers/hwmon/k10temp.c | 1 +
include/linux/pci_ids.h | 1 +
3 files changed, 4 insertions(+)
--
2.34.1
s2idle previously worked on this system, but it regressed in kernel
6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
index 0 for driver probe").
The reason for the regression is that before this commit the SMN
communication was hardcoded, but after amd_smn_read() is used which
relies upon the misc PCI ID used by DF function 3 being included in
a table. The ID was missing for model 78h, so this meant that the
amd_smn_read() wouldn't work.
Add the missing ID into amd_nb, restoring s2idle on this system.
Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
Signed-off-by: Mario Limonciello <[email protected]>
---
arch/x86/kernel/amd_nb.c | 2 ++
include/linux/pci_ids.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 4266b64631a4..7e331e8f3692 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -36,6 +36,7 @@
#define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
#define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
@@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
+ { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
{}
};
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 45c3d62e616d..95f33dadb2be 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -567,6 +567,7 @@
#define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
#define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
#define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703
#define PCI_DEVICE_ID_AMD_LANCE 0x2000
#define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001
--
2.34.1
This enables k10temp to work on this system.
Signed-off-by: Mario Limonciello <[email protected]>
---
drivers/hwmon/k10temp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
index ba2f6a4f8c16..7b177b9fbb09 100644
--- a/drivers/hwmon/k10temp.c
+++ b/drivers/hwmon/k10temp.c
@@ -507,6 +507,7 @@ static const struct pci_device_id k10temp_id_table[] = {
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
+ { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
{ PCI_VDEVICE(HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) },
{}
};
--
2.34.1
On 4/26/23 22:33, Mario Limonciello wrote:
> This enables k10temp to work on this system.
>
> Signed-off-by: Mario Limonciello <[email protected]>
> ---
Acked-by: Guenter Roeck <[email protected]>
in case someone wants to apply this patch together with the first patch
of the series. Otherwise I'll wait with applying it until after that
patch is available upstream.
Guenter
> drivers/hwmon/k10temp.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
> index ba2f6a4f8c16..7b177b9fbb09 100644
> --- a/drivers/hwmon/k10temp.c
> +++ b/drivers/hwmon/k10temp.c
> @@ -507,6 +507,7 @@ static const struct pci_device_id k10temp_id_table[] = {
> { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
> { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
> { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
> + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
> { PCI_VDEVICE(HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) },
> {}
> };
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
>
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table. The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
>
> Add the missing ID into amd_nb, restoring s2idle on this system.
Is there a long-term solution for this that will not require adding
new IDs every time new hardware comes out?
drivers/platform/x86/amd/pmc.c already matches ACPI IDs; maybe there's
some way for the platform to provide the information you need via
ACPI or something?
Bjorn
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
>
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table. The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
>
> Add the missing ID into amd_nb, restoring s2idle on this system.
>
> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
> Signed-off-by: Mario Limonciello <[email protected]>
Grudgingly, because this really isn't a maintainable strategy:
Acked-by: Bjorn Helgaas <[email protected]> # pci_ids.h
> ---
> arch/x86/kernel/amd_nb.c | 2 ++
> include/linux/pci_ids.h | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> index 4266b64631a4..7e331e8f3692 100644
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -36,6 +36,7 @@
> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>
> /* Protect the PCI config register pairs used for SMN. */
> static DEFINE_MUTEX(smn_mutex);
> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
> {}
> };
>
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 45c3d62e616d..95f33dadb2be 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -567,6 +567,7 @@
> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
> #define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703
> #define PCI_DEVICE_ID_AMD_LANCE 0x2000
> #define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001
> --
> 2.34.1
>
[Public]
> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> > s2idle previously worked on this system, but it regressed in kernel
> > 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> > index 0 for driver probe").
> >
> > The reason for the regression is that before this commit the SMN
> > communication was hardcoded, but after amd_smn_read() is used which
> > relies upon the misc PCI ID used by DF function 3 being included in
> > a table. The ID was missing for model 78h, so this meant that the
> > amd_smn_read() wouldn't work.
> >
> > Add the missing ID into amd_nb, restoring s2idle on this system.
>
> Is there a long-term solution for this that will not require adding
> new IDs every time new hardware comes out?
>
> drivers/platform/x86/amd/pmc.c already matches ACPI IDs; maybe there's
> some way for the platform to provide the information you need via
> ACPI or something?
>
> Bjorn
I had the same question when I came up with this and found out the
ACPI tables don't encode the information needed right now.
But, yes there are discussions ongoing to make this more scalable in
a future generation.
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
>
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table. The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
>
> Add the missing ID into amd_nb, restoring s2idle on this system.
>
> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
> Signed-off-by: Mario Limonciello <[email protected]>
FWIW:
Acked-by: Guenter Roeck <[email protected]>
Note that this patch is not upstream, meaning the second patch
in the series can not be applied either. I am not sure if that is
because of "regressed in kernel 6.4" - after all, that kernel does not
exist yet. The offending patch _is_ in the upstream kernel, though.
It might make sense to inform the regression bot if the problem is
not fixed when v6.4-rc1 is made available.
Guenter
> ---
> arch/x86/kernel/amd_nb.c | 2 ++
> include/linux/pci_ids.h | 1 +
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> index 4266b64631a4..7e331e8f3692 100644
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -36,6 +36,7 @@
> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>
> /* Protect the PCI config register pairs used for SMN. */
> static DEFINE_MUTEX(smn_mutex);
> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
> {}
> };
>
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 45c3d62e616d..95f33dadb2be 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -567,6 +567,7 @@
> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
> #define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703
> #define PCI_DEVICE_ID_AMD_LANCE 0x2000
> #define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001
> --
> 2.34.1
>
On 5/6/23 09:05, Guenter Roeck wrote:
> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
>> s2idle previously worked on this system, but it regressed in kernel
>> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
>> index 0 for driver probe").
>>
>> The reason for the regression is that before this commit the SMN
>> communication was hardcoded, but after amd_smn_read() is used which
>> relies upon the misc PCI ID used by DF function 3 being included in
>> a table. The ID was missing for model 78h, so this meant that the
>> amd_smn_read() wouldn't work.
>>
>> Add the missing ID into amd_nb, restoring s2idle on this system.
>>
>> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
>> Signed-off-by: Mario Limonciello <[email protected]>
> FWIW:
>
> Acked-by: Guenter Roeck <[email protected]>
>
> Note that this patch is not upstream, meaning the second patch
> in the series can not be applied either. I am not sure if that is
> because of "regressed in kernel 6.4" - after all, that kernel does not
> exist yet. The offending patch _is_ in the upstream kernel, though.
> It might make sense to inform the regression bot if the problem is
> not fixed when v6.4-rc1 is made available.
You're right; the commit message should:
s,but it regressed in kernel 6.4 due,but it regressed in,
Boris told me that he's waiting for 6.4-rc1 to pick this series up.
#regzbot ^introduced 310e782a99c7
>
> Guenter
>
>> ---
>> arch/x86/kernel/amd_nb.c | 2 ++
>> include/linux/pci_ids.h | 1 +
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
>> index 4266b64631a4..7e331e8f3692 100644
>> --- a/arch/x86/kernel/amd_nb.c
>> +++ b/arch/x86/kernel/amd_nb.c
>> @@ -36,6 +36,7 @@
>> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
>> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
>> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
>> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>>
>> /* Protect the PCI config register pairs used for SMN. */
>> static DEFINE_MUTEX(smn_mutex);
>> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
>> { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
>> + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
>> {}
>> };
>>
>> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
>> index 45c3d62e616d..95f33dadb2be 100644
>> --- a/include/linux/pci_ids.h
>> +++ b/include/linux/pci_ids.h
>> @@ -567,6 +567,7 @@
>> #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
>> #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
>> #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
>> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
>> #define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703
>> #define PCI_DEVICE_ID_AMD_LANCE 0x2000
>> #define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001
>> --
>> 2.34.1
>>
On Thu, Apr 27, 2023 at 11:48:16AM -0500, Bjorn Helgaas wrote:
> Grudgingly, because this really isn't a maintainable strategy:
>
> Acked-by: Bjorn Helgaas <[email protected]> # pci_ids.h
I was wondering whether that define should even go into pci_ids.h but
there's a patch 2 which uses it and which is *not* in my mbox.
Mario, please CC everybody on all patches in the future.
I'll take the k10temp one too so that there's no cross-tree deps.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 7d8accfaa0ab65e4282c8e58950f7d688342cd86
Gitweb: https://git.kernel.org/tip/7d8accfaa0ab65e4282c8e58950f7d688342cd86
Author: Mario Limonciello <[email protected]>
AuthorDate: Thu, 27 Apr 2023 00:33:37 -05:00
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Mon, 08 May 2023 11:36:19 +02:00
hwmon: (k10temp) Add PCI ID for family 19, model 78h
Enable k10temp on this system.
[ bp: Massage. ]
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/hwmon/k10temp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
index ba2f6a4..7b177b9 100644
--- a/drivers/hwmon/k10temp.c
+++ b/drivers/hwmon/k10temp.c
@@ -507,6 +507,7 @@ static const struct pci_device_id k10temp_id_table[] = {
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
{ PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
+ { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
{ PCI_VDEVICE(HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) },
{}
};
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
Gitweb: https://git.kernel.org/tip/23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
Author: Mario Limonciello <[email protected]>
AuthorDate: Thu, 27 Apr 2023 00:33:36 -05:00
Committer: Borislav Petkov (AMD) <[email protected]>
CommitterDate: Mon, 08 May 2023 11:25:19 +02:00
x86/amd_nb: Add PCI ID for family 19h model 78h
Commit
310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
switched to using amd_smn_read() which relies upon the misc PCI ID used
by DF function 3 being included in a table. The ID for model 78h is
missing in that table, so amd_smn_read() doesn't work.
Add the missing ID into amd_nb, restoring s2idle on this system.
[ bp: Simplify commit message. ]
Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]> # pci_ids.h
Acked-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/kernel/amd_nb.c | 2 ++
include/linux/pci_ids.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 4266b64..7e331e8 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -36,6 +36,7 @@
#define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
#define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
@@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
+ { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
{}
};
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 45c3d62..95f33da 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -567,6 +567,7 @@
#define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
#define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
#define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703
#define PCI_DEVICE_ID_AMD_LANCE 0x2000
#define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001
On Mon, May 08, 2023 at 01:13:14PM +0200, Thorsten Leemhuis wrote:
> Which afaics means that users of -rc1 are now affected by this and might
> waste time bisecting a known issue that could easily have been fixed
> already. :-/ That doesn't feel right. Or am I missing something?
-rc1 is pretty much the most broken tree there is. And it is not an
officially released tree but a, well, the first release candidate. So
fixes are trickling in, there's lag between what gets found, when the
maintainers pick it up and when it ends up upstream and so on and so on.
Some fixes need longer testing because there have been cases where a fix
breaks something else.
And yes, maintainers can always expedite a fix or Linus will pick it up
directly if it breaks a lot of boxes in a nasty way.
So, long story short, I don't think you should track -rcs. You are
tracking the reports already and you're tracking where their fixes land
so I guess that's good enough.
> /me wonders I he should start tracking regressions more closely during
> the merge window to catch and prevent situations like this...
I don't see a "situation" here. rcs can be broken for some use cases and
that is fine as long as that breakage doesn't get released.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 07.05.23 14:51, Mario Limonciello wrote:
> On 5/6/23 09:05, Guenter Roeck wrote:
>> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
>>> s2idle previously worked on this system, but it regressed in kernel
>>> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
>>> index 0 for driver probe").
>>>
>>> The reason for the regression is that before this commit the SMN
>>> communication was hardcoded, but after amd_smn_read() is used which
>>> relies upon the misc PCI ID used by DF function 3 being included in
>>> a table. The ID was missing for model 78h, so this meant that the
>>> amd_smn_read() wouldn't work.
>>>
>>> Add the missing ID into amd_nb, restoring s2idle on this system.
>>>
>>> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for
>>> driver probe")
>>> Signed-off-by: Mario Limonciello <[email protected]>
>> FWIW:
>>
>> Acked-by: Guenter Roeck <[email protected]>
>>
>> Note that this patch is not upstream, meaning the second patch
>> in the series can not be applied either. I am not sure if that is
>> because of "regressed in kernel 6.4" - after all, that kernel does not
>> exist yet. The offending patch _is_ in the upstream kernel, though.
>> It might make sense to inform the regression bot if the problem is
>> not fixed when v6.4-rc1 is made available.
>
> You're right; the commit message should:
>
> s,but it regressed in kernel 6.4 due,but it regressed in,
>
> Boris told me that he's waiting for 6.4-rc1 to pick this series up.
Which afaics means that users of -rc1 are now affected by this and might
waste time bisecting a known issue that could easily have been fixed
already. :-/ That doesn't feel right. Or am I missing something?
/me wonders I he should start tracking regressions more closely during
the merge window to catch and prevent situations like this...
> #regzbot ^introduced 310e782a99c7
Thx for adding it.
#regzbot fix: 7d8accfaa0ab65e42
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
On Mon, May 08, 2023 at 01:25:43PM +0200, Borislav Petkov wrote:
> On Mon, May 08, 2023 at 01:13:14PM +0200, Thorsten Leemhuis wrote:
> > Which afaics means that users of -rc1 are now affected by this and might
> > waste time bisecting a known issue that could easily have been fixed
> > already. :-/ That doesn't feel right. Or am I missing something?
>
> -rc1 is pretty much the most broken tree there is. And it is not an
> officially released tree but a, well, the first release candidate. So
> fixes are trickling in, there's lag between what gets found, when the
> maintainers pick it up and when it ends up upstream and so on and so on.
>
> Some fixes need longer testing because there have been cases where a fix
> breaks something else.
>
> And yes, maintainers can always expedite a fix or Linus will pick it up
> directly if it breaks a lot of boxes in a nasty way.
>
> So, long story short, I don't think you should track -rcs. You are
> tracking the reports already and you're tracking where their fixes land
> so I guess that's good enough.
>
I absolutely disagree. Without Thorsten's tracking, Linus would have no
idea what the status of the kernel is.
> > /me wonders I he should start tracking regressions more closely during
> > the merge window to catch and prevent situations like this...
>
> I don't see a "situation" here. rcs can be broken for some use cases and
> that is fine as long as that breakage doesn't get released.
>
Again, I disagree. The whole point of testing release candidates is to get
problems fixed. If issues are not fixed timely, they just pile up on top
of each other and make it difficult to identify new issues (and, in many
cases, to test the kernel in the first place).
I find it quite annoying that problems are identfied, often even in -next,
the patch intoducing them is applied to mainline anyway, and then
it sometimes takes until -rc5 or even later to get the fix applied (even if
the fix has been known for weeks or even months). It sometimes even takes
Linus' intervention to get fixes applied to the upstream kernel. That
really should not be necessary. Telling those who track regressions
to stay away from release candidates is absolutely the wrong thing to do
and would only serve to make release candidates quite pointless.
v6.4-rc1 is a good example. Fixes for all build breakages were published
before the commit window opened, yet at least one of them did not make
it into -rc1. Together with this patch there are now at least two
regressions if -rc1 whch could have been avoided and may impact testability
on affected systems.
Guenter
On Mon, May 08, 2023 at 06:28:03AM -0700, Guenter Roeck wrote:
> Together with this patch there are now at least two regressions if
> -rc1 whch could have been avoided and may impact testability on
> affected systems.
Are you saying that this patch which fixes s2idle on some random box
should've gone to Linus *immediately*?
And read my mail again:
"Some fixes need longer testing because there have been cases where
a fix breaks something else."
So yes, I disagree with rushing fixes immediately. If they're obvious
- whatever that means - then sure but not all of them are such.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
[AMD Official Use Only - General]
+stable, Sasha
> > Together with this patch there are now at least two regressions if
> > -rc1 whch could have been avoided and may impact testability on
> > affected systems.
>
> Are you saying that this patch which fixes s2idle on some random box
> should've gone to Linus *immediately*?
>
> And read my mail again:
>
> "Some fixes need longer testing because there have been cases where
> a fix breaks something else."
>
> So yes, I disagree with rushing fixes immediately. If they're obvious
> - whatever that means - then sure but not all of them are such.
>
> --
Unfortunately, it looks like the broken commit got backported into 6.1.28,
but the fix still isn't in Linus' tree.
Sasha,
Can you please pick up
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
for 6.1.29 to fix the regression?
Thanks,
On 11.05.23 21:51, Limonciello, Mario wrote:
> [AMD Official Use Only - General]
>
> +stable, Sasha
>
>>> Together with this patch there are now at least two regressions if
>>> -rc1 whch could have been avoided and may impact testability on
>>> affected systems.
>>
>> Are you saying that this patch which fixes s2idle on some random box
>> should've gone to Linus *immediately*?
>>
>> And read my mail again:
>>
>> "Some fixes need longer testing because there have been cases where
>> a fix breaks something else."
>>
>> So yes, I disagree with rushing fixes immediately. If they're obvious
>> - whatever that means - then sure but not all of them are such.
>>
>> --
>
> Unfortunately, it looks like the broken commit got backported into 6.1.28,
> but the fix still isn't in Linus' tree.
>
> Sasha,
>
> Can you please pick up
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
> for 6.1.29 to fix the regression?
FWIW, the stable team afaics usually does not fix anything in stable
trees before it's fixed in mainline. IOW: that fix now quickly needs to
go to Linus to get it quickly fixed in 6.1.y.
Side note: I'll soon post a rewritten section of 'Prioritize work on
fixing regressions' which is part of
Documentation/process/handling-regressions.rst. It among others will
cover this case:
```
* Whenever you want to swiftly resolve a regression that recently also
made it into a proper mainline, stable, or longterm release, fix it
quickly in mainline; when appropriate thus involve Linus to fast-track
the fix. That's because the stable team normally does neither revert nor
fix any changes that cause a regression in mainline, too.
```
Ciao, Thorsten
On Thu, May 11, 2023 at 07:51:42PM +0000, Limonciello, Mario wrote:
>[AMD Official Use Only - General]
>
>+stable, Sasha
>
>> > Together with this patch there are now at least two regressions if
>> > -rc1 whch could have been avoided and may impact testability on
>> > affected systems.
>>
>> Are you saying that this patch which fixes s2idle on some random box
>> should've gone to Linus *immediately*?
>>
>> And read my mail again:
>>
>> "Some fixes need longer testing because there have been cases where
>> a fix breaks something else."
>>
>> So yes, I disagree with rushing fixes immediately. If they're obvious
>> - whatever that means - then sure but not all of them are such.
>>
>> --
>
>Unfortunately, it looks like the broken commit got backported into 6.1.28,
>but the fix still isn't in Linus' tree.
>
>Sasha,
>
>Can you please pick up
>https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
>for 6.1.29 to fix the regression?
Happily, once it lands upstream :)
--
Thanks,
Sasha
On Thu, May 11, 2023 at 07:51:42PM +0000, Limonciello, Mario wrote:
> [AMD Official Use Only - General]
>
> +stable, Sasha
>
> > > Together with this patch there are now at least two regressions if
> > > -rc1 whch could have been avoided and may impact testability on
> > > affected systems.
> >
> > Are you saying that this patch which fixes s2idle on some random box
> > should've gone to Linus *immediately*?
> >
> > And read my mail again:
> >
> > "Some fixes need longer testing because there have been cases where
> > a fix breaks something else."
> >
> > So yes, I disagree with rushing fixes immediately. If they're obvious
> > - whatever that means - then sure but not all of them are such.
> >
> > --
>
> Unfortunately, it looks like the broken commit got backported into 6.1.28,
> but the fix still isn't in Linus' tree.
>
> Sasha,
>
> Can you please pick up
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
> for 6.1.29 to fix the regression?
Now that this is in Linus's tree, it's queued up, thanks.
greg k-h