2023-05-19 20:28:46

by Leo Li

[permalink] [raw]
Subject: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()

Since 315bada690e0 ("EDAC: Check for GHES preference in the
chipset-specific EDAC drivers"), vendor specific EDAC driver will not
probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES device
is present. Make ghes_get_devices() return NULL when the GHES device
list is empty to fix the problem.

Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper module")
Signed-off-by: Li Yang <[email protected]>
Cc: Jia He <[email protected]>
---

V2: fix the fallthrough case in x86 path

drivers/acpi/apei/ghes.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 34ad071a64e9..4382fe13ee3e 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1544,6 +1544,8 @@ struct list_head *ghes_get_devices(void)

pr_warn_once("Force-loading ghes_edac on an unsupported platform. You're on your own!\n");
}
+ } else if (list_empty(&ghes_devs)) {
+ return NULL;
}

return &ghes_devs;
--
2.38.0



2023-05-25 15:27:57

by Justin He

[permalink] [raw]
Subject: RE: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()

Hi Li,

> -----Original Message-----
> From: Li Yang <[email protected]>
> Sent: Saturday, May 20, 2023 4:13 AM
> To: Rafael J. Wysocki <[email protected]>; Len Brown <[email protected]>;
> James Morse <[email protected]>; Tony Luck <[email protected]>;
> Borislav Petkov <[email protected]>; Justin He <[email protected]>
> Cc: Li Yang <[email protected]>; [email protected];
> [email protected]
> Subject: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()
>
> Since 315bada690e0 ("EDAC: Check for GHES preference in the chipset-specific
> EDAC drivers"), vendor specific EDAC driver will not probe correctly when
> CONFIG_ACPI_APEI_GHES is enabled but no GHES device is present. Make
> ghes_get_devices() return NULL when the GHES device list is empty to fix the
> problem.
>
> Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper
> module")
> Signed-off-by: Li Yang <[email protected]>
> Cc: Jia He <[email protected]>
> ---
>
> V2: fix the fallthrough case in x86 path
>
> drivers/acpi/apei/ghes.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index
> 34ad071a64e9..4382fe13ee3e 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -1544,6 +1544,8 @@ struct list_head *ghes_get_devices(void)
>
> pr_warn_once("Force-loading ghes_edac on an unsupported
> platform. You're on your own!\n");
> }
> + } else if (list_empty(&ghes_devs)) {
> + return NULL;
> }
I have no major objections to the fix. Just curious about the edac driver in you platform.
IIUC, in your case, CONFIG_ACPI_APEI_GHES is enabled and edac_ghes driver is either not
loaded or fails to load. Is my understanding correct?
I wonder whether ghes_get_devices() can unblock such check condition and let other
chipset-specific edac drivers continue with the initialization. @Toshi, What do u think of it?


--
Cheers,
Justin (Jia He)


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

2023-05-25 21:01:51

by Leo Li

[permalink] [raw]
Subject: RE: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()



> -----Original Message-----
> From: Justin He <[email protected]>
> Sent: Thursday, May 25, 2023 10:18 AM
> To: Leo Li <[email protected]>; Toshi Kani <[email protected]>; Rafael J.
> Wysocki <[email protected]>; Len Brown <[email protected]>; James Morse
> <[email protected]>; Tony Luck <[email protected]>; Borislav
> Petkov <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: RE: [PATCH v2] apei/ghes: correctly return NULL for
> ghes_get_devices()
>
> Hi Li,
>
> > -----Original Message-----
> > From: Li Yang <[email protected]>
> > Sent: Saturday, May 20, 2023 4:13 AM
> > To: Rafael J. Wysocki <[email protected]>; Len Brown
> > <[email protected]>; James Morse <[email protected]>; Tony Luck
> > <[email protected]>; Borislav Petkov <[email protected]>; Justin He
> > <[email protected]>
> > Cc: Li Yang <[email protected]>; [email protected];
> > [email protected]
> > Subject: [PATCH v2] apei/ghes: correctly return NULL for
> > ghes_get_devices()
> >
> > Since 315bada690e0 ("EDAC: Check for GHES preference in the
> > chipset-specific EDAC drivers"), vendor specific EDAC driver will not
> > probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES
> > device is present. Make
> > ghes_get_devices() return NULL when the GHES device list is empty to
> > fix the problem.
> >
> > Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper
> > module")
> > Signed-off-by: Li Yang <[email protected]>
> > Cc: Jia He <[email protected]>
> > ---
> >
> > V2: fix the fallthrough case in x86 path
> >
> > drivers/acpi/apei/ghes.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index
> > 34ad071a64e9..4382fe13ee3e 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -1544,6 +1544,8 @@ struct list_head *ghes_get_devices(void)
> >
> > pr_warn_once("Force-loading ghes_edac on an
> > unsupported platform. You're on your own!\n");
> > }
> > + } else if (list_empty(&ghes_devs)) {
> > + return NULL;
> > }
> I have no major objections to the fix. Just curious about the edac driver in
> you platform.
> IIUC, in your case, CONFIG_ACPI_APEI_GHES is enabled and edac_ghes
> driver is either not loaded or fails to load. Is my understanding correct?

Right. ghes_edac is loaded but since ghes_devs is empty due to this platform not using ACPI, it bails out with -ENODEV very quickly. I would expect the original platform EDAC driver should continue to work in this scenario.

> I wonder whether ghes_get_devices() can unblock such check condition and
> let other chipset-specific edac drivers continue with the initialization. @Toshi,
> What do u think of it?
>
>
> --
> Cheers,
> Justin (Jia He)
>
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium. Thank you.

2023-06-05 18:00:43

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()

> > Since 315bada690e0 ("EDAC: Check for GHES preference in the
> > chipset-specific EDAC drivers"), vendor specific EDAC driver will not
> > probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES device
> > is present. Make ghes_get_devices() return NULL when the GHES device
> > list is empty to fix the problem.
> >
> > Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper module")
> > Signed-off-by: Li Yang <[email protected]>
> > Cc: Jia He <[email protected]>
>
> Boris, Tony, any comments?

All of the callers are expecting NULL for a failure, not an empty list. So this looks OK to me.

Reviewed-by: Tony Luck <[email protected]>

-Tony

2023-06-05 18:00:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()

On Fri, May 19, 2023 at 10:13 PM Li Yang <[email protected]> wrote:
>
> Since 315bada690e0 ("EDAC: Check for GHES preference in the
> chipset-specific EDAC drivers"), vendor specific EDAC driver will not
> probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES device
> is present. Make ghes_get_devices() return NULL when the GHES device
> list is empty to fix the problem.
>
> Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper module")
> Signed-off-by: Li Yang <[email protected]>
> Cc: Jia He <[email protected]>

Boris, Tony, any comments?

> ---
>
> V2: fix the fallthrough case in x86 path
>
> drivers/acpi/apei/ghes.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 34ad071a64e9..4382fe13ee3e 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -1544,6 +1544,8 @@ struct list_head *ghes_get_devices(void)
>
> pr_warn_once("Force-loading ghes_edac on an unsupported platform. You're on your own!\n");
> }
> + } else if (list_empty(&ghes_devs)) {
> + return NULL;
> }
>
> return &ghes_devs;
> --
> 2.38.0
>

2023-06-12 18:00:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2] apei/ghes: correctly return NULL for ghes_get_devices()

On Mon, Jun 5, 2023 at 7:26 PM Luck, Tony <[email protected]> wrote:
>
> > > Since 315bada690e0 ("EDAC: Check for GHES preference in the
> > > chipset-specific EDAC drivers"), vendor specific EDAC driver will not
> > > probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES device
> > > is present. Make ghes_get_devices() return NULL when the GHES device
> > > list is empty to fix the problem.
> > >
> > > Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper module")
> > > Signed-off-by: Li Yang <[email protected]>
> > > Cc: Jia He <[email protected]>
> >
> > Boris, Tony, any comments?
>
> All of the callers are expecting NULL for a failure, not an empty list. So this looks OK to me.
>
> Reviewed-by: Tony Luck <[email protected]>

Applied as 6.5 material, thanks!