2022-11-16 00:44:37

by Jason Baron

[permalink] [raw]
Subject: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()

Currently, ghes_edac_register() is called via ghes_init() from acpi_init()
at the subsys_initcall() level. However, edac_init() is also called from
the subsys_initcall(), leaving the ordering ambiguous.

If ghes_edac_register() is called first, then 'mc0' ends up at:
/sys/devices/mc0/, instead of the expected:
/sys/devices/system/edac/mc/mc0.

So while everything seems ok, other than the unexpected sysfs location, it
seems like 'edac_init()' should be called before any drivers start
registering. So have 'edac_init()' called earlier via arch_initcall().

However, this moves edac_pci_clear_parity_errors() up as well. Seems like
this wants to be called after pci bus scan, so keep
edac_pci_clear_parity_errors() at subsys_init(). That said, it seems like
pci bus scan happens at subsys_init() level, so really the parity clearing
should be moved later. But that can be left as a separate patch.

Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")
Signed-off-by: Jason Baron <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: James Morse <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Shuai Xue <[email protected]>
Cc: [email protected]
---
drivers/edac/edac_module.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 32a931d0cb71..407d4a5fce7a 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -109,15 +109,6 @@ static int __init edac_init(void)
if (err)
return err;

- /*
- * Harvest and clear any boot/initialization PCI parity errors
- *
- * FIXME: This only clears errors logged by devices present at time of
- * module initialization. We should also do an initial clear
- * of each newly hotplugged device.
- */
- edac_pci_clear_parity_errors();
-
err = edac_mc_sysfs_init();
if (err)
goto err_sysfs;
@@ -157,12 +148,34 @@ static void __exit edac_exit(void)
edac_subsys_exit();
}

+static void __init edac_init_clear_parity_errors(void)
+{
+ /*
+ * Harvest and clear any boot/initialization PCI parity errors
+ *
+ * FIXME: This only clears errors logged by devices present at time of
+ * module initialization. We should also do an initial clear
+ * of each newly hotplugged device.
+ */
+ edac_pci_clear_parity_errors();
+
+ return 0;
+}
+
/*
* Inform the kernel of our entry and exit points
+ *
+ * ghes_edac_register() is call via acpi_init() -> ghes_init()
+ * at the subsys_initcall level so edac_init() must come first
*/
-subsys_initcall(edac_init);
+arch_initcall(edac_init);
module_exit(edac_exit);

+/*
+ * Clear parity errors after PCI subsys is initialized
+ */
+subsys_initcall(edac_init_clear_parity_errors);
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Doug Thompson http://www.softwarebitmaker.com, et al");
MODULE_DESCRIPTION("Core library routines for EDAC reporting");
--
2.17.1



2022-11-16 11:26:15

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()

Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on ras/edac-for-next]
[also build test ERROR on linus/master v6.1-rc5 next-20221115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046
base: https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next
patch link: https://lore.kernel.org/r/20221116003729.194802-1-jbaron%40akamai.com
patch subject: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()
config: powerpc-allyesconfig
compiler: powerpc-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/a970ee7e983345d07bd1f3e455688ef753f32a45
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046
git checkout a970ee7e983345d07bd1f3e455688ef753f32a45
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

drivers/edac/edac_module.c: In function 'edac_init_clear_parity_errors':
drivers/edac/edac_module.c:162:16: error: 'return' with a value, in function returning void [-Werror=return-type]
162 | return 0;
| ^
drivers/edac/edac_module.c:151:20: note: declared here
151 | static void __init edac_init_clear_parity_errors(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/printk.h:6,
from include/asm-generic/bug.h:22,
from arch/powerpc/include/asm/bug.h:158,
from include/linux/bug.h:5,
from arch/powerpc/include/asm/cmpxchg.h:8,
from arch/powerpc/include/asm/atomic.h:11,
from include/linux/atomic.h:7,
from include/linux/edac.h:15,
from drivers/edac/edac_module.c:13:
drivers/edac/edac_module.c: At top level:
>> drivers/edac/edac_module.c:177:17: error: initialization of 'initcall_t' {aka 'int (*)(void)'} from incompatible pointer type 'void (*)(void)' [-Werror=incompatible-pointer-types]
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/init.h:250:55: note: in definition of macro '____define_initcall'
250 | __attribute__((__section__(__sec))) = fn;
| ^~
include/linux/init.h:260:9: note: in expansion of macro '__unique_initcall'
260 | __unique_initcall(fn, id, __sec, __initcall_id(fn))
| ^~~~~~~~~~~~~~~~~
include/linux/init.h:262:35: note: in expansion of macro '___define_initcall'
262 | #define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id)
| ^~~~~~~~~~~~~~~~~~
include/linux/init.h:286:41: note: in expansion of macro '__define_initcall'
286 | #define subsys_initcall(fn) __define_initcall(fn, 4)
| ^~~~~~~~~~~~~~~~~
drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall'
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~
cc1: some warnings being treated as errors


vim +177 drivers/edac/edac_module.c

173
174 /*
175 * Clear parity errors after PCI subsys is initialized
176 */
> 177 subsys_initcall(edac_init_clear_parity_errors);
178

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (4.32 kB)
config (336.13 kB)
Download all attachments

2022-11-16 11:32:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()

On Tue, Nov 15, 2022 at 07:37:29PM -0500, Jason Baron wrote:
> Currently, ghes_edac_register() is called via ghes_init() from acpi_init()

https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-ghes

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-11-16 13:28:33

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()

Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on ras/edac-for-next]
[also build test ERROR on linus/master v6.1-rc5 next-20221115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046
base: https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git edac-for-next
patch link: https://lore.kernel.org/r/20221116003729.194802-1-jbaron%40akamai.com
patch subject: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()
config: powerpc-allmodconfig
compiler: powerpc-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/a970ee7e983345d07bd1f3e455688ef753f32a45
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Jason-Baron/EDAC-edac_module-order-edac_init-before-ghes_edac_register/20221116-084046
git checkout a970ee7e983345d07bd1f3e455688ef753f32a45
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash drivers/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All error/warnings (new ones prefixed by >>):

drivers/edac/edac_module.c: In function 'edac_init_clear_parity_errors':
drivers/edac/edac_module.c:162:16: error: 'return' with a value, in function returning void [-Werror=return-type]
162 | return 0;
| ^
drivers/edac/edac_module.c:151:20: note: declared here
151 | static void __init edac_init_clear_parity_errors(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/device/driver.h:21,
from include/linux/device.h:32,
from include/linux/edac.h:16,
from drivers/edac/edac_module.c:13:
drivers/edac/edac_module.c: At top level:
include/linux/module.h:130:49: error: redefinition of '__inittest'
130 | static inline initcall_t __maybe_unused __inittest(void) \
| ^~~~~~~~~~
include/linux/module.h:116:41: note: in expansion of macro 'module_init'
116 | #define subsys_initcall(fn) module_init(fn)
| ^~~~~~~~~~~
drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall'
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~
include/linux/module.h:130:49: note: previous definition of '__inittest' with type 'int (*(void))(void)'
130 | static inline initcall_t __maybe_unused __inittest(void) \
| ^~~~~~~~~~
include/linux/module.h:115:41: note: in expansion of macro 'module_init'
115 | #define arch_initcall(fn) module_init(fn)
| ^~~~~~~~~~~
drivers/edac/edac_module.c:171:1: note: in expansion of macro 'arch_initcall'
171 | arch_initcall(edac_init);
| ^~~~~~~~~~~~~
drivers/edac/edac_module.c: In function '__inittest':
>> drivers/edac/edac_module.c:177:17: error: returning 'void (*)(void)' from a function with incompatible return type 'initcall_t' {aka 'int (*)(void)'} [-Werror=incompatible-pointer-types]
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/module.h:131:18: note: in definition of macro 'module_init'
131 | { return initfn; } \
| ^~~~~~
drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall'
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~
drivers/edac/edac_module.c: At top level:
include/linux/module.h:132:13: error: redefinition of 'init_module'
132 | int init_module(void) __copy(initfn) \
| ^~~~~~~~~~~
include/linux/module.h:116:41: note: in expansion of macro 'module_init'
116 | #define subsys_initcall(fn) module_init(fn)
| ^~~~~~~~~~~
drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall'
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~
include/linux/module.h:132:13: note: previous definition of 'init_module' with type 'int(void)'
132 | int init_module(void) __copy(initfn) \
| ^~~~~~~~~~~
include/linux/module.h:115:41: note: in expansion of macro 'module_init'
115 | #define arch_initcall(fn) module_init(fn)
| ^~~~~~~~~~~
drivers/edac/edac_module.c:171:1: note: in expansion of macro 'arch_initcall'
171 | arch_initcall(edac_init);
| ^~~~~~~~~~~~~
>> include/linux/module.h:132:13: warning: 'init_module' alias between functions of incompatible types 'int(void)' and 'void(void)' [-Wattribute-alias=]
132 | int init_module(void) __copy(initfn) \
| ^~~~~~~~~~~
include/linux/module.h:116:41: note: in expansion of macro 'module_init'
116 | #define subsys_initcall(fn) module_init(fn)
| ^~~~~~~~~~~
drivers/edac/edac_module.c:177:1: note: in expansion of macro 'subsys_initcall'
177 | subsys_initcall(edac_init_clear_parity_errors);
| ^~~~~~~~~~~~~~~
drivers/edac/edac_module.c:151:20: note: aliased declaration here
151 | static void __init edac_init_clear_parity_errors(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors


vim +177 drivers/edac/edac_module.c

173
174 /*
175 * Clear parity errors after PCI subsys is initialized
176 */
> 177 subsys_initcall(edac_init_clear_parity_errors);
178

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (6.64 kB)
config (329.74 kB)
Download all attachments

2022-11-16 14:43:16

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()


On 11/16/22 06:14, Borislav Petkov wrote:
> On Tue, Nov 15, 2022 at 07:37:29PM -0500, Jason Baron wrote:
>> Currently, ghes_edac_register() is called via ghes_init() from acpi_init()
> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git/log/?h=edac-ghes__;!!GjvTz_vk!RVsGvU3qNqFLwWDFImJScVgizbxofNbNY-8NF2inDqKTrn3IWJdJdcQJ6FoKxFkWhEPRpYmwzw$
>
Hi Boris,

Thanks, yes this looks like it will address the regression. Is this
planned for 6.1?

Or 5.15 stable, which is where we hit this regression?

Thanks,

-Jason


2022-11-16 18:52:27

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()

Hi,

On Wed, Nov 16, 2022 at 09:32:41AM -0500, Jason Baron wrote:
> Thanks, yes this looks like it will address the regression. Is this
> planned for 6.1?

6.2.

> Or 5.15 stable, which is where we hit this regression?

No, I don't think it is stable material.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-11-16 19:10:23

by Jason Baron

[permalink] [raw]
Subject: Re: [PATCH] EDAC/edac_module: order edac_init() before ghes_edac_register()


On 11/16/22 13:37, Borislav Petkov wrote:
> Hi,
>
> On Wed, Nov 16, 2022 at 09:32:41AM -0500, Jason Baron wrote:
>> Thanks, yes this looks like it will address the regression. Is this
>> planned for 6.1?
> 6.2.
>
>> Or 5.15 stable, which is where we hit this regression?
> No, I don't think it is stable material.
>
> Thx.
>
Ok, thanks. Is there any plan to address this in 5.15 stable/6.1 ?

Either with a revert or fixup as I proposed or something else?

Thanks,

-Jason