2022-10-17 13:07:55

by Justin He

[permalink] [raw]
Subject: [PATCH v9 0/7] Make ghes_edac a proper module

Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in
apci_init()") introduced a bug that ghes_edac_register() would be invoked
before edac_init(). Because at that time, the bus "edac" hasn't been even
registered, this created sysfs /devices/mc0 instead of
/sys/devices/system/edac/mc/mc0 on an Ampere eMag server.

The solution is to make ghes_edac a proper module.

Changelog:
v9:
- drop the unrcu_pointer patch 06 of v8
- add Ard's xchg_release patch to use a better memory barrier
v8:https://lore.kernel.org/lkml/[email protected]/
- merge v7 two force_enable and ghes_get_devices() patches into one
- make force_enable static
v7:https://lore.kernel.org/lkml/[email protected]/
- remove the ghes_edac_preferred and ghes_present (suggested by Borislav)
- adjust the patch splitting, no major functional changes
- remove the r-b tag in those changed patches
v6:https://www.spinics.net/lists/kernel/msg4511453.html
- no code changes from v5 patches
- add the reviewed and acked by from Toshi
- describe the removal of ghes_edac_force_enable checking in Patch 05
v5: https://www.spinics.net/lists/kernel/msg4502787.html
- add the review-by from Toshi for patch 04 and 06
- refine the commit msg
- remove the unconditional set of ghes_edac_force_enable on Arm
v4: https://lore.kernel.org/lkml/[email protected]/
- move the kernel boot option to ghes module parameter
- collapse th ghes_present and ghes_edac_preferred into one patch
v3: https://lore.kernel.org/lkml/[email protected]/
- refine the commit logs
- introduce ghes preferred and present flag (by Toshi)
- move force_load to setup parameter
- add the ghes_edac_preferred() check for x86/Arm edac drivers
v2: https://lore.kernel.org/lkml/[email protected]/
- add acked-by tag of Patch 1 from Ard
- split the notifier patch
- add 2 patch to get regular drivers selected when ghes edac is not loaded
- fix an errno in igen6 driver
- add a patch to fix the sparse warning of ghes
- refine the commit logs
v1: https://lore.kernel.org/lkml/[email protected]/

Ard Biesheuvel (1):
apei/ghes: Use xchg_release() for updating new cache slot instead of
cmpxchg()

Jia He (6):
efi/cper: export several helpers for ghes_edac to use
EDAC/ghes: Add a notifier for reporting memory errors
EDAC/ghes: Prepare to make ghes_edac a proper module
EDAC/ghes: Make ghes_edac a proper module to remove the dependency on
ghes
EDAC: Add the ghes_get_devices() check for chipset-specific edac
drivers
EDAC/igen6: Return consistent errno when another edac driver is
enabled

drivers/acpi/apei/ghes.c | 109 +++++++++++++++++++++++++--------
drivers/edac/Kconfig | 4 +-
drivers/edac/amd64_edac.c | 3 +
drivers/edac/armada_xp_edac.c | 3 +
drivers/edac/edac_module.h | 1 +
drivers/edac/ghes_edac.c | 90 ++++++++++++++++-----------
drivers/edac/i10nm_base.c | 3 +
drivers/edac/igen6_edac.c | 5 +-
drivers/edac/layerscape_edac.c | 3 +
drivers/edac/pnd2_edac.c | 3 +
drivers/edac/sb_edac.c | 3 +
drivers/edac/skx_base.c | 3 +
drivers/edac/thunderx_edac.c | 3 +
drivers/edac/xgene_edac.c | 3 +
drivers/firmware/efi/cper.c | 3 +
include/acpi/ghes.h | 34 +++-------
16 files changed, 185 insertions(+), 88 deletions(-)

--
2.25.1


2022-10-17 13:08:04

by Justin He

[permalink] [raw]
Subject: [PATCH v9 1/7] efi/cper: export several helpers for ghes_edac to use

Before ghes_edac can be turned back into a proper module again, export
several helpers which are going to be used by it.

Signed-off-by: Jia He <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
---
drivers/firmware/efi/cper.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index e4e5ea7ce910..053eae13f409 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -290,6 +290,7 @@ int cper_mem_err_location(struct cper_mem_err_compact *mem, char *msg)

return n;
}
+EXPORT_SYMBOL_GPL(cper_mem_err_location);

int cper_dimm_err_location(struct cper_mem_err_compact *mem, char *msg)
{
@@ -310,6 +311,7 @@ int cper_dimm_err_location(struct cper_mem_err_compact *mem, char *msg)

return n;
}
+EXPORT_SYMBOL_GPL(cper_dimm_err_location);

void cper_mem_err_pack(const struct cper_sec_mem_err *mem,
struct cper_mem_err_compact *cmem)
@@ -331,6 +333,7 @@ void cper_mem_err_pack(const struct cper_sec_mem_err *mem,
cmem->mem_array_handle = mem->mem_array_handle;
cmem->mem_dev_handle = mem->mem_dev_handle;
}
+EXPORT_SYMBOL_GPL(cper_mem_err_pack);

const char *cper_mem_err_unpack(struct trace_seq *p,
struct cper_mem_err_compact *cmem)
--
2.25.1

2022-10-17 13:08:45

by Justin He

[permalink] [raw]
Subject: [PATCH v9 3/7] EDAC/ghes: Prepare to make ghes_edac a proper module

To make ghes_edac a proper module, prepare to decouple its dependencies
on ghes.

ghes_edac_register() is too late to set the module flag ghes_edac.force_load.
Also, other edac drivers should not be able to control this flag. Move this
flag to the module parameter in ghes and make it static.

Introduce a helper ghes_get_devices(), which returns the dev list GHES
probed when the platform-check passes on the system.

The previous force_load check is not needed in ghes_edac_unregister()
since it will be checked in module_init of ghes_edac later.

Suggested-by: Toshi Kani <[email protected]>
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Jia He <[email protected]>
---
drivers/acpi/apei/ghes.c | 46 ++++++++++++++++++++++++++++++++++++++++
drivers/edac/ghes_edac.c | 35 ++----------------------------
include/acpi/ghes.h | 6 ++++++
3 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 8cb65f757d06..6a0dcb8c0901 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -109,6 +109,13 @@ static inline bool is_hest_type_generic_v2(struct ghes *ghes)
bool ghes_disable;
module_param_named(disable, ghes_disable, bool, 0);

+/*
+ * "ghes.edac_force_enable" forcibly enables ghes_edac and skips the platform
+ * check.
+ */
+static bool ghes_edac_force_enable;
+module_param_named(edac_force_enable, ghes_edac_force_enable, bool, 0);
+
/*
* All error sources notified with HED (Hardware Error Device) share a
* single notifier callback, so they need to be linked and checked one
@@ -120,6 +127,9 @@ module_param_named(disable, ghes_disable, bool, 0);
static LIST_HEAD(ghes_hed);
static DEFINE_MUTEX(ghes_list_mutex);

+static LIST_HEAD(ghes_devs);
+static DEFINE_MUTEX(ghes_devs_mutex);
+
/*
* Because the memory area used to transfer hardware error information
* from BIOS to Linux can be determined only in NMI, IRQ or timer
@@ -1380,6 +1390,12 @@ static int ghes_probe(struct platform_device *ghes_dev)

ghes_edac_register(ghes, &ghes_dev->dev);

+ ghes->dev = &ghes_dev->dev;
+
+ mutex_lock(&ghes_devs_mutex);
+ list_add_tail(&ghes->elist, &ghes_devs);
+ mutex_unlock(&ghes_devs_mutex);
+
/* Handle any pending errors right away */
spin_lock_irqsave(&ghes_notify_lock_irq, flags);
ghes_proc(ghes);
@@ -1444,6 +1460,10 @@ static int ghes_remove(struct platform_device *ghes_dev)

ghes_edac_unregister(ghes);

+ mutex_lock(&ghes_devs_mutex);
+ list_del(&ghes->elist);
+ mutex_unlock(&ghes_devs_mutex);
+
kfree(ghes);

platform_set_drvdata(ghes_dev, NULL);
@@ -1500,6 +1520,32 @@ void __init acpi_ghes_init(void)
pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
}

+/*
+ * Known x86 systems that prefer GHES error reporting:
+ */
+static struct acpi_platform_list plat_list[] = {
+ {"HPE ", "Server ", 0, ACPI_SIG_FADT, all_versions},
+ { } /* End */
+};
+
+struct list_head *ghes_get_devices(void)
+{
+ int idx = -1;
+
+ if (IS_ENABLED(CONFIG_X86)) {
+ idx = acpi_match_platform_list(plat_list);
+ if (idx < 0) {
+ if (!ghes_edac_force_enable)
+ return NULL;
+
+ pr_warn_once("Force-loading ghes_edac on an unsupported platform. You're on your own!\n");
+ }
+ }
+
+ return &ghes_devs;
+}
+EXPORT_SYMBOL_GPL(ghes_get_devices);
+
void ghes_register_report_chain(struct notifier_block *nb)
{
atomic_notifier_chain_register(&ghes_report_chain, nb);
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 7b8d56a769f6..b85a545d1cb0 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -54,10 +54,6 @@ static DEFINE_MUTEX(ghes_reg_mutex);
*/
static DEFINE_SPINLOCK(ghes_lock);

-/* "ghes_edac.force_load=1" skips the platform check */
-static bool __read_mostly force_load;
-module_param(force_load, bool, 0);
-
static bool system_scanned;

/* Memory Device - Type 17 of SMBIOS spec */
@@ -387,14 +383,6 @@ static struct notifier_block ghes_edac_mem_err_nb = {
.priority = 0,
};

-/*
- * Known systems that are safe to enable this module.
- */
-static struct acpi_platform_list plat_list[] = {
- {"HPE ", "Server ", 0, ACPI_SIG_FADT, all_versions},
- { } /* End */
-};
-
int ghes_edac_register(struct ghes *ghes, struct device *dev)
{
bool fake = false;
@@ -402,19 +390,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
struct ghes_pvt *pvt;
struct edac_mc_layer layers[1];
unsigned long flags;
- int idx = -1;
int rc = 0;

- if (IS_ENABLED(CONFIG_X86)) {
- /* Check if safe to enable on this system */
- idx = acpi_match_platform_list(plat_list);
- if (!force_load && idx < 0)
- return -ENODEV;
- } else {
- force_load = true;
- idx = 0;
- }
-
/* finish another registration/unregistration instance first */
mutex_lock(&ghes_reg_mutex);

@@ -458,15 +435,10 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
pr_info("work on such system. Use this driver with caution\n");
- } else if (idx < 0) {
- pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
- pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
- pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
- pr_info("If you find incorrect reports, please contact your hardware vendor\n");
- pr_info("to correct its BIOS.\n");
- pr_info("This system has %d DIMM sockets.\n", ghes_hw.num_dimms);
}

+ pr_info("This system has %d DIMM sockets.\n", ghes_hw.num_dimms);
+
if (!fake) {
struct dimm_info *src, *dst;
int i = 0;
@@ -535,9 +507,6 @@ void ghes_edac_unregister(struct ghes *ghes)
struct mem_ctl_info *mci;
unsigned long flags;

- if (!force_load)
- return;
-
mutex_lock(&ghes_reg_mutex);

system_scanned = false;
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 5cbd38b6e4e1..ce693e9f07a0 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -27,6 +27,8 @@ struct ghes {
struct timer_list timer;
unsigned int irq;
};
+ struct device *dev;
+ struct list_head elist;
};

struct ghes_estatus_node {
@@ -80,6 +82,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev);

void ghes_edac_unregister(struct ghes *ghes);

+struct list_head *ghes_get_devices(void);
+
#else
static inline int ghes_edac_register(struct ghes *ghes, struct device *dev)
{
@@ -89,6 +93,8 @@ static inline int ghes_edac_register(struct ghes *ghes, struct device *dev)
static inline void ghes_edac_unregister(struct ghes *ghes)
{
}
+
+static inline struct list_head *ghes_get_devices(void) { return NULL; }
#endif

static inline int acpi_hest_get_version(struct acpi_hest_generic_data *gdata)
--
2.25.1

2022-10-17 13:09:00

by Justin He

[permalink] [raw]
Subject: [PATCH v9 5/7] EDAC: Add the ghes_get_devices() check for chipset-specific edac drivers

Add ghes_get_devices() check for chipset-specific edac drivers to ensure
that ghes_edac is used on the platform where ghes_edac is preferred over
chipset-specific edac driver.

Unlike the existing edac_get_owner() check, the ghes_get_devices()
check works independent to the module_init ordering.

Suggested-by: Toshi Kani <[email protected]>
Signed-off-by: Jia He <[email protected]>
---
drivers/edac/amd64_edac.c | 3 +++
drivers/edac/armada_xp_edac.c | 3 +++
drivers/edac/edac_module.h | 1 +
drivers/edac/i10nm_base.c | 3 +++
drivers/edac/igen6_edac.c | 3 +++
drivers/edac/layerscape_edac.c | 3 +++
drivers/edac/pnd2_edac.c | 3 +++
drivers/edac/sb_edac.c | 3 +++
drivers/edac/skx_base.c | 3 +++
drivers/edac/thunderx_edac.c | 3 +++
drivers/edac/xgene_edac.c | 3 +++
11 files changed, 31 insertions(+)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 2f854feeeb23..e3318e5575a3 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -4329,6 +4329,9 @@ static int __init amd64_edac_init(void)
int err = -ENODEV;
int i;

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -EBUSY;
diff --git a/drivers/edac/armada_xp_edac.c b/drivers/edac/armada_xp_edac.c
index 038abbb83f4b..c4bd2fb9c46b 100644
--- a/drivers/edac/armada_xp_edac.c
+++ b/drivers/edac/armada_xp_edac.c
@@ -599,6 +599,9 @@ static int __init armada_xp_edac_init(void)
{
int res;

+ if (ghes_get_devices())
+ return -EBUSY;
+
/* only polling is supported */
edac_op_state = EDAC_OPSTATE_POLL;

diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 96f6de0c8ff6..3826f82de487 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -11,6 +11,7 @@
#ifndef __EDAC_MODULE_H__
#define __EDAC_MODULE_H__

+#include <acpi/ghes.h>
#include "edac_mc.h"
#include "edac_pci.h"
#include "edac_device.h"
diff --git a/drivers/edac/i10nm_base.c b/drivers/edac/i10nm_base.c
index 6cf50ee0b77c..75211ee4cd12 100644
--- a/drivers/edac/i10nm_base.c
+++ b/drivers/edac/i10nm_base.c
@@ -548,6 +548,9 @@ static int __init i10nm_init(void)

edac_dbg(2, "\n");

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -EBUSY;
diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c
index a07bbfd075d0..d33c666221f9 100644
--- a/drivers/edac/igen6_edac.c
+++ b/drivers/edac/igen6_edac.c
@@ -1271,6 +1271,9 @@ static int __init igen6_init(void)

edac_dbg(2, "\n");

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -ENODEV;
diff --git a/drivers/edac/layerscape_edac.c b/drivers/edac/layerscape_edac.c
index 94cac7686a56..35ceaca578e1 100644
--- a/drivers/edac/layerscape_edac.c
+++ b/drivers/edac/layerscape_edac.c
@@ -38,6 +38,9 @@ static int __init fsl_ddr_mc_init(void)
{
int res;

+ if (ghes_get_devices())
+ return -EBUSY;
+
/* make sure error reporting method is sane */
switch (edac_op_state) {
case EDAC_OPSTATE_POLL:
diff --git a/drivers/edac/pnd2_edac.c b/drivers/edac/pnd2_edac.c
index a20b299f1202..2b306f2cc605 100644
--- a/drivers/edac/pnd2_edac.c
+++ b/drivers/edac/pnd2_edac.c
@@ -1528,6 +1528,9 @@ static int __init pnd2_init(void)

edac_dbg(2, "\n");

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -EBUSY;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 9678ab97c7ac..2c860adf54a7 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -3506,6 +3506,9 @@ static int __init sbridge_init(void)

edac_dbg(2, "\n");

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -EBUSY;
diff --git a/drivers/edac/skx_base.c b/drivers/edac/skx_base.c
index 1abc020d49ab..80a7334111b1 100644
--- a/drivers/edac/skx_base.c
+++ b/drivers/edac/skx_base.c
@@ -653,6 +653,9 @@ static int __init skx_init(void)

edac_dbg(2, "\n");

+ if (ghes_get_devices())
+ return -EBUSY;
+
owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
return -EBUSY;
diff --git a/drivers/edac/thunderx_edac.c b/drivers/edac/thunderx_edac.c
index f13674081cb6..0bcd9f02c84a 100644
--- a/drivers/edac/thunderx_edac.c
+++ b/drivers/edac/thunderx_edac.c
@@ -2114,6 +2114,9 @@ static int __init thunderx_edac_init(void)
{
int rc = 0;

+ if (ghes_get_devices())
+ return -EBUSY;
+
rc = pci_register_driver(&thunderx_lmc_driver);
if (rc)
return rc;
diff --git a/drivers/edac/xgene_edac.c b/drivers/edac/xgene_edac.c
index 54081403db4f..c52b9dd9154c 100644
--- a/drivers/edac/xgene_edac.c
+++ b/drivers/edac/xgene_edac.c
@@ -2004,6 +2004,9 @@ static int __init xgene_edac_init(void)
{
int rc;

+ if (ghes_get_devices())
+ return -EBUSY;
+
/* Make sure error reporting method is sane */
switch (edac_op_state) {
case EDAC_OPSTATE_POLL:
--
2.25.1

2022-10-17 13:09:07

by Justin He

[permalink] [raw]
Subject: [PATCH v9 6/7] apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()

From: Ard Biesheuvel <[email protected]>

From: Ard Biesheuvel <[email protected]>

ghes_estatus_cache_add() selects a slot, and either succeeds in
replacing its contents with a pointer to a new cached item, or it just
gives up and frees the new item again, without attempting to select
another slot even if one might be available.

Since only inserting new items is needed, the race can only cause a failure
if the selected slot was updated with another new item concurrently,
which means that it is arbitrary which of those two items gets
dropped. This means the cmpxchg() and the special case are not necessary,
and hence just drop the existing item unconditionally. Note that this
does not result in loss of error events, it simply means we might
cause a false cache miss, and report the same event one additional
time in quick succession even if the cache should have prevented that.

Move the xchg_release() and call_rcu out of rcu_read_lock/unlock section
since there is no actually dereferencing the pointer at all.

Co-developed-by: Jia He <[email protected]>
Signed-off-by: Jia He <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
---
drivers/acpi/apei/ghes.c | 47 +++++++++++++++++++++-------------------
1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 27c72b175e4b..5d7754053ca0 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -150,7 +150,7 @@ struct ghes_vendor_record_entry {
static struct gen_pool *ghes_estatus_pool;
static unsigned long ghes_estatus_pool_size_request;

-static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
+static struct ghes_estatus_cache __rcu *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
static atomic_t ghes_estatus_cache_alloced;

static int ghes_panic_timeout __read_mostly = 30;
@@ -785,31 +785,26 @@ static struct ghes_estatus_cache *ghes_estatus_cache_alloc(
return cache;
}

-static void ghes_estatus_cache_free(struct ghes_estatus_cache *cache)
+static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
{
+ struct ghes_estatus_cache *cache;
u32 len;

+ cache = container_of(head, struct ghes_estatus_cache, rcu);
len = cper_estatus_len(GHES_ESTATUS_FROM_CACHE(cache));
len = GHES_ESTATUS_CACHE_LEN(len);
gen_pool_free(ghes_estatus_pool, (unsigned long)cache, len);
atomic_dec(&ghes_estatus_cache_alloced);
}

-static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
-{
- struct ghes_estatus_cache *cache;
-
- cache = container_of(head, struct ghes_estatus_cache, rcu);
- ghes_estatus_cache_free(cache);
-}
-
static void ghes_estatus_cache_add(
struct acpi_hest_generic *generic,
struct acpi_hest_generic_status *estatus)
{
int i, slot = -1, count;
unsigned long long now, duration, period, max_period = 0;
- struct ghes_estatus_cache *cache, *slot_cache = NULL, *new_cache;
+ struct ghes_estatus_cache *cache, *new_cache;
+ struct ghes_estatus_cache __rcu *victim;

new_cache = ghes_estatus_cache_alloc(generic, estatus);
if (new_cache == NULL)
@@ -820,13 +815,11 @@ static void ghes_estatus_cache_add(
cache = rcu_dereference(ghes_estatus_caches[i]);
if (cache == NULL) {
slot = i;
- slot_cache = NULL;
break;
}
duration = now - cache->time_in;
if (duration >= GHES_ESTATUS_IN_CACHE_MAX_NSEC) {
slot = i;
- slot_cache = cache;
break;
}
count = atomic_read(&cache->count);
@@ -835,18 +828,28 @@ static void ghes_estatus_cache_add(
if (period > max_period) {
max_period = period;
slot = i;
- slot_cache = cache;
}
}
- /* new_cache must be put into array after its contents are written */
- smp_wmb();
- if (slot != -1 && cmpxchg(ghes_estatus_caches + slot,
- slot_cache, new_cache) == slot_cache) {
- if (slot_cache)
- call_rcu(&slot_cache->rcu, ghes_estatus_cache_rcu_free);
- } else
- ghes_estatus_cache_free(new_cache);
rcu_read_unlock();
+
+ if (slot != -1) {
+ /*
+ * Use release semantics to ensure that ghes_estatus_cached()
+ * running on another CPU will see the updated cache fields if
+ * it can see the new value of the pointer.
+ */
+ victim = xchg_release(&ghes_estatus_caches[slot], new_cache);
+
+ /*
+ * At this point, victim may point to a cached item different
+ * from the one based on which we selected the slot. Instead of
+ * going to the loop again to pick another slot, let's just
+ * drop the other item anyway: this may cause a false cache
+ * miss later on, but that won't cause any problems.
+ */
+ if (victim)
+ call_rcu(unrcu_pointer(&victim->rcu), ghes_estatus_cache_rcu_free);
+ }
}

static void __ghes_panic(struct ghes *ghes,
--
2.25.1

2022-10-17 13:09:21

by Justin He

[permalink] [raw]
Subject: [PATCH v9 2/7] EDAC/ghes: Add a notifier for reporting memory errors

To make a proper module, add a notifier for reporting memory errors. Use
an atomic notifier because calls sites like ghes_proc_in_irq() run in
interrupt context.

Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Jia He <[email protected]>
---
drivers/acpi/apei/ghes.c | 16 +++++++++++++++-
drivers/edac/ghes_edac.c | 19 +++++++++++++++++--
include/acpi/ghes.h | 10 +++-------
3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d91ad378c00d..8cb65f757d06 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -94,6 +94,8 @@
#define FIX_APEI_GHES_SDEI_CRITICAL __end_of_fixed_addresses
#endif

+static ATOMIC_NOTIFIER_HEAD(ghes_report_chain);
+
static inline bool is_hest_type_generic_v2(struct ghes *ghes)
{
return ghes->generic->header.type == ACPI_HEST_TYPE_GENERIC_ERROR_V2;
@@ -645,7 +647,7 @@ static bool ghes_do_proc(struct ghes *ghes,
if (guid_equal(sec_type, &CPER_SEC_PLATFORM_MEM)) {
struct cper_sec_mem_err *mem_err = acpi_hest_get_payload(gdata);

- ghes_edac_report_mem_error(sev, mem_err);
+ atomic_notifier_call_chain(&ghes_report_chain, sev, mem_err);

arch_apei_report_mem_error(sev, mem_err);
queued = ghes_handle_memory_failure(gdata, sev);
@@ -1497,3 +1499,15 @@ void __init acpi_ghes_init(void)
else
pr_info(GHES_PFX "Failed to enable APEI firmware first mode.\n");
}
+
+void ghes_register_report_chain(struct notifier_block *nb)
+{
+ atomic_notifier_chain_register(&ghes_report_chain, nb);
+}
+EXPORT_SYMBOL_GPL(ghes_register_report_chain);
+
+void ghes_unregister_report_chain(struct notifier_block *nb)
+{
+ atomic_notifier_chain_unregister(&ghes_report_chain, nb);
+}
+EXPORT_SYMBOL_GPL(ghes_unregister_report_chain);
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index c8fa7dcfdbd0..7b8d56a769f6 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -14,6 +14,7 @@
#include <linux/dmi.h>
#include "edac_module.h"
#include <ras/ras_event.h>
+#include <linux/notifier.h>

#define OTHER_DETAIL_LEN 400

@@ -267,11 +268,14 @@ static int print_mem_error_other_detail(const struct cper_sec_mem_err *mem, char
return n;
}

-void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
+static int ghes_edac_report_mem_error(struct notifier_block *nb,
+ unsigned long val, void *data)
{
+ struct cper_sec_mem_err *mem_err = (struct cper_sec_mem_err *)data;
struct cper_mem_err_compact cmem;
struct edac_raw_error_desc *e;
struct mem_ctl_info *mci;
+ unsigned long sev = val;
struct ghes_pvt *pvt;
unsigned long flags;
char *p;
@@ -282,7 +286,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
* know.
*/
if (WARN_ON_ONCE(in_nmi()))
- return;
+ return NOTIFY_OK;

spin_lock_irqsave(&ghes_lock, flags);

@@ -374,8 +378,15 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)

unlock:
spin_unlock_irqrestore(&ghes_lock, flags);
+
+ return NOTIFY_OK;
}

+static struct notifier_block ghes_edac_mem_err_nb = {
+ .notifier_call = ghes_edac_report_mem_error,
+ .priority = 0,
+};
+
/*
* Known systems that are safe to enable this module.
*/
@@ -503,6 +514,8 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
ghes_pvt = pvt;
spin_unlock_irqrestore(&ghes_lock, flags);

+ ghes_register_report_chain(&ghes_edac_mem_err_nb);
+
/* only set on success */
refcount_set(&ghes_refcount, 1);

@@ -548,6 +561,8 @@ void ghes_edac_unregister(struct ghes *ghes)
if (mci)
edac_mc_free(mci);

+ ghes_unregister_report_chain(&ghes_edac_mem_err_nb);
+
unlock:
mutex_unlock(&ghes_reg_mutex);
}
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index 34fb3431a8f3..5cbd38b6e4e1 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -76,18 +76,11 @@ int ghes_estatus_pool_init(int num_ghes);
/* From drivers/edac/ghes_edac.c */

#ifdef CONFIG_EDAC_GHES
-void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err);
-
int ghes_edac_register(struct ghes *ghes, struct device *dev);

void ghes_edac_unregister(struct ghes *ghes);

#else
-static inline void ghes_edac_report_mem_error(int sev,
- struct cper_sec_mem_err *mem_err)
-{
-}
-
static inline int ghes_edac_register(struct ghes *ghes, struct device *dev)
{
return -ENODEV;
@@ -145,4 +138,7 @@ int ghes_notify_sea(void);
static inline int ghes_notify_sea(void) { return -ENOENT; }
#endif

+struct notifier_block;
+extern void ghes_register_report_chain(struct notifier_block *nb);
+extern void ghes_unregister_report_chain(struct notifier_block *nb);
#endif /* GHES_H */
--
2.25.1

2022-10-17 13:10:39

by Justin He

[permalink] [raw]
Subject: [PATCH v9 4/7] EDAC/ghes: Make ghes_edac a proper module to remove the dependency on ghes

Commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in
apci_init()") introduced a bug that ghes_edac_register() would be invoked
before edac_init(). Because at that time, the bus "edac" hadn't been even
registered, this created sysfs /devices/mc0 instead of
/sys/devices/system/edac/mc/mc0 on an Ampere eMag server.

To remove the dependency of ghes_edac on ghes, make it a proper module. Use
a list to save the probing devices in ghes_probe(), and defer the
ghes_edac_register() to module_init() of the new ghes_edac module by
iterating over the devices list.

Co-developed-by: Borislav Petkov <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Jia He <[email protected]>
Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")
Cc: Shuai Xue <[email protected]>
---
drivers/acpi/apei/ghes.c | 4 ----
drivers/edac/Kconfig | 4 ++--
drivers/edac/ghes_edac.c | 40 ++++++++++++++++++++++++++++++++++++++--
include/acpi/ghes.h | 22 ++--------------------
4 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 6a0dcb8c0901..27c72b175e4b 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -1388,8 +1388,6 @@ static int ghes_probe(struct platform_device *ghes_dev)

platform_set_drvdata(ghes_dev, ghes);

- ghes_edac_register(ghes, &ghes_dev->dev);
-
ghes->dev = &ghes_dev->dev;

mutex_lock(&ghes_devs_mutex);
@@ -1458,8 +1456,6 @@ static int ghes_remove(struct platform_device *ghes_dev)

ghes_fini(ghes);

- ghes_edac_unregister(ghes);
-
mutex_lock(&ghes_devs_mutex);
list_del(&ghes->elist);
mutex_unlock(&ghes_devs_mutex);
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 17562cf1fe97..df45db81858b 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -53,8 +53,8 @@ config EDAC_DECODE_MCE
has been initialized.

config EDAC_GHES
- bool "Output ACPI APEI/GHES BIOS detected errors via EDAC"
- depends on ACPI_APEI_GHES && (EDAC=y)
+ tristate "Output ACPI APEI/GHES BIOS detected errors via EDAC"
+ depends on ACPI_APEI_GHES
select UEFI_CPER
help
Not all machines support hardware-driven error report. Some of those
diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index b85a545d1cb0..dec1bb96e4b3 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -56,6 +56,8 @@ static DEFINE_SPINLOCK(ghes_lock);

static bool system_scanned;

+static struct list_head *ghes_devs;
+
/* Memory Device - Type 17 of SMBIOS spec */
struct memdev_dmi_entry {
u8 type;
@@ -383,7 +385,7 @@ static struct notifier_block ghes_edac_mem_err_nb = {
.priority = 0,
};

-int ghes_edac_register(struct ghes *ghes, struct device *dev)
+static int ghes_edac_register(struct device *dev)
{
bool fake = false;
struct mem_ctl_info *mci;
@@ -502,7 +504,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
return rc;
}

-void ghes_edac_unregister(struct ghes *ghes)
+static void ghes_edac_unregister(struct ghes *ghes)
{
struct mem_ctl_info *mci;
unsigned long flags;
@@ -535,3 +537,37 @@ void ghes_edac_unregister(struct ghes *ghes)
unlock:
mutex_unlock(&ghes_reg_mutex);
}
+
+static int __init ghes_edac_init(void)
+{
+ struct ghes *g, *g_tmp;
+
+ ghes_devs = ghes_get_devices();
+ if (!ghes_devs)
+ return -ENODEV;
+
+ if (list_empty(ghes_devs)) {
+ pr_info("GHES probing device list is empty");
+ return -ENODEV;
+ }
+
+ list_for_each_entry_safe(g, g_tmp, ghes_devs, elist) {
+ ghes_edac_register(g->dev);
+ }
+
+ return 0;
+}
+module_init(ghes_edac_init);
+
+static void __exit ghes_edac_exit(void)
+{
+ struct ghes *g, *g_tmp;
+
+ list_for_each_entry_safe(g, g_tmp, ghes_devs, elist) {
+ ghes_edac_unregister(g);
+ }
+}
+module_exit(ghes_edac_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Output ACPI APEI/GHES BIOS detected errors module via EDAC");
diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h
index ce693e9f07a0..2e785d3554d8 100644
--- a/include/acpi/ghes.h
+++ b/include/acpi/ghes.h
@@ -71,32 +71,14 @@ int ghes_register_vendor_record_notifier(struct notifier_block *nb);
* @nb: pointer to the notifier_block structure of the vendor record handler.
*/
void ghes_unregister_vendor_record_notifier(struct notifier_block *nb);
-#endif
-
-int ghes_estatus_pool_init(int num_ghes);
-
-/* From drivers/edac/ghes_edac.c */
-
-#ifdef CONFIG_EDAC_GHES
-int ghes_edac_register(struct ghes *ghes, struct device *dev);
-
-void ghes_edac_unregister(struct ghes *ghes);

struct list_head *ghes_get_devices(void);
-
#else
-static inline int ghes_edac_register(struct ghes *ghes, struct device *dev)
-{
- return -ENODEV;
-}
-
-static inline void ghes_edac_unregister(struct ghes *ghes)
-{
-}
-
static inline struct list_head *ghes_get_devices(void) { return NULL; }
#endif

+int ghes_estatus_pool_init(int num_ghes);
+
static inline int acpi_hest_get_version(struct acpi_hest_generic_data *gdata)
{
return gdata->revision >> 8;
--
2.25.1

2022-10-17 13:11:16

by Justin He

[permalink] [raw]
Subject: [PATCH v9 7/7] EDAC/igen6: Return consistent errno when another edac driver is enabled

Only a single edac driver can be enabled for EDAC MC. The igen6_init()
should be returned with EBUSY instead of ENODEV, which is consistent with
other edac drivers.

Signed-off-by: Jia He <[email protected]>
---
drivers/edac/igen6_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/igen6_edac.c b/drivers/edac/igen6_edac.c
index d33c666221f9..544dd19072ea 100644
--- a/drivers/edac/igen6_edac.c
+++ b/drivers/edac/igen6_edac.c
@@ -1276,7 +1276,7 @@ static int __init igen6_init(void)

owner = edac_get_owner();
if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
- return -ENODEV;
+ return -EBUSY;

edac_op_state = EDAC_OPSTATE_NMI;

--
2.25.1

2022-10-17 15:07:15

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH v9 6/7] apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()

On Mon, 17 Oct 2022 at 15:02, Jia He <[email protected]> wrote:
>
> From: Ard Biesheuvel <[email protected]>
>
> From: Ard Biesheuvel <[email protected]>
>
> ghes_estatus_cache_add() selects a slot, and either succeeds in
> replacing its contents with a pointer to a new cached item, or it just
> gives up and frees the new item again, without attempting to select
> another slot even if one might be available.
>
> Since only inserting new items is needed, the race can only cause a failure
> if the selected slot was updated with another new item concurrently,
> which means that it is arbitrary which of those two items gets
> dropped. This means the cmpxchg() and the special case are not necessary,
> and hence just drop the existing item unconditionally. Note that this
> does not result in loss of error events, it simply means we might
> cause a false cache miss, and report the same event one additional
> time in quick succession even if the cache should have prevented that.
>
> Move the xchg_release() and call_rcu out of rcu_read_lock/unlock section
> since there is no actually dereferencing the pointer at all.
>
> Co-developed-by: Jia He <[email protected]>
> Signed-off-by: Jia He <[email protected]>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> drivers/acpi/apei/ghes.c | 47 +++++++++++++++++++++-------------------
> 1 file changed, 25 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 27c72b175e4b..5d7754053ca0 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -150,7 +150,7 @@ struct ghes_vendor_record_entry {
> static struct gen_pool *ghes_estatus_pool;
> static unsigned long ghes_estatus_pool_size_request;
>
> -static struct ghes_estatus_cache *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
> +static struct ghes_estatus_cache __rcu *ghes_estatus_caches[GHES_ESTATUS_CACHES_SIZE];
> static atomic_t ghes_estatus_cache_alloced;
>
> static int ghes_panic_timeout __read_mostly = 30;
> @@ -785,31 +785,26 @@ static struct ghes_estatus_cache *ghes_estatus_cache_alloc(
> return cache;
> }
>
> -static void ghes_estatus_cache_free(struct ghes_estatus_cache *cache)
> +static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
> {
> + struct ghes_estatus_cache *cache;
> u32 len;
>
> + cache = container_of(head, struct ghes_estatus_cache, rcu);
> len = cper_estatus_len(GHES_ESTATUS_FROM_CACHE(cache));
> len = GHES_ESTATUS_CACHE_LEN(len);
> gen_pool_free(ghes_estatus_pool, (unsigned long)cache, len);
> atomic_dec(&ghes_estatus_cache_alloced);
> }
>
> -static void ghes_estatus_cache_rcu_free(struct rcu_head *head)
> -{
> - struct ghes_estatus_cache *cache;
> -
> - cache = container_of(head, struct ghes_estatus_cache, rcu);
> - ghes_estatus_cache_free(cache);
> -}
> -
> static void ghes_estatus_cache_add(
> struct acpi_hest_generic *generic,
> struct acpi_hest_generic_status *estatus)
> {
> int i, slot = -1, count;
> unsigned long long now, duration, period, max_period = 0;
> - struct ghes_estatus_cache *cache, *slot_cache = NULL, *new_cache;
> + struct ghes_estatus_cache *cache, *new_cache;
> + struct ghes_estatus_cache __rcu *victim;
>
> new_cache = ghes_estatus_cache_alloc(generic, estatus);
> if (new_cache == NULL)
> @@ -820,13 +815,11 @@ static void ghes_estatus_cache_add(
> cache = rcu_dereference(ghes_estatus_caches[i]);
> if (cache == NULL) {
> slot = i;
> - slot_cache = NULL;
> break;
> }
> duration = now - cache->time_in;
> if (duration >= GHES_ESTATUS_IN_CACHE_MAX_NSEC) {
> slot = i;
> - slot_cache = cache;
> break;
> }
> count = atomic_read(&cache->count);
> @@ -835,18 +828,28 @@ static void ghes_estatus_cache_add(
> if (period > max_period) {
> max_period = period;
> slot = i;
> - slot_cache = cache;
> }
> }
> - /* new_cache must be put into array after its contents are written */
> - smp_wmb();
> - if (slot != -1 && cmpxchg(ghes_estatus_caches + slot,
> - slot_cache, new_cache) == slot_cache) {
> - if (slot_cache)
> - call_rcu(&slot_cache->rcu, ghes_estatus_cache_rcu_free);
> - } else
> - ghes_estatus_cache_free(new_cache);
> rcu_read_unlock();
> +
> + if (slot != -1) {
> + /*
> + * Use release semantics to ensure that ghes_estatus_cached()
> + * running on another CPU will see the updated cache fields if
> + * it can see the new value of the pointer.
> + */
> + victim = xchg_release(&ghes_estatus_caches[slot], new_cache);
> +

This still lacks the RCU_INITIALIZER()

> + /*
> + * At this point, victim may point to a cached item different
> + * from the one based on which we selected the slot. Instead of
> + * going to the loop again to pick another slot, let's just
> + * drop the other item anyway: this may cause a false cache
> + * miss later on, but that won't cause any problems.
> + */
> + if (victim)
> + call_rcu(unrcu_pointer(&victim->rcu), ghes_estatus_cache_rcu_free);

Please use &unrcu_pointer(victim)->rcu here.

> + }
> }
>
> static void __ghes_panic(struct ghes *ghes,
> --
> 2.25.1
>