2017-08-23 23:04:46

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 0/5] enable ghes_edac on selected platforms

The ghes_edac driver was introduced in 2013 [1], but it has not been
enabled by any distro yet. This is because the driver obtains error
info from firmware interfaces, which are not properly implemented on
many platforms.

To get out from this situation, add a platform check to selectively
enable the driver on the platforms that are known to have proper
firmware implementation. Platform vendors can add their platforms to
the list when they support ghes_edac.

Patch 1-2 introduces a common function for platform check.
Patch 3 introduces platform check to ghes_edac.
Patch 4-5 optimizes regular edac driver's init code when ghes_edac is used.

Patch-set is based on bp.git ghes branch.

v4:
- Increase the size of oem_id[] and oem_table_id[] by 1 (patch 1)
- Change code style to a single line (patch 1)
- Rebase to top of bp.git ghes branch.

v3:
- Change struct & func names to "platform" from "oem" (patch 1)
- Drop a patch that checks OSC APEI bit (remove v2 patch 3)
- Drop a patch that avoids multiple calls to dmi_walk() (remove v2 patch 4)
- Change parameter name to ghes_edac.force_load (patch 3)
- Change function to edac_get_owner() (patch 4)
- Change edac_mc_owner to const char * (patch 4)
- Change to call edac_get_owner() at the beginning (patch 5)
- Remove ".c" from mod_name (patch 5)

v2:
- Address review comments (patch 1)
- Add OSC APEI check (patch 3)
- Avoid multiple dmi_walk (patch 4)
- Add EDAC MC owner check (patch 6,7)

---
Toshi Kani (5):
1/5 ACPI / blacklist: add acpi_match_platform_list()
2/5 intel_pstate: convert to use acpi_match_platform_list()
3/5 ghes_edac: add platform check to enable ghes_edac
4/5 EDAC: add edac_get_owner() to check MC owner
5/5 edac drivers: add MC owner check in init

---
drivers/acpi/blacklist.c | 83 +++++++-----------------------------------
drivers/acpi/utils.c | 36 ++++++++++++++++++
drivers/cpufreq/intel_pstate.c | 64 +++++++++++++-------------------
drivers/edac/amd64_edac.c | 5 +++
drivers/edac/edac_mc.c | 7 +++-
drivers/edac/edac_mc.h | 8 ++++
drivers/edac/ghes_edac.c | 29 ++++++++++++---
drivers/edac/pnd2_edac.c | 9 ++++-
drivers/edac/sb_edac.c | 9 ++++-
drivers/edac/skx_edac.c | 9 ++++-
include/linux/acpi.h | 19 ++++++++++
11 files changed, 160 insertions(+), 118 deletions(-)


2017-08-23 23:04:49

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 3/5] ghes_edac: add platform check to enable ghes_edac

The ghes_edac driver was introduced in 2013 [1], but it has not
been enabled by any distro yet. This driver obtains error info
from firmware interfaces, which are not properly implemented on
many platforms, as the driver always emits the messages below:

This EDAC driver relies on BIOS to enumerate memory and get error reports.
Unfortunately, not all BIOSes reflect the memory layout correctly
So, the end result of using this driver varies from vendor to vendor
If you find incorrect reports, please contact your hardware vendor
to correct its BIOS.

To get out from this situation, add a platform check to selectively
enable the driver on the platforms that are known to have proper
firmware implementation. Platform vendors can add their platforms
to the list when they support ghes_edac.

"ghes_edac.force_load=1" skips this platform check.

[1]: https://lwn.net/Articles/538438/
Signed-off-by: Toshi Kani <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Tony Luck <[email protected]>
---
drivers/edac/ghes_edac.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 8d904df..0030a09 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -38,6 +38,10 @@ static struct ghes_edac_pvt *ghes_pvt;
*/
static DEFINE_SPINLOCK(ghes_lock);

+/* Set 1 to skip the platform check */
+static bool __read_mostly ghes_edac_force_load;
+module_param_named(force_load, ghes_edac_force_load, bool, 0);
+
/* Memory Device - Type 17 of SMBIOS spec */
struct memdev_dmi_entry {
u8 type;
@@ -415,6 +419,15 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
spin_unlock_irqrestore(&ghes_lock, flags);
}

+/*
+ * Known systems that are safe to enable this module.
+ * "ghes_edac.force_load=1" skips this check if necessary.
+ */
+static struct acpi_platform_list plat_list[] = {
+ {"HPE ", "Server ", 0, ACPI_SIG_FADT, all_versions},
+ { } /* End */
+};
+
int ghes_edac_register(struct ghes *ghes, struct device *dev)
{
bool fake = false;
@@ -422,6 +435,12 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
struct mem_ctl_info *mci;
struct edac_mc_layer layers[1];
struct ghes_edac_dimm_fill dimm_fill;
+ int idx;
+
+ /* Check if safe to enable on this system */
+ idx = acpi_match_platform_list(plat_list);
+ if (!ghes_edac_force_load && idx < 0)
+ return 0;

/*
* We have only one logical memory controller to which all DIMMs belong.
@@ -460,17 +479,17 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
mci->ctl_name = "ghes_edac";
mci->dev_name = "ghes";

- if (!fake) {
+ if (fake) {
+ pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
+ pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
+ pr_info("work on such system. Use this driver with caution\n");
+ } else if (idx < 0) {
pr_info("This EDAC driver relies on BIOS to enumerate memory and get error reports.\n");
pr_info("Unfortunately, not all BIOSes reflect the memory layout correctly.\n");
pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
pr_info("If you find incorrect reports, please contact your hardware vendor\n");
pr_info("to correct its BIOS.\n");
pr_info("This system has %d DIMM sockets.\n", num_dimm);
- } else {
- pr_info("This system has a very crappy BIOS: It doesn't even list the DIMMS.\n");
- pr_info("Its SMBIOS info is wrong. It is doubtful that the error report would\n");
- pr_info("work on such system. Use this driver with caution\n");
}

if (!fake) {

2017-08-23 23:04:58

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 5/5] edac drivers: add MC owner check in init

Change generic x86 edac drivers, which probe CPU type with
x86_match_cpu(), to verify the module owner at the beginning
of their module init functions. This allows them to fail
their init immediately when ghes_edac is enabled. Similar
change can be made to other edac drivers as necessary.

Also, remove ".c" from module names of pnp2_edac, sb_edac,
and skx_edac.

Signed-off-by: Toshi Kani <[email protected]>
Suggested-by: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Tony Luck <[email protected]>
---
drivers/edac/amd64_edac.c | 5 +++++
drivers/edac/pnd2_edac.c | 9 ++++++++-
drivers/edac/sb_edac.c | 9 +++++++--
drivers/edac/skx_edac.c | 9 ++++++++-
4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ac2f302..8b16ec5 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3434,9 +3434,14 @@ MODULE_DEVICE_TABLE(x86cpu, amd64_cpuids);

static int __init amd64_edac_init(void)
{
+ const char *owner;
int err = -ENODEV;
int i;

+ owner = edac_get_owner();
+ if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
+ return -EBUSY;
+
if (!x86_match_cpu(amd64_cpuids))
return -ENODEV;

diff --git a/drivers/edac/pnd2_edac.c b/drivers/edac/pnd2_edac.c
index a3180a8..7ca02f8 100644
--- a/drivers/edac/pnd2_edac.c
+++ b/drivers/edac/pnd2_edac.c
@@ -45,6 +45,8 @@
#include "edac_module.h"
#include "pnd2_edac.h"

+#define EDAC_MOD_STR "pnd2_edac"
+
#define APL_NUM_CHANNELS 4
#define DNV_NUM_CHANNELS 2
#define DNV_MAX_DIMMS 2 /* Max DIMMs per channel */
@@ -1337,7 +1339,7 @@ static int pnd2_register_mci(struct mem_ctl_info **ppmci)
pvt = mci->pvt_info;
memset(pvt, 0, sizeof(*pvt));

- mci->mod_name = "pnd2_edac.c";
+ mci->mod_name = EDAC_MOD_STR;
mci->dev_name = ops->name;
mci->ctl_name = "Pondicherry2";

@@ -1529,10 +1531,15 @@ MODULE_DEVICE_TABLE(x86cpu, pnd2_cpuids);
static int __init pnd2_init(void)
{
const struct x86_cpu_id *id;
+ const char *owner;
int rc;

edac_dbg(2, "\n");

+ owner = edac_get_owner();
+ if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
+ return -EBUSY;
+
id = x86_match_cpu(pnd2_cpuids);
if (!id)
return -ENODEV;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index dc05916..581fdb7 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -36,7 +36,7 @@ static LIST_HEAD(sbridge_edac_list);
* Alter this version for the module when modifications are made
*/
#define SBRIDGE_REVISION " Ver: 1.1.2 "
-#define EDAC_MOD_STR "sbridge_edac"
+#define EDAC_MOD_STR "sb_edac"

/*
* Debug macros
@@ -3155,7 +3155,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev, enum type type)
MEM_FLAG_DDR4 : MEM_FLAG_DDR3;
mci->edac_ctl_cap = EDAC_FLAG_NONE;
mci->edac_cap = EDAC_FLAG_NONE;
- mci->mod_name = "sb_edac.c";
+ mci->mod_name = EDAC_MOD_STR;
mci->dev_name = pci_name(pdev);
mci->ctl_page_to_phys = NULL;

@@ -3402,10 +3402,15 @@ static void sbridge_remove(void)
static int __init sbridge_init(void)
{
const struct x86_cpu_id *id;
+ const char *owner;
int rc;

edac_dbg(2, "\n");

+ owner = edac_get_owner();
+ if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
+ return -EBUSY;
+
id = x86_match_cpu(sbridge_cpuids);
if (!id)
return -ENODEV;
diff --git a/drivers/edac/skx_edac.c b/drivers/edac/skx_edac.c
index 16dea97..85d5f0b 100644
--- a/drivers/edac/skx_edac.c
+++ b/drivers/edac/skx_edac.c
@@ -31,6 +31,8 @@

#include "edac_module.h"

+#define EDAC_MOD_STR "skx_edac"
+
/*
* Debug macros
*/
@@ -469,7 +471,7 @@ static int skx_register_mci(struct skx_imc *imc)
mci->mtype_cap = MEM_FLAG_DDR4;
mci->edac_ctl_cap = EDAC_FLAG_NONE;
mci->edac_cap = EDAC_FLAG_NONE;
- mci->mod_name = "skx_edac.c";
+ mci->mod_name = EDAC_MOD_STR;
mci->dev_name = pci_name(imc->chan[0].cdev);
mci->ctl_page_to_phys = NULL;

@@ -1039,12 +1041,17 @@ static int __init skx_init(void)
{
const struct x86_cpu_id *id;
const struct munit *m;
+ const char *owner;
int rc = 0, i;
u8 mc = 0, src_id, node_id;
struct skx_dev *d;

edac_dbg(2, "\n");

+ owner = edac_get_owner();
+ if (owner && strncmp(owner, EDAC_MOD_STR, sizeof(EDAC_MOD_STR)))
+ return -EBUSY;
+
id = x86_match_cpu(skx_cpuids);
if (!id)
return -ENODEV;

2017-08-23 23:05:19

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 4/5] EDAC: add edac_get_owner() to check MC owner

Only a single edac driver can be enabled for EDAC MC. When ghes_edac
is enabled, a regular edac driver for the CPU type / platform still
attempts to register itself and fails in edac_mc_add_mc().

Add edac_get_owner() so that regular edac drivers can check the owner
of EDAC MC without calling edac_mc_add_mc().

Signed-off-by: Toshi Kani <[email protected]>
Suggested-by: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Tony Luck <[email protected]>
---
drivers/edac/edac_mc.c | 7 ++++++-
drivers/edac/edac_mc.h | 8 ++++++++
2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 4800721..48193f5 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -53,7 +53,7 @@ static LIST_HEAD(mc_devices);
* Used to lock EDAC MC to just one module, avoiding two drivers e. g.
* apei/ghes and i7core_edac to be used at the same time.
*/
-static void const *edac_mc_owner;
+static const char *edac_mc_owner;

static struct bus_type mc_bus[EDAC_MAX_MCS];

@@ -701,6 +701,11 @@ struct mem_ctl_info *edac_mc_find(int idx)
}
EXPORT_SYMBOL(edac_mc_find);

+const char *edac_get_owner(void)
+{
+ return edac_mc_owner;
+}
+EXPORT_SYMBOL_GPL(edac_get_owner);

/* FIXME - should a warning be printed if no error detection? correction? */
int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
diff --git a/drivers/edac/edac_mc.h b/drivers/edac/edac_mc.h
index 5357800..4165e15 100644
--- a/drivers/edac/edac_mc.h
+++ b/drivers/edac/edac_mc.h
@@ -128,6 +128,14 @@ struct mem_ctl_info *edac_mc_alloc(unsigned mc_num,
unsigned sz_pvt);

/**
+ * edac_get_owner - Return the owner's mod_name of EDAC MC
+ *
+ * Returns:
+ * Pointer to mod_name string when EDAC MC is owned. NULL otherwise.
+ */
+extern const char *edac_get_owner(void);
+
+/*
* edac_mc_add_mc_with_groups() - Insert the @mci structure into the mci
* global list and create sysfs entries associated with @mci structure.
*

2017-08-23 23:05:40

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 2/5] intel_pstate: convert to use acpi_match_platform_list()

Convert to use acpi_match_platform_list() for the platform check.
There is no change in functionality.

Signed-off-by: Toshi Kani <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Borislav Petkov <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 64 ++++++++++++++++------------------------
1 file changed, 25 insertions(+), 39 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index b7fb8b7..ef22a20 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2458,39 +2458,31 @@ enum {
PPC,
};

-struct hw_vendor_info {
- u16 valid;
- char oem_id[ACPI_OEM_ID_SIZE];
- char oem_table_id[ACPI_OEM_TABLE_ID_SIZE];
- int oem_pwr_table;
-};
-
/* Hardware vendor-specific info that has its own power management modes */
-static struct hw_vendor_info vendor_info[] __initdata = {
- {1, "HP ", "ProLiant", PSS},
- {1, "ORACLE", "X4-2 ", PPC},
- {1, "ORACLE", "X4-2L ", PPC},
- {1, "ORACLE", "X4-2B ", PPC},
- {1, "ORACLE", "X3-2 ", PPC},
- {1, "ORACLE", "X3-2L ", PPC},
- {1, "ORACLE", "X3-2B ", PPC},
- {1, "ORACLE", "X4470M2 ", PPC},
- {1, "ORACLE", "X4270M3 ", PPC},
- {1, "ORACLE", "X4270M2 ", PPC},
- {1, "ORACLE", "X4170M2 ", PPC},
- {1, "ORACLE", "X4170 M3", PPC},
- {1, "ORACLE", "X4275 M3", PPC},
- {1, "ORACLE", "X6-2 ", PPC},
- {1, "ORACLE", "Sudbury ", PPC},
- {0, "", ""},
+static struct acpi_platform_list plat_info[] __initdata = {
+ {"HP ", "ProLiant", 0, ACPI_SIG_FADT, all_versions, 0, PSS},
+ {"ORACLE", "X4-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4-2L ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4-2B ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X3-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X3-2L ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X3-2B ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4470M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4270M3 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4270M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4170M2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4170 M3", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X4275 M3", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "X6-2 ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ {"ORACLE", "Sudbury ", 0, ACPI_SIG_FADT, all_versions, 0, PPC},
+ { } /* End */
};

static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
{
- struct acpi_table_header hdr;
- struct hw_vendor_info *v_info;
const struct x86_cpu_id *id;
u64 misc_pwr;
+ int idx;

id = x86_match_cpu(intel_pstate_cpu_oob_ids);
if (id) {
@@ -2499,21 +2491,15 @@ static bool __init intel_pstate_platform_pwr_mgmt_exists(void)
return true;
}

- if (acpi_disabled ||
- ACPI_FAILURE(acpi_get_table_header(ACPI_SIG_FADT, 0, &hdr)))
+ idx = acpi_match_platform_list(plat_info);
+ if (idx < 0)
return false;

- for (v_info = vendor_info; v_info->valid; v_info++) {
- if (!strncmp(hdr.oem_id, v_info->oem_id, ACPI_OEM_ID_SIZE) &&
- !strncmp(hdr.oem_table_id, v_info->oem_table_id,
- ACPI_OEM_TABLE_ID_SIZE))
- switch (v_info->oem_pwr_table) {
- case PSS:
- return intel_pstate_no_acpi_pss();
- case PPC:
- return intel_pstate_has_acpi_ppc() &&
- (!force_load);
- }
+ switch (plat_info[idx].data) {
+ case PSS:
+ return intel_pstate_no_acpi_pss();
+ case PPC:
+ return intel_pstate_has_acpi_ppc() && !force_load;
}

return false;

2017-08-23 23:05:56

by Kani, Toshimitsu

[permalink] [raw]
Subject: [PATCH v4 1/5] ACPI / blacklist: add acpi_match_platform_list()

ACPI OEM ID / OEM Table ID / Revision can be used to identify
a platform based on ACPI firmware info. acpi_blacklisted(),
intel_pstate_platform_pwr_mgmt_exists(), and some other funcs,
have been using similar check to detect a list of platforms
that require special handlings.

Move the platform check in acpi_blacklisted() to a new common
utility function, acpi_match_platform_list(), so that other
drivers do not have to implement their own version.

There is no change in functionality.

Signed-off-by: Toshi Kani <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
---
drivers/acpi/blacklist.c | 83 ++++++++--------------------------------------
drivers/acpi/utils.c | 36 ++++++++++++++++++++
include/linux/acpi.h | 19 +++++++++++
3 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/drivers/acpi/blacklist.c b/drivers/acpi/blacklist.c
index bb542ac..037fd53 100644
--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -30,30 +30,13 @@

#include "internal.h"

-enum acpi_blacklist_predicates {
- all_versions,
- less_than_or_equal,
- equal,
- greater_than_or_equal,
-};
-
-struct acpi_blacklist_item {
- char oem_id[7];
- char oem_table_id[9];
- u32 oem_revision;
- char *table;
- enum acpi_blacklist_predicates oem_revision_predicate;
- char *reason;
- u32 is_critical_error;
-};
-
static struct dmi_system_id acpi_rev_dmi_table[] __initdata;

/*
* POLICY: If *anything* doesn't work, put it on the blacklist.
* If they are critical errors, mark it critical, and abort driver load.
*/
-static struct acpi_blacklist_item acpi_blacklist[] __initdata = {
+static struct acpi_platform_list acpi_blacklist[] __initdata = {
/* Compaq Presario 1700 */
{"PTLTD ", " DSDT ", 0x06040000, ACPI_SIG_DSDT, less_than_or_equal,
"Multiple problems", 1},
@@ -67,65 +50,27 @@ static struct acpi_blacklist_item acpi_blacklist[] __initdata = {
{"IBM ", "TP600E ", 0x00000105, ACPI_SIG_DSDT, less_than_or_equal,
"Incorrect _ADR", 1},

- {""}
+ { }
};

int __init acpi_blacklisted(void)
{
- int i = 0;
+ int i;
int blacklisted = 0;
- struct acpi_table_header table_header;
-
- while (acpi_blacklist[i].oem_id[0] != '\0') {
- if (acpi_get_table_header(acpi_blacklist[i].table, 0, &table_header)) {
- i++;
- continue;
- }
-
- if (strncmp(acpi_blacklist[i].oem_id, table_header.oem_id, 6)) {
- i++;
- continue;
- }
-
- if (strncmp
- (acpi_blacklist[i].oem_table_id, table_header.oem_table_id,
- 8)) {
- i++;
- continue;
- }
-
- if ((acpi_blacklist[i].oem_revision_predicate == all_versions)
- || (acpi_blacklist[i].oem_revision_predicate ==
- less_than_or_equal
- && table_header.oem_revision <=
- acpi_blacklist[i].oem_revision)
- || (acpi_blacklist[i].oem_revision_predicate ==
- greater_than_or_equal
- && table_header.oem_revision >=
- acpi_blacklist[i].oem_revision)
- || (acpi_blacklist[i].oem_revision_predicate == equal
- && table_header.oem_revision ==
- acpi_blacklist[i].oem_revision)) {

- printk(KERN_ERR PREFIX
- "Vendor \"%6.6s\" System \"%8.8s\" "
- "Revision 0x%x has a known ACPI BIOS problem.\n",
- acpi_blacklist[i].oem_id,
- acpi_blacklist[i].oem_table_id,
- acpi_blacklist[i].oem_revision);
+ i = acpi_match_platform_list(acpi_blacklist);
+ if (i >= 0) {
+ pr_err(PREFIX "Vendor \"%6.6s\" System \"%8.8s\" Revision 0x%x has a known ACPI BIOS problem.\n",
+ acpi_blacklist[i].oem_id,
+ acpi_blacklist[i].oem_table_id,
+ acpi_blacklist[i].oem_revision);

- printk(KERN_ERR PREFIX
- "Reason: %s. This is a %s error\n",
- acpi_blacklist[i].reason,
- (acpi_blacklist[i].
- is_critical_error ? "non-recoverable" :
- "recoverable"));
+ pr_err(PREFIX "Reason: %s. This is a %s error\n",
+ acpi_blacklist[i].reason,
+ (acpi_blacklist[i].data ?
+ "non-recoverable" : "recoverable"));

- blacklisted = acpi_blacklist[i].is_critical_error;
- break;
- } else {
- i++;
- }
+ blacklisted = acpi_blacklist[i].data;
}

(void)early_acpi_osi_init();
diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index b9d956c..0a9e597 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -816,3 +816,39 @@ static int __init acpi_backlight(char *str)
return 1;
}
__setup("acpi_backlight=", acpi_backlight);
+
+/**
+ * acpi_match_platform_list - Check if the system matches with a given list
+ * @plat: pointer to acpi_platform_list table terminated by a NULL entry
+ *
+ * Return the matched index if the system is found in the platform list.
+ * Otherwise, return a negative error code.
+ */
+int acpi_match_platform_list(const struct acpi_platform_list *plat)
+{
+ struct acpi_table_header hdr;
+ int idx = 0;
+
+ if (acpi_disabled)
+ return -ENODEV;
+
+ for (; plat->oem_id[0]; plat++, idx++) {
+ if (ACPI_FAILURE(acpi_get_table_header(plat->table, 0, &hdr)))
+ continue;
+
+ if (strncmp(plat->oem_id, hdr.oem_id, ACPI_OEM_ID_SIZE))
+ continue;
+
+ if (strncmp(plat->oem_table_id, hdr.oem_table_id, ACPI_OEM_TABLE_ID_SIZE))
+ continue;
+
+ if ((plat->pred == all_versions) ||
+ (plat->pred == less_than_or_equal && hdr.oem_revision <= plat->oem_revision) ||
+ (plat->pred == greater_than_or_equal && hdr.oem_revision >= plat->oem_revision) ||
+ (plat->pred == equal && hdr.oem_revision == plat->oem_revision))
+ return idx;
+ }
+
+ return -ENODEV;
+}
+EXPORT_SYMBOL(acpi_match_platform_list);
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index c749eef..1ae7abf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -556,6 +556,25 @@ extern acpi_status acpi_pci_osc_control_set(acpi_handle handle,
#define ACPI_OST_SC_DRIVER_LOAD_FAILURE 0x81
#define ACPI_OST_SC_INSERT_NOT_SUPPORTED 0x82

+enum acpi_predicate {
+ all_versions,
+ less_than_or_equal,
+ equal,
+ greater_than_or_equal,
+};
+
+/* Table must be terminted by a NULL entry */
+struct acpi_platform_list {
+ char oem_id[ACPI_OEM_ID_SIZE+1];
+ char oem_table_id[ACPI_OEM_TABLE_ID_SIZE+1];
+ u32 oem_revision;
+ char *table;
+ enum acpi_predicate pred;
+ char *reason;
+ u32 data;
+};
+int acpi_match_platform_list(const struct acpi_platform_list *plat);
+
extern void acpi_early_init(void);
extern void acpi_subsystem_init(void);


2017-08-24 07:53:58

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] ACPI / blacklist: add acpi_match_platform_list()

On Wed, Aug 23, 2017 at 04:54:43PM -0600, Toshi Kani wrote:
> ACPI OEM ID / OEM Table ID / Revision can be used to identify
> a platform based on ACPI firmware info. acpi_blacklisted(),
> intel_pstate_platform_pwr_mgmt_exists(), and some other funcs,
> have been using similar check to detect a list of platforms
> that require special handlings.
>
> Move the platform check in acpi_blacklisted() to a new common
> utility function, acpi_match_platform_list(), so that other
> drivers do not have to implement their own version.
>
> There is no change in functionality.
>
> Signed-off-by: Toshi Kani <[email protected]>
> Cc: "Rafael J. Wysocki" <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> ---
> drivers/acpi/blacklist.c | 83 ++++++++--------------------------------------
> drivers/acpi/utils.c | 36 ++++++++++++++++++++
> include/linux/acpi.h | 19 +++++++++++
> 3 files changed, 69 insertions(+), 69 deletions(-)

...

> diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
> index b9d956c..0a9e597 100644
> --- a/drivers/acpi/utils.c
> +++ b/drivers/acpi/utils.c
> @@ -816,3 +816,39 @@ static int __init acpi_backlight(char *str)
> return 1;
> }
> __setup("acpi_backlight=", acpi_backlight);
> +
> +/**
> + * acpi_match_platform_list - Check if the system matches with a given list
> + * @plat: pointer to acpi_platform_list table terminated by a NULL entry
> + *
> + * Return the matched index if the system is found in the platform list.
> + * Otherwise, return a negative error code.
> + */
> +int acpi_match_platform_list(const struct acpi_platform_list *plat)
> +{
> + struct acpi_table_header hdr;
> + int idx = 0;
> +
> + if (acpi_disabled)
> + return -ENODEV;
> +
> + for (; plat->oem_id[0]; plat++, idx++) {
> + if (ACPI_FAILURE(acpi_get_table_header(plat->table, 0, &hdr)))
> + continue;
> +
> + if (strncmp(plat->oem_id, hdr.oem_id, ACPI_OEM_ID_SIZE))
> + continue;
> +
> + if (strncmp(plat->oem_table_id, hdr.oem_table_id, ACPI_OEM_TABLE_ID_SIZE))
> + continue;
> +
> + if ((plat->pred == all_versions) ||
> + (plat->pred == less_than_or_equal && hdr.oem_revision <= plat->oem_revision) ||
> + (plat->pred == greater_than_or_equal && hdr.oem_revision >= plat->oem_revision) ||
> + (plat->pred == equal && hdr.oem_revision == plat->oem_revision))

If you align the second part of the test like this:

if ((plat->pred == all_versions) ||
(plat->pred == less_than_or_equal && hdr.oem_revision <= plat->oem_revision) ||
(plat->pred == greater_than_or_equal && hdr.oem_revision >= plat->oem_revision) ||
(plat->pred == equal && hdr.oem_revision == plat->oem_revision))

it gets maximally readable. But I'd leave that up to Rafael when committing
- no need to send another version.

Other than that:

Reviewed-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-08-28 21:03:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] ACPI / blacklist: add acpi_match_platform_list()

On Thursday, August 24, 2017 9:53:48 AM CEST Borislav Petkov wrote:
> On Wed, Aug 23, 2017 at 04:54:43PM -0600, Toshi Kani wrote:
> > ACPI OEM ID / OEM Table ID / Revision can be used to identify
> > a platform based on ACPI firmware info. acpi_blacklisted(),
> > intel_pstate_platform_pwr_mgmt_exists(), and some other funcs,
> > have been using similar check to detect a list of platforms
> > that require special handlings.
> >
> > Move the platform check in acpi_blacklisted() to a new common
> > utility function, acpi_match_platform_list(), so that other
> > drivers do not have to implement their own version.
> >
> > There is no change in functionality.
> >
> > Signed-off-by: Toshi Kani <[email protected]>
> > Cc: "Rafael J. Wysocki" <[email protected]>
> > Cc: Borislav Petkov <[email protected]>
> > ---
> > drivers/acpi/blacklist.c | 83 ++++++++--------------------------------------
> > drivers/acpi/utils.c | 36 ++++++++++++++++++++
> > include/linux/acpi.h | 19 +++++++++++
> > 3 files changed, 69 insertions(+), 69 deletions(-)
>
> ...
>
> > diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
> > index b9d956c..0a9e597 100644
> > --- a/drivers/acpi/utils.c
> > +++ b/drivers/acpi/utils.c
> > @@ -816,3 +816,39 @@ static int __init acpi_backlight(char *str)
> > return 1;
> > }
> > __setup("acpi_backlight=", acpi_backlight);
> > +
> > +/**
> > + * acpi_match_platform_list - Check if the system matches with a given list
> > + * @plat: pointer to acpi_platform_list table terminated by a NULL entry
> > + *
> > + * Return the matched index if the system is found in the platform list.
> > + * Otherwise, return a negative error code.
> > + */
> > +int acpi_match_platform_list(const struct acpi_platform_list *plat)
> > +{
> > + struct acpi_table_header hdr;
> > + int idx = 0;
> > +
> > + if (acpi_disabled)
> > + return -ENODEV;
> > +
> > + for (; plat->oem_id[0]; plat++, idx++) {
> > + if (ACPI_FAILURE(acpi_get_table_header(plat->table, 0, &hdr)))
> > + continue;
> > +
> > + if (strncmp(plat->oem_id, hdr.oem_id, ACPI_OEM_ID_SIZE))
> > + continue;
> > +
> > + if (strncmp(plat->oem_table_id, hdr.oem_table_id, ACPI_OEM_TABLE_ID_SIZE))
> > + continue;
> > +
> > + if ((plat->pred == all_versions) ||
> > + (plat->pred == less_than_or_equal && hdr.oem_revision <= plat->oem_revision) ||
> > + (plat->pred == greater_than_or_equal && hdr.oem_revision >= plat->oem_revision) ||
> > + (plat->pred == equal && hdr.oem_revision == plat->oem_revision))
>
> If you align the second part of the test like this:
>
> if ((plat->pred == all_versions) ||
> (plat->pred == less_than_or_equal && hdr.oem_revision <= plat->oem_revision) ||
> (plat->pred == greater_than_or_equal && hdr.oem_revision >= plat->oem_revision) ||
> (plat->pred == equal && hdr.oem_revision == plat->oem_revision))
>
> it gets maximally readable. But I'd leave that up to Rafael when committing
> - no need to send another version.
>
> Other than that:
>
> Reviewed-by: Borislav Petkov <[email protected]>

OK

So what about the [3-5/5] in this series?

My current plan is to apply them too and expose a branch with them, can I
go ahead with that?

Thanks,
Rafael

2017-08-28 21:22:10

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] ACPI / blacklist: add acpi_match_platform_list()

On Mon, Aug 28, 2017 at 10:55:07PM +0200, Rafael J. Wysocki wrote:
> So what about the [3-5/5] in this series?
>
> My current plan is to apply them too and expose a branch with them, can I
> go ahead with that?

No, please expose a branch with only the ACPI patches, i.e., 1 and 2 and
I can merge it and apply the EDAC patches ontop.

Then you and I need to sync up so that you send your pull requests first
and I follow you so that Linus applies them in the proper order.

Makes sense?

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-08-28 21:27:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v4 1/5] ACPI / blacklist: add acpi_match_platform_list()

On Monday, August 28, 2017 11:21:52 PM CEST Borislav Petkov wrote:
> On Mon, Aug 28, 2017 at 10:55:07PM +0200, Rafael J. Wysocki wrote:
> > So what about the [3-5/5] in this series?
> >
> > My current plan is to apply them too and expose a branch with them, can I
> > go ahead with that?
>
> No, please expose a branch with only the ACPI patches, i.e., 1 and 2 and
> I can merge it and apply the EDAC patches ontop.
>
> Then you and I need to sync up so that you send your pull requests first
> and I follow you so that Linus applies them in the proper order.
>
> Makes sense?

Yup, works for me!

2017-08-31 10:57:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] ghes_edac: add platform check to enable ghes_edac

On Wed, Aug 23, 2017 at 04:54:45PM -0600, Toshi Kani wrote:
> The ghes_edac driver was introduced in 2013 [1], but it has not
> been enabled by any distro yet. This driver obtains error info
> from firmware interfaces, which are not properly implemented on
> many platforms, as the driver always emits the messages below:
>
> This EDAC driver relies on BIOS to enumerate memory and get error reports.
> Unfortunately, not all BIOSes reflect the memory layout correctly
> So, the end result of using this driver varies from vendor to vendor
> If you find incorrect reports, please contact your hardware vendor
> to correct its BIOS.
>
> To get out from this situation, add a platform check to selectively
> enable the driver on the platforms that are known to have proper
> firmware implementation. Platform vendors can add their platforms
> to the list when they support ghes_edac.
>
> "ghes_edac.force_load=1" skips this platform check.
>
> [1]: https://lwn.net/Articles/538438/
> Signed-off-by: Toshi Kani <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Mauro Carvalho Chehab <[email protected]>
> Cc: Tony Luck <[email protected]>
> ---
> drivers/edac/ghes_edac.c | 29 ++++++++++++++++++++++++-----
> 1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> index 8d904df..0030a09 100644
> --- a/drivers/edac/ghes_edac.c
> +++ b/drivers/edac/ghes_edac.c
> @@ -38,6 +38,10 @@ static struct ghes_edac_pvt *ghes_pvt;
> */
> static DEFINE_SPINLOCK(ghes_lock);
>
> +/* Set 1 to skip the platform check */
> +static bool __read_mostly ghes_edac_force_load;

It is static - "force_load" as a bool name is enough.

> +module_param_named(force_load, ghes_edac_force_load, bool, 0);

ERROR: Use 4 digit octal (0777) not decimal permissions
#53: FILE: drivers/edac/ghes_edac.c:43:
+module_param_named(force_load, ghes_edac_force_load, bool, 0);

This last param is @perm: visibility in sysfs. Why not visible in sysfs?

> +
> /* Memory Device - Type 17 of SMBIOS spec */
> struct memdev_dmi_entry {
> u8 type;
> @@ -415,6 +419,15 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
> spin_unlock_irqrestore(&ghes_lock, flags);
> }
>
> +/*
> + * Known systems that are safe to enable this module.
> + * "ghes_edac.force_load=1" skips this check if necessary.

Put this second sentence over the parameter definition.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-08-31 16:17:27

by Kani, Toshimitsu

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] ghes_edac: add platform check to enable ghes_edac

On Thu, 2017-08-31 at 12:56 +0200, Borislav Petkov wrote:
> On Wed, Aug 23, 2017 at 04:54:45PM -0600, Toshi Kani wrote:
:
> > ---
> >  drivers/edac/ghes_edac.c |   29 ++++++++++++++++++++++++-----
> >  1 file changed, 24 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> > index 8d904df..0030a09 100644
> > --- a/drivers/edac/ghes_edac.c
> > +++ b/drivers/edac/ghes_edac.c
> > @@ -38,6 +38,10 @@ static struct ghes_edac_pvt *ghes_pvt;
> >   */
> >  static DEFINE_SPINLOCK(ghes_lock);
> >  
> > +/* Set 1 to skip the platform check */
> > +static bool __read_mostly ghes_edac_force_load;
>
> It is static - "force_load" as a bool name is enough.

Will do.

> > +module_param_named(force_load, ghes_edac_force_load, bool, 0);
>
> ERROR: Use 4 digit octal (0777) not decimal permissions
> #53: FILE: drivers/edac/ghes_edac.c:43:
> +module_param_named(force_load, ghes_edac_force_load, bool, 0);
>
> This last param is @perm: visibility in sysfs. Why not visible in
> sysfs?

I followed in the footsteps of 'ghes_disable', which is also a kernel
boot option and uses 0.

> > +
> >  /* Memory Device - Type 17 of SMBIOS spec */
> >  struct memdev_dmi_entry {
> >   u8 type;
> > @@ -415,6 +419,15 @@ void ghes_edac_report_mem_error(struct ghes
> > *ghes, int sev,
> >   spin_unlock_irqrestore(&ghes_lock, flags);
> >  }
> >  
> > +/*
> > + * Known systems that are safe to enable this module.
> > + * "ghes_edac.force_load=1" skips this check if necessary.
>
> Put this second sentence over the parameter definition.

Will do.

Thanks,
-Toshi

2017-08-31 16:52:21

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 3/5] ghes_edac: add platform check to enable ghes_edac

On Thu, Aug 31, 2017 at 04:17:07PM +0000, Kani, Toshimitsu wrote:
> I followed in the footsteps of 'ghes_disable', which is also a kernel
> boot option and uses 0.

Ok, ghes_disable comment says that using module_param() is easier.
ghes_edac is not a module but then __setup() is for "really core code".

Documentation/admin-guide/kernel-parameters.rst also talks about
core_param() but that's #ifndef MODULE. So module_param() it is.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2017-09-01 09:31:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v4 5/5] edac drivers: add MC owner check in init

On Wed, Aug 23, 2017 at 04:54:47PM -0600, Toshi Kani wrote:
> Change generic x86 edac drivers, which probe CPU type with
> x86_match_cpu(), to verify the module owner at the beginning
> of their module init functions. This allows them to fail
> their init immediately when ghes_edac is enabled. Similar
> change can be made to other edac drivers as necessary.
>
> Also, remove ".c" from module names of pnp2_edac, sb_edac,
> and skx_edac.
>
> Signed-off-by: Toshi Kani <[email protected]>
> Suggested-by: Borislav Petkov <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Mauro Carvalho Chehab <[email protected]>
> Cc: Tony Luck <[email protected]>
> ---
> drivers/edac/amd64_edac.c | 5 +++++
> drivers/edac/pnd2_edac.c | 9 ++++++++-
> drivers/edac/sb_edac.c | 9 +++++++--
> drivers/edac/skx_edac.c | 9 ++++++++-
> 4 files changed, 28 insertions(+), 4 deletions(-)

Last 2 applied, thanks.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.