2016-03-03 19:06:09

by Josef Bacik

[permalink] [raw]
Subject: [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id

We have a bunch of boxes where ACPI_ERST_GET_RECORD_ID doesn't advance after
reading an error record, it only advances on reboot. So we get a bunch of MCE's
from a bad CPU, swap out the CPU, and then /var/log/mcelog has a new entry on
every reboot, which makes our monitoring send the box back to be fixed when it
has already been fixed.

So instead of using ACPI_ERST_GET_RECORD_ID to walk through the records on the
box, simply pass 0 into erst_read() which will read the first entry every time,
and then store the record id from the cpe header so that we clear out the right
record. With this patch these broken boxes now spit out all of records in the
ERST in one go instead of one per reboot. Thanks,

Signed-off-by: Josef Bacik <[email protected]>
---
arch/x86/kernel/cpu/mcheck/mce-apei.c | 43 +++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
index 34c89a3..6de662f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
@@ -113,27 +113,50 @@ ssize_t apei_read_mce(struct mce *m, u64 *record_id)
{
struct cper_mce_record rcd;
int rc, pos;
+ bool lookup_record = false;

rc = erst_get_record_id_begin(&pos);
if (rc)
return rc;
retry:
- rc = erst_get_record_id_next(&pos, record_id);
- if (rc)
- goto out;
+ /*
+ * Some hardware is broken and doesn't actually advance the record id
+ * returned by ACPI_ERST_GET_RECORD_ID when we read a record like the
+ * spec says it is supposed to. So instead use record_id == 0 to just
+ * grab the first record in the erst, and fall back only if we trip over
+ * a record that isn't a MCE record.
+ */
+ if (lookup_record) {
+ rc = erst_get_record_id_next(&pos, record_id);
+ if (rc)
+ goto out;
+ } else {
+ *record_id = 0;
+ }
/* no more record */
if (*record_id == APEI_ERST_INVALID_RECORD_ID)
goto out;
rc = erst_read(*record_id, &rcd.hdr, sizeof(rcd));
- /* someone else has cleared the record, try next one */
- if (rc == -ENOENT)
- goto retry;
- else if (rc < 0)
+ /*
+ * someone else has cleared the record, try next one if we are looking
+ * up records. If we aren't looking up the record id's then just bail
+ * since this means we have an empty table.
+ */
+ if (rc == -ENOENT) {
+ if (lookup_record)
+ goto retry;
+ rc = 0;
+ goto out;
+ } else if (rc < 0) {
goto out;
- /* try to skip other type records in storage */
- else if (rc != sizeof(rcd) ||
- uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE))
+ } else if (rc != sizeof(rcd) ||
+ uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE)) {
+ /* try to skip other type records in storage */
+ lookup_record = true;
goto retry;
+ }
+ /* Use the record header as the source of truth for the record id. */
+ *record_id = rcd.hdr.record_id;
memcpy(m, &rcd.mce, sizeof(*m));
rc = sizeof(*m);
out:
--
2.5.0


2016-03-03 21:50:00

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id

> retry:
> - rc = erst_get_record_id_next(&pos, record_id);
> - if (rc)
> - goto out;
> + /*
> + * Some hardware is broken and doesn't actually advance the record id

I'd blame this on firmware rather than hardware.

> + * returned by ACPI_ERST_GET_RECORD_ID when we read a record like the
> + * spec says it is supposed to. So instead use record_id == 0 to just
> + * grab the first record in the erst, and fall back only if we trip over
> + * a record that isn't a MCE record.
> + */
> + if (lookup_record) {
> + rc = erst_get_record_id_next(&pos, record_id);
> + if (rc)
> + goto out;
> + } else {
> + *record_id = 0;
> + }
> /* no more record */
> if (*record_id == APEI_ERST_INVALID_RECORD_ID)
> goto out;
> rc = erst_read(*record_id, &rcd.hdr, sizeof(rcd));
> - /* someone else has cleared the record, try next one */
> - if (rc == -ENOENT)
> - goto retry;
> - else if (rc < 0)
> + /*
> + * someone else has cleared the record, try next one if we are looking
> + * up records. If we aren't looking up the record id's then just bail
> + * since this means we have an empty table.
> + */
> + if (rc == -ENOENT) {
> + if (lookup_record)
> + goto retry;
> + rc = 0;
> + goto out;
> + } else if (rc < 0) {
> goto out;
> - /* try to skip other type records in storage */
> - else if (rc != sizeof(rcd) ||
> - uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE))
> + } else if (rc != sizeof(rcd) ||
> + uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE)) {
> + /* try to skip other type records in storage */
> + lookup_record = true;
> goto retry;

Are you still doomed by the buggy firmware if we take this "goto"?
You be back at the top of the loop excpecting erst_get_record_id_next()
to move on. Does this just not happen in practice (finding non MCE
records in amognst the MCE ones)?

> + }
> + /* Use the record header as the source of truth for the record id. */
> + *record_id = rcd.hdr.record_id;
> memcpy(m, &rcd.mce, sizeof(*m));
> rc = sizeof(*m);

-Tony

2016-03-03 21:54:09

by Josef Bacik

[permalink] [raw]
Subject: Re: [PATCH] mce-apei: do not rely on ACPI_ERST_GET_RECORD_ID for record id

On 03/03/2016 04:49 PM, Luck, Tony wrote:
>> retry:
>> - rc = erst_get_record_id_next(&pos, record_id);
>> - if (rc)
>> - goto out;
>> + /*
>> + * Some hardware is broken and doesn't actually advance the record id
>
> I'd blame this on firmware rather than hardware.

Yup sorry misspoke.

>
>> + * returned by ACPI_ERST_GET_RECORD_ID when we read a record like the
>> + * spec says it is supposed to. So instead use record_id == 0 to just
>> + * grab the first record in the erst, and fall back only if we trip over
>> + * a record that isn't a MCE record.
>> + */
>> + if (lookup_record) {
>> + rc = erst_get_record_id_next(&pos, record_id);
>> + if (rc)
>> + goto out;
>> + } else {
>> + *record_id = 0;
>> + }
>> /* no more record */
>> if (*record_id == APEI_ERST_INVALID_RECORD_ID)
>> goto out;
>> rc = erst_read(*record_id, &rcd.hdr, sizeof(rcd));
>> - /* someone else has cleared the record, try next one */
>> - if (rc == -ENOENT)
>> - goto retry;
>> - else if (rc < 0)
>> + /*
>> + * someone else has cleared the record, try next one if we are looking
>> + * up records. If we aren't looking up the record id's then just bail
>> + * since this means we have an empty table.
>> + */
>> + if (rc == -ENOENT) {
>> + if (lookup_record)
>> + goto retry;
>> + rc = 0;
>> + goto out;
>> + } else if (rc < 0) {
>> goto out;
>> - /* try to skip other type records in storage */
>> - else if (rc != sizeof(rcd) ||
>> - uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE))
>> + } else if (rc != sizeof(rcd) ||
>> + uuid_le_cmp(rcd.hdr.creator_id, CPER_CREATOR_MCE)) {
>> + /* try to skip other type records in storage */
>> + lookup_record = true;
>> goto retry;
>
> Are you still doomed by the buggy firmware if we take this "goto"?
> You be back at the top of the loop excpecting erst_get_record_id_next()
> to move on. Does this just not happen in practice (finding non MCE
> records in amognst the MCE ones)?
>

So one of the boxes had a non MCE record with MCE records, but yeah it
was on a box that was also broken in this way. I'm not super worried
about this case for us, I just want to have a fallback for any firmware
that may not be broken in this way to be able to skip things. Thanks,

Josef