2024-04-01 17:15:15

by Avadhut Naik

[permalink] [raw]
Subject: [PATCH v6 0/2] Update mce_record tracepoint

This patchset updates the mce_record tracepoint so that the recently added
fields of struct mce are exported through it to userspace.

The first patch adds PPIN (Protected Processor Inventory Number) field to
the tracepoint.

The second patch adds the microcode field (Microcode Revision) to the
tracepoint.

Changes in v2:
- Export microcode field (Microcode Revision) through the tracepoiont in
addition to PPIN.

Changes in v3:
- Change format specifier for microcode revision from %u to %x
- Fix tab alignments
- Add Reviewed-by: Sohil Mehta <[email protected]>

Changes in v4:
- Update commit messages to reflect the reason for the fields being
added to the tracepoint.
- Add comment to explicitly state the type of information that should
be added to the tracepoint.
- Add Reviewed-by: Steven Rostedt (Google) <[email protected]>

Changes in v5:
- Changed "MICROCODE REVISION" to just "MICROCODE".
- Changed words which are not acronyms from ALL CAPS to no caps.
- Added Reviewed-by: Tony Luck <[email protected]>

Changes in v6:
- Rebased on top of Ingo's changes to the MCE tracepoint
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/include/trace/events/mce.h?id=ac5e80e94f5c67d7053f50fc3faddab931707f0f

[NOTE:
- Since changes in this version are very minor, have retained the below
tags received for previous versions:
Reviewed-by: Sohil Mehta <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>]

Avadhut Naik (2):
tracing: Include PPIN in mce_record tracepoint
tracing: Include Microcode Revision in mce_record tracepoint

include/trace/events/mce.h | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)


base-commit: 65d1240b6728b38e4d2068d6738a17e4ee4351f5
--
2.34.1



2024-04-01 17:15:28

by Avadhut Naik

[permalink] [raw]
Subject: [PATCH v6 1/2] tracing: Include PPIN in mce_record tracepoint

Machine Check Error information from struct mce is exported to userspace
through the mce_record tracepoint.

Currently, however, the PPIN (Protected Processor Inventory Number) field
of struct mce is not exported through the tracepoint.

Export PPIN through the tracepoint as it provides a unique identifier for
the system (or socket in case of multi-socket systems) on which the MCE
has been received.

Also, add a comment explaining the kind of information that can be and
should be added to the tracepoint.

Signed-off-by: Avadhut Naik <[email protected]>
Reviewed-by: Sohil Mehta <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
include/trace/events/mce.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 9c4e12163996..294fccc329c1 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -9,6 +9,14 @@
#include <linux/tracepoint.h>
#include <asm/mce.h>

+/*
+ * MCE Event Record.
+ *
+ * Only very relevant and transient information which cannot be
+ * gathered from a system by any other means or which can only be
+ * acquired arduously should be added to this record.
+ */
+
TRACE_EVENT(mce_record,

TP_PROTO(struct mce *m),
@@ -25,6 +33,7 @@ TRACE_EVENT(mce_record,
__field( u64, ipid )
__field( u64, ip )
__field( u64, tsc )
+ __field( u64, ppin )
__field( u64, walltime )
__field( u32, cpu )
__field( u32, cpuid )
@@ -45,6 +54,7 @@ TRACE_EVENT(mce_record,
__entry->ipid = m->ipid;
__entry->ip = m->ip;
__entry->tsc = m->tsc;
+ __entry->ppin = m->ppin;
__entry->walltime = m->time;
__entry->cpu = m->extcpu;
__entry->cpuid = m->cpuid;
@@ -55,7 +65,7 @@ TRACE_EVENT(mce_record,
__entry->cpuvendor = m->cpuvendor;
),

- TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
+ TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
__entry->cpu,
__entry->mcgcap, __entry->mcgstatus,
__entry->bank, __entry->status,
@@ -65,6 +75,7 @@ TRACE_EVENT(mce_record,
__entry->synd,
__entry->cs, __entry->ip,
__entry->tsc,
+ __entry->ppin,
__entry->cpuvendor,
__entry->cpuid,
__entry->walltime,
--
2.34.1


2024-04-01 17:15:48

by Avadhut Naik

[permalink] [raw]
Subject: [PATCH v6 2/2] tracing: Include Microcode Revision in mce_record tracepoint

Currently, the microcode field (Microcode Revision) of struct mce is not
exported to userspace through the mce_record tracepoint.

Knowing the microcode version on which the MCE was received is critical
information for debugging. If the version is not recorded, later attempts
to acquire the version might result in discrepancies since it can be
changed at runtime.

Export microcode version through the tracepoint to prevent ambiguity over
the active version on the system when the MCE was received.

Signed-off-by: Avadhut Naik <[email protected]>
Reviewed-by: Sohil Mehta <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
include/trace/events/mce.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 294fccc329c1..f0f7b3cb2041 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -42,6 +42,7 @@ TRACE_EVENT(mce_record,
__field( u8, cs )
__field( u8, bank )
__field( u8, cpuvendor )
+ __field( u32, microcode )
),

TP_fast_assign(
@@ -63,9 +64,10 @@ TRACE_EVENT(mce_record,
__entry->cs = m->cs;
__entry->bank = m->bank;
__entry->cpuvendor = m->cpuvendor;
+ __entry->microcode = m->microcode;
),

- TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
+ TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x, microcode: %x",
__entry->cpu,
__entry->mcgcap, __entry->mcgstatus,
__entry->bank, __entry->status,
@@ -80,7 +82,8 @@ TRACE_EVENT(mce_record,
__entry->cpuid,
__entry->walltime,
__entry->socketid,
- __entry->apicid)
+ __entry->apicid,
+ __entry->microcode)
);

#endif /* _TRACE_MCE_H */
--
2.34.1


Subject: [tip: x86/cpu] tracing: Add the ::microcode field to the mce_record tracepoint

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 186d7ef52c1f0c41450dedbdf6d6325d0a84e4c5
Gitweb: https://git.kernel.org/tip/186d7ef52c1f0c41450dedbdf6d6325d0a84e4c5
Author: Avadhut Naik <[email protected]>
AuthorDate: Mon, 01 Apr 2024 12:14:55 -05:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 03 Apr 2024 09:39:29 +02:00

tracing: Add the ::microcode field to the mce_record tracepoint

Currently, the microcode field (Microcode Revision) of 'struct mce' is not
exposed to userspace through the mce_record tracepoint.

Knowing the microcode version on which the MCE was received is critical
information for debugging. If the version is not recorded, later attempts
to acquire the version might result in discrepancies since it can be
changed at runtime.

Add microcode version to the tracepoint to prevent ambiguity over
the active version on the system when the MCE was received.

Signed-off-by: Avadhut Naik <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Sohil Mehta <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/trace/events/mce.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 294fccc..f0f7b3c 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -42,6 +42,7 @@ TRACE_EVENT(mce_record,
__field( u8, cs )
__field( u8, bank )
__field( u8, cpuvendor )
+ __field( u32, microcode )
),

TP_fast_assign(
@@ -63,9 +64,10 @@ TRACE_EVENT(mce_record,
__entry->cs = m->cs;
__entry->bank = m->bank;
__entry->cpuvendor = m->cpuvendor;
+ __entry->microcode = m->microcode;
),

- TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
+ TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x, microcode: %x",
__entry->cpu,
__entry->mcgcap, __entry->mcgstatus,
__entry->bank, __entry->status,
@@ -80,7 +82,8 @@ TRACE_EVENT(mce_record,
__entry->cpuid,
__entry->walltime,
__entry->socketid,
- __entry->apicid)
+ __entry->apicid,
+ __entry->microcode)
);

#endif /* _TRACE_MCE_H */

Subject: [tip: x86/cpu] tracing: Add the ::ppin field to the mce_record tracepoint

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID: 98430645e383404e5f6f784cabbb08ebb4ac5499
Gitweb: https://git.kernel.org/tip/98430645e383404e5f6f784cabbb08ebb4ac5499
Author: Avadhut Naik <[email protected]>
AuthorDate: Mon, 01 Apr 2024 12:14:54 -05:00
Committer: Ingo Molnar <[email protected]>
CommitterDate: Wed, 03 Apr 2024 09:39:29 +02:00

tracing: Add the ::ppin field to the mce_record tracepoint

Machine Check Error information from 'struct mce' is exposed to userspace
through the mce_record tracepoint.

Currently, however, the PPIN (Protected Processor Inventory Number) field
of 'struct mce' is not exposed.

Add a PPIN field to the tracepoint as it provides a unique identifier for
the system (or socket in case of multi-socket systems) on which the MCE
has been received.

Also, add a comment explaining the kind of information that can be and
should be added to the tracepoint.

Signed-off-by: Avadhut Naik <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Sohil Mehta <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/trace/events/mce.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 9c4e121..294fccc 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -9,6 +9,14 @@
#include <linux/tracepoint.h>
#include <asm/mce.h>

+/*
+ * MCE Event Record.
+ *
+ * Only very relevant and transient information which cannot be
+ * gathered from a system by any other means or which can only be
+ * acquired arduously should be added to this record.
+ */
+
TRACE_EVENT(mce_record,

TP_PROTO(struct mce *m),
@@ -25,6 +33,7 @@ TRACE_EVENT(mce_record,
__field( u64, ipid )
__field( u64, ip )
__field( u64, tsc )
+ __field( u64, ppin )
__field( u64, walltime )
__field( u32, cpu )
__field( u32, cpuid )
@@ -45,6 +54,7 @@ TRACE_EVENT(mce_record,
__entry->ipid = m->ipid;
__entry->ip = m->ip;
__entry->tsc = m->tsc;
+ __entry->ppin = m->ppin;
__entry->walltime = m->time;
__entry->cpu = m->extcpu;
__entry->cpuid = m->cpuid;
@@ -55,7 +65,7 @@ TRACE_EVENT(mce_record,
__entry->cpuvendor = m->cpuvendor;
),

- TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
+ TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR: %016Lx, MISC: %016Lx, SYND: %016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, vendor: %u, CPUID: %x, time: %llu, socket: %u, APIC: %x",
__entry->cpu,
__entry->mcgcap, __entry->mcgstatus,
__entry->bank, __entry->status,
@@ -65,6 +75,7 @@ TRACE_EVENT(mce_record,
__entry->synd,
__entry->cs, __entry->ip,
__entry->tsc,
+ __entry->ppin,
__entry->cpuvendor,
__entry->cpuid,
__entry->walltime,