On Tue, Mar 08, 2022 at 12:41:34PM -0600, Carlos Bilbao wrote:
> AMD's severity grading covers very few machine errors. In the graded cases
> there are no user-readable messages, complicating debugging of critical
> hardware errors. Furthermore, with the current implementation AMD MCEs have
> no support for the severities-coverage file. Adding new severities for AMD
> with the current logic would be too convoluted.
>
> Fix the above issues including AMD severities to the severity table, in
> combination with Intel MCEs. Unify the severity grading logic of both
> vendors. Label the vendor-specific cases (e.g. cases with different
> registers) where checks cannot be implicit with the available features.
>
> Signed-off-by: Carlos Bilbao <[email protected]>
> ---
> arch/x86/include/asm/mce.h | 7 ++
> arch/x86/kernel/cpu/mce/severity.c | 188 +++++++++++++++--------------
> 2 files changed, 103 insertions(+), 92 deletions(-)
Sorry, maybe you're too new to this and you probably haven't read the
old discussions we have had about the severity grading turd. In order to
save you some time: adding more to that macro insanity is not going to
happen.
The AMD severity grading functions are *actually* readable vs this
abomination which I hate with passion.
If you want to add more logic, you should add to mce_severity_amd(),
perhaps call other helper functions which grade based on a certain
aspect of the error type, split the logic, use comments, etc, but
*definitely* not this.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 3/8/2022 1:32 PM, Borislav Petkov wrote:
> On Tue, Mar 08, 2022 at 12:41:34PM -0600, Carlos Bilbao wrote:
>> AMD's severity grading covers very few machine errors. In the graded cases
>> there are no user-readable messages, complicating debugging of critical
>> hardware errors. Furthermore, with the current implementation AMD MCEs have
>> no support for the severities-coverage file. Adding new severities for AMD
>> with the current logic would be too convoluted.
>>
>> Fix the above issues including AMD severities to the severity table, in
>> combination with Intel MCEs. Unify the severity grading logic of both
>> vendors. Label the vendor-specific cases (e.g. cases with different
>> registers) where checks cannot be implicit with the available features.
>>
>> Signed-off-by: Carlos Bilbao <[email protected]>
>> ---
>> arch/x86/include/asm/mce.h | 7 ++
>> arch/x86/kernel/cpu/mce/severity.c | 188 +++++++++++++++--------------
>> 2 files changed, 103 insertions(+), 92 deletions(-)
>
> Sorry, maybe you're too new to this and you probably haven't read the
> old discussions we have had about the severity grading turd. In order to
> save you some time: adding more to that macro insanity is not going to
> happen.
>
> The AMD severity grading functions are *actually* readable vs this
> abomination which I hate with passion.
>
> If you want to add more logic, you should add to mce_severity_amd(),
> perhaps call other helper functions which grade based on a certain
> aspect of the error type, split the logic, use comments, etc, but
> *definitely* not this.
>
> Thx.
>
Understood, sending a new patch in that direction.
Thanks,
Carlos