2021-05-05 22:37:13

by Tyler Hicks

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH] EDAC: update edac printk wrappers to use printk_ratelimited.

On 2021-05-06 00:02:44, Borislav Petkov wrote:
> On Wed, May 05, 2021 at 04:48:46PM -0500, Tyler Hicks wrote:
> > The thought was that the full stream of log messages isn't necessary to
> > notice that there's a problem when they are being emitted at such a high
> > rate (500 per second). They're just filling up disk space and/or wasting
> > networking bandwidth at that point.
>
> I already asked about this but lemme point it out again: have you guys
> looked at drivers/ras/cec.c ?

We'll have a closer look. Thanks for the pointer!

Tyler

>
> With that there won't be *any* error reports in dmesg and it will even
> poison and offline pages which generate excessive errors so that ...
>
> > Of course, the best course of action here is to service the machine
> > but there's still a period of time between the CE errors popping up
> > and the machine being serviced.
>
> ... you'll have ample time to service the machine.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>


2021-05-05 23:37:11

by Tyler Hicks

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH] EDAC: update edac printk wrappers to use printk_ratelimited.

On 2021-05-05 17:16:11, Tyler Hicks wrote:
> On 2021-05-06 00:02:44, Borislav Petkov wrote:
> > On Wed, May 05, 2021 at 04:48:46PM -0500, Tyler Hicks wrote:
> > > The thought was that the full stream of log messages isn't necessary to
> > > notice that there's a problem when they are being emitted at such a high
> > > rate (500 per second). They're just filling up disk space and/or wasting
> > > networking bandwidth at that point.
> >
> > I already asked about this but lemme point it out again: have you guys
> > looked at drivers/ras/cec.c ?
>
> We'll have a closer look. Thanks for the pointer!

This is x86-specific and not applicable in our situation.

Tyler

>
> Tyler
>
> >
> > With that there won't be *any* error reports in dmesg and it will even
> > poison and offline pages which generate excessive errors so that ...
> >
> > > Of course, the best course of action here is to service the machine
> > > but there's still a period of time between the CE errors popping up
> > > and the machine being serviced.
> >
> > ... you'll have ample time to service the machine.
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette
> >

2021-05-06 00:26:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH] EDAC: update edac printk wrappers to use printk_ratelimited.

On Wed, May 05, 2021 at 05:43:57PM -0500, Tyler Hicks wrote:
> This is x86-specific

That's because it is used by x86 currently. It shouldn't be hard to use
it on another arch though as the machinery is pretty generic.

> and not applicable in our situation.

What is your situation? ARM?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette