2023-03-29 03:52:10

by Zhang, Rui

[permalink] [raw]
Subject: Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

Hi, Peter,

CC the list.

On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
> Dear Mr. Rui,
> Dear Mr. Roeck,
>
> please consider accepting the attached patches or
> modifying the coretemp code to stop spamming my syslog.
> I would appreciate it very much if you can accept the patches.
>
> coretemp: Improve dynamic changes of TjMax
> After introduction of dynamic TjMax changes in commit
> c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
> my syslog gets spammed with "TjMax is ... degrees C"
> messages.
> If TjMax is subject to change at any time, it won't be
> set in tdata anymore and re-read every time from MSR.
> This causes quite a lot of dev_dbg() messages to be issued.
>
> The following patches change the code to read TjMax
> from the MSRs into tdata->tjmax (again) but allow for a
> dynamic update at any time as well. (Patches 1 and 2)
> This way a message will only be issued after actual changes.
> Also I replaced the dev_dbg() with dev_notice (Patch 3) and
> added a additional dev_notice for the case where TjMax is
> set based on assumptions. (Patch 4)
>
>
> If you do not want to accept my patches, removing the
> dev_dbg() in get_tjmax() would be the most simple
> solution I guess.
>
Please check if below patch solves your problem or not.

From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00 2001
From: Zhang Rui <[email protected]>
Date: Wed, 29 Mar 2023 11:35:18 +0800
Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages

Avoid duplicate dev_dbg messages when tjmax value retrieved from MSR
does not change.

Signed-off-by: Zhang Rui <[email protected]>
---
drivers/hwmon/coretemp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index 30d77f451937..809456967b50 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
int err;
u32 eax, edx;
u32 val;
+ static u32 tjmax;

/* use static tjmax once it is set */
if (tdata->tjmax)
@@ -287,7 +288,10 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
* will be used
*/
if (val) {
- dev_dbg(dev, "TjMax is %d degrees C\n", val);
+ if (tjmax != val) {
+ dev_dbg(dev, "TjMax is %d degrees C\n", val);
+ tjmax = val;
+ }
return val * 1000;
}
}
--
2.25.1


2023-03-29 12:48:32

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

On Wed, Mar 29, 2023 at 03:43:58AM +0000, Zhang, Rui wrote:
> Hi, Peter,
>
> CC the list.
>
> On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
> > Dear Mr. Rui,
> > Dear Mr. Roeck,
> >
> > please consider accepting the attached patches or
> > modifying the coretemp code to stop spamming my syslog.
> > I would appreciate it very much if you can accept the patches.
> >
> > coretemp: Improve dynamic changes of TjMax
> > After introduction of dynamic TjMax changes in commit
> > c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
> > my syslog gets spammed with "TjMax is ... degrees C"
> > messages.
> > If TjMax is subject to change at any time, it won't be
> > set in tdata anymore and re-read every time from MSR.
> > This causes quite a lot of dev_dbg() messages to be issued.
> >
> > The following patches change the code to read TjMax
> > from the MSRs into tdata->tjmax (again) but allow for a
> > dynamic update at any time as well. (Patches 1 and 2)
> > This way a message will only be issued after actual changes.
> > Also I replaced the dev_dbg() with dev_notice (Patch 3) and
> > added a additional dev_notice for the case where TjMax is
> > set based on assumptions. (Patch 4)
> >
> >
> > If you do not want to accept my patches, removing the
> > dev_dbg() in get_tjmax() would be the most simple
> > solution I guess.
> >
> Please check if below patch solves your problem or not.
>
> From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00 2001
> From: Zhang Rui <[email protected]>
> Date: Wed, 29 Mar 2023 11:35:18 +0800
> Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages
>
> Avoid duplicate dev_dbg messages when tjmax value retrieved from MSR
> does not change.
>
> Signed-off-by: Zhang Rui <[email protected]>
> ---
> drivers/hwmon/coretemp.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> index 30d77f451937..809456967b50 100644
> --- a/drivers/hwmon/coretemp.c
> +++ b/drivers/hwmon/coretemp.c
> @@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
> int err;
> u32 eax, edx;
> u32 val;
> + static u32 tjmax;

That would apply to every instance of this driver, meaning to every
CPU core. Is that really appropriate ?

Guenter

>
> /* use static tjmax once it is set */
> if (tdata->tjmax)
> @@ -287,7 +288,10 @@ static int get_tjmax(struct temp_data *tdata, struct device *dev)
> * will be used
> */
> if (val) {
> - dev_dbg(dev, "TjMax is %d degrees C\n", val);
> + if (tjmax != val) {
> + dev_dbg(dev, "TjMax is %d degrees C\n", val);
> + tjmax = val;
> + }
> return val * 1000;
> }
> }
> --
> 2.25.1
>

2023-03-29 14:17:32

by Zhang, Rui

[permalink] [raw]
Subject: Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

On Wed, 2023-03-29 at 05:47 -0700, Guenter Roeck wrote:
> On Wed, Mar 29, 2023 at 03:43:58AM +0000, Zhang, Rui wrote:
> > Hi, Peter,
> >
> > CC the list.
> >
> > On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
> > > Dear Mr. Rui,
> > > Dear Mr. Roeck,
> > >
> > > please consider accepting the attached patches or
> > > modifying the coretemp code to stop spamming my syslog.
> > > I would appreciate it very much if you can accept the patches.
> > >
> > > coretemp: Improve dynamic changes of TjMax
> > > After introduction of dynamic TjMax changes in commit
> > > c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
> > > my syslog gets spammed with "TjMax is ... degrees C"
> > > messages.
> > > If TjMax is subject to change at any time, it won't be
> > > set in tdata anymore and re-read every time from MSR.
> > > This causes quite a lot of dev_dbg() messages to be issued.
> > >
> > > The following patches change the code to read TjMax
> > > from the MSRs into tdata->tjmax (again) but allow for a
> > > dynamic update at any time as well. (Patches 1 and 2)
> > > This way a message will only be issued after actual changes.
> > > Also I replaced the dev_dbg() with dev_notice (Patch 3) and
> > > added a additional dev_notice for the case where TjMax is
> > > set based on assumptions. (Patch 4)
> > >
> > >
> > > If you do not want to accept my patches, removing the
> > > dev_dbg() in get_tjmax() would be the most simple
> > > solution I guess.
> > >
> > Please check if below patch solves your problem or not.
> >
> > From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00
> > 2001
> > From: Zhang Rui <[email protected]>
> > Date: Wed, 29 Mar 2023 11:35:18 +0800
> > Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages
> >
> > Avoid duplicate dev_dbg messages when tjmax value retrieved from
> > MSR
> > does not change.
> >
> > Signed-off-by: Zhang Rui <[email protected]>
> > ---
> > drivers/hwmon/coretemp.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > index 30d77f451937..809456967b50 100644
> > --- a/drivers/hwmon/coretemp.c
> > +++ b/drivers/hwmon/coretemp.c
> > @@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata,
> > struct device *dev)
> > int err;
> > u32 eax, edx;
> > u32 val;
> > + static u32 tjmax;
>
> That would apply to every instance of this driver, meaning to every
> CPU core. Is that really appropriate ?
>
> Guenter
>
Good point.

MSR_IA32_TEMPERATURE_TARGET is package scope, and the cached tjmax
should also be package scope, or else this message is shown for each
cpu when tjmax changes in one package.

Previously, the message is printed only once during driver probing time
thus I'm wondering how useful this is.

Maybe we can just delete it?

thanks,
rui

2023-03-29 19:01:11

by Peter Ganzhorn

[permalink] [raw]
Subject: Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

On 3/29/23 16:11, Zhang, Rui wrote:
> On Wed, 2023-03-29 at 05:47 -0700, Guenter Roeck wrote:
>> On Wed, Mar 29, 2023 at 03:43:58AM +0000, Zhang, Rui wrote:
>>> Hi, Peter,
>>>
>>> CC the list.
>>>
>>> On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
>>>> Dear Mr. Rui,
>>>> Dear Mr. Roeck,
>>>>
>>>> please consider accepting the attached patches or
>>>> modifying the coretemp code to stop spamming my syslog.
>>>> I would appreciate it very much if you can accept the patches.
>>>>
>>>> coretemp: Improve dynamic changes of TjMax
>>>> After introduction of dynamic TjMax changes in commit
>>>> c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
>>>> my syslog gets spammed with "TjMax is ... degrees C"
>>>> messages.
>>>> If TjMax is subject to change at any time, it won't be
>>>> set in tdata anymore and re-read every time from MSR.
>>>> This causes quite a lot of dev_dbg() messages to be issued.
>>>>
>>>> The following patches change the code to read TjMax
>>>> from the MSRs into tdata->tjmax (again) but allow for a
>>>> dynamic update at any time as well. (Patches 1 and 2)
>>>> This way a message will only be issued after actual changes.
>>>> Also I replaced the dev_dbg() with dev_notice (Patch 3) and
>>>> added a additional dev_notice for the case where TjMax is
>>>> set based on assumptions. (Patch 4)
>>>>
>>>>
>>>> If you do not want to accept my patches, removing the
>>>> dev_dbg() in get_tjmax() would be the most simple
>>>> solution I guess.
>>>>
>>> Please check if below patch solves your problem or not.
>>>
>>> From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00
>>> 2001
>>> From: Zhang Rui <[email protected]>
>>> Date: Wed, 29 Mar 2023 11:35:18 +0800
>>> Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages
>>>
>>> Avoid duplicate dev_dbg messages when tjmax value retrieved from
>>> MSR
>>> does not change.
>>>
>>> Signed-off-by: Zhang Rui <[email protected]>
>>> ---
>>> drivers/hwmon/coretemp.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
>>> index 30d77f451937..809456967b50 100644
>>> --- a/drivers/hwmon/coretemp.c
>>> +++ b/drivers/hwmon/coretemp.c
>>> @@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata,
>>> struct device *dev)
>>> int err;
>>> u32 eax, edx;
>>> u32 val;
>>> + static u32 tjmax;
>> That would apply to every instance of this driver, meaning to every
>> CPU core. Is that really appropriate ?
>>
>> Guenter
>>
> Good point.
>
> MSR_IA32_TEMPERATURE_TARGET is package scope, and the cached tjmax
> should also be package scope, or else this message is shown for each
> cpu when tjmax changes in one package.
>
> Previously, the message is printed only once during driver probing time
> thus I'm wondering how useful this is.
>
> Maybe we can just delete it?
>
> thanks,
> rui
The proposed patch from Rui Zhang does solve the issue of spamming the
syslog.

I only get one message at boot:

[    1.370790] platform coretemp.0: TjMax is 96 degrees C

But if different packages have different tjmax values I guess the
spamming issue may return.

Personally I'd prefer to get a message once if tjmax changes for any
package.

Thank you both for the quick reaction.
Peter

2023-03-29 21:51:59

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 0/4] coretemp: Fix spamming of ring buffer

On Wed, Mar 29, 2023 at 02:11:36PM +0000, Zhang, Rui wrote:
> On Wed, 2023-03-29 at 05:47 -0700, Guenter Roeck wrote:
> > On Wed, Mar 29, 2023 at 03:43:58AM +0000, Zhang, Rui wrote:
> > > Hi, Peter,
> > >
> > > CC the list.
> > >
> > > On Tue, 2023-03-28 at 22:37 +0200, Peter Ganzhorn wrote:
> > > > Dear Mr. Rui,
> > > > Dear Mr. Roeck,
> > > >
> > > > please consider accepting the attached patches or
> > > > modifying the coretemp code to stop spamming my syslog.
> > > > I would appreciate it very much if you can accept the patches.
> > > >
> > > > coretemp: Improve dynamic changes of TjMax
> > > > After introduction of dynamic TjMax changes in commit
> > > > c0c67f8761cec1fe36c21d85b1a5400ea7ac30cd
> > > > my syslog gets spammed with "TjMax is ... degrees C"
> > > > messages.
> > > > If TjMax is subject to change at any time, it won't be
> > > > set in tdata anymore and re-read every time from MSR.
> > > > This causes quite a lot of dev_dbg() messages to be issued.
> > > >
> > > > The following patches change the code to read TjMax
> > > > from the MSRs into tdata->tjmax (again) but allow for a
> > > > dynamic update at any time as well. (Patches 1 and 2)
> > > > This way a message will only be issued after actual changes.
> > > > Also I replaced the dev_dbg() with dev_notice (Patch 3) and
> > > > added a additional dev_notice for the case where TjMax is
> > > > set based on assumptions. (Patch 4)
> > > >
> > > >
> > > > If you do not want to accept my patches, removing the
> > > > dev_dbg() in get_tjmax() would be the most simple
> > > > solution I guess.
> > > >
> > > Please check if below patch solves your problem or not.
> > >
> > > From 9370ee5163a85f65230b5222f1f4dece59ce078a Mon Sep 17 00:00:00
> > > 2001
> > > From: Zhang Rui <[email protected]>
> > > Date: Wed, 29 Mar 2023 11:35:18 +0800
> > > Subject: [PATCH] hwmon: (coretemp) Avoid duplicate debug messages
> > >
> > > Avoid duplicate dev_dbg messages when tjmax value retrieved from
> > > MSR
> > > does not change.
> > >
> > > Signed-off-by: Zhang Rui <[email protected]>
> > > ---
> > > drivers/hwmon/coretemp.c | 6 +++++-
> > > 1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
> > > index 30d77f451937..809456967b50 100644
> > > --- a/drivers/hwmon/coretemp.c
> > > +++ b/drivers/hwmon/coretemp.c
> > > @@ -267,6 +267,7 @@ static int get_tjmax(struct temp_data *tdata,
> > > struct device *dev)
> > > int err;
> > > u32 eax, edx;
> > > u32 val;
> > > + static u32 tjmax;
> >
> > That would apply to every instance of this driver, meaning to every
> > CPU core. Is that really appropriate ?
> >
> > Guenter
> >
> Good point.
>
> MSR_IA32_TEMPERATURE_TARGET is package scope, and the cached tjmax
> should also be package scope, or else this message is shown for each
> cpu when tjmax changes in one package.
>
> Previously, the message is printed only once during driver probing time
> thus I'm wondering how useful this is.
>
> Maybe we can just delete it?
>
Personally I would delete it. I don't see the value of clogging the kernel
log with it.

Guenter