The dmc520 driver requires that at least one interrupt line, out of the ten
possible, is configured. The driver prints an error and returns -EINVAL
from its .probe function if there are no interrupt lines configured.
Don't print a KERN_ERR level message for each interrupt line that's
unconfigured as that can confuse users into thinking that there is an
error condition.
Before this change, the following KERN_ERR level messages would be
reported if only dram_ecc_errc and dram_ecc_errd were configured in the
device tree:
dmc520 68000000.dmc: IRQ ram_ecc_errc not found
dmc520 68000000.dmc: IRQ ram_ecc_errd not found
dmc520 68000000.dmc: IRQ failed_access not found
dmc520 68000000.dmc: IRQ failed_prog not found
dmc520 68000000.dmc: IRQ link_err not
dmc520 68000000.dmc: IRQ temperature_event not found
dmc520 68000000.dmc: IRQ arch_fsm not found
dmc520 68000000.dmc: IRQ phy_request not found
Fixes: 1088750d7839 ("EDAC: Add EDAC driver for DMC520")
Signed-off-by: Tyler Hicks <[email protected]>
Cc: <[email protected]>
Reported-by: Sinan Kaya <[email protected]>
---
drivers/edac/dmc520_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c
index b8a7d9594afd..1fa5ca57e9ec 100644
--- a/drivers/edac/dmc520_edac.c
+++ b/drivers/edac/dmc520_edac.c
@@ -489,7 +489,7 @@ static int dmc520_edac_probe(struct platform_device *pdev)
dev = &pdev->dev;
for (idx = 0; idx < NUMBER_OF_IRQS; idx++) {
- irq = platform_get_irq_byname(pdev, dmc520_irq_configs[idx].name);
+ irq = platform_get_irq_byname_optional(pdev, dmc520_irq_configs[idx].name);
irqs[idx] = irq;
masks[idx] = dmc520_irq_configs[idx].mask;
if (irq >= 0) {
--
2.25.1
> -----Original Message-----
> From: Tyler Hicks <[email protected]>
> Sent: Tuesday, January 11, 2022 8:38 AM
> To: Lei Wang (DPLAT) <[email protected]>; Borislav Petkov
> <[email protected]>; Tony Luck <[email protected]>; Mauro Carvalho Chehab
> <[email protected]>
> Cc: Sinan Kaya <[email protected]>; Shiping Ji <[email protected]>;
> James Morse <[email protected]>; Robert Richter <[email protected]>;
> [email protected]; [email protected]
> Subject: [PATCH] EDAC/dmc520: Don't print an error for each unconfigured
> interrupt line
>
> The dmc520 driver requires that at least one interrupt line, out of the ten
> possible, is configured. The driver prints an error and returns -EINVAL from
> its .probe function if there are no interrupt lines configured.
>
> Don't print a KERN_ERR level message for each interrupt line that's
> unconfigured as that can confuse users into thinking that there is an error
> condition.
>
> Before this change, the following KERN_ERR level messages would be reported
> if only dram_ecc_errc and dram_ecc_errd were configured in the device tree:
>
> dmc520 68000000.dmc: IRQ ram_ecc_errc not found
> dmc520 68000000.dmc: IRQ ram_ecc_errd not found
> dmc520 68000000.dmc: IRQ failed_access not found
> dmc520 68000000.dmc: IRQ failed_prog not found
> dmc520 68000000.dmc: IRQ link_err not
> dmc520 68000000.dmc: IRQ temperature_event not found
> dmc520 68000000.dmc: IRQ arch_fsm not found
> dmc520 68000000.dmc: IRQ phy_request not found
>
> Fixes: 1088750d7839 ("EDAC: Add EDAC driver for DMC520")
> Signed-off-by: Tyler Hicks <[email protected]>
Looks good. EDAC-CORE maintainers, please take the patch through your tree.
Thanks!
Acked-by: Lei Wang <[email protected]>
> Cc: <[email protected]>
> Reported-by: Sinan Kaya <[email protected]>
> ---
> drivers/edac/dmc520_edac.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/edac/dmc520_edac.c b/drivers/edac/dmc520_edac.c index
> b8a7d9594afd..1fa5ca57e9ec 100644
> --- a/drivers/edac/dmc520_edac.c
> +++ b/drivers/edac/dmc520_edac.c
> @@ -489,7 +489,7 @@ static int dmc520_edac_probe(struct platform_device
> *pdev)
> dev = &pdev->dev;
>
> for (idx = 0; idx < NUMBER_OF_IRQS; idx++) {
> - irq = platform_get_irq_byname(pdev,
> dmc520_irq_configs[idx].name);
> + irq = platform_get_irq_byname_optional(pdev,
> +dmc520_irq_configs[idx].name);
> irqs[idx] = irq;
> masks[idx] = dmc520_irq_configs[idx].mask;
> if (irq >= 0) {
> --
> 2.25.1
On Tue, Jan 11, 2022 at 10:38:00AM -0600, Tyler Hicks wrote:
> The dmc520 driver requires that at least one interrupt line, out of the ten
> possible, is configured. The driver prints an error and returns -EINVAL
> from its .probe function if there are no interrupt lines configured.
>
> Don't print a KERN_ERR level message for each interrupt line that's
> unconfigured as that can confuse users into thinking that there is an
> error condition.
>
> Before this change, the following KERN_ERR level messages would be
> reported if only dram_ecc_errc and dram_ecc_errd were configured in the
> device tree:
>
> dmc520 68000000.dmc: IRQ ram_ecc_errc not found
> dmc520 68000000.dmc: IRQ ram_ecc_errd not found
> dmc520 68000000.dmc: IRQ failed_access not found
> dmc520 68000000.dmc: IRQ failed_prog not found
> dmc520 68000000.dmc: IRQ link_err not
> dmc520 68000000.dmc: IRQ temperature_event not found
> dmc520 68000000.dmc: IRQ arch_fsm not found
> dmc520 68000000.dmc: IRQ phy_request not found
>
> Fixes: 1088750d7839 ("EDAC: Add EDAC driver for DMC520")
> Signed-off-by: Tyler Hicks <[email protected]>
> Cc: <[email protected]>
Why stable? AFAICT, this is fixing only the spew of some error messages
but the driver is still functional.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 2022-01-16 19:29:46, Borislav Petkov wrote:
> On Tue, Jan 11, 2022 at 10:38:00AM -0600, Tyler Hicks wrote:
> > The dmc520 driver requires that at least one interrupt line, out of the ten
> > possible, is configured. The driver prints an error and returns -EINVAL
> > from its .probe function if there are no interrupt lines configured.
> >
> > Don't print a KERN_ERR level message for each interrupt line that's
> > unconfigured as that can confuse users into thinking that there is an
> > error condition.
> >
> > Before this change, the following KERN_ERR level messages would be
> > reported if only dram_ecc_errc and dram_ecc_errd were configured in the
> > device tree:
> >
> > dmc520 68000000.dmc: IRQ ram_ecc_errc not found
> > dmc520 68000000.dmc: IRQ ram_ecc_errd not found
> > dmc520 68000000.dmc: IRQ failed_access not found
> > dmc520 68000000.dmc: IRQ failed_prog not found
> > dmc520 68000000.dmc: IRQ link_err not
> > dmc520 68000000.dmc: IRQ temperature_event not found
> > dmc520 68000000.dmc: IRQ arch_fsm not found
> > dmc520 68000000.dmc: IRQ phy_request not found
> >
> > Fixes: 1088750d7839 ("EDAC: Add EDAC driver for DMC520")
> > Signed-off-by: Tyler Hicks <[email protected]>
> > Cc: <[email protected]>
>
> Why stable? AFAICT, this is fixing only the spew of some error messages
> but the driver is still functional.
KERN_ERR messages trip log scanners and cause concern that the
kernel/hardware is not configured or working correctly. They also add a
little big of ongoing stress into kernel maintainer's lives, as we
prepare and test kernel updates, since they show up as red text in
journalctl output that we have to think about regularly. Multiple
KERN_ERR messages, 8 in this case, can also be considered a little worse
than a single error message.
I feel like this trivial fix is worth taking into stable rather than
suppressing these errors (mentally and in log scanners) for years.
Tyler
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
On Tue, Jan 18, 2022 at 09:28:16AM -0600, Tyler Hicks wrote:
> KERN_ERR messages trip log scanners and cause concern that the
> kernel/hardware is not configured or working correctly. They also add a
> little big of ongoing stress into kernel maintainer's lives, as we
> prepare and test kernel updates, since they show up as red text in
> journalctl output that we have to think about regularly. Multiple
> KERN_ERR messages, 8 in this case, can also be considered a little worse
> than a single error message.
It sounds to me like you wanna read
Documentation/process/stable-kernel-rules.rst
first.
> I feel like this trivial fix is worth taking into stable rather than
> suppressing these errors (mentally and in log scanners) for years.
Years?
In any case, sorry, no, I don't consider this stable material.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 2022-01-18 18:28:16, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 09:28:16AM -0600, Tyler Hicks wrote:
> > KERN_ERR messages trip log scanners and cause concern that the
> > kernel/hardware is not configured or working correctly. They also add a
> > little big of ongoing stress into kernel maintainer's lives, as we
> > prepare and test kernel updates, since they show up as red text in
> > journalctl output that we have to think about regularly. Multiple
> > KERN_ERR messages, 8 in this case, can also be considered a little worse
> > than a single error message.
>
> It sounds to me like you wanna read
>
> Documentation/process/stable-kernel-rules.rst
>
> first.
I'm familiar with it and the sort of commits that flow into stable.
> > I feel like this trivial fix is worth taking into stable rather than
> > suppressing these errors (mentally and in log scanners) for years.
>
> Years?
Yes, years. v5.10 is supported through 2026.
> In any case, sorry, no, I don't consider this stable material.
The bar varies by subsystem maintainer but this wouldn't be the first
logging fix that made it into a stable branch. From the linux-5.10.y
branch of linux-stable:
ddb13ddacc60 scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
526261c1b706 amd/display: downgrade validation failure log level
9a3f52f73c04 bnxt_en: Improve logging of error recovery settings information.
5f7bda9ba8d7 leds: lm3697: Don't spam logs when probe is deferred
8b195380cd07 staging: fbtft: Don't spam logs when probe is deferred
...
But you do the hard work of maintaining the subsystem tree so you get to
call the shots about where fixes are routed. :) Thanks for applying the
change!
Tyler
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
On Tue, Jan 18, 2022 at 01:54:01PM -0600, Tyler Hicks wrote:
> On 2022-01-18 18:28:16, Borislav Petkov wrote:
> > On Tue, Jan 18, 2022 at 09:28:16AM -0600, Tyler Hicks wrote:
> > > KERN_ERR messages trip log scanners and cause concern that the
> > > kernel/hardware is not configured or working correctly. They also add a
> > > little big of ongoing stress into kernel maintainer's lives, as we
> > > prepare and test kernel updates, since they show up as red text in
> > > journalctl output that we have to think about regularly. Multiple
> > > KERN_ERR messages, 8 in this case, can also be considered a little worse
> > > than a single error message.
> >
> > It sounds to me like you wanna read
> >
> > Documentation/process/stable-kernel-rules.rst
> >
> > first.
>
> I'm familiar with it and the sort of commits that flow into stable.
>
> > > I feel like this trivial fix is worth taking into stable rather than
> > > suppressing these errors (mentally and in log scanners) for years.
> >
> > Years?
>
> Yes, years. v5.10 is supported through 2026.
>
> > In any case, sorry, no, I don't consider this stable material.
>
> The bar varies by subsystem maintainer but this wouldn't be the first
> logging fix that made it into a stable branch. From the linux-5.10.y
> branch of linux-stable:
>
> ddb13ddacc60 scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
> 526261c1b706 amd/display: downgrade validation failure log level
> 9a3f52f73c04 bnxt_en: Improve logging of error recovery settings information.
> 5f7bda9ba8d7 leds: lm3697: Don't spam logs when probe is deferred
> 8b195380cd07 staging: fbtft: Don't spam logs when probe is deferred
> ...
Well, lemme add the stable folks for comment then - they might have had
their reasons.
( Or Sasha's AI went nuts. Which I've witnessed a bunch of times
already.)
If I look at the stable-kernel-rules.rst file, the only rule that
*maybe*, *probably* applies here is
"- It must fix a real bug that bothers people"
But this one is formulated so broadly so that it makes me wanna ignore
it. Because *anything* can bother people - even spelling mistakes but
then a later rule says no spelling fixes.
Don't get me wrong - I don't mind having the stable tag where really
needed. But here it is questionable. And we have those stable rules for
a reason - if we start bending them and ignoring them then we might
just as well backport everything that applies and have parallel kernel
streams where the version means nothing. Basically a distro kernel. :-P
So let's see what the stable folks say first.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jan 18, 2022 at 10:04:30PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 01:54:01PM -0600, Tyler Hicks wrote:
> > On 2022-01-18 18:28:16, Borislav Petkov wrote:
> > > On Tue, Jan 18, 2022 at 09:28:16AM -0600, Tyler Hicks wrote:
> > > > KERN_ERR messages trip log scanners and cause concern that the
> > > > kernel/hardware is not configured or working correctly. They also add a
> > > > little big of ongoing stress into kernel maintainer's lives, as we
> > > > prepare and test kernel updates, since they show up as red text in
> > > > journalctl output that we have to think about regularly. Multiple
> > > > KERN_ERR messages, 8 in this case, can also be considered a little worse
> > > > than a single error message.
> > >
> > > It sounds to me like you wanna read
> > >
> > > Documentation/process/stable-kernel-rules.rst
> > >
> > > first.
> >
> > I'm familiar with it and the sort of commits that flow into stable.
> >
> > > > I feel like this trivial fix is worth taking into stable rather than
> > > > suppressing these errors (mentally and in log scanners) for years.
> > >
> > > Years?
> >
> > Yes, years. v5.10 is supported through 2026.
> >
> > > In any case, sorry, no, I don't consider this stable material.
> >
> > The bar varies by subsystem maintainer but this wouldn't be the first
> > logging fix that made it into a stable branch. From the linux-5.10.y
> > branch of linux-stable:
> >
> > ddb13ddacc60 scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp()
> > 526261c1b706 amd/display: downgrade validation failure log level
> > 9a3f52f73c04 bnxt_en: Improve logging of error recovery settings information.
> > 5f7bda9ba8d7 leds: lm3697: Don't spam logs when probe is deferred
> > 8b195380cd07 staging: fbtft: Don't spam logs when probe is deferred
> > ...
>
> Well, lemme add the stable folks for comment then - they might have had
> their reasons.
>
> ( Or Sasha's AI went nuts. Which I've witnessed a bunch of times
> already.)
>
> If I look at the stable-kernel-rules.rst file, the only rule that
> *maybe*, *probably* applies here is
>
> "- It must fix a real bug that bothers people"
>
> But this one is formulated so broadly so that it makes me wanna ignore
> it. Because *anything* can bother people - even spelling mistakes but
> then a later rule says no spelling fixes.
>
> Don't get me wrong - I don't mind having the stable tag where really
> needed. But here it is questionable. And we have those stable rules for
> a reason - if we start bending them and ignoring them then we might
> just as well backport everything that applies and have parallel kernel
> streams where the version means nothing. Basically a distro kernel. :-P
>
> So let's see what the stable folks say first.
I will be glad to take these types of patches if the subsystem
maintainer thinks it will help things out, or if they are tired of
getting emails about the misleading messages. In this case, I don't
think either of those things is relevant, so I don't see why the patch
should be backported.
For this specific change, I do NOT think it should be backported at all,
mostly for the reason that people are still arguing over the whole
platform_get_*_optional() mess that we currently have. Let's not go and
backport anything right now to stable trees until we have all of that
sorted out, as it looks like it all might be changing again. See:
https://lore.kernel.org/r/[email protected]
for all of the gory details and the 300+ emails written on the topic so
far.
Tyler, feel free to jump in to that thread if you want, it's a mess...
thanks,
greg k-h
On Wed, Jan 19, 2022 at 10:17:52AM +0100, Greg Kroah-Hartman wrote:
> For this specific change, I do NOT think it should be backported at all,
> mostly for the reason that people are still arguing over the whole
> platform_get_*_optional() mess that we currently have. Let's not go and
> backport anything right now to stable trees until we have all of that
> sorted out, as it looks like it all might be changing again. See:
> https://lore.kernel.org/r/[email protected]
> for all of the gory details and the 300+ emails written on the topic so
> far.
It sounds to me I should not even take this patch upstream yet,
considering that's still ongoing...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jan 19, 2022 at 10:37:51AM +0100, Borislav Petkov wrote:
> On Wed, Jan 19, 2022 at 10:17:52AM +0100, Greg Kroah-Hartman wrote:
> > For this specific change, I do NOT think it should be backported at all,
> > mostly for the reason that people are still arguing over the whole
> > platform_get_*_optional() mess that we currently have. Let's not go and
> > backport anything right now to stable trees until we have all of that
> > sorted out, as it looks like it all might be changing again. See:
> > https://lore.kernel.org/r/[email protected]
> > for all of the gory details and the 300+ emails written on the topic so
> > far.
>
> It sounds to me I should not even take this patch upstream yet,
> considering that's still ongoing...
Yes, I would not take that just yet at all. Let's let the api argument
settle down a bit first.
thanks,
greg k-h
On 2022-01-19 11:28:08, Greg Kroah-Hartman wrote:
> On Wed, Jan 19, 2022 at 10:37:51AM +0100, Borislav Petkov wrote:
> > On Wed, Jan 19, 2022 at 10:17:52AM +0100, Greg Kroah-Hartman wrote:
> > > For this specific change, I do NOT think it should be backported at all,
> > > mostly for the reason that people are still arguing over the whole
> > > platform_get_*_optional() mess that we currently have. Let's not go and
> > > backport anything right now to stable trees until we have all of that
> > > sorted out, as it looks like it all might be changing again. See:
> > > https://lore.kernel.org/r/[email protected]
> > > for all of the gory details and the 300+ emails written on the topic so
> > > far.
> >
> > It sounds to me I should not even take this patch upstream yet,
> > considering that's still ongoing...
>
> Yes, I would not take that just yet at all. Let's let the api argument
> settle down a bit first.
The API argument seems to have fizzled out in v2:
https://lore.kernel.org/lkml/[email protected]/
Can this fix be merged since there seem to be no API changes coming
soon? Boris, feel free to strip off the cc stable tag.
Tyler
>
> thanks,
>
> greg k-h
>
On 2022-04-18 23:13:36, Borislav Petkov wrote:
> On Mon, Apr 18, 2022 at 03:40:29PM -0500, Tyler Hicks wrote:
> > > The API argument seems to have fizzled out in v2:
> > >
> > > https://lore.kernel.org/lkml/[email protected]/
>
> I don't see those two upstream yet, on a quick glance. Perhaps in Greg's tree?
>
> Greg, what's the latest with that platform_get_*_optional() fun?
>
> Also, the second of those two patches above has:
>
> + * Return: non-zero IRQ number on success, 0 if IRQ wasn't found, negative error
> + * number on failure.
> */
> int platform_get_irq_byname_optional(struct platform_device *dev,
>
> and your patch does:
>
> + irq = platform_get_irq_byname_optional(pdev, dmc520_irq_configs[idx].name);
> irqs[idx] = irq;
>
> so on failure, it would still write the negative error value in
> irqs[idx].
>
> How can that be right?
The patches to modify the API have become stale. There have been no
new comments or revisions since Feb. What I'm proposing is to proceed
with merging this simple fix and let the folks discussing the API
changes adjust the use in the dmc250 driver if/when they decide to
revive the API changes.
Tyler
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
>
On 2022-04-04 16:56:58, Tyler Hicks wrote:
> On 2022-01-19 11:28:08, Greg Kroah-Hartman wrote:
> > On Wed, Jan 19, 2022 at 10:37:51AM +0100, Borislav Petkov wrote:
> > > On Wed, Jan 19, 2022 at 10:17:52AM +0100, Greg Kroah-Hartman wrote:
> > > > For this specific change, I do NOT think it should be backported at all,
> > > > mostly for the reason that people are still arguing over the whole
> > > > platform_get_*_optional() mess that we currently have. Let's not go and
> > > > backport anything right now to stable trees until we have all of that
> > > > sorted out, as it looks like it all might be changing again. See:
> > > > https://lore.kernel.org/r/[email protected]
> > > > for all of the gory details and the 300+ emails written on the topic so
> > > > far.
> > >
> > > It sounds to me I should not even take this patch upstream yet,
> > > considering that's still ongoing...
> >
> > Yes, I would not take that just yet at all. Let's let the api argument
> > settle down a bit first.
>
> The API argument seems to have fizzled out in v2:
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Can this fix be merged since there seem to be no API changes coming
> soon? Boris, feel free to strip off the cc stable tag.
Hi Boris - I just double checked that this still looks correct and
applies cleanly to linux-next. Anything I can do on my end to help get
this little fix merged into the ras.git tree? Thanks!
Tyler
>
> Tyler
>
> >
> > thanks,
> >
> > greg k-h
> >
On Mon, Apr 18, 2022 at 03:40:29PM -0500, Tyler Hicks wrote:
> > The API argument seems to have fizzled out in v2:
> >
> > https://lore.kernel.org/lkml/[email protected]/
I don't see those two upstream yet, on a quick glance. Perhaps in Greg's tree?
Greg, what's the latest with that platform_get_*_optional() fun?
Also, the second of those two patches above has:
+ * Return: non-zero IRQ number on success, 0 if IRQ wasn't found, negative error
+ * number on failure.
*/
int platform_get_irq_byname_optional(struct platform_device *dev,
and your patch does:
+ irq = platform_get_irq_byname_optional(pdev, dmc520_irq_configs[idx].name);
irqs[idx] = irq;
so on failure, it would still write the negative error value in
irqs[idx].
How can that be right?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Apr 18, 2022 at 04:34:53PM -0500, Tyler Hicks wrote:
> The patches to modify the API have become stale. There have been no
> new comments or revisions since Feb. What I'm proposing is to proceed
> with merging this simple fix and let the folks discussing the API
> changes adjust the use in the dmc250 driver if/when they decide to
> revive the API changes.
Ok, fair enough.
Queued, thanks.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette