2006-05-24 17:43:05

by Doug Thompson

[permalink] [raw]
Subject: [PATCH 0/6] EDAC Patch Set

This patch set applies to kernel 2.6.17-rc4 fine and to 2.6.17-rc4-mm1 with slight
fuzz in the MAINTAINERs file.

The following set of patches to EDAC provide various cleanups of the EDAC core and
memory controller drivers. Most of which came from Dave Peterson and from others.

edac-pci_dep.patch
Change MC drivers from using CVS revisions strings for their version number,
Now each driver has its own local string.

Remove some PCI dependencies from the core EDAC module. Most of the code
changes here are from a patch by Dave Jiang.
It may be best to eventually move the PCI-specific code into a separate source file.

edac-mc_numbers_1_of_2.patch
This is part 1 of a 2-part patch set. The code changes are split into two
parts to make the patches more readable.

Remove add_mc_to_global_list(). In next patch, this function will be
reimplemented with different semantics.

edac-mc_numbers_2_of_2.patch
This is part 2 of a 2-part patch set.

- Reimplement add_mc_to_global_list() with semantics that allow the caller to
determine the ID number for a mem_ctl_info structure. Then modify
edac_mc_add_mc() so that the caller specifies the ID number for the new
mem_ctl_info structure. Platform-specific code should be able to assign the
ID numbers in a platform-specific manner. For instance, on Opteron it makes
sense to have the ID of the mem_ctl_info structure match the ID of the node
that the memory controller belongs to.

- Modify callers of edac_mc_add_mc() so they use the new semantics.

edac-probe1_cleanup_1_of_2.patch
This is part 1 of a 2-part patch set. The code changes are split into two
parts to make the patches more readable.

Add lower-level functions that handle various parts of the initialization done
by the xxx_probe1() functions. Some of the xxx_probe1() functions are much
too long and complicated (see "Chapter 5: Functions" in
Documentation/CodingStyle).
This is part 2 of a 2-part patch set.

Modify the xxx_probe1() functions so they call the lower-level functions
created by the first patch in the set.

edac-maintainers.patch
--------
Removed Dave Peterson as per his request as maintainer.
Add Mark Gross as maintainer for E752X MC driver

Signed-off-by: doug thompson <[email protected]>

drivers/edac/amd76x_edac.c | 97 ++++---
drivers/edac/e752x_edac.c | 345 ++++++++++++++-------------
drivers/edac/e7xxx_edac.c | 184 +++++++-------
drivers/edac/edac_mc.c | 56 ++--
drivers/edac/edac_mc.h | 20 +
drivers/edac/i82860_edac.c | 132 +++++-----
drivers/edac/i82875p_edac.c | 254 +++++++++++--------
drivers/edac/r82600_edac.c | 139 +++++-----
linux-2.6.17-rc3/drivers/edac/edac_mc.c | 43 ---
linux-2.6.17-rc4/MAINTAINERS | 14 -
linux-2.6.17-rc4/drivers/edac/amd76x_edac.c | 5
linux-2.6.17-rc4/drivers/edac/e752x_edac.c | 5
linux-2.6.17-rc4/drivers/edac/e7xxx_edac.c | 5
linux-2.6.17-rc4/drivers/edac/edac_mc.c | 46 +++
linux-2.6.17-rc4/drivers/edac/edac_mc.h | 2
linux-2.6.17-rc4/drivers/edac/i82860_edac.c | 5
linux-2.6.17-rc4/drivers/edac/i82875p_edac.c | 5
linux-2.6.17-rc4/drivers/edac/r82600_edac.c | 5
18 files changed, 755 insertions(+), 607 deletions(-)


2006-05-24 18:18:14

by Jurgen Kramer

[permalink] [raw]
Subject: Re: [PATCH 0/6] EDAC Patch Set

On Wed, 2006-05-24 at 10:43 -0700, Doug Thompson wrote:
> This patch set applies to kernel 2.6.17-rc4 fine and to 2.6.17-rc4-mm1 with slight
> fuzz in the MAINTAINERs file.
>
> The following set of patches to EDAC provide various cleanups of the EDAC core and
> memory controller drivers. Most of which came from Dave Peterson and from others.
>
> edac-pci_dep.patch
> Change MC drivers from using CVS revisions strings for their version number,
> Now each driver has its own local string.
>
> Remove some PCI dependencies from the core EDAC module. Most of the code
> changes here are from a patch by Dave Jiang.
> It may be best to eventually move the PCI-specific code into a separate source file.
>
> edac-mc_numbers_1_of_2.patch
> This is part 1 of a 2-part patch set. The code changes are split into two
> parts to make the patches more readable.
>
> Remove add_mc_to_global_list(). In next patch, this function will be
> reimplemented with different semantics.
>
> edac-mc_numbers_2_of_2.patch
> This is part 2 of a 2-part patch set.
>
> - Reimplement add_mc_to_global_list() with semantics that allow the caller to
> determine the ID number for a mem_ctl_info structure. Then modify
> edac_mc_add_mc() so that the caller specifies the ID number for the new
> mem_ctl_info structure. Platform-specific code should be able to assign the
> ID numbers in a platform-specific manner. For instance, on Opteron it makes
> sense to have the ID of the mem_ctl_info structure match the ID of the node
> that the memory controller belongs to.
>
> - Modify callers of edac_mc_add_mc() so they use the new semantics.
>
> edac-probe1_cleanup_1_of_2.patch
> This is part 1 of a 2-part patch set. The code changes are split into two
> parts to make the patches more readable.
>
> Add lower-level functions that handle various parts of the initialization done
> by the xxx_probe1() functions. Some of the xxx_probe1() functions are much
> too long and complicated (see "Chapter 5: Functions" in
> Documentation/CodingStyle).
> This is part 2 of a 2-part patch set.
>
> Modify the xxx_probe1() functions so they call the lower-level functions
> created by the first patch in the set.

Will this patchset fix/suppress the "Non-Fatal Error PCI Express B"
messages I see with the E7525 edac?

I am running 2.6.16 (or more specific FC5 2.6.16-1.2111) with seems to
already include this version:

MC: drivers/edac/edac_mc.c version edac_mc Ver: 2.0.0 May 4 2006

This version still floods my syslog with "Non-Fatal Error...." messages.

Jurgen


2006-05-24 18:55:15

by Doug Thompson

[permalink] [raw]
Subject: Re: [PATCH 0/6] EDAC Patch Set



--- Jurgen Kramer <[email protected]> wrote:
>
> Will this patchset fix/suppress the "Non-Fatal Error PCI Express B"
> messages I see with the E7525 edac?

No not yet. These patches are part of the set we gathered in the queue after EDAC
was put into the kernel as a result of various other feedback. The Non-Fatal noise
has been placed in the queue of work to do. Dave Peterson, who was co-maintainer of
EDAC, has moved on, so I have picking up the TODO slack and flushing these patches
out the door so I can start with a somewhat cleaner slate.

The good news is I have found a maintainer for the E7525 MC driver (I don't have
access to a mobo with that chipset, so I have a problem in verifying any mods I do
work) who has agreed to work with that driver. Thanks to mark gross for taking that
on. He and I have discussed this noise issue.

doug thompson

>
> I am running 2.6.16 (or more specific FC5 2.6.16-1.2111) with seems to
> already include this version:
>
> MC: drivers/edac/edac_mc.c version edac_mc Ver: 2.0.0 May 4 2006
>
> This version still floods my syslog with "Non-Fatal Error...." messages.
>
> Jurgen

2006-05-25 04:42:31

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 0/6] EDAC Patch Set

drivers/edac/e752x_edac.c: In function `e752x_probe1':
drivers/edac/e752x_edac.c:932: warning: `mci' might be used uninitialized in this function
drivers/edac/e752x_edac.c:933: warning: `pvt' might be used uninitialized in this function

And these are indeed both box-killing bugs. I'm sure you were seeing this
warning as well.

There's this, plus the compile error, plus the ifdef proliferation. I'll
wait for version 2, thanks.

2006-05-25 14:59:53

by mark gross

[permalink] [raw]
Subject: Re: [PATCH 0/6] EDAC Patch Set

On Wed, May 24, 2006 at 11:55:12AM -0700, Doug Thompson wrote:
>
>
> --- Jurgen Kramer <[email protected]> wrote:
> >
> > Will this patchset fix/suppress the "Non-Fatal Error PCI Express B"
> > messages I see with the E7525 edac?
>
> No not yet. These patches are part of the set we gathered in the queue after EDAC
> was put into the kernel as a result of various other feedback. The Non-Fatal noise
> has been placed in the queue of work to do. Dave Peterson, who was co-maintainer of
> EDAC, has moved on, so I have picking up the TODO slack and flushing these patches
> out the door so I can start with a somewhat cleaner slate.
>
> The good news is I have found a maintainer for the E7525 MC driver (I don't have
> access to a mobo with that chipset, so I have a problem in verifying any mods I do
> work) who has agreed to work with that driver. Thanks to mark gross for taking that
> on. He and I have discussed this noise issue.
>

I have a couple of platforms that reproduce the non-fatal pci express noise.

As I mentioned to Doug, I will dig into this issue after OLS. However; if
anyone has any ideas to share with me on it I'll take any advice I can get. I
first would like to get a good root cause on why these things are coming from
the PCI code atfer loading the edac_e752x driver.

--mgross

> doug thompson
>
> >
> > I am running 2.6.16 (or more specific FC5 2.6.16-1.2111) with seems to
> > already include this version:
> >
> > MC: drivers/edac/edac_mc.c version edac_mc Ver: 2.0.0 May 4 2006
> >
> > This version still floods my syslog with "Non-Fatal Error...." messages.
> >
> > Jurgen
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2006-05-25 15:00:42

by mark gross

[permalink] [raw]
Subject: Re: [PATCH 0/6] EDAC Patch Set

On Wed, May 24, 2006 at 09:42:03PM -0700, Andrew Morton wrote:
> drivers/edac/e752x_edac.c: In function `e752x_probe1':
> drivers/edac/e752x_edac.c:932: warning: `mci' might be used uninitialized in this function
> drivers/edac/e752x_edac.c:933: warning: `pvt' might be used uninitialized in this function
>
> And these are indeed both box-killing bugs. I'm sure you were seeing this
> warning as well.
>
> There's this, plus the compile error, plus the ifdef proliferation. I'll
> wait for version 2, thanks.

Doug, can you take these?

--mgross