2021-12-08 17:44:17

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 0/4] AMD Family 19h Models 10h-1Fh Updates

Hi all,

This set adds support for AMD Family 19h Models 10h-1Fh and A0h-AFh.

Patch 1 adds some new memory types into EDAC.

Patch 2 adds the PCI IDs, family structures, etc. for the new models.

Patch 3 is a small fixup on how some registers are checked.

Patch 4 adds register offset and other minor changes introduced with
these new models.

This set is based on the following branches:
- tip/master
- ras/edac-for-next
- groeck/linux-staging/hwmon-next

The following commit in hwmon-next is needed for functional support of
this set.

49e90c39d0be ("x86/amd_nb: Add AMD Family 19h Models (10h-1Fh) and (A0h-AFh) PCI IDs")

Thanks,
Yazen

Yazen Ghannam (4):
EDAC: Add RDDR5 and LRDDR5 memory types
EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
EDAC/amd64: Check register values from all UMCs
EDAC/amd64: Add DDR5 support and related register changes

drivers/edac/amd64_edac.c | 93 +++++++++++++++++++++++++++++++++++----
drivers/edac/amd64_edac.h | 16 ++++++-
drivers/edac/edac_mc.c | 2 +
include/linux/edac.h | 6 +++
4 files changed, 107 insertions(+), 10 deletions(-)

--
2.25.1



2021-12-08 17:44:20

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 1/4] EDAC: Add RDDR5 and LRDDR5 memory types

Include Registered-DDR5 and Load-Reduced DDR5 in the list of memory
types.

Signed-off-by: Yazen Ghannam <[email protected]>
---
drivers/edac/edac_mc.c | 2 ++
include/linux/edac.h | 6 ++++++
2 files changed, 8 insertions(+)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 9f82ca295353..9d9aabdec96b 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -162,6 +162,8 @@ const char * const edac_mem_types[] = {
[MEM_LPDDR4] = "Low-Power-DDR4-RAM",
[MEM_LRDDR4] = "Load-Reduced-DDR4-RAM",
[MEM_DDR5] = "Unbuffered-DDR5",
+ [MEM_RDDR5] = "Registered-DDR5",
+ [MEM_LRDDR5] = "Load-Reduced-DDR5-RAM",
[MEM_NVDIMM] = "Non-volatile-RAM",
[MEM_WIO2] = "Wide-IO-2",
[MEM_HBM2] = "High-bandwidth-memory-Gen2",
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 4207d06996a4..e730b3468719 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -182,6 +182,8 @@ static inline char *mc_event_error_type(const unsigned int err_type)
* @MEM_LRDDR4: Load-Reduced DDR4 memory.
* @MEM_LPDDR4: Low-Power DDR4 memory.
* @MEM_DDR5: Unbuffered DDR5 RAM
+ * @MEM_RDDR5: Registered DDR5 RAM
+ * @MEM_LRDDR5: Load-Reduced DDR5 memory.
* @MEM_NVDIMM: Non-volatile RAM
* @MEM_WIO2: Wide I/O 2.
* @MEM_HBM2: High bandwidth Memory Gen 2.
@@ -211,6 +213,8 @@ enum mem_type {
MEM_LRDDR4,
MEM_LPDDR4,
MEM_DDR5,
+ MEM_RDDR5,
+ MEM_LRDDR5,
MEM_NVDIMM,
MEM_WIO2,
MEM_HBM2,
@@ -239,6 +243,8 @@ enum mem_type {
#define MEM_FLAG_LRDDR4 BIT(MEM_LRDDR4)
#define MEM_FLAG_LPDDR4 BIT(MEM_LPDDR4)
#define MEM_FLAG_DDR5 BIT(MEM_DDR5)
+#define MEM_FLAG_RDDR5 BIT(MEM_RDDR5)
+#define MEM_FLAG_LRDDR5 BIT(MEM_LRDDR5)
#define MEM_FLAG_NVDIMM BIT(MEM_NVDIMM)
#define MEM_FLAG_WIO2 BIT(MEM_WIO2)
#define MEM_FLAG_HBM2 BIT(MEM_HBM2)
--
2.25.1


2021-12-08 17:44:22

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 2/4] EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh

Add a new family type for AMD Family 19h Models 10h to 1Fh. Use this new
family type for Models A0h to AFh also.

Increase the maximum number of controllers from 8 to 12.

Signed-off-by: Yazen Ghannam <[email protected]>
---
drivers/edac/amd64_edac.c | 21 ++++++++++++++++++++-
drivers/edac/amd64_edac.h | 5 ++++-
2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ca0c67bc25c6..ff29267e46a6 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2925,6 +2925,16 @@ static struct amd64_family_type family_types[] = {
.dbam_to_cs = f17_addr_mask_to_cs_size,
}
},
+ [F19_M10H_CPUS] = {
+ .ctl_name = "F19h_M10h",
+ .f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
+ .f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
+ .max_mcs = 12,
+ .ops = {
+ .early_channel_count = f17_early_channel_count,
+ .dbam_to_cs = f17_addr_mask_to_cs_size,
+ }
+ },
};

/*
@@ -3962,11 +3972,20 @@ static struct amd64_family_type *per_family_init(struct amd64_pvt *pvt)
break;

case 0x19:
- if (pvt->model >= 0x20 && pvt->model <= 0x2f) {
+ if (pvt->model >= 0x10 && pvt->model <= 0x1f) {
+ fam_type = &family_types[F19_M10H_CPUS];
+ pvt->ops = &family_types[F19_M10H_CPUS].ops;
+ break;
+ } else if (pvt->model >= 0x20 && pvt->model <= 0x2f) {
fam_type = &family_types[F17_M70H_CPUS];
pvt->ops = &family_types[F17_M70H_CPUS].ops;
fam_type->ctl_name = "F19h_M20h";
break;
+ } else if (pvt->model >= 0xa0 && pvt->model <= 0xaf) {
+ fam_type = &family_types[F19_M10H_CPUS];
+ pvt->ops = &family_types[F19_M10H_CPUS].ops;
+ fam_type->ctl_name = "F19h_MA0h";
+ break;
}
fam_type = &family_types[F19_CPUS];
pvt->ops = &family_types[F19_CPUS].ops;
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 85aa820bc165..650cab401e21 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -96,7 +96,7 @@
/* Hardware limit on ChipSelect rows per MC and processors per system */
#define NUM_CHIPSELECTS 8
#define DRAM_RANGES 8
-#define NUM_CONTROLLERS 8
+#define NUM_CONTROLLERS 12

#define ON true
#define OFF false
@@ -126,6 +126,8 @@
#define PCI_DEVICE_ID_AMD_17H_M70H_DF_F6 0x1446
#define PCI_DEVICE_ID_AMD_19H_DF_F0 0x1650
#define PCI_DEVICE_ID_AMD_19H_DF_F6 0x1656
+#define PCI_DEVICE_ID_AMD_19H_M10H_DF_F0 0x14ad
+#define PCI_DEVICE_ID_AMD_19H_M10H_DF_F6 0x14b3

/*
* Function 1 - Address Map
@@ -298,6 +300,7 @@ enum amd_families {
F17_M60H_CPUS,
F17_M70H_CPUS,
F19_CPUS,
+ F19_M10H_CPUS,
NUM_FAMILIES,
};

--
2.25.1


2021-12-08 17:44:24

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 3/4] EDAC/amd64: Check register values from all UMCs

Loop over all UMCs and create bitmasks to check the values of the
DIMM_CFG and UMC_CFG registers rather than just checking the values from
the first two UMCs.

Signed-off-by: Yazen Ghannam <[email protected]>
---
drivers/edac/amd64_edac.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ff29267e46a6..1df763128483 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1621,9 +1621,16 @@ static void determine_memory_type(struct amd64_pvt *pvt)
u32 dram_ctrl, dcsm;

if (pvt->umc) {
- if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
+ u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
+
+ for_each_umc(i) {
+ umc_cfg |= pvt->umc[i].umc_cfg;
+ dimm_cfg |= pvt->umc[i].dimm_cfg;
+ }
+
+ if (dimm_cfg & BIT(5))
pvt->dram_type = MEM_LRDDR4;
- else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
+ else if (dimm_cfg & BIT(4))
pvt->dram_type = MEM_RDDR4;
else
pvt->dram_type = MEM_DDR4;
--
2.25.1


2021-12-08 17:44:31

by Yazen Ghannam

[permalink] [raw]
Subject: [PATCH 4/4] EDAC/amd64: Add DDR5 support and related register changes

Future AMD systems will support DDR5.

Add support for changes in register addresses for these systems.

Introduce a "family flags" bitmask that can be used to indicate any
special behavior needed on a per-family basis.

Signed-off-by: Yazen Ghannam <[email protected]>
---
drivers/edac/amd64_edac.c | 61 +++++++++++++++++++++++++++++++++++----
drivers/edac/amd64_edac.h | 11 +++++++
2 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 1df763128483..e37a8e0cef7e 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -15,6 +15,36 @@ static struct msr __percpu *msrs;

static struct amd64_family_type *fam_type;

+/* Family flag helpers */
+static inline bool has_ddr5(void)
+{
+ return fam_type->flags.has_ddr5;
+}
+
+static inline u64 get_addr_cfg(void)
+{
+ if (has_ddr5())
+ return UMCCH_ADDR_CFG_DDR5;
+
+ return UMCCH_ADDR_CFG;
+}
+
+static inline u64 get_addr_mask_sec(void)
+{
+ if (has_ddr5())
+ return UMCCH_ADDR_MASK_SEC_DDR5;
+
+ return UMCCH_ADDR_MASK_SEC;
+}
+
+static inline u64 get_dimm_cfg(void)
+{
+ if (has_ddr5())
+ return UMCCH_DIMM_CFG_DDR5;
+
+ return UMCCH_DIMM_CFG;
+}
+
/* Per-node stuff */
static struct ecc_settings **ecc_stngs;

@@ -1429,8 +1459,10 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
edac_dbg(1, "UMC%d x16 DIMMs present: %s\n",
i, (umc->dimm_cfg & BIT(7)) ? "yes" : "no");

- if (pvt->dram_type == MEM_LRDDR4) {
- amd_smn_read(pvt->mc_node_id, umc_base + UMCCH_ADDR_CFG, &tmp);
+ if (pvt->dram_type == MEM_LRDDR4 || pvt->dram_type == MEM_LRDDR5) {
+ amd_smn_read(pvt->mc_node_id,
+ umc_base + get_addr_cfg(),
+ &tmp);
edac_dbg(1, "UMC%d LRDIMM %dx rank multiply\n",
i, 1 << ((tmp >> 4) & 0x3));
}
@@ -1505,7 +1537,7 @@ static void prep_chip_selects(struct amd64_pvt *pvt)

for_each_umc(umc) {
pvt->csels[umc].b_cnt = 4;
- pvt->csels[umc].m_cnt = 2;
+ pvt->csels[umc].m_cnt = has_ddr5() ? 4 : 2;
}

} else {
@@ -1545,7 +1577,7 @@ static void read_umc_base_mask(struct amd64_pvt *pvt)
}

umc_mask_reg = get_umc_base(umc) + UMCCH_ADDR_MASK;
- umc_mask_reg_sec = get_umc_base(umc) + UMCCH_ADDR_MASK_SEC;
+ umc_mask_reg_sec = get_umc_base(umc) + get_addr_mask_sec();

for_each_chip_select_mask(cs, umc, pvt) {
mask = &pvt->csels[umc].csmasks[cs];
@@ -1628,6 +1660,17 @@ static void determine_memory_type(struct amd64_pvt *pvt)
dimm_cfg |= pvt->umc[i].dimm_cfg;
}

+ /* Check if system supports DDR5 and has DDR5 DIMMs in use. */
+ if (has_ddr5() && (umc_cfg & BIT(0))) {
+ if (dimm_cfg & BIT(5))
+ pvt->dram_type = MEM_LRDDR5;
+ else if (dimm_cfg & BIT(4))
+ pvt->dram_type = MEM_RDDR5;
+ else
+ pvt->dram_type = MEM_DDR5;
+ return;
+ }
+
if (dimm_cfg & BIT(5))
pvt->dram_type = MEM_LRDDR4;
else if (dimm_cfg & BIT(4))
@@ -2174,8 +2217,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
* There is one mask per DIMM, and two Chip Selects per DIMM.
* CS0 and CS1 -> DIMM0
* CS2 and CS3 -> DIMM1
+ *
+ * Systems with DDR5 support have one mask per Chip Select.
*/
- dimm = csrow_nr >> 1;
+ if (has_ddr5())
+ dimm = csrow_nr;
+ else
+ dimm = csrow_nr >> 1;

/* Asymmetric dual-rank DIMM support. */
if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
@@ -2937,6 +2985,7 @@ static struct amd64_family_type family_types[] = {
.f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
.f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
.max_mcs = 12,
+ .flags.has_ddr5 = 1,
.ops = {
.early_channel_count = f17_early_channel_count,
.dbam_to_cs = f17_addr_mask_to_cs_size,
@@ -3365,7 +3414,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
umc_base = get_umc_base(i);
umc = &pvt->umc[i];

- amd_smn_read(nid, umc_base + UMCCH_DIMM_CFG, &umc->dimm_cfg);
+ amd_smn_read(nid, umc_base + get_dimm_cfg(), &umc->dimm_cfg);
amd_smn_read(nid, umc_base + UMCCH_UMC_CFG, &umc->umc_cfg);
amd_smn_read(nid, umc_base + UMCCH_SDP_CTRL, &umc->sdp_ctrl);
amd_smn_read(nid, umc_base + UMCCH_ECC_CTRL, &umc->ecc_ctrl);
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 650cab401e21..48cba95451cb 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -271,8 +271,11 @@
#define UMCCH_BASE_ADDR_SEC 0x10
#define UMCCH_ADDR_MASK 0x20
#define UMCCH_ADDR_MASK_SEC 0x28
+#define UMCCH_ADDR_MASK_SEC_DDR5 0x30
#define UMCCH_ADDR_CFG 0x30
+#define UMCCH_ADDR_CFG_DDR5 0x40
#define UMCCH_DIMM_CFG 0x80
+#define UMCCH_DIMM_CFG_DDR5 0x90
#define UMCCH_UMC_CFG 0x100
#define UMCCH_SDP_CTRL 0x104
#define UMCCH_ECC_CTRL 0x14C
@@ -477,11 +480,19 @@ struct low_ops {
unsigned cs_mode, int cs_mask_nr);
};

+struct amd64_family_flags {
+ /* Indicates that the family supports DDR5 and associated register changes. */
+ __u64 has_ddr5 : 1,
+
+ __reserved : 63;
+};
+
struct amd64_family_type {
const char *ctl_name;
u16 f0_id, f1_id, f2_id, f6_id;
/* Maximum number of memory controllers per die/node. */
u8 max_mcs;
+ struct amd64_family_flags flags;
struct low_ops ops;
};

--
2.25.1


2021-12-10 12:34:26

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/4] EDAC/amd64: Check register values from all UMCs

On Wed, Dec 08, 2021 at 05:43:55PM +0000, Yazen Ghannam wrote:
> Loop over all UMCs and create bitmasks to check the values of the
> DIMM_CFG and UMC_CFG registers rather than just checking the values from
> the first two UMCs.

Do not talk about what your patch does in the commit message - that
should hopefully be visible in the diff itself. Rather, talk about *why*
you're doing what you're doing.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-12-10 12:41:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] EDAC/amd64: Add DDR5 support and related register changes

On Wed, Dec 08, 2021 at 05:43:56PM +0000, Yazen Ghannam wrote:
> Future AMD systems will support DDR5.
>
> Add support for changes in register addresses for these systems.
>
> Introduce a "family flags" bitmask that can be used to indicate any
> special behavior needed on a per-family basis.
>
> Signed-off-by: Yazen Ghannam <[email protected]>
> ---
> drivers/edac/amd64_edac.c | 61 +++++++++++++++++++++++++++++++++++----
> drivers/edac/amd64_edac.h | 11 +++++++
> 2 files changed, 66 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 1df763128483..e37a8e0cef7e 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -15,6 +15,36 @@ static struct msr __percpu *msrs;
>
> static struct amd64_family_type *fam_type;
>
> +/* Family flag helpers */
> +static inline bool has_ddr5(void)
> +{
> + return fam_type->flags.has_ddr5;

A flag about ddr5 *and* a function of the same name. Kinda too much,
don't ya think?

> @@ -1628,6 +1660,17 @@ static void determine_memory_type(struct amd64_pvt *pvt)
> dimm_cfg |= pvt->umc[i].dimm_cfg;
> }
>
> + /* Check if system supports DDR5 and has DDR5 DIMMs in use. */
> + if (has_ddr5() && (umc_cfg & BIT(0))) {
> + if (dimm_cfg & BIT(5))
> + pvt->dram_type = MEM_LRDDR5;
> + else if (dimm_cfg & BIT(4))
> + pvt->dram_type = MEM_RDDR5;
> + else
> + pvt->dram_type = MEM_DDR5;
> + return;
> + }
> +
> if (dimm_cfg & BIT(5))
> pvt->dram_type = MEM_LRDDR4;
> else if (dimm_cfg & BIT(4))
> @@ -2174,8 +2217,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> * There is one mask per DIMM, and two Chip Selects per DIMM.
> * CS0 and CS1 -> DIMM0
> * CS2 and CS3 -> DIMM1
> + *
> + * Systems with DDR5 support have one mask per Chip Select.
> */
> - dimm = csrow_nr >> 1;
> + if (has_ddr5())
> + dimm = csrow_nr;
> + else
> + dimm = csrow_nr >> 1;
>
> /* Asymmetric dual-rank DIMM support. */
> if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
> @@ -2937,6 +2985,7 @@ static struct amd64_family_type family_types[] = {
> .f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
> .f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
> .max_mcs = 12,
> + .flags.has_ddr5 = 1,

So judging by the name, this means that model 0x10 has DDR5. But I think
you wanna say whether it supports DDR5 or not?

Or does M10 support DDR5 only?

But it doesn't look like it from the comment above:

"Check if system supports DDR5 and has DDR5 DIMMs in use."

So why is this thing set statically only for this model instead of
detecting from the hw whether there are ddr5 or ddr5 DIMMs and what it
supports?

And then you can use the defines you just added in patch 1.

I'm confused.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-12-10 12:44:13

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 0/4] AMD Family 19h Models 10h-1Fh Updates

On Wed, Dec 08, 2021 at 05:43:52PM +0000, Yazen Ghannam wrote:
> Yazen Ghannam (4):
> EDAC: Add RDDR5 and LRDDR5 memory types
> EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh

First two applied, thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-12-13 17:23:15

by Yazen Ghannam

[permalink] [raw]
Subject: Re: [PATCH 0/4] AMD Family 19h Models 10h-1Fh Updates

On Fri, Dec 10, 2021 at 01:44:11PM +0100, Borislav Petkov wrote:
> On Wed, Dec 08, 2021 at 05:43:52PM +0000, Yazen Ghannam wrote:
> > Yazen Ghannam (4):
> > EDAC: Add RDDR5 and LRDDR5 memory types
> > EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh
>
> First two applied, thx.
>

Thank you!

-Yazen

2021-12-13 17:24:36

by Yazen Ghannam

[permalink] [raw]
Subject: Re: [PATCH 3/4] EDAC/amd64: Check register values from all UMCs

On Fri, Dec 10, 2021 at 01:34:19PM +0100, Borislav Petkov wrote:
> On Wed, Dec 08, 2021 at 05:43:55PM +0000, Yazen Ghannam wrote:
> > Loop over all UMCs and create bitmasks to check the values of the
> > DIMM_CFG and UMC_CFG registers rather than just checking the values from
> > the first two UMCs.
>
> Do not talk about what your patch does in the commit message - that
> should hopefully be visible in the diff itself. Rather, talk about *why*
> you're doing what you're doing.
>

Understood.

Thanks,
Yazen

2021-12-13 17:47:10

by Yazen Ghannam

[permalink] [raw]
Subject: Re: [PATCH 4/4] EDAC/amd64: Add DDR5 support and related register changes

On Fri, Dec 10, 2021 at 01:41:26PM +0100, Borislav Petkov wrote:
> On Wed, Dec 08, 2021 at 05:43:56PM +0000, Yazen Ghannam wrote:
> > Future AMD systems will support DDR5.
> >
> > Add support for changes in register addresses for these systems.
> >
> > Introduce a "family flags" bitmask that can be used to indicate any
> > special behavior needed on a per-family basis.
> >
> > Signed-off-by: Yazen Ghannam <[email protected]>
> > ---
> > drivers/edac/amd64_edac.c | 61 +++++++++++++++++++++++++++++++++++----
> > drivers/edac/amd64_edac.h | 11 +++++++
> > 2 files changed, 66 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> > index 1df763128483..e37a8e0cef7e 100644
> > --- a/drivers/edac/amd64_edac.c
> > +++ b/drivers/edac/amd64_edac.c
> > @@ -15,6 +15,36 @@ static struct msr __percpu *msrs;
> >
> > static struct amd64_family_type *fam_type;
> >
> > +/* Family flag helpers */
> > +static inline bool has_ddr5(void)
> > +{
> > + return fam_type->flags.has_ddr5;
>
> A flag about ddr5 *and* a function of the same name. Kinda too much,
> don't ya think?
>

Yeah, you're right. I didn't think about that. I think I'll drop this function
and just check the flag directly.

> > @@ -1628,6 +1660,17 @@ static void determine_memory_type(struct amd64_pvt *pvt)
> > dimm_cfg |= pvt->umc[i].dimm_cfg;
> > }
> >
> > + /* Check if system supports DDR5 and has DDR5 DIMMs in use. */
> > + if (has_ddr5() && (umc_cfg & BIT(0))) {
> > + if (dimm_cfg & BIT(5))
> > + pvt->dram_type = MEM_LRDDR5;
> > + else if (dimm_cfg & BIT(4))
> > + pvt->dram_type = MEM_RDDR5;
> > + else
> > + pvt->dram_type = MEM_DDR5;
> > + return;
> > + }
> > +
> > if (dimm_cfg & BIT(5))
> > pvt->dram_type = MEM_LRDDR4;
> > else if (dimm_cfg & BIT(4))
> > @@ -2174,8 +2217,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> > * There is one mask per DIMM, and two Chip Selects per DIMM.
> > * CS0 and CS1 -> DIMM0
> > * CS2 and CS3 -> DIMM1
> > + *
> > + * Systems with DDR5 support have one mask per Chip Select.
> > */
> > - dimm = csrow_nr >> 1;
> > + if (has_ddr5())
> > + dimm = csrow_nr;
> > + else
> > + dimm = csrow_nr >> 1;
> >
> > /* Asymmetric dual-rank DIMM support. */
> > if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
> > @@ -2937,6 +2985,7 @@ static struct amd64_family_type family_types[] = {
> > .f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
> > .f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
> > .max_mcs = 12,
> > + .flags.has_ddr5 = 1,
>
> So judging by the name, this means that model 0x10 has DDR5. But I think
> you wanna say whether it supports DDR5 or not?
>
> Or does M10 support DDR5 only?
>
> But it doesn't look like it from the comment above:
>
> "Check if system supports DDR5 and has DDR5 DIMMs in use."
>
> So why is this thing set statically only for this model instead of
> detecting from the hw whether there are ddr5 or ddr5 DIMMs and what it
> supports?
>
> And then you can use the defines you just added in patch 1.
>
> I'm confused.
>

Yeah, sorry it's not clear. The purpose of the flag is to indicate some minor
changes that show up with future systems like register offsets changes, etc. I
didn't want to tie the name to a specific model or core name. I went with DDR5
as a new feature that shows up with these changes, but they're not directly
tied to DDR5.

But yes, a system may support DDR5 and DDR4. And this can be detected from the
hardware.

What do you think about calling the flag "uses_f19h_m10h_offsets" or something
like that? I was trying to avoid family/model in the name, but the code
already does this all over. And the convention has been to call something by
the first family/model where it shows up.

Thanks,
Yazen

2021-12-14 16:22:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4/4] EDAC/amd64: Add DDR5 support and related register changes

On Mon, Dec 13, 2021 at 05:46:55PM +0000, Yazen Ghannam wrote:
> Yeah, sorry it's not clear. The purpose of the flag is to indicate some minor
> changes that show up with future systems like register offsets changes, etc. I
> didn't want to tie the name to a specific model or core name. I went with DDR5
> as a new feature that shows up with these changes, but they're not directly
> tied to DDR5.
>
> But yes, a system may support DDR5 and DDR4. And this can be detected from the
> hardware.
>
> What do you think about calling the flag "uses_f19h_m10h_offsets" or something
> like that? I was trying to avoid family/model in the name, but the code
> already does this all over. And the convention has been to call something by
> the first family/model where it shows up.

Good question.

So AFAIU, these register offset changes are probably going to propagate
beyond F19M10... In any case, they won't be tied to the family/model
so your flag idea is in the right direction, AFAICT. I'd do something
shorter, though, so that the code accessing it is short'n'sweet:

if (pvt->flags.f19h_regs_ng) - "new generation" regs :-)

or even

if (pvt->flags.zn_new_regs_fmt)

or whatever that's called. The GPU UMC is called UMC_v2 so I guess

if (pvt->flags.zn_regs_v2)

:-)

You get the idea...

With an ample explanation in a comment what that means, ofc.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette