2022-08-25 04:49:35

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v3 0/5] Fix crash when using Qcom LLCC/EDAC drivers

Hello,

This series fixes the crash seen on the Qualcomm SM8450 chipset with the
LLCC/EDAC drivers. The problem was due to the Qcom EDAC driver using the
fixed LLCC register offsets for detecting the LLCC errors.

This seems to have worked for SoCs till SM8450. But in SM8450, the LLCC
register offsets were changed. So accessing the fixed offsets causes the
crash on this platform.

So for fixing this issue, and also to make it work on future SoCs, let's
pass the LLCC offsets from the Qcom LLCC driver based on the individual
SoCs and let the EDAC driver make use of them.

This series has been tested on SM8450 based dev board.

Thanks,
Mani

Changes in v3:

* Instead of using SoC specific register offset naming convention, used
LLCC version based as suggested by Sai
* Fixed the existing reg_offset naming convention to clearly represent
the LLCC version from which the offsets were changed
* Added Sai's Acked-by to MAINTAINERS patch
* Added a new patch that removes an extra error no assignment

Changes in v2:

* Volunteered myself as a maintainer for the EDAC driver since the current
maintainers have left Qualcomm and I couldn't get hold of them.

Manivannan Sadhasivam (5):
soc: qcom: llcc: Rename reg_offset structs to reflect LLCC version
soc: qcom: llcc: Pass LLCC version based register offsets to EDAC
driver
EDAC/qcom: Get rid of hardcoded register offsets
EDAC/qcom: Remove extra error no assignment in qcom_llcc_core_setup()
MAINTAINERS: Add myself as the maintainer for qcom_edac driver

MAINTAINERS | 3 +-
drivers/edac/qcom_edac.c | 119 ++++++++++++++---------------
drivers/soc/qcom/llcc-qcom.c | 92 +++++++++++++++++++---
include/linux/soc/qcom/llcc-qcom.h | 36 +++++++--
4 files changed, 170 insertions(+), 80 deletions(-)

--
2.25.1


2022-08-25 04:51:24

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v3 4/5] EDAC/qcom: Remove extra error no assignment in qcom_llcc_core_setup()

If the ret variable is initialized with -EINVAL, then there is no need to
assign it again in the default case of qcom_llcc_core_setup().

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/edac/qcom_edac.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
index 04df70b7fea3..0b6ca1f20b51 100644
--- a/drivers/edac/qcom_edac.c
+++ b/drivers/edac/qcom_edac.c
@@ -126,7 +126,7 @@ static int qcom_llcc_core_setup(struct llcc_drv_data *drv, struct regmap *llcc_b
static int
qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
{
- int ret = 0;
+ int ret = -EINVAL;

switch (err_type) {
case LLCC_DRAM_CE:
@@ -158,7 +158,6 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
return ret;
break;
default:
- ret = -EINVAL;
edac_printk(KERN_CRIT, EDAC_LLCC, "Unexpected error type: %d\n",
err_type);
}
--
2.25.1

2022-08-25 04:56:16

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v3 3/5] EDAC/qcom: Get rid of hardcoded register offsets

The LLCC EDAC register offsets varies between each SoC. Hardcoding the
register offsets won't work and will often result in crash due to
accessing the wrong locations.

Hence, get the register offsets from the LLCC driver matching the
individual SoCs.

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/edac/qcom_edac.c | 116 ++++++++++++++---------------
include/linux/soc/qcom/llcc-qcom.h | 6 --
2 files changed, 58 insertions(+), 64 deletions(-)

diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
index 97a27e42dd61..04df70b7fea3 100644
--- a/drivers/edac/qcom_edac.c
+++ b/drivers/edac/qcom_edac.c
@@ -21,30 +21,9 @@
#define TRP_SYN_REG_CNT 6
#define DRP_SYN_REG_CNT 8

-#define LLCC_COMMON_STATUS0 0x0003000c
#define LLCC_LB_CNT_MASK GENMASK(31, 28)
#define LLCC_LB_CNT_SHIFT 28

-/* Single & double bit syndrome register offsets */
-#define TRP_ECC_SB_ERR_SYN0 0x0002304c
-#define TRP_ECC_DB_ERR_SYN0 0x00020370
-#define DRP_ECC_SB_ERR_SYN0 0x0004204c
-#define DRP_ECC_DB_ERR_SYN0 0x00042070
-
-/* Error register offsets */
-#define TRP_ECC_ERROR_STATUS1 0x00020348
-#define TRP_ECC_ERROR_STATUS0 0x00020344
-#define DRP_ECC_ERROR_STATUS1 0x00042048
-#define DRP_ECC_ERROR_STATUS0 0x00042044
-
-/* TRP, DRP interrupt register offsets */
-#define DRP_INTERRUPT_STATUS 0x00041000
-#define TRP_INTERRUPT_0_STATUS 0x00020480
-#define DRP_INTERRUPT_CLEAR 0x00041008
-#define DRP_ECC_ERROR_CNTR_CLEAR 0x00040004
-#define TRP_INTERRUPT_0_CLEAR 0x00020484
-#define TRP_ECC_ERROR_CNTR_CLEAR 0x00020440
-
/* Mask and shift macros */
#define ECC_DB_ERR_COUNT_MASK GENMASK(4, 0)
#define ECC_DB_ERR_WAYS_MASK GENMASK(31, 16)
@@ -60,15 +39,6 @@
#define DRP_TRP_INT_CLEAR GENMASK(1, 0)
#define DRP_TRP_CNT_CLEAR GENMASK(1, 0)

-/* Config registers offsets*/
-#define DRP_ECC_ERROR_CFG 0x00040000
-
-/* Tag RAM, Data RAM interrupt register offsets */
-#define CMN_INTERRUPT_0_ENABLE 0x0003001c
-#define CMN_INTERRUPT_2_ENABLE 0x0003003c
-#define TRP_INTERRUPT_0_ENABLE 0x00020488
-#define DRP_INTERRUPT_ENABLE 0x0004100c
-
#define SB_ERROR_THRESHOLD 0x1
#define SB_ERROR_THRESHOLD_SHIFT 24
#define SB_DB_TRP_INTERRUPT_ENABLE 0x3
@@ -86,9 +56,6 @@ enum {
static const struct llcc_edac_reg_data edac_reg_data[] = {
[LLCC_DRAM_CE] = {
.name = "DRAM Single-bit",
- .synd_reg = DRP_ECC_SB_ERR_SYN0,
- .count_status_reg = DRP_ECC_ERROR_STATUS1,
- .ways_status_reg = DRP_ECC_ERROR_STATUS0,
.reg_cnt = DRP_SYN_REG_CNT,
.count_mask = ECC_SB_ERR_COUNT_MASK,
.ways_mask = ECC_SB_ERR_WAYS_MASK,
@@ -96,9 +63,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
},
[LLCC_DRAM_UE] = {
.name = "DRAM Double-bit",
- .synd_reg = DRP_ECC_DB_ERR_SYN0,
- .count_status_reg = DRP_ECC_ERROR_STATUS1,
- .ways_status_reg = DRP_ECC_ERROR_STATUS0,
.reg_cnt = DRP_SYN_REG_CNT,
.count_mask = ECC_DB_ERR_COUNT_MASK,
.ways_mask = ECC_DB_ERR_WAYS_MASK,
@@ -106,9 +70,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
},
[LLCC_TRAM_CE] = {
.name = "TRAM Single-bit",
- .synd_reg = TRP_ECC_SB_ERR_SYN0,
- .count_status_reg = TRP_ECC_ERROR_STATUS1,
- .ways_status_reg = TRP_ECC_ERROR_STATUS0,
.reg_cnt = TRP_SYN_REG_CNT,
.count_mask = ECC_SB_ERR_COUNT_MASK,
.ways_mask = ECC_SB_ERR_WAYS_MASK,
@@ -116,9 +77,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
},
[LLCC_TRAM_UE] = {
.name = "TRAM Double-bit",
- .synd_reg = TRP_ECC_DB_ERR_SYN0,
- .count_status_reg = TRP_ECC_ERROR_STATUS1,
- .ways_status_reg = TRP_ECC_ERROR_STATUS0,
.reg_cnt = TRP_SYN_REG_CNT,
.count_mask = ECC_DB_ERR_COUNT_MASK,
.ways_mask = ECC_DB_ERR_WAYS_MASK,
@@ -126,7 +84,7 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
},
};

-static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
+static int qcom_llcc_core_setup(struct llcc_drv_data *drv, struct regmap *llcc_bcast_regmap)
{
u32 sb_err_threshold;
int ret;
@@ -135,31 +93,31 @@ static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
* Configure interrupt enable registers such that Tag, Data RAM related
* interrupts are propagated to interrupt controller for servicing
*/
- ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
+ ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
TRP0_INTERRUPT_ENABLE,
TRP0_INTERRUPT_ENABLE);
if (ret)
return ret;

- ret = regmap_update_bits(llcc_bcast_regmap, TRP_INTERRUPT_0_ENABLE,
+ ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->trp_interrupt_0_enable,
SB_DB_TRP_INTERRUPT_ENABLE,
SB_DB_TRP_INTERRUPT_ENABLE);
if (ret)
return ret;

sb_err_threshold = (SB_ERROR_THRESHOLD << SB_ERROR_THRESHOLD_SHIFT);
- ret = regmap_write(llcc_bcast_regmap, DRP_ECC_ERROR_CFG,
+ ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_ecc_error_cfg,
sb_err_threshold);
if (ret)
return ret;

- ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
+ ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
DRP0_INTERRUPT_ENABLE,
DRP0_INTERRUPT_ENABLE);
if (ret)
return ret;

- ret = regmap_write(llcc_bcast_regmap, DRP_INTERRUPT_ENABLE,
+ ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_interrupt_enable,
SB_DB_DRP_INTERRUPT_ENABLE);
return ret;
}
@@ -173,24 +131,28 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
switch (err_type) {
case LLCC_DRAM_CE:
case LLCC_DRAM_UE:
- ret = regmap_write(drv->bcast_regmap, DRP_INTERRUPT_CLEAR,
+ ret = regmap_write(drv->bcast_regmap,
+ drv->edac_reg_offset->drp_interrupt_clear,
DRP_TRP_INT_CLEAR);
if (ret)
return ret;

- ret = regmap_write(drv->bcast_regmap, DRP_ECC_ERROR_CNTR_CLEAR,
+ ret = regmap_write(drv->bcast_regmap,
+ drv->edac_reg_offset->drp_ecc_error_cntr_clear,
DRP_TRP_CNT_CLEAR);
if (ret)
return ret;
break;
case LLCC_TRAM_CE:
case LLCC_TRAM_UE:
- ret = regmap_write(drv->bcast_regmap, TRP_INTERRUPT_0_CLEAR,
+ ret = regmap_write(drv->bcast_regmap,
+ drv->edac_reg_offset->trp_interrupt_0_clear,
DRP_TRP_INT_CLEAR);
if (ret)
return ret;

- ret = regmap_write(drv->bcast_regmap, TRP_ECC_ERROR_CNTR_CLEAR,
+ ret = regmap_write(drv->bcast_regmap,
+ drv->edac_reg_offset->trp_ecc_error_cntr_clear,
DRP_TRP_CNT_CLEAR);
if (ret)
return ret;
@@ -203,16 +165,54 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
return ret;
}

+struct qcom_llcc_syn_regs {
+ u32 synd_reg;
+ u32 count_status_reg;
+ u32 ways_status_reg;
+};
+
+static void get_reg_offsets(struct llcc_drv_data *drv, int err_type,
+ struct qcom_llcc_syn_regs *syn_regs)
+{
+ const struct llcc_edac_reg_offset *edac_reg_offset = drv->edac_reg_offset;
+
+ switch (err_type) {
+ case LLCC_DRAM_CE:
+ syn_regs->synd_reg = edac_reg_offset->drp_ecc_sb_err_syn0;
+ syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
+ syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
+ break;
+ case LLCC_DRAM_UE:
+ syn_regs->synd_reg = edac_reg_offset->drp_ecc_db_err_syn0;
+ syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
+ syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
+ break;
+ case LLCC_TRAM_CE:
+ syn_regs->synd_reg = edac_reg_offset->trp_ecc_sb_err_syn0;
+ syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
+ syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
+ break;
+ case LLCC_TRAM_UE:
+ syn_regs->synd_reg = edac_reg_offset->trp_ecc_db_err_syn0;
+ syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
+ syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
+ break;
+ }
+}
+
/* Dump Syndrome registers data for Tag RAM, Data RAM bit errors*/
static int
dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
{
struct llcc_edac_reg_data reg_data = edac_reg_data[err_type];
+ struct qcom_llcc_syn_regs regs = { };
int err_cnt, err_ways, ret, i;
u32 synd_reg, synd_val;

+ get_reg_offsets(drv, err_type, &regs);
+
for (i = 0; i < reg_data.reg_cnt; i++) {
- synd_reg = reg_data.synd_reg + (i * 4);
+ synd_reg = regs.synd_reg + (i * 4);
ret = regmap_read(drv->regmap, drv->offsets[bank] + synd_reg,
&synd_val);
if (ret)
@@ -223,7 +223,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
}

ret = regmap_read(drv->regmap,
- drv->offsets[bank] + reg_data.count_status_reg,
+ drv->offsets[bank] + regs.count_status_reg,
&err_cnt);
if (ret)
goto clear;
@@ -234,7 +234,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
reg_data.name, err_cnt);

ret = regmap_read(drv->regmap,
- drv->offsets[bank] + reg_data.ways_status_reg,
+ drv->offsets[bank] + regs.ways_status_reg,
&err_ways);
if (ret)
goto clear;
@@ -297,7 +297,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
/* Iterate over the banks and look for Tag RAM or Data RAM errors */
for (i = 0; i < drv->num_banks; i++) {
ret = regmap_read(drv->regmap,
- drv->offsets[i] + DRP_INTERRUPT_STATUS,
+ drv->offsets[i] + drv->edac_reg_offset->drp_interrupt_status,
&drp_error);

if (!ret && (drp_error & SB_ECC_ERROR)) {
@@ -313,7 +313,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
irq_rc = IRQ_HANDLED;

ret = regmap_read(drv->regmap,
- drv->offsets[i] + TRP_INTERRUPT_0_STATUS,
+ drv->offsets[i] + drv->edac_reg_offset->trp_interrupt_0_status,
&trp_error);

if (!ret && (trp_error & SB_ECC_ERROR)) {
@@ -340,7 +340,7 @@ static int qcom_llcc_edac_probe(struct platform_device *pdev)
int ecc_irq;
int rc;

- rc = qcom_llcc_core_setup(llcc_driv_data->bcast_regmap);
+ rc = qcom_llcc_core_setup(llcc_driv_data, llcc_driv_data->bcast_regmap);
if (rc)
return rc;

diff --git a/include/linux/soc/qcom/llcc-qcom.h b/include/linux/soc/qcom/llcc-qcom.h
index bc2fb8343a94..d5b2d58e8857 100644
--- a/include/linux/soc/qcom/llcc-qcom.h
+++ b/include/linux/soc/qcom/llcc-qcom.h
@@ -57,9 +57,6 @@ struct llcc_slice_desc {
/**
* struct llcc_edac_reg_data - llcc edac registers data for each error type
* @name: Name of the error
- * @synd_reg: Syndrome register address
- * @count_status_reg: Status register address to read the error count
- * @ways_status_reg: Status register address to read the error ways
* @reg_cnt: Number of registers
* @count_mask: Mask value to get the error count
* @ways_mask: Mask value to get the error ways
@@ -68,9 +65,6 @@ struct llcc_slice_desc {
*/
struct llcc_edac_reg_data {
char *name;
- u64 synd_reg;
- u64 count_status_reg;
- u64 ways_status_reg;
u32 reg_cnt;
u32 count_mask;
u32 ways_mask;
--
2.25.1

2022-08-25 04:57:54

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v3 2/5] soc: qcom: llcc: Pass LLCC version based register offsets to EDAC driver

The LLCC EDAC register offsets varies between each SoCs. Until now, the
EDAC driver used the hardcoded register offsets. But this caused crash
on SM8450 SoC where the register offsets has been changed.

So to avoid this crash and also to make it easy to accommodate changes for
new SoCs, let's pass the LLCC version specific register offsets to the
EDAC driver.

Currently, two set of offsets are used. One is starting from LLCC version
v1.0.0 used by all SoCs other than SM8450. For SM8450, LLCC version
starting from v2.1.0 is used.

Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
drivers/soc/qcom/llcc-qcom.c | 66 ++++++++++++++++++++++++++++++
include/linux/soc/qcom/llcc-qcom.h | 30 ++++++++++++++
2 files changed, 96 insertions(+)

diff --git a/drivers/soc/qcom/llcc-qcom.c b/drivers/soc/qcom/llcc-qcom.c
index 0dc2bb0c23cc..8b7e8118f3ce 100644
--- a/drivers/soc/qcom/llcc-qcom.c
+++ b/drivers/soc/qcom/llcc-qcom.c
@@ -104,6 +104,7 @@ struct qcom_llcc_config {
int size;
bool need_llcc_cfg;
const u32 *reg_offset;
+ const struct llcc_edac_reg_offset *edac_reg_offset;
};

enum llcc_reg_offset {
@@ -296,6 +297,60 @@ static const struct llcc_slice_config sm8450_data[] = {
{LLCC_AENPU, 8, 2048, 1, 1, 0xFFFF, 0x0, 0, 0, 0, 0, 0, 0, 0 },
};

+static const struct llcc_edac_reg_offset llcc_v1_edac_reg_offset = {
+ .trp_ecc_error_status0 = 0x20344,
+ .trp_ecc_error_status1 = 0x20348,
+ .trp_ecc_sb_err_syn0 = 0x2304c,
+ .trp_ecc_db_err_syn0 = 0x20370,
+ .trp_ecc_error_cntr_clear = 0x20440,
+ .trp_interrupt_0_status = 0x20480,
+ .trp_interrupt_0_clear = 0x20484,
+ .trp_interrupt_0_enable = 0x20488,
+
+ /* LLCC Common registers */
+ .cmn_status0 = 0x3000c,
+ .cmn_interrupt_0_enable = 0x3001c,
+ .cmn_interrupt_2_enable = 0x3003c,
+
+ /* LLCC DRP registers */
+ .drp_ecc_error_cfg = 0x40000,
+ .drp_ecc_error_cntr_clear = 0x40004,
+ .drp_interrupt_status = 0x41000,
+ .drp_interrupt_clear = 0x41008,
+ .drp_interrupt_enable = 0x4100c,
+ .drp_ecc_error_status0 = 0x42044,
+ .drp_ecc_error_status1 = 0x42048,
+ .drp_ecc_sb_err_syn0 = 0x4204c,
+ .drp_ecc_db_err_syn0 = 0x42070,
+};
+
+static const struct llcc_edac_reg_offset llcc_v2_1_edac_reg_offset = {
+ .trp_ecc_error_status0 = 0x20344,
+ .trp_ecc_error_status1 = 0x20348,
+ .trp_ecc_sb_err_syn0 = 0x2034c,
+ .trp_ecc_db_err_syn0 = 0x20370,
+ .trp_ecc_error_cntr_clear = 0x20440,
+ .trp_interrupt_0_status = 0x20480,
+ .trp_interrupt_0_clear = 0x20484,
+ .trp_interrupt_0_enable = 0x20488,
+
+ /* LLCC Common registers */
+ .cmn_status0 = 0x3400c,
+ .cmn_interrupt_0_enable = 0x3401c,
+ .cmn_interrupt_2_enable = 0x3403c,
+
+ /* LLCC DRP registers */
+ .drp_ecc_error_cfg = 0x50000,
+ .drp_ecc_error_cntr_clear = 0x50004,
+ .drp_interrupt_status = 0x50020,
+ .drp_interrupt_clear = 0x50028,
+ .drp_interrupt_enable = 0x5002c,
+ .drp_ecc_error_status0 = 0x520f4,
+ .drp_ecc_error_status1 = 0x520f8,
+ .drp_ecc_sb_err_syn0 = 0x520fc,
+ .drp_ecc_db_err_syn0 = 0x52120,
+};
+
/* LLCC register offset starting from v1.0.0 */
static const u32 llcc_v1_reg_offset[] = {
[LLCC_COMMON_HW_INFO] = 0x00030000,
@@ -313,6 +368,7 @@ static const struct qcom_llcc_config sc7180_cfg = {
.size = ARRAY_SIZE(sc7180_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sc7280_cfg = {
@@ -320,6 +376,7 @@ static const struct qcom_llcc_config sc7280_cfg = {
.size = ARRAY_SIZE(sc7280_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sc8180x_cfg = {
@@ -327,6 +384,7 @@ static const struct qcom_llcc_config sc8180x_cfg = {
.size = ARRAY_SIZE(sc8180x_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sc8280xp_cfg = {
@@ -334,6 +392,7 @@ static const struct qcom_llcc_config sc8280xp_cfg = {
.size = ARRAY_SIZE(sc8280xp_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sdm845_cfg = {
@@ -341,6 +400,7 @@ static const struct qcom_llcc_config sdm845_cfg = {
.size = ARRAY_SIZE(sdm845_data),
.need_llcc_cfg = false,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sm6350_cfg = {
@@ -348,6 +408,7 @@ static const struct qcom_llcc_config sm6350_cfg = {
.size = ARRAY_SIZE(sm6350_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sm8150_cfg = {
@@ -355,6 +416,7 @@ static const struct qcom_llcc_config sm8150_cfg = {
.size = ARRAY_SIZE(sm8150_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sm8250_cfg = {
@@ -362,6 +424,7 @@ static const struct qcom_llcc_config sm8250_cfg = {
.size = ARRAY_SIZE(sm8250_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sm8350_cfg = {
@@ -369,6 +432,7 @@ static const struct qcom_llcc_config sm8350_cfg = {
.size = ARRAY_SIZE(sm8350_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v1_reg_offset,
+ .edac_reg_offset = &llcc_v1_edac_reg_offset,
};

static const struct qcom_llcc_config sm8450_cfg = {
@@ -376,6 +440,7 @@ static const struct qcom_llcc_config sm8450_cfg = {
.size = ARRAY_SIZE(sm8450_data),
.need_llcc_cfg = true,
.reg_offset = llcc_v2_1_reg_offset,
+ .edac_reg_offset = &llcc_v2_1_edac_reg_offset,
};

static struct llcc_drv_data *drv_data = (void *) -EPROBE_DEFER;
@@ -776,6 +841,7 @@ static int qcom_llcc_probe(struct platform_device *pdev)

drv_data->cfg = llcc_cfg;
drv_data->cfg_size = sz;
+ drv_data->edac_reg_offset = cfg->edac_reg_offset;
mutex_init(&drv_data->lock);
platform_set_drvdata(pdev, drv_data);

diff --git a/include/linux/soc/qcom/llcc-qcom.h b/include/linux/soc/qcom/llcc-qcom.h
index 9ed5384c5ca1..bc2fb8343a94 100644
--- a/include/linux/soc/qcom/llcc-qcom.h
+++ b/include/linux/soc/qcom/llcc-qcom.h
@@ -78,11 +78,40 @@ struct llcc_edac_reg_data {
u8 ways_shift;
};

+struct llcc_edac_reg_offset {
+ /* LLCC TRP registers */
+ u32 trp_ecc_error_status0;
+ u32 trp_ecc_error_status1;
+ u32 trp_ecc_sb_err_syn0;
+ u32 trp_ecc_db_err_syn0;
+ u32 trp_ecc_error_cntr_clear;
+ u32 trp_interrupt_0_status;
+ u32 trp_interrupt_0_clear;
+ u32 trp_interrupt_0_enable;
+
+ /* LLCC Common registers */
+ u32 cmn_status0;
+ u32 cmn_interrupt_0_enable;
+ u32 cmn_interrupt_2_enable;
+
+ /* LLCC DRP registers */
+ u32 drp_ecc_error_cfg;
+ u32 drp_ecc_error_cntr_clear;
+ u32 drp_interrupt_status;
+ u32 drp_interrupt_clear;
+ u32 drp_interrupt_enable;
+ u32 drp_ecc_error_status0;
+ u32 drp_ecc_error_status1;
+ u32 drp_ecc_sb_err_syn0;
+ u32 drp_ecc_db_err_syn0;
+};
+
/**
* struct llcc_drv_data - Data associated with the llcc driver
* @regmap: regmap associated with the llcc device
* @bcast_regmap: regmap associated with llcc broadcast offset
* @cfg: pointer to the data structure for slice configuration
+ * @edac_reg_offset: Offset of the LLCC EDAC registers
* @lock: mutex associated with each slice
* @cfg_size: size of the config data table
* @max_slices: max slices as read from device tree
@@ -96,6 +125,7 @@ struct llcc_drv_data {
struct regmap *regmap;
struct regmap *bcast_regmap;
const struct llcc_slice_config *cfg;
+ const struct llcc_edac_reg_offset *edac_reg_offset;
struct mutex lock;
u32 cfg_size;
u32 max_slices;
--
2.25.1

2022-08-25 05:08:04

by Sai Prakash Ranjan

[permalink] [raw]
Subject: Re: [PATCH v3 3/5] EDAC/qcom: Get rid of hardcoded register offsets

On 8/25/2022 10:08 AM, Manivannan Sadhasivam wrote:
> The LLCC EDAC register offsets varies between each SoC. Hardcoding the
> register offsets won't work and will often result in crash due to
> accessing the wrong locations.
>
> Hence, get the register offsets from the LLCC driver matching the
> individual SoCs.
>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/edac/qcom_edac.c | 116 ++++++++++++++---------------
> include/linux/soc/qcom/llcc-qcom.h | 6 --
> 2 files changed, 58 insertions(+), 64 deletions(-)

Reviewed-by: Sai Prakash Ranjan <[email protected]>


Thanks,
Sai

> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> index 97a27e42dd61..04df70b7fea3 100644
> --- a/drivers/edac/qcom_edac.c
> +++ b/drivers/edac/qcom_edac.c
> @@ -21,30 +21,9 @@
> #define TRP_SYN_REG_CNT 6
> #define DRP_SYN_REG_CNT 8
>
> -#define LLCC_COMMON_STATUS0 0x0003000c
> #define LLCC_LB_CNT_MASK GENMASK(31, 28)
> #define LLCC_LB_CNT_SHIFT 28
>
> -/* Single & double bit syndrome register offsets */
> -#define TRP_ECC_SB_ERR_SYN0 0x0002304c
> -#define TRP_ECC_DB_ERR_SYN0 0x00020370
> -#define DRP_ECC_SB_ERR_SYN0 0x0004204c
> -#define DRP_ECC_DB_ERR_SYN0 0x00042070
> -
> -/* Error register offsets */
> -#define TRP_ECC_ERROR_STATUS1 0x00020348
> -#define TRP_ECC_ERROR_STATUS0 0x00020344
> -#define DRP_ECC_ERROR_STATUS1 0x00042048
> -#define DRP_ECC_ERROR_STATUS0 0x00042044
> -
> -/* TRP, DRP interrupt register offsets */
> -#define DRP_INTERRUPT_STATUS 0x00041000
> -#define TRP_INTERRUPT_0_STATUS 0x00020480
> -#define DRP_INTERRUPT_CLEAR 0x00041008
> -#define DRP_ECC_ERROR_CNTR_CLEAR 0x00040004
> -#define TRP_INTERRUPT_0_CLEAR 0x00020484
> -#define TRP_ECC_ERROR_CNTR_CLEAR 0x00020440
> -
> /* Mask and shift macros */
> #define ECC_DB_ERR_COUNT_MASK GENMASK(4, 0)
> #define ECC_DB_ERR_WAYS_MASK GENMASK(31, 16)
> @@ -60,15 +39,6 @@
> #define DRP_TRP_INT_CLEAR GENMASK(1, 0)
> #define DRP_TRP_CNT_CLEAR GENMASK(1, 0)
>
> -/* Config registers offsets*/
> -#define DRP_ECC_ERROR_CFG 0x00040000
> -
> -/* Tag RAM, Data RAM interrupt register offsets */
> -#define CMN_INTERRUPT_0_ENABLE 0x0003001c
> -#define CMN_INTERRUPT_2_ENABLE 0x0003003c
> -#define TRP_INTERRUPT_0_ENABLE 0x00020488
> -#define DRP_INTERRUPT_ENABLE 0x0004100c
> -
> #define SB_ERROR_THRESHOLD 0x1
> #define SB_ERROR_THRESHOLD_SHIFT 24
> #define SB_DB_TRP_INTERRUPT_ENABLE 0x3
> @@ -86,9 +56,6 @@ enum {
> static const struct llcc_edac_reg_data edac_reg_data[] = {
> [LLCC_DRAM_CE] = {
> .name = "DRAM Single-bit",
> - .synd_reg = DRP_ECC_SB_ERR_SYN0,
> - .count_status_reg = DRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = DRP_ECC_ERROR_STATUS0,
> .reg_cnt = DRP_SYN_REG_CNT,
> .count_mask = ECC_SB_ERR_COUNT_MASK,
> .ways_mask = ECC_SB_ERR_WAYS_MASK,
> @@ -96,9 +63,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_DRAM_UE] = {
> .name = "DRAM Double-bit",
> - .synd_reg = DRP_ECC_DB_ERR_SYN0,
> - .count_status_reg = DRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = DRP_ECC_ERROR_STATUS0,
> .reg_cnt = DRP_SYN_REG_CNT,
> .count_mask = ECC_DB_ERR_COUNT_MASK,
> .ways_mask = ECC_DB_ERR_WAYS_MASK,
> @@ -106,9 +70,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_TRAM_CE] = {
> .name = "TRAM Single-bit",
> - .synd_reg = TRP_ECC_SB_ERR_SYN0,
> - .count_status_reg = TRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = TRP_ECC_ERROR_STATUS0,
> .reg_cnt = TRP_SYN_REG_CNT,
> .count_mask = ECC_SB_ERR_COUNT_MASK,
> .ways_mask = ECC_SB_ERR_WAYS_MASK,
> @@ -116,9 +77,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_TRAM_UE] = {
> .name = "TRAM Double-bit",
> - .synd_reg = TRP_ECC_DB_ERR_SYN0,
> - .count_status_reg = TRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = TRP_ECC_ERROR_STATUS0,
> .reg_cnt = TRP_SYN_REG_CNT,
> .count_mask = ECC_DB_ERR_COUNT_MASK,
> .ways_mask = ECC_DB_ERR_WAYS_MASK,
> @@ -126,7 +84,7 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> };
>
> -static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
> +static int qcom_llcc_core_setup(struct llcc_drv_data *drv, struct regmap *llcc_bcast_regmap)
> {
> u32 sb_err_threshold;
> int ret;
> @@ -135,31 +93,31 @@ static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
> * Configure interrupt enable registers such that Tag, Data RAM related
> * interrupts are propagated to interrupt controller for servicing
> */
> - ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
> TRP0_INTERRUPT_ENABLE,
> TRP0_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> - ret = regmap_update_bits(llcc_bcast_regmap, TRP_INTERRUPT_0_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->trp_interrupt_0_enable,
> SB_DB_TRP_INTERRUPT_ENABLE,
> SB_DB_TRP_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> sb_err_threshold = (SB_ERROR_THRESHOLD << SB_ERROR_THRESHOLD_SHIFT);
> - ret = regmap_write(llcc_bcast_regmap, DRP_ECC_ERROR_CFG,
> + ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_ecc_error_cfg,
> sb_err_threshold);
> if (ret)
> return ret;
>
> - ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
> DRP0_INTERRUPT_ENABLE,
> DRP0_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> - ret = regmap_write(llcc_bcast_regmap, DRP_INTERRUPT_ENABLE,
> + ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_interrupt_enable,
> SB_DB_DRP_INTERRUPT_ENABLE);
> return ret;
> }
> @@ -173,24 +131,28 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> switch (err_type) {
> case LLCC_DRAM_CE:
> case LLCC_DRAM_UE:
> - ret = regmap_write(drv->bcast_regmap, DRP_INTERRUPT_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->drp_interrupt_clear,
> DRP_TRP_INT_CLEAR);
> if (ret)
> return ret;
>
> - ret = regmap_write(drv->bcast_regmap, DRP_ECC_ERROR_CNTR_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->drp_ecc_error_cntr_clear,
> DRP_TRP_CNT_CLEAR);
> if (ret)
> return ret;
> break;
> case LLCC_TRAM_CE:
> case LLCC_TRAM_UE:
> - ret = regmap_write(drv->bcast_regmap, TRP_INTERRUPT_0_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->trp_interrupt_0_clear,
> DRP_TRP_INT_CLEAR);
> if (ret)
> return ret;
>
> - ret = regmap_write(drv->bcast_regmap, TRP_ECC_ERROR_CNTR_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->trp_ecc_error_cntr_clear,
> DRP_TRP_CNT_CLEAR);
> if (ret)
> return ret;
> @@ -203,16 +165,54 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> return ret;
> }
>
> +struct qcom_llcc_syn_regs {
> + u32 synd_reg;
> + u32 count_status_reg;
> + u32 ways_status_reg;
> +};
> +
> +static void get_reg_offsets(struct llcc_drv_data *drv, int err_type,
> + struct qcom_llcc_syn_regs *syn_regs)
> +{
> + const struct llcc_edac_reg_offset *edac_reg_offset = drv->edac_reg_offset;
> +
> + switch (err_type) {
> + case LLCC_DRAM_CE:
> + syn_regs->synd_reg = edac_reg_offset->drp_ecc_sb_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
> + break;
> + case LLCC_DRAM_UE:
> + syn_regs->synd_reg = edac_reg_offset->drp_ecc_db_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
> + break;
> + case LLCC_TRAM_CE:
> + syn_regs->synd_reg = edac_reg_offset->trp_ecc_sb_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
> + break;
> + case LLCC_TRAM_UE:
> + syn_regs->synd_reg = edac_reg_offset->trp_ecc_db_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
> + break;
> + }
> +}
> +
> /* Dump Syndrome registers data for Tag RAM, Data RAM bit errors*/
> static int
> dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> {
> struct llcc_edac_reg_data reg_data = edac_reg_data[err_type];
> + struct qcom_llcc_syn_regs regs = { };
> int err_cnt, err_ways, ret, i;
> u32 synd_reg, synd_val;
>
> + get_reg_offsets(drv, err_type, &regs);
> +
> for (i = 0; i < reg_data.reg_cnt; i++) {
> - synd_reg = reg_data.synd_reg + (i * 4);
> + synd_reg = regs.synd_reg + (i * 4);
> ret = regmap_read(drv->regmap, drv->offsets[bank] + synd_reg,
> &synd_val);
> if (ret)
> @@ -223,7 +223,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> }
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[bank] + reg_data.count_status_reg,
> + drv->offsets[bank] + regs.count_status_reg,
> &err_cnt);
> if (ret)
> goto clear;
> @@ -234,7 +234,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> reg_data.name, err_cnt);
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[bank] + reg_data.ways_status_reg,
> + drv->offsets[bank] + regs.ways_status_reg,
> &err_ways);
> if (ret)
> goto clear;
> @@ -297,7 +297,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
> /* Iterate over the banks and look for Tag RAM or Data RAM errors */
> for (i = 0; i < drv->num_banks; i++) {
> ret = regmap_read(drv->regmap,
> - drv->offsets[i] + DRP_INTERRUPT_STATUS,
> + drv->offsets[i] + drv->edac_reg_offset->drp_interrupt_status,
> &drp_error);
>
> if (!ret && (drp_error & SB_ECC_ERROR)) {
> @@ -313,7 +313,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
> irq_rc = IRQ_HANDLED;
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[i] + TRP_INTERRUPT_0_STATUS,
> + drv->offsets[i] + drv->edac_reg_offset->trp_interrupt_0_status,
> &trp_error);
>
> if (!ret && (trp_error & SB_ECC_ERROR)) {
> @@ -340,7 +340,7 @@ static int qcom_llcc_edac_probe(struct platform_device *pdev)
> int ecc_irq;
> int rc;
>
> - rc = qcom_llcc_core_setup(llcc_driv_data->bcast_regmap);
> + rc = qcom_llcc_core_setup(llcc_driv_data, llcc_driv_data->bcast_regmap);
> if (rc)
> return rc;
>
> diff --git a/include/linux/soc/qcom/llcc-qcom.h b/include/linux/soc/qcom/llcc-qcom.h
> index bc2fb8343a94..d5b2d58e8857 100644
> --- a/include/linux/soc/qcom/llcc-qcom.h
> +++ b/include/linux/soc/qcom/llcc-qcom.h
> @@ -57,9 +57,6 @@ struct llcc_slice_desc {
> /**
> * struct llcc_edac_reg_data - llcc edac registers data for each error type
> * @name: Name of the error
> - * @synd_reg: Syndrome register address
> - * @count_status_reg: Status register address to read the error count
> - * @ways_status_reg: Status register address to read the error ways
> * @reg_cnt: Number of registers
> * @count_mask: Mask value to get the error count
> * @ways_mask: Mask value to get the error ways
> @@ -68,9 +65,6 @@ struct llcc_slice_desc {
> */
> struct llcc_edac_reg_data {
> char *name;
> - u64 synd_reg;
> - u64 count_status_reg;
> - u64 ways_status_reg;
> u32 reg_cnt;
> u32 count_mask;
> u32 ways_mask;

2022-08-25 05:11:30

by Manivannan Sadhasivam

[permalink] [raw]
Subject: [PATCH v3 5/5] MAINTAINERS: Add myself as the maintainer for qcom_edac driver

The current maintainers have left Qualcomm and their email addresses were
bouncing. Since I couldn't get hold of them now, I'm volunteering myself
to maintain this driver.

Acked-by: Sai Prakash Ranjan <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
MAINTAINERS | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8a5012ba6ff9..026dd33b106c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7432,8 +7432,7 @@ S: Maintained
F: drivers/edac/pnd2_edac.[ch]

EDAC-QCOM
-M: Channagoud Kadabi <[email protected]>
-M: Venkata Narendra Kumar Gutta <[email protected]>
+M: Manivannan Sadhasivam <[email protected]>
L: [email protected]
L: [email protected]
S: Maintained
--
2.25.1

2022-08-25 05:21:50

by Sai Prakash Ranjan

[permalink] [raw]
Subject: Re: [PATCH v3 4/5] EDAC/qcom: Remove extra error no assignment in qcom_llcc_core_setup()

On 8/25/2022 10:08 AM, Manivannan Sadhasivam wrote:
> If the ret variable is initialized with -EINVAL, then there is no need to
> assign it again in the default case of qcom_llcc_core_setup().

Nit: I think you meant in qcom_llcc_clear_error_status().

>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/edac/qcom_edac.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)

Reviewed-by: Sai Prakash Ranjan <[email protected]>

> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> index 04df70b7fea3..0b6ca1f20b51 100644
> --- a/drivers/edac/qcom_edac.c
> +++ b/drivers/edac/qcom_edac.c
> @@ -126,7 +126,7 @@ static int qcom_llcc_core_setup(struct llcc_drv_data *drv, struct regmap *llcc_b
> static int
> qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> {
> - int ret = 0;
> + int ret = -EINVAL;
>
> switch (err_type) {
> case LLCC_DRAM_CE:
> @@ -158,7 +158,6 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> return ret;
> break;
> default:
> - ret = -EINVAL;
> edac_printk(KERN_CRIT, EDAC_LLCC, "Unexpected error type: %d\n",
> err_type);
> }

2022-08-25 05:58:37

by Sai Prakash Ranjan

[permalink] [raw]
Subject: Re: [PATCH v3 2/5] soc: qcom: llcc: Pass LLCC version based register offsets to EDAC driver

On 8/25/2022 10:08 AM, Manivannan Sadhasivam wrote:
> The LLCC EDAC register offsets varies between each SoCs. Until now, the
> EDAC driver used the hardcoded register offsets. But this caused crash
> on SM8450 SoC where the register offsets has been changed.
>
> So to avoid this crash and also to make it easy to accommodate changes for
> new SoCs, let's pass the LLCC version specific register offsets to the
> EDAC driver.
>
> Currently, two set of offsets are used. One is starting from LLCC version
> v1.0.0 used by all SoCs other than SM8450. For SM8450, LLCC version
> starting from v2.1.0 is used.
>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/soc/qcom/llcc-qcom.c | 66 ++++++++++++++++++++++++++++++
> include/linux/soc/qcom/llcc-qcom.h | 30 ++++++++++++++
> 2 files changed, 96 insertions(+)

Reviewed-by: Sai Prakash Ranjan <[email protected]>


Thanks,
Sai

2022-08-30 03:14:55

by Bjorn Andersson

[permalink] [raw]
Subject: Re: (subset) [PATCH v3 0/5] Fix crash when using Qcom LLCC/EDAC drivers

On Thu, 25 Aug 2022 10:08:54 +0530, Manivannan Sadhasivam wrote:
> This series fixes the crash seen on the Qualcomm SM8450 chipset with the
> LLCC/EDAC drivers. The problem was due to the Qcom EDAC driver using the
> fixed LLCC register offsets for detecting the LLCC errors.
>
> This seems to have worked for SoCs till SM8450. But in SM8450, the LLCC
> register offsets were changed. So accessing the fixed offsets causes the
> crash on this platform.
>
> [...]

Applied, thanks!

[1/5] soc: qcom: llcc: Rename reg_offset structs to reflect LLCC version
commit: 5365cea199c70d6abedc2e1be850c03e990f1829
[2/5] soc: qcom: llcc: Pass LLCC version based register offsets to EDAC driver
commit: c13d7d261e361dbaf5adbdc216ee4a1204c48001

Best regards,
--
Bjorn Andersson <[email protected]>

2022-08-30 03:31:52

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v3 3/5] EDAC/qcom: Get rid of hardcoded register offsets

On Thu, Aug 25, 2022 at 10:08:57AM +0530, Manivannan Sadhasivam wrote:
> The LLCC EDAC register offsets varies between each SoC. Hardcoding the
> register offsets won't work and will often result in crash due to
> accessing the wrong locations.
>
> Hence, get the register offsets from the LLCC driver matching the
> individual SoCs.
>

I have applied patch 1 and 2 to the Qualcomm tree, please find a tag of
this:

https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git
tags/[email protected]

Regards,
Bjorn

> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/edac/qcom_edac.c | 116 ++++++++++++++---------------
> include/linux/soc/qcom/llcc-qcom.h | 6 --
> 2 files changed, 58 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> index 97a27e42dd61..04df70b7fea3 100644
> --- a/drivers/edac/qcom_edac.c
> +++ b/drivers/edac/qcom_edac.c
> @@ -21,30 +21,9 @@
> #define TRP_SYN_REG_CNT 6
> #define DRP_SYN_REG_CNT 8
>
> -#define LLCC_COMMON_STATUS0 0x0003000c
> #define LLCC_LB_CNT_MASK GENMASK(31, 28)
> #define LLCC_LB_CNT_SHIFT 28
>
> -/* Single & double bit syndrome register offsets */
> -#define TRP_ECC_SB_ERR_SYN0 0x0002304c
> -#define TRP_ECC_DB_ERR_SYN0 0x00020370
> -#define DRP_ECC_SB_ERR_SYN0 0x0004204c
> -#define DRP_ECC_DB_ERR_SYN0 0x00042070
> -
> -/* Error register offsets */
> -#define TRP_ECC_ERROR_STATUS1 0x00020348
> -#define TRP_ECC_ERROR_STATUS0 0x00020344
> -#define DRP_ECC_ERROR_STATUS1 0x00042048
> -#define DRP_ECC_ERROR_STATUS0 0x00042044
> -
> -/* TRP, DRP interrupt register offsets */
> -#define DRP_INTERRUPT_STATUS 0x00041000
> -#define TRP_INTERRUPT_0_STATUS 0x00020480
> -#define DRP_INTERRUPT_CLEAR 0x00041008
> -#define DRP_ECC_ERROR_CNTR_CLEAR 0x00040004
> -#define TRP_INTERRUPT_0_CLEAR 0x00020484
> -#define TRP_ECC_ERROR_CNTR_CLEAR 0x00020440
> -
> /* Mask and shift macros */
> #define ECC_DB_ERR_COUNT_MASK GENMASK(4, 0)
> #define ECC_DB_ERR_WAYS_MASK GENMASK(31, 16)
> @@ -60,15 +39,6 @@
> #define DRP_TRP_INT_CLEAR GENMASK(1, 0)
> #define DRP_TRP_CNT_CLEAR GENMASK(1, 0)
>
> -/* Config registers offsets*/
> -#define DRP_ECC_ERROR_CFG 0x00040000
> -
> -/* Tag RAM, Data RAM interrupt register offsets */
> -#define CMN_INTERRUPT_0_ENABLE 0x0003001c
> -#define CMN_INTERRUPT_2_ENABLE 0x0003003c
> -#define TRP_INTERRUPT_0_ENABLE 0x00020488
> -#define DRP_INTERRUPT_ENABLE 0x0004100c
> -
> #define SB_ERROR_THRESHOLD 0x1
> #define SB_ERROR_THRESHOLD_SHIFT 24
> #define SB_DB_TRP_INTERRUPT_ENABLE 0x3
> @@ -86,9 +56,6 @@ enum {
> static const struct llcc_edac_reg_data edac_reg_data[] = {
> [LLCC_DRAM_CE] = {
> .name = "DRAM Single-bit",
> - .synd_reg = DRP_ECC_SB_ERR_SYN0,
> - .count_status_reg = DRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = DRP_ECC_ERROR_STATUS0,
> .reg_cnt = DRP_SYN_REG_CNT,
> .count_mask = ECC_SB_ERR_COUNT_MASK,
> .ways_mask = ECC_SB_ERR_WAYS_MASK,
> @@ -96,9 +63,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_DRAM_UE] = {
> .name = "DRAM Double-bit",
> - .synd_reg = DRP_ECC_DB_ERR_SYN0,
> - .count_status_reg = DRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = DRP_ECC_ERROR_STATUS0,
> .reg_cnt = DRP_SYN_REG_CNT,
> .count_mask = ECC_DB_ERR_COUNT_MASK,
> .ways_mask = ECC_DB_ERR_WAYS_MASK,
> @@ -106,9 +70,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_TRAM_CE] = {
> .name = "TRAM Single-bit",
> - .synd_reg = TRP_ECC_SB_ERR_SYN0,
> - .count_status_reg = TRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = TRP_ECC_ERROR_STATUS0,
> .reg_cnt = TRP_SYN_REG_CNT,
> .count_mask = ECC_SB_ERR_COUNT_MASK,
> .ways_mask = ECC_SB_ERR_WAYS_MASK,
> @@ -116,9 +77,6 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> [LLCC_TRAM_UE] = {
> .name = "TRAM Double-bit",
> - .synd_reg = TRP_ECC_DB_ERR_SYN0,
> - .count_status_reg = TRP_ECC_ERROR_STATUS1,
> - .ways_status_reg = TRP_ECC_ERROR_STATUS0,
> .reg_cnt = TRP_SYN_REG_CNT,
> .count_mask = ECC_DB_ERR_COUNT_MASK,
> .ways_mask = ECC_DB_ERR_WAYS_MASK,
> @@ -126,7 +84,7 @@ static const struct llcc_edac_reg_data edac_reg_data[] = {
> },
> };
>
> -static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
> +static int qcom_llcc_core_setup(struct llcc_drv_data *drv, struct regmap *llcc_bcast_regmap)
> {
> u32 sb_err_threshold;
> int ret;
> @@ -135,31 +93,31 @@ static int qcom_llcc_core_setup(struct regmap *llcc_bcast_regmap)
> * Configure interrupt enable registers such that Tag, Data RAM related
> * interrupts are propagated to interrupt controller for servicing
> */
> - ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
> TRP0_INTERRUPT_ENABLE,
> TRP0_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> - ret = regmap_update_bits(llcc_bcast_regmap, TRP_INTERRUPT_0_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->trp_interrupt_0_enable,
> SB_DB_TRP_INTERRUPT_ENABLE,
> SB_DB_TRP_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> sb_err_threshold = (SB_ERROR_THRESHOLD << SB_ERROR_THRESHOLD_SHIFT);
> - ret = regmap_write(llcc_bcast_regmap, DRP_ECC_ERROR_CFG,
> + ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_ecc_error_cfg,
> sb_err_threshold);
> if (ret)
> return ret;
>
> - ret = regmap_update_bits(llcc_bcast_regmap, CMN_INTERRUPT_2_ENABLE,
> + ret = regmap_update_bits(llcc_bcast_regmap, drv->edac_reg_offset->cmn_interrupt_2_enable,
> DRP0_INTERRUPT_ENABLE,
> DRP0_INTERRUPT_ENABLE);
> if (ret)
> return ret;
>
> - ret = regmap_write(llcc_bcast_regmap, DRP_INTERRUPT_ENABLE,
> + ret = regmap_write(llcc_bcast_regmap, drv->edac_reg_offset->drp_interrupt_enable,
> SB_DB_DRP_INTERRUPT_ENABLE);
> return ret;
> }
> @@ -173,24 +131,28 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> switch (err_type) {
> case LLCC_DRAM_CE:
> case LLCC_DRAM_UE:
> - ret = regmap_write(drv->bcast_regmap, DRP_INTERRUPT_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->drp_interrupt_clear,
> DRP_TRP_INT_CLEAR);
> if (ret)
> return ret;
>
> - ret = regmap_write(drv->bcast_regmap, DRP_ECC_ERROR_CNTR_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->drp_ecc_error_cntr_clear,
> DRP_TRP_CNT_CLEAR);
> if (ret)
> return ret;
> break;
> case LLCC_TRAM_CE:
> case LLCC_TRAM_UE:
> - ret = regmap_write(drv->bcast_regmap, TRP_INTERRUPT_0_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->trp_interrupt_0_clear,
> DRP_TRP_INT_CLEAR);
> if (ret)
> return ret;
>
> - ret = regmap_write(drv->bcast_regmap, TRP_ECC_ERROR_CNTR_CLEAR,
> + ret = regmap_write(drv->bcast_regmap,
> + drv->edac_reg_offset->trp_ecc_error_cntr_clear,
> DRP_TRP_CNT_CLEAR);
> if (ret)
> return ret;
> @@ -203,16 +165,54 @@ qcom_llcc_clear_error_status(int err_type, struct llcc_drv_data *drv)
> return ret;
> }
>
> +struct qcom_llcc_syn_regs {
> + u32 synd_reg;
> + u32 count_status_reg;
> + u32 ways_status_reg;
> +};
> +
> +static void get_reg_offsets(struct llcc_drv_data *drv, int err_type,
> + struct qcom_llcc_syn_regs *syn_regs)
> +{
> + const struct llcc_edac_reg_offset *edac_reg_offset = drv->edac_reg_offset;
> +
> + switch (err_type) {
> + case LLCC_DRAM_CE:
> + syn_regs->synd_reg = edac_reg_offset->drp_ecc_sb_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
> + break;
> + case LLCC_DRAM_UE:
> + syn_regs->synd_reg = edac_reg_offset->drp_ecc_db_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->drp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->drp_ecc_error_status0;
> + break;
> + case LLCC_TRAM_CE:
> + syn_regs->synd_reg = edac_reg_offset->trp_ecc_sb_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
> + break;
> + case LLCC_TRAM_UE:
> + syn_regs->synd_reg = edac_reg_offset->trp_ecc_db_err_syn0;
> + syn_regs->count_status_reg = edac_reg_offset->trp_ecc_error_status1;
> + syn_regs->ways_status_reg = edac_reg_offset->trp_ecc_error_status0;
> + break;
> + }
> +}
> +
> /* Dump Syndrome registers data for Tag RAM, Data RAM bit errors*/
> static int
> dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> {
> struct llcc_edac_reg_data reg_data = edac_reg_data[err_type];
> + struct qcom_llcc_syn_regs regs = { };
> int err_cnt, err_ways, ret, i;
> u32 synd_reg, synd_val;
>
> + get_reg_offsets(drv, err_type, &regs);
> +
> for (i = 0; i < reg_data.reg_cnt; i++) {
> - synd_reg = reg_data.synd_reg + (i * 4);
> + synd_reg = regs.synd_reg + (i * 4);
> ret = regmap_read(drv->regmap, drv->offsets[bank] + synd_reg,
> &synd_val);
> if (ret)
> @@ -223,7 +223,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> }
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[bank] + reg_data.count_status_reg,
> + drv->offsets[bank] + regs.count_status_reg,
> &err_cnt);
> if (ret)
> goto clear;
> @@ -234,7 +234,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
> reg_data.name, err_cnt);
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[bank] + reg_data.ways_status_reg,
> + drv->offsets[bank] + regs.ways_status_reg,
> &err_ways);
> if (ret)
> goto clear;
> @@ -297,7 +297,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
> /* Iterate over the banks and look for Tag RAM or Data RAM errors */
> for (i = 0; i < drv->num_banks; i++) {
> ret = regmap_read(drv->regmap,
> - drv->offsets[i] + DRP_INTERRUPT_STATUS,
> + drv->offsets[i] + drv->edac_reg_offset->drp_interrupt_status,
> &drp_error);
>
> if (!ret && (drp_error & SB_ECC_ERROR)) {
> @@ -313,7 +313,7 @@ llcc_ecc_irq_handler(int irq, void *edev_ctl)
> irq_rc = IRQ_HANDLED;
>
> ret = regmap_read(drv->regmap,
> - drv->offsets[i] + TRP_INTERRUPT_0_STATUS,
> + drv->offsets[i] + drv->edac_reg_offset->trp_interrupt_0_status,
> &trp_error);
>
> if (!ret && (trp_error & SB_ECC_ERROR)) {
> @@ -340,7 +340,7 @@ static int qcom_llcc_edac_probe(struct platform_device *pdev)
> int ecc_irq;
> int rc;
>
> - rc = qcom_llcc_core_setup(llcc_driv_data->bcast_regmap);
> + rc = qcom_llcc_core_setup(llcc_driv_data, llcc_driv_data->bcast_regmap);
> if (rc)
> return rc;
>
> diff --git a/include/linux/soc/qcom/llcc-qcom.h b/include/linux/soc/qcom/llcc-qcom.h
> index bc2fb8343a94..d5b2d58e8857 100644
> --- a/include/linux/soc/qcom/llcc-qcom.h
> +++ b/include/linux/soc/qcom/llcc-qcom.h
> @@ -57,9 +57,6 @@ struct llcc_slice_desc {
> /**
> * struct llcc_edac_reg_data - llcc edac registers data for each error type
> * @name: Name of the error
> - * @synd_reg: Syndrome register address
> - * @count_status_reg: Status register address to read the error count
> - * @ways_status_reg: Status register address to read the error ways
> * @reg_cnt: Number of registers
> * @count_mask: Mask value to get the error count
> * @ways_mask: Mask value to get the error ways
> @@ -68,9 +65,6 @@ struct llcc_slice_desc {
> */
> struct llcc_edac_reg_data {
> char *name;
> - u64 synd_reg;
> - u64 count_status_reg;
> - u64 ways_status_reg;
> u32 reg_cnt;
> u32 count_mask;
> u32 ways_mask;
> --
> 2.25.1
>

2022-10-03 07:14:09

by Manivannan Sadhasivam

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] Fix crash when using Qcom LLCC/EDAC drivers

On Thu, Aug 25, 2022 at 10:08:54AM +0530, Manivannan Sadhasivam wrote:
> Hello,
>
> This series fixes the crash seen on the Qualcomm SM8450 chipset with the
> LLCC/EDAC drivers. The problem was due to the Qcom EDAC driver using the
> fixed LLCC register offsets for detecting the LLCC errors.
>
> This seems to have worked for SoCs till SM8450. But in SM8450, the LLCC
> register offsets were changed. So accessing the fixed offsets causes the
> crash on this platform.
>
> So for fixing this issue, and also to make it work on future SoCs, let's
> pass the LLCC offsets from the Qcom LLCC driver based on the individual
> SoCs and let the EDAC driver make use of them.
>
> This series has been tested on SM8450 based dev board.
>

Since the LLCC patches are already merged, can we get the EDAC patches to be
merged for v6.1?

Thanks,
Mani

> Thanks,
> Mani
>
> Changes in v3:
>
> * Instead of using SoC specific register offset naming convention, used
> LLCC version based as suggested by Sai
> * Fixed the existing reg_offset naming convention to clearly represent
> the LLCC version from which the offsets were changed
> * Added Sai's Acked-by to MAINTAINERS patch
> * Added a new patch that removes an extra error no assignment
>
> Changes in v2:
>
> * Volunteered myself as a maintainer for the EDAC driver since the current
> maintainers have left Qualcomm and I couldn't get hold of them.
>
> Manivannan Sadhasivam (5):
> soc: qcom: llcc: Rename reg_offset structs to reflect LLCC version
> soc: qcom: llcc: Pass LLCC version based register offsets to EDAC
> driver
> EDAC/qcom: Get rid of hardcoded register offsets
> EDAC/qcom: Remove extra error no assignment in qcom_llcc_core_setup()
> MAINTAINERS: Add myself as the maintainer for qcom_edac driver
>
> MAINTAINERS | 3 +-
> drivers/edac/qcom_edac.c | 119 ++++++++++++++---------------
> drivers/soc/qcom/llcc-qcom.c | 92 +++++++++++++++++++---
> include/linux/soc/qcom/llcc-qcom.h | 36 +++++++--
> 4 files changed, 170 insertions(+), 80 deletions(-)
>
> --
> 2.25.1
>

--
மணிவண்ணன் சதாசிவம்

2022-10-03 10:30:54

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 0/5] Fix crash when using Qcom LLCC/EDAC drivers

On Mon, Oct 03, 2022 at 12:34:15PM +0530, Manivannan Sadhasivam wrote:
> Since the LLCC patches are already merged, can we get the EDAC patches to be
> merged for v6.1?

It is too late for 6.1. I'll take a look at them after the merge window closes.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2022-10-26 16:04:47

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v3 3/5] EDAC/qcom: Get rid of hardcoded register offsets

On Thu, Aug 25, 2022 at 10:08:57AM +0530, Manivannan Sadhasivam wrote:
> The LLCC EDAC register offsets varies between each SoC. Hardcoding the
> register offsets won't work and will often result in crash due to
> accessing the wrong locations.
>
> Hence, get the register offsets from the LLCC driver matching the
> individual SoCs.
>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> drivers/edac/qcom_edac.c | 116 ++++++++++++++---------------
> include/linux/soc/qcom/llcc-qcom.h | 6 --
> 2 files changed, 58 insertions(+), 64 deletions(-)

I can't take those:

ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/soc/qcom/llcc-qcom.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1
make: *** [Makefile:1944: modpost] Error 2

You'd have to rediff them against latest Linus -rc tag and test them
properly.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette