The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
This offset only works for some SoCs like SDM845 for which driver support
was initially added.
But the later SoCs use different register stride that vary between the
banks with holes in-between. So it is not possible to use a single register
stride for accessing the CSRs of each bank. By doing so could result in a
crash with the current drivers. So far this crash is not reported since
EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
driver extensively by triggering the EDAC IRQ (that's where each bank
CSRs are accessed).
For fixing this issue, let's obtain the base address of each LLCC bank from
devicetree and get rid of the fixed stride.
This series affects multiple platforms but I have only tested this on
SM8250 and SM8450. Testing on other platforms is welcomed.
Thanks,
Mani
Changes in v2:
* Removed reg-names property and used index of reg property to parse LLCC
bank base address (Bjorn)
* Collected Ack from Sai for binding
* Added a new patch for polling mode (Luca)
* Renamed subject of patches targeting SC7180 and SM6350
Manivannan Sadhasivam (13):
dt-bindings: arm: msm: Update the maintainers for LLCC
dt-bindings: arm: msm: Fix register regions used for LLCC banks
arm64: dts: qcom: sdm845: Fix the base addresses of LLCC banks
arm64: dts: qcom: sc7180: Remove reg-names property from LLCC node
arm64: dts: qcom: sc7280: Fix the base addresses of LLCC banks
arm64: dts: qcom: sc8280xp: Fix the base addresses of LLCC banks
arm64: dts: qcom: sm8150: Fix the base addresses of LLCC banks
arm64: dts: qcom: sm8250: Fix the base addresses of LLCC banks
arm64: dts: qcom: sm8350: Fix the base addresses of LLCC banks
arm64: dts: qcom: sm8450: Fix the base addresses of LLCC banks
arm64: dts: qcom: sm6350: Remove reg-names property from LLCC node
qcom: llcc/edac: Fix the base address used for accessing LLCC banks
qcom: llcc/edac: Support polling mode for ECC handling
.../bindings/arm/msm/qcom,llcc.yaml | 100 +++++++++++++++---
arch/arm64/boot/dts/qcom/sc7180.dtsi | 1 -
arch/arm64/boot/dts/qcom/sc7280.dtsi | 4 +-
arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 7 +-
arch/arm64/boot/dts/qcom/sdm845.dtsi | 5 +-
arch/arm64/boot/dts/qcom/sm6350.dtsi | 1 -
arch/arm64/boot/dts/qcom/sm8150.dtsi | 5 +-
arch/arm64/boot/dts/qcom/sm8250.dtsi | 5 +-
arch/arm64/boot/dts/qcom/sm8350.dtsi | 5 +-
arch/arm64/boot/dts/qcom/sm8450.dtsi | 5 +-
drivers/edac/qcom_edac.c | 51 +++++----
drivers/soc/qcom/llcc-qcom.c | 85 ++++++++-------
include/linux/soc/qcom/llcc-qcom.h | 6 +-
13 files changed, 186 insertions(+), 94 deletions(-)
--
2.25.1
The LLCC block has several banks each with a different base address
and holes in between. So it is not a correct approach to cover these
banks with a single offset/size. Instead, the individual bank's base
address needs to be specified in devicetree with the exact size.
Also, let's get rid of reg-names property as it is not needed anymore.
The driver is expected to parse the reg field based on index to get the
addresses of each LLCC banks.
Cc: <[email protected]> # 5.18
Fixes: 1dc3e50eb680 ("arm64: dts: qcom: sm8450: Add LLCC/system-cache-controller node")
Reported-by: Parikshit Pareek <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8450.dtsi | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 570475040d95..30685857021a 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -3640,8 +3640,9 @@ gem_noc: interconnect@19100000 {
system-cache-controller@19200000 {
compatible = "qcom,sm8450-llcc";
- reg = <0 0x19200000 0 0x580000>, <0 0x19a00000 0 0x80000>;
- reg-names = "llcc_base", "llcc_broadcast_base";
+ reg = <0 0x19200000 0 0x80000>, <0 0x19600000 0 0x80000>,
+ <0 0x19300000 0 0x80000>, <0 0x19700000 0 0x80000>,
+ <0 0x19a00000 0 0x80000>;
interrupts = <GIC_SPI 266 IRQ_TYPE_LEVEL_HIGH>;
};
--
2.25.1
Rishabh Bhatnagar has left Qualcomm, and there is no evidence of him
maintaining with a new identity. So his entry needs to be removed.
Also, Sai Prakash Ranjan's email address should be updated to use
quicinc domain.
Cc: Sai Prakash Ranjan <[email protected]>
Acked-by: Sai Prakash Ranjan <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
---
Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml b/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
index 38efcad56dbd..d1df49ffcc1b 100644
--- a/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
+++ b/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
@@ -7,8 +7,7 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
title: Last Level Cache Controller
maintainers:
- - Rishabh Bhatnagar <[email protected]>
- - Sai Prakash Ranjan <[email protected]>
+ - Sai Prakash Ranjan <[email protected]>
description: |
LLCC (Last Level Cache Controller) provides last level of cache memory in SoC,
--
2.25.1
On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> This offset only works for some SoCs like SDM845 for which driver support
> was initially added.
>
> But the later SoCs use different register stride that vary between the
> banks with holes in-between. So it is not possible to use a single register
> stride for accessing the CSRs of each bank. By doing so could result in a
> crash with the current drivers. So far this crash is not reported since
> EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> driver extensively by triggering the EDAC IRQ (that's where each bank
> CSRs are accessed).
>
> For fixing this issue, let's obtain the base address of each LLCC bank from
> devicetree and get rid of the fixed stride.
>
> This series affects multiple platforms but I have only tested this on
> SM8250 and SM8450. Testing on other platforms is welcomed.
>
Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
I took this for a quick spin on the qdrive3 I've got access to without
any issue:
[root@localhost ~]# modprobe qcom_edac
[root@localhost ~]# dmesg | grep -i edac
[ 0.620723] EDAC MC: Ver: 3.0.0
[ 1.165417] ghes_edac: GHES probing device list is empty
[ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
[root@localhost ~]# cat /proc/interrupts | grep ecc
174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
[root@localhost ~]#
Potentially stupid question, but are users expected to manually load the
driver as I did? I don't see how it would be loaded automatically in the
current state, but thought it was funny that I needed to modprobe
myself.
Please let me know if you want me to do any more further testing!
Thanks,
Andrew
On 12/12/2022 6:03 PM, Manivannan Sadhasivam wrote:
> The LLCC block has several banks each with a different base address
> and holes in between. So it is not a correct approach to cover these
> banks with a single offset/size. Instead, the individual bank's base
> address needs to be specified in devicetree with the exact size.
>
> Also, let's get rid of reg-names property as it is not needed anymore.
> The driver is expected to parse the reg field based on index to get the
> addresses of each LLCC banks.
>
> Cc: <[email protected]> # 5.18
> Fixes: 1dc3e50eb680 ("arm64: dts: qcom: sm8450: Add LLCC/system-cache-controller node")
> Reported-by: Parikshit Pareek <[email protected]>
> Signed-off-by: Manivannan Sadhasivam <[email protected]>
> ---
> arch/arm64/boot/dts/qcom/sm8450.dtsi | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> index 570475040d95..30685857021a 100644
> --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> @@ -3640,8 +3640,9 @@ gem_noc: interconnect@19100000 {
>
> system-cache-controller@19200000 {
> compatible = "qcom,sm8450-llcc";
> - reg = <0 0x19200000 0 0x580000>, <0 0x19a00000 0 0x80000>;
> - reg-names = "llcc_base", "llcc_broadcast_base";
> + reg = <0 0x19200000 0 0x80000>, <0 0x19600000 0 0x80000>,
> + <0 0x19300000 0 0x80000>, <0 0x19700000 0 0x80000>,
> + <0 0x19a00000 0 0x80000>;
> interrupts = <GIC_SPI 266 IRQ_TYPE_LEVEL_HIGH>;
> };
>
Reviewed-by: Sai Prakash Ranjan <[email protected]>
On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> > The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> > accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> > This offset only works for some SoCs like SDM845 for which driver support
> > was initially added.
> >
> > But the later SoCs use different register stride that vary between the
> > banks with holes in-between. So it is not possible to use a single register
> > stride for accessing the CSRs of each bank. By doing so could result in a
> > crash with the current drivers. So far this crash is not reported since
> > EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> > driver extensively by triggering the EDAC IRQ (that's where each bank
> > CSRs are accessed).
> >
> > For fixing this issue, let's obtain the base address of each LLCC bank from
> > devicetree and get rid of the fixed stride.
> >
> > This series affects multiple platforms but I have only tested this on
> > SM8250 and SM8450. Testing on other platforms is welcomed.
> >
>
> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
>
Thanks!
> I took this for a quick spin on the qdrive3 I've got access to without
> any issue:
>
> [root@localhost ~]# modprobe qcom_edac
> [root@localhost ~]# dmesg | grep -i edac
> [ 0.620723] EDAC MC: Ver: 3.0.0
> [ 1.165417] ghes_edac: GHES probing device list is empty
> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
> [root@localhost ~]# cat /proc/interrupts | grep ecc
> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
> [root@localhost ~]#
>
> Potentially stupid question, but are users expected to manually load the
> driver as I did? I don't see how it would be loaded automatically in the
> current state, but thought it was funny that I needed to modprobe
> myself.
>
> Please let me know if you want me to do any more further testing!
>
Well, I always ended up using the driver as a built-in. I do make it module for
build test but never really used it as a module, so didn't catch this issue.
This is due to the module alias not exported by the qcom_edac driver. Below
diff allows kernel to autoload it:
diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
index f7afb5375293..13919d01c22d 100644
--- a/drivers/edac/qcom_edac.c
+++ b/drivers/edac/qcom_edac.c
@@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
MODULE_DESCRIPTION("QCOM EDAC driver");
MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:qcom_llcc_edac");
Please test and let me know. I will add this as a new patch in next version.
Thanks,
Mani
> Thanks,
> Andrew
>
--
மணிவண்ணன் சதாசிவம்
On 12/12/2022 13:32, Manivannan Sadhasivam wrote:
> Rishabh Bhatnagar has left Qualcomm, and there is no evidence of him
> maintaining with a new identity. So his entry needs to be removed.
>
> Also, Sai Prakash Ranjan's email address should be updated to use
> quicinc domain.
>
Acked-by: Krzysztof Kozlowski <[email protected]>
Best regards,
Krzysztof
On Tue, Dec 13, 2022 at 10:58:02AM +0530, Manivannan Sadhasivam wrote:
> On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
> > On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> > > The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> > > accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> > > This offset only works for some SoCs like SDM845 for which driver support
> > > was initially added.
> > >
> > > But the later SoCs use different register stride that vary between the
> > > banks with holes in-between. So it is not possible to use a single register
> > > stride for accessing the CSRs of each bank. By doing so could result in a
> > > crash with the current drivers. So far this crash is not reported since
> > > EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> > > driver extensively by triggering the EDAC IRQ (that's where each bank
> > > CSRs are accessed).
> > >
> > > For fixing this issue, let's obtain the base address of each LLCC bank from
> > > devicetree and get rid of the fixed stride.
> > >
> > > This series affects multiple platforms but I have only tested this on
> > > SM8250 and SM8450. Testing on other platforms is welcomed.
> > >
> >
> > Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
> >
>
> Thanks!
>
> > I took this for a quick spin on the qdrive3 I've got access to without
> > any issue:
> >
> > [root@localhost ~]# modprobe qcom_edac
> > [root@localhost ~]# dmesg | grep -i edac
> > [ 0.620723] EDAC MC: Ver: 3.0.0
> > [ 1.165417] ghes_edac: GHES probing device list is empty
> > [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
> > [root@localhost ~]# cat /proc/interrupts | grep ecc
> > 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
> > [root@localhost ~]#
> >
> > Potentially stupid question, but are users expected to manually load the
> > driver as I did? I don't see how it would be loaded automatically in the
> > current state, but thought it was funny that I needed to modprobe
> > myself.
> >
> > Please let me know if you want me to do any more further testing!
> >
>
> Well, I always ended up using the driver as a built-in. I do make it module for
> build test but never really used it as a module, so didn't catch this issue.
>
> This is due to the module alias not exported by the qcom_edac driver. Below
> diff allows kernel to autoload it:
>
> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> index f7afb5375293..13919d01c22d 100644
> --- a/drivers/edac/qcom_edac.c
> +++ b/drivers/edac/qcom_edac.c
> @@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
>
> MODULE_DESCRIPTION("QCOM EDAC driver");
> MODULE_LICENSE("GPL v2");
> +MODULE_ALIAS("platform:qcom_llcc_edac");
>
> Please test and let me know. I will add this as a new patch in next version.
>
Thanks Mani, that gets things working for me. For that patch:
Reviewed-by: Andrew Halaney <[email protected]>
Tested-by: Andrew Halaney <[email protected]>
My personal opinion, but that probably deserves a Fixes: tag too!
On 13/12/2022 06:28, Manivannan Sadhasivam wrote:
> On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
>> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
>>> The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
>>> accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
>>> This offset only works for some SoCs like SDM845 for which driver support
>>> was initially added.
>>>
>>> But the later SoCs use different register stride that vary between the
>>> banks with holes in-between. So it is not possible to use a single register
>>> stride for accessing the CSRs of each bank. By doing so could result in a
>>> crash with the current drivers. So far this crash is not reported since
>>> EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
>>> driver extensively by triggering the EDAC IRQ (that's where each bank
>>> CSRs are accessed).
>>>
>>> For fixing this issue, let's obtain the base address of each LLCC bank from
>>> devicetree and get rid of the fixed stride.
>>>
>>> This series affects multiple platforms but I have only tested this on
>>> SM8250 and SM8450. Testing on other platforms is welcomed.
>>>
>>
>> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
>>
>
> Thanks!
>
>> I took this for a quick spin on the qdrive3 I've got access to without
>> any issue:
>>
>> [root@localhost ~]# modprobe qcom_edac
>> [root@localhost ~]# dmesg | grep -i edac
>> [ 0.620723] EDAC MC: Ver: 3.0.0
>> [ 1.165417] ghes_edac: GHES probing device list is empty
>> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
>> [root@localhost ~]# cat /proc/interrupts | grep ecc
>> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
>> [root@localhost ~]#
>>
>> Potentially stupid question, but are users expected to manually load the
>> driver as I did? I don't see how it would be loaded automatically in the
>> current state, but thought it was funny that I needed to modprobe
>> myself.
>>
>> Please let me know if you want me to do any more further testing!
>>
>
> Well, I always ended up using the driver as a built-in. I do make it module for
> build test but never really used it as a module, so didn't catch this issue.
>
> This is due to the module alias not exported by the qcom_edac driver. Below
> diff allows kernel to autoload it:
>
> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> index f7afb5375293..13919d01c22d 100644
> --- a/drivers/edac/qcom_edac.c
> +++ b/drivers/edac/qcom_edac.c
> @@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
>
> MODULE_DESCRIPTION("QCOM EDAC driver");
> MODULE_LICENSE("GPL v2");
> +MODULE_ALIAS("platform:qcom_llcc_edac");
While this is a way to fix it, but instead of creating aliases for wrong
names, either a correct name should be used or driver should receive ID
table.
Best regards,
Krzysztof
On Tue, Dec 13, 2022 at 05:54:56PM +0100, Krzysztof Kozlowski wrote:
> On 13/12/2022 06:28, Manivannan Sadhasivam wrote:
> > On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
> >> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> >>> The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> >>> accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> >>> This offset only works for some SoCs like SDM845 for which driver support
> >>> was initially added.
> >>>
> >>> But the later SoCs use different register stride that vary between the
> >>> banks with holes in-between. So it is not possible to use a single register
> >>> stride for accessing the CSRs of each bank. By doing so could result in a
> >>> crash with the current drivers. So far this crash is not reported since
> >>> EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> >>> driver extensively by triggering the EDAC IRQ (that's where each bank
> >>> CSRs are accessed).
> >>>
> >>> For fixing this issue, let's obtain the base address of each LLCC bank from
> >>> devicetree and get rid of the fixed stride.
> >>>
> >>> This series affects multiple platforms but I have only tested this on
> >>> SM8250 and SM8450. Testing on other platforms is welcomed.
> >>>
> >>
> >> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
> >>
> >
> > Thanks!
> >
> >> I took this for a quick spin on the qdrive3 I've got access to without
> >> any issue:
> >>
> >> [root@localhost ~]# modprobe qcom_edac
> >> [root@localhost ~]# dmesg | grep -i edac
> >> [ 0.620723] EDAC MC: Ver: 3.0.0
> >> [ 1.165417] ghes_edac: GHES probing device list is empty
> >> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
> >> [root@localhost ~]# cat /proc/interrupts | grep ecc
> >> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
> >> [root@localhost ~]#
> >>
> >> Potentially stupid question, but are users expected to manually load the
> >> driver as I did? I don't see how it would be loaded automatically in the
> >> current state, but thought it was funny that I needed to modprobe
> >> myself.
> >>
> >> Please let me know if you want me to do any more further testing!
> >>
> >
> > Well, I always ended up using the driver as a built-in. I do make it module for
> > build test but never really used it as a module, so didn't catch this issue.
> >
> > This is due to the module alias not exported by the qcom_edac driver. Below
> > diff allows kernel to autoload it:
> >
> > diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> > index f7afb5375293..13919d01c22d 100644
> > --- a/drivers/edac/qcom_edac.c
> > +++ b/drivers/edac/qcom_edac.c
> > @@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
> >
> > MODULE_DESCRIPTION("QCOM EDAC driver");
> > MODULE_LICENSE("GPL v2");
> > +MODULE_ALIAS("platform:qcom_llcc_edac");
>
> While this is a way to fix it, but instead of creating aliases for wrong
> names, either a correct name should be used or driver should receive ID
> table.
>
I'm not sure how you'd fix it with a _correct_ name here. Also, the id table is
an overkill since there is only one driver that is making use of it. And
moreover, there is no definite ID to use.
Thanks,
Mani
> Best regards,
> Krzysztof
>
--
மணிவண்ணன் சதாசிவம்
On 13/12/2022 18:57, Manivannan Sadhasivam wrote:
> On Tue, Dec 13, 2022 at 05:54:56PM +0100, Krzysztof Kozlowski wrote:
>> On 13/12/2022 06:28, Manivannan Sadhasivam wrote:
>>> On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
>>>> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
>>>>> The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
>>>>> accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
>>>>> This offset only works for some SoCs like SDM845 for which driver support
>>>>> was initially added.
>>>>>
>>>>> But the later SoCs use different register stride that vary between the
>>>>> banks with holes in-between. So it is not possible to use a single register
>>>>> stride for accessing the CSRs of each bank. By doing so could result in a
>>>>> crash with the current drivers. So far this crash is not reported since
>>>>> EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
>>>>> driver extensively by triggering the EDAC IRQ (that's where each bank
>>>>> CSRs are accessed).
>>>>>
>>>>> For fixing this issue, let's obtain the base address of each LLCC bank from
>>>>> devicetree and get rid of the fixed stride.
>>>>>
>>>>> This series affects multiple platforms but I have only tested this on
>>>>> SM8250 and SM8450. Testing on other platforms is welcomed.
>>>>>
>>>>
>>>> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
>>>>
>>>
>>> Thanks!
>>>
>>>> I took this for a quick spin on the qdrive3 I've got access to without
>>>> any issue:
>>>>
>>>> [root@localhost ~]# modprobe qcom_edac
>>>> [root@localhost ~]# dmesg | grep -i edac
>>>> [ 0.620723] EDAC MC: Ver: 3.0.0
>>>> [ 1.165417] ghes_edac: GHES probing device list is empty
>>>> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
>>>> [root@localhost ~]# cat /proc/interrupts | grep ecc
>>>> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
>>>> [root@localhost ~]#
>>>>
>>>> Potentially stupid question, but are users expected to manually load the
>>>> driver as I did? I don't see how it would be loaded automatically in the
>>>> current state, but thought it was funny that I needed to modprobe
>>>> myself.
>>>>
>>>> Please let me know if you want me to do any more further testing!
>>>>
>>>
>>> Well, I always ended up using the driver as a built-in. I do make it module for
>>> build test but never really used it as a module, so didn't catch this issue.
>>>
>>> This is due to the module alias not exported by the qcom_edac driver. Below
>>> diff allows kernel to autoload it:
>>>
>>> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
>>> index f7afb5375293..13919d01c22d 100644
>>> --- a/drivers/edac/qcom_edac.c
>>> +++ b/drivers/edac/qcom_edac.c
>>> @@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
>>>
>>> MODULE_DESCRIPTION("QCOM EDAC driver");
>>> MODULE_LICENSE("GPL v2");
>>> +MODULE_ALIAS("platform:qcom_llcc_edac");
>>
>> While this is a way to fix it, but instead of creating aliases for wrong
>> names, either a correct name should be used or driver should receive ID
>> table.
>>
>
> I'm not sure how you'd fix it with a _correct_ name here.
Hm, I assumed that it would be enough if driver name would match device
name. Currently these two are not in sync. Maybe it's not enough when
built as module?
> Also, the id table is
> an overkill since there is only one driver that is making use of it. And
> moreover, there is no definite ID to use.
Every driver with a single device support has usually ID table and it's
not a problem...
Best regards,
Krzysztof
On Tue, Dec 13, 2022 at 07:47:17PM +0100, Krzysztof Kozlowski wrote:
> On 13/12/2022 18:57, Manivannan Sadhasivam wrote:
> > On Tue, Dec 13, 2022 at 05:54:56PM +0100, Krzysztof Kozlowski wrote:
> >> On 13/12/2022 06:28, Manivannan Sadhasivam wrote:
> >>> On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
> >>>> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> >>>>> The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> >>>>> accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> >>>>> This offset only works for some SoCs like SDM845 for which driver support
> >>>>> was initially added.
> >>>>>
> >>>>> But the later SoCs use different register stride that vary between the
> >>>>> banks with holes in-between. So it is not possible to use a single register
> >>>>> stride for accessing the CSRs of each bank. By doing so could result in a
> >>>>> crash with the current drivers. So far this crash is not reported since
> >>>>> EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> >>>>> driver extensively by triggering the EDAC IRQ (that's where each bank
> >>>>> CSRs are accessed).
> >>>>>
> >>>>> For fixing this issue, let's obtain the base address of each LLCC bank from
> >>>>> devicetree and get rid of the fixed stride.
> >>>>>
> >>>>> This series affects multiple platforms but I have only tested this on
> >>>>> SM8250 and SM8450. Testing on other platforms is welcomed.
> >>>>>
> >>>>
> >>>> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
> >>>>
> >>>
> >>> Thanks!
> >>>
> >>>> I took this for a quick spin on the qdrive3 I've got access to without
> >>>> any issue:
> >>>>
> >>>> [root@localhost ~]# modprobe qcom_edac
> >>>> [root@localhost ~]# dmesg | grep -i edac
> >>>> [ 0.620723] EDAC MC: Ver: 3.0.0
> >>>> [ 1.165417] ghes_edac: GHES probing device list is empty
> >>>> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
> >>>> [root@localhost ~]# cat /proc/interrupts | grep ecc
> >>>> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
> >>>> [root@localhost ~]#
> >>>>
> >>>> Potentially stupid question, but are users expected to manually load the
> >>>> driver as I did? I don't see how it would be loaded automatically in the
> >>>> current state, but thought it was funny that I needed to modprobe
> >>>> myself.
> >>>>
> >>>> Please let me know if you want me to do any more further testing!
> >>>>
> >>>
> >>> Well, I always ended up using the driver as a built-in. I do make it module for
> >>> build test but never really used it as a module, so didn't catch this issue.
> >>>
> >>> This is due to the module alias not exported by the qcom_edac driver. Below
> >>> diff allows kernel to autoload it:
> >>>
> >>> diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
> >>> index f7afb5375293..13919d01c22d 100644
> >>> --- a/drivers/edac/qcom_edac.c
> >>> +++ b/drivers/edac/qcom_edac.c
> >>> @@ -419,3 +419,4 @@ module_platform_driver(qcom_llcc_edac_driver);
> >>>
> >>> MODULE_DESCRIPTION("QCOM EDAC driver");
> >>> MODULE_LICENSE("GPL v2");
> >>> +MODULE_ALIAS("platform:qcom_llcc_edac");
> >>
> >> While this is a way to fix it, but instead of creating aliases for wrong
> >> names, either a correct name should be used or driver should receive ID
> >> table.
> >>
> >
> > I'm not sure how you'd fix it with a _correct_ name here.
>
> Hm, I assumed that it would be enough if driver name would match device
> name. Currently these two are not in sync. Maybe it's not enough when
> built as module?
>
Right, for module it is not enough and that's why we need id_table/alias.
> > Also, the id table is
> > an overkill since there is only one driver that is making use of it. And
> > moreover, there is no definite ID to use.
>
> Every driver with a single device support has usually ID table and it's
> not a problem...
>
Are you referring to OF/ACPI ID table? Or something else?
Thanks,
Mani
> Best regards,
> Krzysztof
>
--
மணிவண்ணன் சதாசிவம்
On 19/12/2022 14:50, Manivannan Sadhasivam wrote:
>
>>> Also, the id table is
>>> an overkill since there is only one driver that is making use of it. And
>>> moreover, there is no definite ID to use.
>>
>> Every driver with a single device support has usually ID table and it's
>> not a problem...
>>
>
> Are you referring to OF/ACPI ID table? Or something else?
No, I refer to the driver ID table (I2C, platform whatever the driver is).
Best regards,
Krzysztof
On 19/12/2022 15:16, Manivannan Sadhasivam wrote:
> On Mon, Dec 19, 2022 at 03:11:36PM +0100, Krzysztof Kozlowski wrote:
>> On 19/12/2022 14:50, Manivannan Sadhasivam wrote:
>>>
>>>>> Also, the id table is
>>>>> an overkill since there is only one driver that is making use of it. And
>>>>> moreover, there is no definite ID to use.
>>>>
>>>> Every driver with a single device support has usually ID table and it's
>>>> not a problem...
>>>>
>>>
>>> Are you referring to OF/ACPI ID table? Or something else?
>>
>> No, I refer to the driver ID table (I2C, platform whatever the driver is).
>>
>
> Yeah, that's what I wanted to avoid here. The ID table makes sense if you have
> a bus like I2C or a separate subsystem but here LLCC is an individual driver.
> So creating a separate ID table is an overkill IMO.
Why this is an overkill? Just few lines and many, many drivers have it.
Even duplicated (for legacy reasons) with OF tables.
ALIAS is not the way to go around ID table because essentially you are
re-implementing it.
Best regards,
Krzysztof
On Mon, Dec 19, 2022 at 03:11:36PM +0100, Krzysztof Kozlowski wrote:
> On 19/12/2022 14:50, Manivannan Sadhasivam wrote:
> >
> >>> Also, the id table is
> >>> an overkill since there is only one driver that is making use of it. And
> >>> moreover, there is no definite ID to use.
> >>
> >> Every driver with a single device support has usually ID table and it's
> >> not a problem...
> >>
> >
> > Are you referring to OF/ACPI ID table? Or something else?
>
> No, I refer to the driver ID table (I2C, platform whatever the driver is).
>
Yeah, that's what I wanted to avoid here. The ID table makes sense if you have
a bus like I2C or a separate subsystem but here LLCC is an individual driver.
So creating a separate ID table is an overkill IMO.
Thanks,
Mani
> Best regards,
> Krzysztof
>
--
மணிவண்ணன் சதாசிவம்
On Mon, 19 Dec 2022 at 16:17, Manivannan Sadhasivam
<[email protected]> wrote:
>
> On Mon, Dec 19, 2022 at 03:11:36PM +0100, Krzysztof Kozlowski wrote:
> > On 19/12/2022 14:50, Manivannan Sadhasivam wrote:
> > >
> > >>> Also, the id table is
> > >>> an overkill since there is only one driver that is making use of it. And
> > >>> moreover, there is no definite ID to use.
> > >>
> > >> Every driver with a single device support has usually ID table and it's
> > >> not a problem...
> > >>
> > >
> > > Are you referring to OF/ACPI ID table? Or something else?
> >
> > No, I refer to the driver ID table (I2C, platform whatever the driver is).
> >
>
> Yeah, that's what I wanted to avoid here. The ID table makes sense if you have
> a bus like I2C or a separate subsystem but here LLCC is an individual driver.
> So creating a separate ID table is an overkill IMO.
Well, struct platform_device_id is used quite a lot together with the
MODULE_DEVICE_TABLE(platform, _ids);
On the other hand:
$ git grep MODULE_ALIAS.*platform: | wc -l
1308
$ git grep MODULE_DEVICE_TABLE.*platform | wc -l
236
--
With best wishes
Dmitry
On Mon, Dec 19, 2022 at 06:49:39PM +0200, Dmitry Baryshkov wrote:
> On Mon, 19 Dec 2022 at 16:17, Manivannan Sadhasivam
> <[email protected]> wrote:
> >
> > On Mon, Dec 19, 2022 at 03:11:36PM +0100, Krzysztof Kozlowski wrote:
> > > On 19/12/2022 14:50, Manivannan Sadhasivam wrote:
> > > >
> > > >>> Also, the id table is
> > > >>> an overkill since there is only one driver that is making use of it. And
> > > >>> moreover, there is no definite ID to use.
> > > >>
> > > >> Every driver with a single device support has usually ID table and it's
> > > >> not a problem...
> > > >>
> > > >
> > > > Are you referring to OF/ACPI ID table? Or something else?
> > >
> > > No, I refer to the driver ID table (I2C, platform whatever the driver is).
> > >
> >
> > Yeah, that's what I wanted to avoid here. The ID table makes sense if you have
> > a bus like I2C or a separate subsystem but here LLCC is an individual driver.
> > So creating a separate ID table is an overkill IMO.
>
> Well, struct platform_device_id is used quite a lot together with the
> MODULE_DEVICE_TABLE(platform, _ids);
>
> On the other hand:
>
> $ git grep MODULE_ALIAS.*platform: | wc -l
> 1308
> $ git grep MODULE_DEVICE_TABLE.*platform | wc -l
> 236
>
Hmm. I think I will just go with platform_device_id in the next version.
Thanks,
Mani
> --
> With best wishes
> Dmitry
--
மணிவண்ணன் சதாசிவம்
Hi Andrew,
On Mon, Dec 12, 2022 at 01:23:40PM -0600, Andrew Halaney wrote:
> On Mon, Dec 12, 2022 at 06:02:58PM +0530, Manivannan Sadhasivam wrote:
> > The Qualcomm LLCC/EDAC drivers were using a fixed register stride for
> > accessing the (Control and Status Regsiters) CSRs of each LLCC bank.
> > This offset only works for some SoCs like SDM845 for which driver support
> > was initially added.
> >
> > But the later SoCs use different register stride that vary between the
> > banks with holes in-between. So it is not possible to use a single register
> > stride for accessing the CSRs of each bank. By doing so could result in a
> > crash with the current drivers. So far this crash is not reported since
> > EDAC_QCOM driver is not enabled in ARM64 defconfig and no one tested the
> > driver extensively by triggering the EDAC IRQ (that's where each bank
> > CSRs are accessed).
> >
> > For fixing this issue, let's obtain the base address of each LLCC bank from
> > devicetree and get rid of the fixed stride.
> >
> > This series affects multiple platforms but I have only tested this on
> > SM8250 and SM8450. Testing on other platforms is welcomed.
> >
>
> Tested-by: Andrew Halaney <[email protected]> # sa8540p-ride
>
I dropped your tested-by tag in v3 as some of the patch content have been
changed. Please test v3 and share your feedback.
Thanks,
Mani
> I took this for a quick spin on the qdrive3 I've got access to without
> any issue:
>
> [root@localhost ~]# modprobe qcom_edac
> [root@localhost ~]# dmesg | grep -i edac
> [ 0.620723] EDAC MC: Ver: 3.0.0
> [ 1.165417] ghes_edac: GHES probing device list is empty
> [ 594.688103] EDAC DEVICE0: Giving out device to module qcom_llcc_edac controller llcc: DEV qcom_llcc_edac (INTERRUPT)
> [root@localhost ~]# cat /proc/interrupts | grep ecc
> 174: 0 0 0 0 0 0 0 0 GICv3 614 Level llcc_ecc
> [root@localhost ~]#
>
> Potentially stupid question, but are users expected to manually load the
> driver as I did? I don't see how it would be loaded automatically in the
> current state, but thought it was funny that I needed to modprobe
> myself.
>
> Please let me know if you want me to do any more further testing!
>
> Thanks,
> Andrew
>
--
மணிவண்ணன் சதாசிவம்