2023-11-18 13:47:05

by Liming Sun

[permalink] [raw]
Subject: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
intermittent eMMC timeout issue reported on some cards under eMMC
stress test.

Reported error message:
dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110

Signed-off-by: Liming Sun <[email protected]>
---
drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
index 3a3bae6948a8..3c8fe8aec558 100644
--- a/drivers/mmc/host/sdhci-of-dwcmshc.c
+++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
@@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
#ifdef CONFIG_ACPI
static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
.ops = &sdhci_dwcmshc_ops,
- .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
+ .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
+ SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
SDHCI_QUIRK2_ACMD23_BROKEN,
};
--
2.30.1


2023-11-20 06:49:39

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

On 18/11/23 15:46, Liming Sun wrote:
> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> intermittent eMMC timeout issue reported on some cards under eMMC
> stress test.
>
> Reported error message:
> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110

Were you able to determine the root cause? For example,
is the host controller timeout correct, is the eMMC
providing correct timeout values, is the mmc subsystem
calculating a correct value, is sdhci programming a correct
value?

If there are problems outside the host controller then we
need to address them also.

>
> Signed-off-by: Liming Sun <[email protected]>

Fixes tag?

> ---
> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
> index 3a3bae6948a8..3c8fe8aec558 100644
> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
> #ifdef CONFIG_ACPI
> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> .ops = &sdhci_dwcmshc_ops,
> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> SDHCI_QUIRK2_ACMD23_BROKEN,
> };

2023-11-20 15:19:16

by Liming Sun

[permalink] [raw]
Subject: RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC



> -----Original Message-----
> From: Adrian Hunter <[email protected]>
> Sent: Monday, November 20, 2023 1:49 AM
> To: Liming Sun <[email protected]>; Ulf Hansson <[email protected]>;
> David Thompson <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
>
> On 18/11/23 15:46, Liming Sun wrote:
> > This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> > intermittent eMMC timeout issue reported on some cards under eMMC
> > stress test.
> >
> > Reported error message:
> > dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>
> Were you able to determine the root cause? For example,
> is the host controller timeout correct, is the eMMC
> providing correct timeout values, is the mmc subsystem
> calculating a correct value, is sdhci programming a correct
> value?
>
> If there are problems outside the host controller then we
> need to address them also.

It is caused by the host controller timeout, but is hard to tell whether the
configuration provided by the card is good enough since it's
intermittent under stress test the SoC needs to work with different eMMC vendors.
In UEFI eMMC driver similar max timeout (0xe) is used to avoid such
issue. This commit tries to use existing quirk, which I think that it would work
if there is another way to adjust the TOUT_CNT register. Any concern or suggestions?

>
> >
> > Signed-off-by: Liming Sun <[email protected]>
>
> Fixes tag?

Will update it in v2.

>
> > ---
> > drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > index 3a3bae6948a8..3c8fe8aec558 100644
> > --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> > +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> sdhci_dwcmshc_pdata = {
> > #ifdef CONFIG_ACPI
> > static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> > .ops = &sdhci_dwcmshc_ops,
> > - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> > + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> > + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> > .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> > SDHCI_QUIRK2_ACMD23_BROKEN,
> > };

2023-11-21 08:09:47

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

On 20/11/23 17:18, Liming Sun wrote:
>
>
>> -----Original Message-----
>> From: Adrian Hunter <[email protected]>
>> Sent: Monday, November 20, 2023 1:49 AM
>> To: Liming Sun <[email protected]>; Ulf Hansson <[email protected]>;
>> David Thompson <[email protected]>
>> Cc: [email protected]; [email protected]
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 18/11/23 15:46, Liming Sun wrote:
>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>> stress test.
>>>
>>> Reported error message:
>>> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>
>> Were you able to determine the root cause? For example,
>> is the host controller timeout correct, is the eMMC
>> providing correct timeout values, is the mmc subsystem
>> calculating a correct value, is sdhci programming a correct
>> value?
>>
>> If there are problems outside the host controller then we
>> need to address them also.
>
> It is caused by the host controller timeout, but is hard to tell whether the
> configuration provided by the card is good enough since it's
> intermittent under stress test the SoC needs to work with different eMMC vendors.
> In UEFI eMMC driver similar max timeout (0xe) is used to avoid such
> issue. This commit tries to use existing quirk, which I think that it would work
> if there is another way to adjust the TOUT_CNT register. Any concern or suggestions?

If cards are providing timeout values that are too low under stress,
it would be better to fix it in the mmc subsystem so that all host
controllers can benefit.

>
>>
>>>
>>> Signed-off-by: Liming Sun <[email protected]>
>>
>> Fixes tag?
>
> Will update it in v2.
>
>>
>>> ---
>>> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>> sdhci_dwcmshc_pdata = {
>>> #ifdef CONFIG_ACPI
>>> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>> .ops = &sdhci_dwcmshc_ops,
>>> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>> SDHCI_QUIRK2_ACMD23_BROKEN,
>>> };
>

2023-11-27 13:36:30

by Christian Loehle

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

On 18/11/2023 13:46, Liming Sun wrote:
> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> intermittent eMMC timeout issue reported on some cards under eMMC
> stress test.
>
> Reported error message:
> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>
> Signed-off-by: Liming Sun <[email protected]>
> ---
> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c b/drivers/mmc/host/sdhci-of-dwcmshc.c
> index 3a3bae6948a8..3c8fe8aec558 100644
> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data sdhci_dwcmshc_pdata = {
> #ifdef CONFIG_ACPI
> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> .ops = &sdhci_dwcmshc_ops,
> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> SDHCI_QUIRK2_ACMD23_BROKEN,
> };

__mmc_blk_ioctl_cmd: data error ?
What stresstest are you running that issues ioctl commands?
On which commands does the timeout occur?
Anyway you should be able to increase the timeout in ioctl structure
directly, i.e. in userspace, or does that not work?

2023-11-30 13:19:48

by Liming Sun

[permalink] [raw]
Subject: RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC



> -----Original Message-----
> From: Christian Loehle <[email protected]>
> Sent: Monday, November 27, 2023 8:36 AM
> To: Liming Sun <[email protected]>; Adrian Hunter
> <[email protected]>; Ulf Hansson <[email protected]>; David
> Thompson <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
>
> On 18/11/2023 13:46, Liming Sun wrote:
> > This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> > intermittent eMMC timeout issue reported on some cards under eMMC
> > stress test.
> >
> > Reported error message:
> > dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> >
> > Signed-off-by: Liming Sun <[email protected]>
> > ---
> > drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > index 3a3bae6948a8..3c8fe8aec558 100644
> > --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> > +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> > @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> sdhci_dwcmshc_pdata = {
> > #ifdef CONFIG_ACPI
> > static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> > .ops = &sdhci_dwcmshc_ops,
> > - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> > + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> > + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> > .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> > SDHCI_QUIRK2_ACMD23_BROKEN,
> > };
>
> __mmc_blk_ioctl_cmd: data error ?
> What stresstest are you running that issues ioctl commands?
> On which commands does the timeout occur?
> Anyway you should be able to increase the timeout in ioctl structure
> directly, i.e. in userspace, or does that not work?

It's running stress test with tool like "fio --name=randrw_stress_round_1 --ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K/3:76K/3 --filename=/dev/mmcblk0"
The tool(application) is owned by user or with some standard tool.

2023-12-11 11:39:12

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

On 30/11/23 15:19, Liming Sun wrote:
>
>
>> -----Original Message-----
>> From: Christian Loehle <[email protected]>
>> Sent: Monday, November 27, 2023 8:36 AM
>> To: Liming Sun <[email protected]>; Adrian Hunter
>> <[email protected]>; Ulf Hansson <[email protected]>; David
>> Thompson <[email protected]>
>> Cc: [email protected]; [email protected]
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 18/11/2023 13:46, Liming Sun wrote:
>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>> stress test.
>>>
>>> Reported error message:
>>> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>>
>>> Signed-off-by: Liming Sun <[email protected]>
>>> ---
>>> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>> sdhci_dwcmshc_pdata = {
>>> #ifdef CONFIG_ACPI
>>> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>> .ops = &sdhci_dwcmshc_ops,
>>> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>> SDHCI_QUIRK2_ACMD23_BROKEN,
>>> };
>>
>> __mmc_blk_ioctl_cmd: data error ?
>> What stresstest are you running that issues ioctl commands?
>> On which commands does the timeout occur?
>> Anyway you should be able to increase the timeout in ioctl structure
>> directly, i.e. in userspace, or does that not work?
>
> It's running stress test with tool like "fio --name=randrw_stress_round_1 --ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K/3:76K/3 --filename=/dev/mmcblk0"
> The tool(application) is owned by user or with some standard tool.

fio does not send mmc ioctls, so I am also a bit confused about
how you get "__mmc_blk_ioctl_cmd: data error -110" ?

2023-12-19 21:18:48

by Liming Sun

[permalink] [raw]
Subject: RE: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC



> -----Original Message-----
> From: Adrian Hunter <[email protected]>
> Sent: Monday, December 11, 2023 6:39 AM
> To: Liming Sun <[email protected]>; Christian Loehle
> <[email protected]>; Ulf Hansson <[email protected]>; David
> Thompson <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
> BlueField-3 SoC
>
> On 30/11/23 15:19, Liming Sun wrote:
> >
> >
> >> -----Original Message-----
> >> From: Christian Loehle <[email protected]>
> >> Sent: Monday, November 27, 2023 8:36 AM
> >> To: Liming Sun <[email protected]>; Adrian Hunter
> >> <[email protected]>; Ulf Hansson <[email protected]>; David
> >> Thompson <[email protected]>
> >> Cc: [email protected]; [email protected]
> >> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk
> for
> >> BlueField-3 SoC
> >>
> >> On 18/11/2023 13:46, Liming Sun wrote:
> >>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
> >>> intermittent eMMC timeout issue reported on some cards under eMMC
> >>> stress test.
> >>>
> >>> Reported error message:
> >>> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
> >>>
> >>> Signed-off-by: Liming Sun <[email protected]>
> >>> ---
> >>> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
> >>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
> >> b/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> index 3a3bae6948a8..3c8fe8aec558 100644
> >>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
> >>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
> >> sdhci_dwcmshc_pdata = {
> >>> #ifdef CONFIG_ACPI
> >>> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
> >>> .ops = &sdhci_dwcmshc_ops,
> >>> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
> >>> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
> >>> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
> >>> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> >>> SDHCI_QUIRK2_ACMD23_BROKEN,
> >>> };
> >>
> >> __mmc_blk_ioctl_cmd: data error ?
> >> What stresstest are you running that issues ioctl commands?
> >> On which commands does the timeout occur?
> >> Anyway you should be able to increase the timeout in ioctl structure
> >> directly, i.e. in userspace, or does that not work?
> >
> > It's running stress test with tool like "fio --name=randrw_stress_round_1 --
> ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --
> norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --
> iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --
> bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K
> /3:76K/3 --filename=/dev/mmcblk0"
> > The tool(application) is owned by user or with some standard tool.
>
> fio does not send mmc ioctls, so I am also a bit confused about
> how you get "__mmc_blk_ioctl_cmd: data error -110" ?

There are other activities or background task going on. I assume it's other
MMC access which are affected by the stress FIO and got timeout. Would it make sense?

2024-01-04 09:25:18

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for BlueField-3 SoC

On 19/12/23 23:18, Liming Sun wrote:
>
>
>> -----Original Message-----
>> From: Adrian Hunter <[email protected]>
>> Sent: Monday, December 11, 2023 6:39 AM
>> To: Liming Sun <[email protected]>; Christian Loehle
>> <[email protected]>; Ulf Hansson <[email protected]>; David
>> Thompson <[email protected]>
>> Cc: [email protected]; [email protected]
>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk for
>> BlueField-3 SoC
>>
>> On 30/11/23 15:19, Liming Sun wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Christian Loehle <[email protected]>
>>>> Sent: Monday, November 27, 2023 8:36 AM
>>>> To: Liming Sun <[email protected]>; Adrian Hunter
>>>> <[email protected]>; Ulf Hansson <[email protected]>; David
>>>> Thompson <[email protected]>
>>>> Cc: [email protected]; [email protected]
>>>> Subject: Re: [PATCH v1 1/1] mmc: sdhci-of-dwcmshc: Enable timeout quirk
>> for
>>>> BlueField-3 SoC
>>>>
>>>> On 18/11/2023 13:46, Liming Sun wrote:
>>>>> This commit enables SDHCI_QUIRK_BROKEN_TIMEOUT_VAL to solve the
>>>>> intermittent eMMC timeout issue reported on some cards under eMMC
>>>>> stress test.
>>>>>
>>>>> Reported error message:
>>>>> dwcmshc MLNXBF30:00: __mmc_blk_ioctl_cmd: data error -110
>>>>>
>>>>> Signed-off-by: Liming Sun <[email protected]>
>>>>> ---
>>>>> drivers/mmc/host/sdhci-of-dwcmshc.c | 3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>> b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> index 3a3bae6948a8..3c8fe8aec558 100644
>>>>> --- a/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> +++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
>>>>> @@ -365,7 +365,8 @@ static const struct sdhci_pltfm_data
>>>> sdhci_dwcmshc_pdata = {
>>>>> #ifdef CONFIG_ACPI
>>>>> static const struct sdhci_pltfm_data sdhci_dwcmshc_bf3_pdata = {
>>>>> .ops = &sdhci_dwcmshc_ops,
>>>>> - .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>>>> + .quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN |
>>>>> + SDHCI_QUIRK_BROKEN_TIMEOUT_VAL,
>>>>> .quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>>>> SDHCI_QUIRK2_ACMD23_BROKEN,
>>>>> };
>>>>
>>>> __mmc_blk_ioctl_cmd: data error ?
>>>> What stresstest are you running that issues ioctl commands?
>>>> On which commands does the timeout occur?
>>>> Anyway you should be able to increase the timeout in ioctl structure
>>>> directly, i.e. in userspace, or does that not work?
>>>
>>> It's running stress test with tool like "fio --name=randrw_stress_round_1 --
>> ioengine=libaio --direct=1 --time_based=1 --end_fsync=1 --ramp_time=5 --
>> norandommap=1 --randrepeat=0 --group_reporting=1 --numjobs=4 --
>> iodepth=128 --rw=randrw --overwrite=1 --runtime=36000 --
>> bssplit=4K/44:8K/1:12K/1:16K/1:24K/1:28K/1:32K/1:40K/32:64K/5:68K/7:72K
>> /3:76K/3 --filename=/dev/mmcblk0"
>>> The tool(application) is owned by user or with some standard tool.
>>
>> fio does not send mmc ioctls, so I am also a bit confused about
>> how you get "__mmc_blk_ioctl_cmd: data error -110" ?
>
> There are other activities or background task going on. I assume it's other
> MMC access which are affected by the stress FIO and got timeout. Would it make sense?
>

It depends on whether the IOCTL is overriding the timeout. In
struct mmc_ioc_cmd there is data_timeout_ns which overrides the
mmc core data timeout calculated by mmc_set_data_timeout(). There
is also cmd_timeout_ms for commands. You need to check whether
"__mmc_blk_ioctl_cmd: data error -110" is because data_timeout_ns
was set too low (but non-zero) by the caller of the IOCTL.