Subject: [PATCH] mmc: rpmb: do not force a retune before RPMB switch

Requesting a retune before switching to the RPMB partition has been
observed to cause CRC errors on the RPMB reads (-EILSEQ).

Since RPMB reads can not be retried, the clients would be directly
affected by the errors.

This commit disables the retune request prior to switching to the RPMB
partition: mmc_retune_pause() no longer triggers a retune before the
pause period begins.

This was verified with the sdhci-of-arasan driver (ZynqMP) configured
for HS200 using two separate eMMC cards (DG4064 and 064GB2). In both
cases, the error was easy to reproduce triggering every few tenths of
reads.

Signed-off-by: Jorge Ramirez-Ortiz <[email protected]>
Acked-by: Avri Altman <[email protected]>
---
v2:
mmc_retune_pause() no longer can trigger a retune.
Keeping Avri Altman Acked-by since they are functionally equivalent.
v1:
modify mmc_retune_pause to optionally trigger a retune.

drivers/mmc/core/host.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 096093f7be00..ed44920e92df 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -119,13 +119,12 @@ void mmc_retune_enable(struct mmc_host *host)

/*
* Pause re-tuning for a small set of operations. The pause begins after the
- * next command and after first doing re-tuning.
+ * next command.
*/
void mmc_retune_pause(struct mmc_host *host)
{
if (!host->retune_paused) {
host->retune_paused = 1;
- mmc_retune_needed(host);
mmc_retune_hold(host);
}
}
--
2.34.1


2024-01-03 08:08:30

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH] mmc: rpmb: do not force a retune before RPMB switch

On 11/12/23 18:55, Jorge Ramirez-Ortiz wrote:
> Requesting a retune before switching to the RPMB partition has been
> observed to cause CRC errors on the RPMB reads (-EILSEQ).
>
> Since RPMB reads can not be retried, the clients would be directly
> affected by the errors.
>
> This commit disables the retune request prior to switching to the RPMB
> partition: mmc_retune_pause() no longer triggers a retune before the
> pause period begins.
>
> This was verified with the sdhci-of-arasan driver (ZynqMP) configured
> for HS200 using two separate eMMC cards (DG4064 and 064GB2). In both
> cases, the error was easy to reproduce triggering every few tenths of
> reads.
>
> Signed-off-by: Jorge Ramirez-Ortiz <[email protected]>
> Acked-by: Avri Altman <[email protected]>

Acked-by: Adrian Hunter <[email protected]>

> ---
> v2:
> mmc_retune_pause() no longer can trigger a retune.
> Keeping Avri Altman Acked-by since they are functionally equivalent.
> v1:
> modify mmc_retune_pause to optionally trigger a retune.
>
> drivers/mmc/core/host.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> index 096093f7be00..ed44920e92df 100644
> --- a/drivers/mmc/core/host.c
> +++ b/drivers/mmc/core/host.c
> @@ -119,13 +119,12 @@ void mmc_retune_enable(struct mmc_host *host)
>
> /*
> * Pause re-tuning for a small set of operations. The pause begins after the
> - * next command and after first doing re-tuning.
> + * next command.
> */
> void mmc_retune_pause(struct mmc_host *host)
> {
> if (!host->retune_paused) {
> host->retune_paused = 1;
> - mmc_retune_needed(host);
> mmc_retune_hold(host);
> }
> }
> --
> 2.34.1


2024-01-03 10:35:57

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] mmc: rpmb: do not force a retune before RPMB switch

On Mon, 11 Dec 2023 at 17:55, Jorge Ramirez-Ortiz <[email protected]> wrote:
>
> Requesting a retune before switching to the RPMB partition has been
> observed to cause CRC errors on the RPMB reads (-EILSEQ).
>
> Since RPMB reads can not be retried, the clients would be directly
> affected by the errors.
>
> This commit disables the retune request prior to switching to the RPMB
> partition: mmc_retune_pause() no longer triggers a retune before the
> pause period begins.
>
> This was verified with the sdhci-of-arasan driver (ZynqMP) configured
> for HS200 using two separate eMMC cards (DG4064 and 064GB2). In both
> cases, the error was easy to reproduce triggering every few tenths of
> reads.
>
> Signed-off-by: Jorge Ramirez-Ortiz <[email protected]>
> Acked-by: Avri Altman <[email protected]>

This seems reasonable, but I would like to see some justification from
a performance point of view in the commit message too.

Moreover, please bump the version number of the patch at each
iteration and add a version summary of what has changed. That helps
the review process.

Kind regards
Uffe

> ---
> v2:
> mmc_retune_pause() no longer can trigger a retune.
> Keeping Avri Altman Acked-by since they are functionally equivalent.
> v1:
> modify mmc_retune_pause to optionally trigger a retune.
>
> drivers/mmc/core/host.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
> index 096093f7be00..ed44920e92df 100644
> --- a/drivers/mmc/core/host.c
> +++ b/drivers/mmc/core/host.c
> @@ -119,13 +119,12 @@ void mmc_retune_enable(struct mmc_host *host)
>
> /*
> * Pause re-tuning for a small set of operations. The pause begins after the
> - * next command and after first doing re-tuning.
> + * next command.
> */
> void mmc_retune_pause(struct mmc_host *host)
> {
> if (!host->retune_paused) {
> host->retune_paused = 1;
> - mmc_retune_needed(host);
> mmc_retune_hold(host);
> }
> }
> --
> 2.34.1

2024-01-05 13:01:21

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH] mmc: rpmb: do not force a retune before RPMB switch



On 1/5/24 09:49, Jorge Ramirez-Ortiz, Foundries wrote:
> On 04/01/24 20:34:09, Adrian Hunter wrote:
>> On 3/01/24 11:20, Jorge Ramirez-Ortiz, Foundries wrote:
>>> On 03/01/24 10:03:38, Adrian Hunter wrote:
>>>> Thanks for doing that! That seems to explain the mystery.
>>>>
>>>> You could hack the test to get an idea of how many successful
>>>> iterations there are before getting an error.
>>>>
>>>> For SDHCI, one difference between tuning and re-tuning is the
>>>> setting of bit-7 "Sampling Clock Select" of "Host Control 2 Register".
>>>> It is initially 0 and then set to 1 after the successful tuning.
>>>> Essentially, leaving it set to 1 is meant to speed up the re-tuning.
>>>> You could try setting it to zero instead, and see if that helps.
>>>> e.g.
>>>>
>>>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
>>>> index c79f73459915..714d8cc39709 100644
>>>> --- a/drivers/mmc/host/sdhci.c
>>>> +++ b/drivers/mmc/host/sdhci.c
>>>> @@ -2732,6 +2732,7 @@ void sdhci_start_tuning(struct sdhci_host *host)
>>>> ctrl |= SDHCI_CTRL_EXEC_TUNING;
>>>> if (host->quirks2 & SDHCI_QUIRK2_TUNING_WORK_AROUND)
>>>> ctrl |= SDHCI_CTRL_TUNED_CLK;
>>>> + ctrl &= ~SDHCI_CTRL_TUNED_CLK;
>>>> sdhci_writew(host, ctrl, SDHCI_HOST_CONTROL2);
>>>>
>>>> /*
>>>>
>>>
>>>
>>> Yes with that change, the re-tuning reliability test does pass.
>>>
>>> root@uz3cg-dwg-sec:/sys/kernel/debug/mmc0# echo 52 > /sys/kernel/debug/mmc0/mmc0\:0001/test
>>> [ 237.833585] mmc0: Starting tests of card mmc0:0001...
>>> [ 237.838759] mmc0: Test case 52. Re-tuning reliability...
>>> [ 267.845403] mmc0: Result: OK
>>> [ 267.848365] mmc0: Tests completed.
>>>
>>>
>>> Unfortunately I still see the error when looping on RPMB reads.
>>>
>>> For instance with this test script
>>> $ while true; do rpmb_read m4hash; usleep 300; done
>>>
>>> I can see the error triggering on the serial port after a minute or so.
>>> [ 151.682907] sdhci-arasan ff160000.mmc: __mmc_blk_ioctl_cmd: data error -84
>>>
>>> Causing OP-TEE to panic since the RPMB read returns an error
>>> E/TC:? 0
>>> E/TC:? 0 TA panicked with code 0xffff0000
>>> E/LD: Status of TA 22250a54-0bf1-48fe-8002-7b20f1c9c9b1
>>> E/LD: arch: aarch64
>>> [...]
>>>
>>> if anything else springs to your mind I am happy to test of course - there are
>>> so many tunnables in this subsystem that experience is this area has exponential
>>> value (and I dont have much).
>>>
>>> Would it make sense if re-tuning requests are rejected unless a minimum number
>>> of jiffies have passed? should I try that as a change?
>>>
>>> or maybe delay a bit longer the RPMB access after a retune request?
>>
>> It seems re-tuning is not working properly, so ideally the
>> SoC vendor / driver implementer would provide a solution.
>
>
> Makes sense to me too. I am copying Michal on the DL.

We will take look at it.

Thanks,
Michal