2015-06-16 10:37:37

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH 0/3] omap_hsmmc: Fix card enumeration failure on


Hi,

When using omap_hsmmc driver, if sd-card repeatedly plug unplugged
multiple times quickly, card enumeration stops after few iterations.
This can be easily reproduced on DRA74X EVM which uses omap_hsmmc driver.
This patch series addresses the above problem. The first patch fixes irq
handler to report all DTOs to mmc-core. Second patch adds handling for
BADA, DEB and CEB interrupts. The last patch introduces driver specific
card detect irq handler to cleanup pending requests before card removal.

Tested on DRA74X amd DRA72X and AM437X-GP EVMs, by repeated intense
plug/unplug iterations.


Kishon Vijay Abraham I (1):
mmc: host: omap_hsmmc: Fix DTO and DCRC handling

Vignesh R (2):
mmc: host: omap_hsmmc: Handle BADA, DEB and CEB interrupts
mmc: host: omap_hsmmc: Add custom card detect irq handler

drivers/mmc/host/omap_hsmmc.c | 84 ++++++++++++++++++++++++++++++++---
1 file changed, 78 insertions(+), 6 deletions(-)

--
2.4.1


2015-06-16 10:38:24

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH 1/3] mmc: host: omap_hsmmc: Fix DTO and DCRC handling

From: Kishon Vijay Abraham I <[email protected]>

DTO/DCRC errors were not being informed to the mmc core since
commit ae4bf788ee9b ("mmc: omap_hsmmc: consolidate error report handling of
HSMMC IRQ"). This commit made sure 'end_trans' is never set on DTO/DCRC
errors. This is because after this commit 'host->data' is checked after
it has been cleared to NULL by omap_hsmmc_dma_cleanup().

Because 'end_trans' is never set, omap_hsmmc_xfer_done() is never invoked
making core layer not to be aware of DTO/DCRC errors. Because of this
any command invoked after DTO/DCRC error leads to a hang.

Fix this by checking for 'host->data' before it is actually cleared.

Fixes: ae4bf788ee9b ("mmc: omap_hsmmc: consolidate error report handling of
HSMMC IRQ")

CC: [email protected]
Signed-off-by: Kishon Vijay Abraham I <[email protected]>
Signed-off-by: Vignesh R <[email protected]>
---
drivers/mmc/host/omap_hsmmc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index 9df2b6801f76..d0abdffb0d7c 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -1062,6 +1062,10 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)

if (status & (CTO_EN | CCRC_EN))
end_cmd = 1;
+ if (host->data || host->response_busy) {
+ end_trans = !end_cmd;
+ host->response_busy = 0;
+ }
if (status & (CTO_EN | DTO_EN))
hsmmc_command_incomplete(host, -ETIMEDOUT, end_cmd);
else if (status & (CCRC_EN | DCRC_EN))
@@ -1081,10 +1085,6 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)
}
dev_dbg(mmc_dev(host->mmc), "AC12 err: 0x%x\n", ac12);
}
- if (host->data || host->response_busy) {
- end_trans = !end_cmd;
- host->response_busy = 0;
- }
}

OMAP_HSMMC_WRITE(host->base, STAT, status);
--
2.4.1

2015-06-16 10:37:43

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH 2/3] mmc: host: omap_hsmmc: Handle BADA, DEB and CEB interrupts

Sometimes BADA, DEB or CEB error interrupts occur when sd card is
unplugged during data transfer. These interrupts are currently ignored
by the interrupt handler. But, this results in card not being
recognised on subsequent insertion. This is because mmcqd is waiting
forever for the data transfer(for which error occurred) to complete.
Fix this, by reporting BADA, DEB, CEB errors to mmc-core as -EILSEQ, so
that the core can do appropriate handling.

Signed-off-by: Vignesh R <[email protected]>
---
drivers/mmc/host/omap_hsmmc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index d0abdffb0d7c..fb4bfefd9250 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -1068,7 +1068,8 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)
}
if (status & (CTO_EN | DTO_EN))
hsmmc_command_incomplete(host, -ETIMEDOUT, end_cmd);
- else if (status & (CCRC_EN | DCRC_EN))
+ else if (status & (CCRC_EN | DCRC_EN | DEB_EN | CEB_EN |
+ BADA_EN))
hsmmc_command_incomplete(host, -EILSEQ, end_cmd);

if (status & ACE_EN) {
--
2.4.1

2015-06-16 10:37:56

by Vignesh Raghavendra

[permalink] [raw]
Subject: [PATCH 3/3] mmc: host: omap_hsmmc: Add custom card detect irq handler

Usually when there is an error in transfer, DTO/CTO or other error
interrupts are raised. But if the card is unplugged in the middle of a
data transfer, it is observed that, neither completion(success) or
timeout(error) interrupts are raised. Hence, the mmc-core is waiting
for-ever for the transfer to complete. This results failure to recognise
sd card on the next insertion.
The only way to solve this is to introduce code to detect this condition
and recover on card insertion (in hsmmc specific cd_irq). Hence,
introduce cd_irq and add code to clear mmc_request that is pending from
the failed transaction.

Signed-off-by: Vignesh R <[email protected]>
---
drivers/mmc/host/omap_hsmmc.c | 73 ++++++++++++++++++++++++++++++++++-
1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
index fb4bfefd9250..ec1fff3c0c9c 100644
--- a/drivers/mmc/host/omap_hsmmc.c
+++ b/drivers/mmc/host/omap_hsmmc.c
@@ -221,6 +221,12 @@ struct omap_hsmmc_host {
#define HSMMC_WAKE_IRQ_ENABLED (1 << 2)
struct omap_hsmmc_next next_data;
struct omap_hsmmc_platform_data *pdata;
+ /*
+ * flag to determine whether card was removed during data
+ * transfer
+ */
+ bool transfer_incomplete;
+

/* return MMC cover switch state, can be NULL if not supported.
*
@@ -867,6 +873,26 @@ static void omap_hsmmc_request_done(struct omap_hsmmc_host *host, struct mmc_req
}

/*
+ * Cleanup incomplete card removal sequence. This will make sure the
+ * next card enumeration is clean.
+ */
+static void omap_hsmmc_request_clear(struct omap_hsmmc_host *host,
+ struct mmc_request *mrq)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&host->irq_lock, flags);
+ host->req_in_progress = 0;
+ host->dma_ch = -1;
+ spin_unlock_irqrestore(&host->irq_lock, flags);
+
+ mmc_request_done(host->mmc, mrq);
+ if (host->mmc->card)
+ mmc_card_set_removed(host->mmc->card);
+ host->mrq = NULL;
+}
+
+/*
* Notify the transfer complete to MMC core
*/
static void
@@ -1248,6 +1274,47 @@ static irqreturn_t omap_hsmmc_cover_irq(int irq, void *dev_id)
return IRQ_HANDLED;
}

+/*
+ * irq handler to notify the core about card insertion/removal
+ */
+static irqreturn_t omap_hsmmc_cd_irq(int irq, void *dev_id)
+{
+ struct omap_hsmmc_host *host = mmc_priv(dev_id);
+ int carddetect = mmc_gpio_get_cd(host->mmc);
+ struct mmc_request *mrq = host->mrq;
+
+ /*
+ * If the card was removed in the middle of data transfer last
+ * time, the TC/CC/timeout interrupt is not raised due to which
+ * mmc_request is not cleared. Hence, this card insertion will
+ * still see pending mmc_request. Clear the request to make sure
+ * that this card enumeration is successful.
+ */
+ if (!carddetect && mrq && host->transfer_incomplete) {
+ omap_hsmmc_disable_irq(host);
+ dev_info(host->dev,
+ "card removed during transfer last time\n");
+ hsmmc_command_incomplete(host, -ENOMEDIUM, 1);
+ omap_hsmmc_request_clear(host, host->mrq);
+ dev_info(host->dev, "recovery done\n");
+ }
+ host->transfer_incomplete = false;
+
+ mmc_detect_change(host->mmc, (HZ * 200) / 1000);
+
+ /*
+ * The current mmc_request is usually null before card removal
+ * sequence is complete. It may not be null if TC/CC interrupt
+ * never happens due to removal of card during a data
+ * transfer. Set a flag to indicate mmc_request was not null
+ * in order to do cleanup on next card insertion.
+ */
+ if (carddetect && mrq)
+ host->transfer_incomplete = true;
+
+ return IRQ_HANDLED;
+}
+
static void omap_hsmmc_dma_callback(void *param)
{
struct omap_hsmmc_host *host = param;
@@ -1918,7 +1985,7 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
struct mmc_host *mmc;
struct omap_hsmmc_host *host = NULL;
struct resource *res;
- int ret, irq;
+ int ret, irq, len;
const struct of_device_id *match;
dma_cap_mask_t mask;
unsigned tx_req, rx_req;
@@ -1980,6 +2047,10 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
if (ret)
goto err_gpio;

+ /* register cd_irq, if cd-gpios property is specified in dt */
+ if (of_find_property(host->dev->of_node, "cd-gpios", &len))
+ mmc_gpio_set_cd_isr(mmc, omap_hsmmc_cd_irq);
+
platform_set_drvdata(pdev, host);

if (pdev->dev.of_node)
--
2.4.1

2015-06-20 22:22:05

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 1/3] mmc: host: omap_hsmmc: Fix DTO and DCRC handling

Hi Vignesh,

2015-06-16 12:37 GMT+02:00 Vignesh R <[email protected]>:
> From: Kishon Vijay Abraham I <[email protected]>
>
> DTO/DCRC errors were not being informed to the mmc core since
> commit ae4bf788ee9b ("mmc: omap_hsmmc: consolidate error report handling of
> HSMMC IRQ"). This commit made sure 'end_trans' is never set on DTO/DCRC
> errors. This is because after this commit 'host->data' is checked after
> it has been cleared to NULL by omap_hsmmc_dma_cleanup().
>
> Because 'end_trans' is never set, omap_hsmmc_xfer_done() is never invoked
> making core layer not to be aware of DTO/DCRC errors. Because of this
> any command invoked after DTO/DCRC error leads to a hang.
>
> Fix this by checking for 'host->data' before it is actually cleared.

This really fixes the problem, thanks for the analysis
TESTED-BY

>
> Fixes: ae4bf788ee9b ("mmc: omap_hsmmc: consolidate error report handling of
> HSMMC IRQ")
>
> CC: [email protected]
> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
> Signed-off-by: Vignesh R <[email protected]>
> ---
> drivers/mmc/host/omap_hsmmc.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
> index 9df2b6801f76..d0abdffb0d7c 100644
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -1062,6 +1062,10 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)
>
> if (status & (CTO_EN | CCRC_EN))
> end_cmd = 1;
> + if (host->data || host->response_busy) {
> + end_trans = !end_cmd;
> + host->response_busy = 0;
> + }
> if (status & (CTO_EN | DTO_EN))
> hsmmc_command_incomplete(host, -ETIMEDOUT, end_cmd);
> else if (status & (CCRC_EN | DCRC_EN))
> @@ -1081,10 +1085,6 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)
> }
> dev_dbg(mmc_dev(host->mmc), "AC12 err: 0x%x\n", ac12);
> }
> - if (host->data || host->response_busy) {
> - end_trans = !end_cmd;
> - host->response_busy = 0;
> - }
> }
>
> OMAP_HSMMC_WRITE(host->base, STAT, status);
> --
> 2.4.1
>

2015-06-20 22:23:12

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 2/3] mmc: host: omap_hsmmc: Handle BADA, DEB and CEB interrupts

TESTED-BY

2015-06-16 12:37 GMT+02:00 Vignesh R <[email protected]>:
> Sometimes BADA, DEB or CEB error interrupts occur when sd card is
> unplugged during data transfer. These interrupts are currently ignored
> by the interrupt handler. But, this results in card not being
> recognised on subsequent insertion. This is because mmcqd is waiting
> forever for the data transfer(for which error occurred) to complete.
> Fix this, by reporting BADA, DEB, CEB errors to mmc-core as -EILSEQ, so
> that the core can do appropriate handling.
>
> Signed-off-by: Vignesh R <[email protected]>
> ---
> drivers/mmc/host/omap_hsmmc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
> index d0abdffb0d7c..fb4bfefd9250 100644
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -1068,7 +1068,8 @@ static void omap_hsmmc_do_irq(struct omap_hsmmc_host *host, int status)
> }
> if (status & (CTO_EN | DTO_EN))
> hsmmc_command_incomplete(host, -ETIMEDOUT, end_cmd);
> - else if (status & (CCRC_EN | DCRC_EN))
> + else if (status & (CCRC_EN | DCRC_EN | DEB_EN | CEB_EN |
> + BADA_EN))
> hsmmc_command_incomplete(host, -EILSEQ, end_cmd);
>
> if (status & ACE_EN) {
> --
> 2.4.1
>

2015-06-20 22:46:04

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 3/3] mmc: host: omap_hsmmc: Add custom card detect irq handler

I haven't managed to produce a hang without this patch

see also comments below

2015-06-16 12:37 GMT+02:00 Vignesh R <[email protected]>:
> Usually when there is an error in transfer, DTO/CTO or other error
> interrupts are raised. But if the card is unplugged in the middle of a
> data transfer, it is observed that, neither completion(success) or
> timeout(error) interrupts are raised. Hence, the mmc-core is waiting
> for-ever for the transfer to complete. This results failure to recognise
> sd card on the next insertion.
> The only way to solve this is to introduce code to detect this condition
> and recover on card insertion (in hsmmc specific cd_irq). Hence,
> introduce cd_irq and add code to clear mmc_request that is pending from
> the failed transaction.
>
> Signed-off-by: Vignesh R <[email protected]>
> ---
> drivers/mmc/host/omap_hsmmc.c | 73 ++++++++++++++++++++++++++++++++++-
> 1 file changed, 72 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
> index fb4bfefd9250..ec1fff3c0c9c 100644
> --- a/drivers/mmc/host/omap_hsmmc.c
> +++ b/drivers/mmc/host/omap_hsmmc.c
> @@ -221,6 +221,12 @@ struct omap_hsmmc_host {
> #define HSMMC_WAKE_IRQ_ENABLED (1 << 2)
> struct omap_hsmmc_next next_data;
> struct omap_hsmmc_platform_data *pdata;
> + /*
> + * flag to determine whether card was removed during data
> + * transfer
> + */
> + bool transfer_incomplete;
> +
>
> /* return MMC cover switch state, can be NULL if not supported.
> *
> @@ -867,6 +873,26 @@ static void omap_hsmmc_request_done(struct omap_hsmmc_host *host, struct mmc_req
> }
>
> /*
> + * Cleanup incomplete card removal sequence. This will make sure the
> + * next card enumeration is clean.
> + */
> +static void omap_hsmmc_request_clear(struct omap_hsmmc_host *host,
> + struct mmc_request *mrq)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&host->irq_lock, flags);
> + host->req_in_progress = 0;
> + host->dma_ch = -1;
> + spin_unlock_irqrestore(&host->irq_lock, flags);
> +
> + mmc_request_done(host->mmc, mrq);
> + if (host->mmc->card)
> + mmc_card_set_removed(host->mmc->card);
> + host->mrq = NULL;
> +}
> +
> +/*
> * Notify the transfer complete to MMC core
> */
> static void
> @@ -1248,6 +1274,47 @@ static irqreturn_t omap_hsmmc_cover_irq(int irq, void *dev_id)
> return IRQ_HANDLED;
> }
>
> +/*
> + * irq handler to notify the core about card insertion/removal
> + */
> +static irqreturn_t omap_hsmmc_cd_irq(int irq, void *dev_id)
> +{

Move this code to 'omap_hsmmc_get_cd' function. Or rather clear
any pending transfer upon '.init_card/omap_hsmmc_init_card'

I guess other hosts also have some housekeeping upon unexpected
card removals. So I guess there is some generic code to this problem.
Did you check?

> + struct omap_hsmmc_host *host = mmc_priv(dev_id);
> + int carddetect = mmc_gpio_get_cd(host->mmc);
> + struct mmc_request *mrq = host->mrq;
> +
> + /*
> + * If the card was removed in the middle of data transfer last
> + * time, the TC/CC/timeout interrupt is not raised due to which
> + * mmc_request is not cleared. Hence, this card insertion will
> + * still see pending mmc_request. Clear the request to make sure
> + * that this card enumeration is successful.
> + */
> + if (!carddetect && mrq && host->transfer_incomplete) {
> + omap_hsmmc_disable_irq(host);
> + dev_info(host->dev,
> + "card removed during transfer last time\n");
> + hsmmc_command_incomplete(host, -ENOMEDIUM, 1);
> + omap_hsmmc_request_clear(host, host->mrq);
> + dev_info(host->dev, "recovery done\n");
> + }
> + host->transfer_incomplete = false;
> +
> + mmc_detect_change(host->mmc, (HZ * 200) / 1000);
> +
> + /*
> + * The current mmc_request is usually null before card removal
> + * sequence is complete. It may not be null if TC/CC interrupt
> + * never happens due to removal of card during a data
> + * transfer. Set a flag to indicate mmc_request was not null
> + * in order to do cleanup on next card insertion.
> + */
> + if (carddetect && mrq)
> + host->transfer_incomplete = true;
> +
> + return IRQ_HANDLED;
> +}
> +
> static void omap_hsmmc_dma_callback(void *param)
> {
> struct omap_hsmmc_host *host = param;
> @@ -1918,7 +1985,7 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
> struct mmc_host *mmc;
> struct omap_hsmmc_host *host = NULL;
> struct resource *res;
> - int ret, irq;
> + int ret, irq, len;
> const struct of_device_id *match;
> dma_cap_mask_t mask;
> unsigned tx_req, rx_req;
> @@ -1980,6 +2047,10 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
> if (ret)
> goto err_gpio;
>
> + /* register cd_irq, if cd-gpios property is specified in dt */
> + if (of_find_property(host->dev->of_node, "cd-gpios", &len))
> + mmc_gpio_set_cd_isr(mmc, omap_hsmmc_cd_irq);
> +

nack, this is done by mmc_of_parse

> platform_set_drvdata(pdev, host);
>
> if (pdev->dev.of_node)
> --
> 2.4.1
>

2015-06-22 13:18:51

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH 3/3] mmc: host: omap_hsmmc: Add custom card detect irq handler

Hi Andreas,


Thanks for testing out these patches.

On Sunday 21 June 2015 04:15 AM, Andreas Fenkart wrote:
> I haven't managed to produce a hang without this patch

Reproducing this hang is not straight forward. It takes 40-50 card
insertion/removal to see this case (sometimes even 100 tries).

>
> see also comments below
>
> 2015-06-16 12:37 GMT+02:00 Vignesh R <[email protected]>:
>> Usually when there is an error in transfer, DTO/CTO or other error
>> interrupts are raised. But if the card is unplugged in the middle of a
>> data transfer, it is observed that, neither completion(success) or
>> timeout(error) interrupts are raised. Hence, the mmc-core is waiting
>> for-ever for the transfer to complete. This results failure to recognise
>> sd card on the next insertion.
>> The only way to solve this is to introduce code to detect this condition
>> and recover on card insertion (in hsmmc specific cd_irq). Hence,
>> introduce cd_irq and add code to clear mmc_request that is pending from
>> the failed transaction.
>>
>> Signed-off-by: Vignesh R <[email protected]>
>> ---
>> drivers/mmc/host/omap_hsmmc.c | 73 ++++++++++++++++++++++++++++++++++-
>> 1 file changed, 72 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
>> index fb4bfefd9250..ec1fff3c0c9c 100644
>> --- a/drivers/mmc/host/omap_hsmmc.c
>> +++ b/drivers/mmc/host/omap_hsmmc.c
>> @@ -221,6 +221,12 @@ struct omap_hsmmc_host {
>> #define HSMMC_WAKE_IRQ_ENABLED (1 << 2)
>> struct omap_hsmmc_next next_data;
>> struct omap_hsmmc_platform_data *pdata;
>> + /*
>> + * flag to determine whether card was removed during data
>> + * transfer
>> + */
>> + bool transfer_incomplete;
>> +
>>
>> /* return MMC cover switch state, can be NULL if not supported.
>> *
>> @@ -867,6 +873,26 @@ static void omap_hsmmc_request_done(struct omap_hsmmc_host *host, struct mmc_req
>> }
>>
>> /*
>> + * Cleanup incomplete card removal sequence. This will make sure the
>> + * next card enumeration is clean.
>> + */
>> +static void omap_hsmmc_request_clear(struct omap_hsmmc_host *host,
>> + struct mmc_request *mrq)
>> +{
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&host->irq_lock, flags);
>> + host->req_in_progress = 0;
>> + host->dma_ch = -1;
>> + spin_unlock_irqrestore(&host->irq_lock, flags);
>> +
>> + mmc_request_done(host->mmc, mrq);
>> + if (host->mmc->card)
>> + mmc_card_set_removed(host->mmc->card);
>> + host->mrq = NULL;
>> +}
>> +
>> +/*
>> * Notify the transfer complete to MMC core
>> */
>> static void
>> @@ -1248,6 +1274,47 @@ static irqreturn_t omap_hsmmc_cover_irq(int irq, void *dev_id)
>> return IRQ_HANDLED;
>> }
>>
>> +/*
>> + * irq handler to notify the core about card insertion/removal
>> + */
>> +static irqreturn_t omap_hsmmc_cd_irq(int irq, void *dev_id)
>> +{
>
> Move this code to 'omap_hsmmc_get_cd' function. Or rather clear
> any pending transfer upon '.init_card/omap_hsmmc_init_card'

I tried using omap_hsmmc_init initially, but here is the problem:

card inserted --> cd_irq_isr--> schedule mmc_rescan()
--- after some time ---
mmc_rescan() -->
mmc_sd_alive() -->
-- card removed ---
mmc_send_status -->
mmc_wait_for_req()-->
wait_for_completion()
^^^ Here the mmc_rescan thread waits forever because, it doesn't get
timeout interrupt for the cmd/req it sent (because card was removed).

But calls to omap_hsmmc_card_init or omap_hsmmc_get_cd are in the same
mmc_rescan thread. Hence, moving the recovery code to init_card does not
help.

>
> I guess other hosts also have some housekeeping upon unexpected
> card removals. So I guess there is some generic code to this problem.
> Did you check?

I did try to find generic code in mmc-core, but couldn't find any.
However, I did see some driver specific cleanups related to unexpected
card removal in sdhci.c. sdhci handles this in .card_event call back.
This does not help in my case as .card_event is also called from
mmc_rescan. I think cleanup code (in sdhci driver) for unexpected card
removal was introduced when .card_event call was outside mmc_rescan.

>
>> + struct omap_hsmmc_host *host = mmc_priv(dev_id);
>> + int carddetect = mmc_gpio_get_cd(host->mmc);
>> + struct mmc_request *mrq = host->mrq;
>> +
>> + /*
>> + * If the card was removed in the middle of data transfer last
>> + * time, the TC/CC/timeout interrupt is not raised due to which
>> + * mmc_request is not cleared. Hence, this card insertion will
>> + * still see pending mmc_request. Clear the request to make sure
>> + * that this card enumeration is successful.
>> + */
>> + if (!carddetect && mrq && host->transfer_incomplete) {
>> + omap_hsmmc_disable_irq(host);
>> + dev_info(host->dev,
>> + "card removed during transfer last time\n");
>> + hsmmc_command_incomplete(host, -ENOMEDIUM, 1);
>> + omap_hsmmc_request_clear(host, host->mrq);
>> + dev_info(host->dev, "recovery done\n");
>> + }
>> + host->transfer_incomplete = false;
>> +
>> + mmc_detect_change(host->mmc, (HZ * 200) / 1000);
>> +
>> + /*
>> + * The current mmc_request is usually null before card removal
>> + * sequence is complete. It may not be null if TC/CC interrupt
>> + * never happens due to removal of card during a data
>> + * transfer. Set a flag to indicate mmc_request was not null
>> + * in order to do cleanup on next card insertion.
>> + */
>> + if (carddetect && mrq)
>> + host->transfer_incomplete = true;
>> +
>> + return IRQ_HANDLED;
>> +}
>> +
>> static void omap_hsmmc_dma_callback(void *param)
>> {
>> struct omap_hsmmc_host *host = param;
>> @@ -1918,7 +1985,7 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
>> struct mmc_host *mmc;
>> struct omap_hsmmc_host *host = NULL;
>> struct resource *res;
>> - int ret, irq;
>> + int ret, irq, len;
>> const struct of_device_id *match;
>> dma_cap_mask_t mask;
>> unsigned tx_req, rx_req;
>> @@ -1980,6 +2047,10 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
>> if (ret)
>> goto err_gpio;
>>
>> + /* register cd_irq, if cd-gpios property is specified in dt */
>> + if (of_find_property(host->dev->of_node, "cd-gpios", &len))
>> + mmc_gpio_set_cd_isr(mmc, omap_hsmmc_cd_irq);
>> +
>
> nack, this is done by mmc_of_parse

mmc_of_parse parses the cd-gpios and registers a generic cd_irq_isr but
there is no way to set driver specific cd_irq. So, the above code is
required.

>
>> platform_set_drvdata(pdev, host);
>>
>> if (pdev->dev.of_node)
>> --
>> 2.4.1
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>

Regards
Vignesh

2015-07-06 06:13:06

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH 0/3] omap_hsmmc: Fix card enumeration failure on



On Tuesday 16 June 2015 04:07 PM, Vignesh R wrote:
>
> Hi,
>
> When using omap_hsmmc driver, if sd-card repeatedly plug unplugged
> multiple times quickly, card enumeration stops after few iterations.
> This can be easily reproduced on DRA74X EVM which uses omap_hsmmc driver.
> This patch series addresses the above problem. The first patch fixes irq
> handler to report all DTOs to mmc-core. Second patch adds handling for
> BADA, DEB and CEB interrupts. The last patch introduces driver specific
> card detect irq handler to cleanup pending requests before card removal.
>
> Tested on DRA74X amd DRA72X and AM437X-GP EVMs, by repeated intense
> plug/unplug iterations.

Gentle ping...

--
Regards
Vignesh

2015-07-11 10:26:13

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH 0/3] omap_hsmmc: Fix card enumeration failure on



On 6/16/2015 4:07 PM, Vignesh R wrote:
>
> Hi,
>
> When using omap_hsmmc driver, if sd-card repeatedly plug unplugged
> multiple times quickly, card enumeration stops after few iterations.
> This can be easily reproduced on DRA74X EVM which uses omap_hsmmc driver.
> This patch series addresses the above problem. The first patch fixes irq
> handler to report all DTOs to mmc-core. Second patch adds handling for
> BADA, DEB and CEB interrupts. The last patch introduces driver specific
> card detect irq handler to cleanup pending requests before card removal.
>
> Tested on DRA74X amd DRA72X and AM437X-GP EVMs, by repeated intense
> plug/unplug iterations.
>

Gentle ping.. These patches fix bugs related to interrupt handling in
omap_hsmmc. It also fixes bugs with unexpected card removal during data
transfer. Can this go into 4.2-rc??

>
> Kishon Vijay Abraham I (1):
> mmc: host: omap_hsmmc: Fix DTO and DCRC handling
>
> Vignesh R (2):
> mmc: host: omap_hsmmc: Handle BADA, DEB and CEB interrupts
> mmc: host: omap_hsmmc: Add custom card detect irq handler
>
> drivers/mmc/host/omap_hsmmc.c | 84 ++++++++++++++++++++++++++++++++---
> 1 file changed, 78 insertions(+), 6 deletions(-)
>

2015-07-20 14:21:04

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH 0/3] omap_hsmmc: Fix card enumeration failure on

On 16 June 2015 at 12:37, Vignesh R <[email protected]> wrote:
>
> Hi,
>
> When using omap_hsmmc driver, if sd-card repeatedly plug unplugged
> multiple times quickly, card enumeration stops after few iterations.
> This can be easily reproduced on DRA74X EVM which uses omap_hsmmc driver.
> This patch series addresses the above problem. The first patch fixes irq
> handler to report all DTOs to mmc-core. Second patch adds handling for
> BADA, DEB and CEB interrupts. The last patch introduces driver specific
> card detect irq handler to cleanup pending requests before card removal.
>
> Tested on DRA74X amd DRA72X and AM437X-GP EVMs, by repeated intense
> plug/unplug iterations.
>
>
> Kishon Vijay Abraham I (1):
> mmc: host: omap_hsmmc: Fix DTO and DCRC handling
>
> Vignesh R (2):
> mmc: host: omap_hsmmc: Handle BADA, DEB and CEB interrupts
> mmc: host: omap_hsmmc: Add custom card detect irq handler
>
> drivers/mmc/host/omap_hsmmc.c | 84 ++++++++++++++++++++++++++++++++---
> 1 file changed, 78 insertions(+), 6 deletions(-)
>
> --
> 2.4.1
>

I have applied patch1 and patch 2 for fixes. Regarding patch 3, let's
give Andreas some more time to comment.

Thanks and sorry for the delay!

Kind regards
Uffe

2015-07-21 08:16:08

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 3/3] mmc: host: omap_hsmmc: Add custom card detect irq handler

Hi Vignesh,

Generally I don't like this patch, it will make it harder, not easier,
to maintain the omap hsmmc driver. Also given that the bug occurs
rarely, people will be reluctant to clean it up later or accept
patches.

see also comments below

2015-06-22 15:18 GMT+02:00 Vignesh R <[email protected]>:
>
>
> On Sunday 21 June 2015 04:15 AM, Andreas Fenkart wrote:
>> I haven't managed to produce a hang without this patch
>
> Reproducing this hang is not straight forward. It takes 40-50 card
> insertion/removal to see this case (sometimes even 100 tries).
>
>>
>> see also comments below
>>
>> 2015-06-16 12:37 GMT+02:00 Vignesh R <[email protected]>:
>>> Usually when there is an error in transfer, DTO/CTO or other error
>>> interrupts are raised. But if the card is unplugged in the middle of a
>>> data transfer, it is observed that, neither completion(success) or
>>> timeout(error) interrupts are raised. Hence, the mmc-core is waiting
>>> for-ever for the transfer to complete. This results failure to recognise
>>> sd card on the next insertion.
>>> The only way to solve this is to introduce code to detect this condition
>>> and recover on card insertion (in hsmmc specific cd_irq). Hence,
>>> introduce cd_irq and add code to clear mmc_request that is pending from
>>> the failed transaction.
>>>
>>> Signed-off-by: Vignesh R <[email protected]>
>>> ---
>>> drivers/mmc/host/omap_hsmmc.c | 73 ++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 72 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c
>>> index fb4bfefd9250..ec1fff3c0c9c 100644
>>> --- a/drivers/mmc/host/omap_hsmmc.c
>>> +++ b/drivers/mmc/host/omap_hsmmc.c
>>> @@ -221,6 +221,12 @@ struct omap_hsmmc_host {
>>> #define HSMMC_WAKE_IRQ_ENABLED (1 << 2)
>>> struct omap_hsmmc_next next_data;
>>> struct omap_hsmmc_platform_data *pdata;
>>> + /*
>>> + * flag to determine whether card was removed during data
>>> + * transfer
>>> + */
>>> + bool transfer_incomplete;
>>> +
>>>
>>> /* return MMC cover switch state, can be NULL if not supported.
>>> *
>>> @@ -867,6 +873,26 @@ static void omap_hsmmc_request_done(struct omap_hsmmc_host *host, struct mmc_req
>>> }
>>>
>>> /*
>>> + * Cleanup incomplete card removal sequence. This will make sure the
>>> + * next card enumeration is clean.
>>> + */
>>> +static void omap_hsmmc_request_clear(struct omap_hsmmc_host *host,
>>> + struct mmc_request *mrq)
>>> +{
>>> + unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&host->irq_lock, flags);
>>> + host->req_in_progress = 0;
>>> + host->dma_ch = -1;
>>> + spin_unlock_irqrestore(&host->irq_lock, flags);
>>> +
>>> + mmc_request_done(host->mmc, mrq);
>>> + if (host->mmc->card)
>>> + mmc_card_set_removed(host->mmc->card);
>>> + host->mrq = NULL;
>>> +}
>>> +
>>> +/*
>>> * Notify the transfer complete to MMC core
>>> */
>>> static void
>>> @@ -1248,6 +1274,47 @@ static irqreturn_t omap_hsmmc_cover_irq(int irq, void *dev_id)
>>> return IRQ_HANDLED;
>>> }
>>>
>>> +/*
>>> + * irq handler to notify the core about card insertion/removal
>>> + */
>>> +static irqreturn_t omap_hsmmc_cd_irq(int irq, void *dev_id)
>>> +{
>>
>> Move this code to 'omap_hsmmc_get_cd' function. Or rather clear
>> any pending transfer upon '.init_card/omap_hsmmc_init_card'
>
> I tried using omap_hsmmc_init initially, but here is the problem:
>
> card inserted --> cd_irq_isr--> schedule mmc_rescan()
> --- after some time ---
> mmc_rescan() -->
> mmc_sd_alive() -->
> -- card removed ---
> mmc_send_status -->
> mmc_wait_for_req()-->
> wait_for_completion()
> ^^^ Here the mmc_rescan thread waits forever because, it doesn't get
> timeout interrupt for the cmd/req it sent (because card was removed).

ok, I see

> But calls to omap_hsmmc_card_init or omap_hsmmc_get_cd are in the same
> mmc_rescan thread. Hence, moving the recovery code to init_card does not
> help.

what about clearing any pending transfer in
- mmc_gpio_cd_irqt, or
- mmc_detect_change

e.g. trigger the later mentioned .card_event callback from those
functions, instead mmc_rescan? Then you can install your
omap_hsmmc_request_clear as the card_event callback. This makes your
custom isr handler redundant, actually your isr handler became
standard.

>> I guess other hosts also have some housekeeping upon unexpected
>> card removals. So I guess there is some generic code to this problem.
>> Did you check?
>
> I did try to find generic code in mmc-core, but couldn't find any.
> However, I did see some driver specific cleanups related to unexpected
> card removal in sdhci.c. sdhci handles this in .card_event call back.
> This does not help in my case as .card_event is also called from
> mmc_rescan. I think cleanup code (in sdhci driver) for unexpected card
> removal was introduced when .card_event call was outside mmc_rescan.

interesting

>>> + struct omap_hsmmc_host *host = mmc_priv(dev_id);
>>> + int carddetect = mmc_gpio_get_cd(host->mmc);
>>> + struct mmc_request *mrq = host->mrq;
>>> +
>>> + /*
>>> + * If the card was removed in the middle of data transfer last
>>> + * time, the TC/CC/timeout interrupt is not raised due to which
>>> + * mmc_request is not cleared. Hence, this card insertion will
>>> + * still see pending mmc_request. Clear the request to make sure
>>> + * that this card enumeration is successful.
>>> + */
>>> + if (!carddetect && mrq && host->transfer_incomplete) {
>>> + omap_hsmmc_disable_irq(host);
>>> + dev_info(host->dev,
>>> + "card removed during transfer last time\n");
>>> + hsmmc_command_incomplete(host, -ENOMEDIUM, 1);
>>> + omap_hsmmc_request_clear(host, host->mrq);
>>> + dev_info(host->dev, "recovery done\n");
>>> + }
>>> + host->transfer_incomplete = false;
>>> +
>>> + mmc_detect_change(host->mmc, (HZ * 200) / 1000);
>>> +
>>> + /*
>>> + * The current mmc_request is usually null before card removal
>>> + * sequence is complete. It may not be null if TC/CC interrupt
>>> + * never happens due to removal of card during a data
>>> + * transfer. Set a flag to indicate mmc_request was not null
>>> + * in order to do cleanup on next card insertion.
>>> + */
>>> + if (carddetect && mrq)
>>> + host->transfer_incomplete = true;
>>> +
>>> + return IRQ_HANDLED;
>>> +}
>>> +
>>> static void omap_hsmmc_dma_callback(void *param)
>>> {
>>> struct omap_hsmmc_host *host = param;
>>> @@ -1918,7 +1985,7 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
>>> struct mmc_host *mmc;
>>> struct omap_hsmmc_host *host = NULL;
>>> struct resource *res;
>>> - int ret, irq;
>>> + int ret, irq, len;
>>> const struct of_device_id *match;
>>> dma_cap_mask_t mask;
>>> unsigned tx_req, rx_req;
>>> @@ -1980,6 +2047,10 @@ static int omap_hsmmc_probe(struct platform_device *pdev)
>>> if (ret)
>>> goto err_gpio;
>>>
>>> + /* register cd_irq, if cd-gpios property is specified in dt */
>>> + if (of_find_property(host->dev->of_node, "cd-gpios", &len))
>>> + mmc_gpio_set_cd_isr(mmc, omap_hsmmc_cd_irq);
>>> +
>>
>> nack, this is done by mmc_of_parse
>
> mmc_of_parse parses the cd-gpios and registers a generic cd_irq_isr but
> there is no way to set driver specific cd_irq. So, the above code is
> required.

I think we don't need a custom isr.

>>> platform_set_drvdata(pdev, host);
>>>
>>> if (pdev->dev.of_node)
>>> --
>>> 2.4.1
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>>
>
> Regards
> Vignesh

2015-07-22 07:31:34

by Vignesh Raghavendra

[permalink] [raw]
Subject: Re: [PATCH 3/3] mmc: host: omap_hsmmc: Add custom card detect irq handler

Hi Andreas,

On 07/21/2015 01:46 PM, Andreas Fenkart wrote:
> Hi Vignesh,
>
> Generally I don't like this patch, it will make it harder, not easier,
> to maintain the omap hsmmc driver. Also given that the bug occurs
> rarely, people will be reluctant to clean it up later or accept
> patches.
>
> see also comments below
>
[snip]
> 2015-06-22 15:18 GMT+02:00 Vignesh R <[email protected]>:
>>
>
>> But calls to omap_hsmmc_card_init or omap_hsmmc_get_cd are in the same
>> mmc_rescan thread. Hence, moving the recovery code to init_card does not
>> help.
>
> what about clearing any pending transfer in
> - mmc_gpio_cd_irqt, or
> - mmc_detect_change
>
> e.g. trigger the later mentioned .card_event callback from those
> functions, instead mmc_rescan? Then you can install your
> omap_hsmmc_request_clear as the card_event callback. This makes your
> custom isr handler redundant, actually your isr handler became
> standard.
>

I looked at the commit fa372a51cb5f93800f711473e5a36e0e0c9a8f00 which
moved .card_event out of mmc_gpio_cd_irqt. It points to two threads
where discussion to move .card_event to mmc_rescan happened[1][2]. The
concern there was this callback was being called from "atomic context"
but, I don't understand how threaded irq is "atomic context". (I am not
sure, if this is because those drivers have irqs disabled till threaded
irq is complete) But, I believe moving card_event callback back to
mmc_gpio_cd_irqt may break some drivers.
I will look into this further.
Do you have any insight on this commit? Thanks!

[1]https://lkml.org/lkml/2014/3/19/79
[2]https://lkml.org/lkml/2013/8/19/539

--
Regards
Vignesh