Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753304AbcD0KC5 (ORCPT ); Wed, 27 Apr 2016 06:02:57 -0400 Received: from mail-wm0-f44.google.com ([74.125.82.44]:38098 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752348AbcD0KCy (ORCPT ); Wed, 27 Apr 2016 06:02:54 -0400 MIME-Version: 1.0 In-Reply-To: <1461404630.7081.9.camel@mhfsdcap03> References: <1460963609-16179-1-git-send-email-chaotian.jing@mediatek.com> <1461404630.7081.9.camel@mhfsdcap03> Date: Wed, 27 Apr 2016 12:02:52 +0200 Message-ID: Subject: Re: [PATCH] mmc: mediatek: fix request blocked by cancel_delayed_work From: Ulf Hansson To: Chaotian Jing Cc: Matthias Brugger , Nicolas Boichat , Douglas Anderson , Geert Uytterhoeven , linux-mmc , "linux-arm-kernel@lists.infradead.org" , linux-mediatek@lists.infradead.org, "linux-kernel@vger.kernel.org" , srv_heupstream , Sascha Hauer Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4203 Lines: 107 On 23 April 2016 at 11:43, Chaotian Jing wrote: > Hi, > On Fri, 2016-04-22 at 14:24 +0200, Ulf Hansson wrote: >> On 18 April 2016 at 09:13, Chaotian Jing wrote: >> > there are 2 points will cause could not call mmc_request_done() >> > and eventually cause the caller thread blocked. >> > >> > A. if card was busy, cancel_delayed_work() will return false because >> > the delay work has not been scheduled, in this case, need put >> > mod_delayed_work() in front of msdc_cmd_is_ready() >> > >> > B. if a request really need more than 5s(Some Sandisk TF card), it will >> > use cancel_delayed_work() to cancel itself, and also return false, so use >> > in_interrupt() to avoid this case >> > >> > Signed-off-by: Chaotian Jing >> > --- >> > drivers/mmc/host/mtk-sd.c | 11 ++++++++--- >> > 1 file changed, 8 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/mmc/host/mtk-sd.c b/drivers/mmc/host/mtk-sd.c >> > index b17f30d..1511b1b 100644 >> > --- a/drivers/mmc/host/mtk-sd.c >> > +++ b/drivers/mmc/host/mtk-sd.c >> > @@ -724,7 +724,7 @@ static void msdc_request_done(struct msdc_host *host, struct mmc_request *mrq) >> > bool ret; >> > >> > ret = cancel_delayed_work(&host->req_timeout); >> > - if (!ret) { >> > + if (!ret && in_interrupt()) { >> > /* delay work already running */ >> > return; >> > } >> > @@ -824,7 +824,12 @@ static inline bool msdc_cmd_is_ready(struct msdc_host *host, >> > } >> > >> > if (mmc_resp_type(cmd) == MMC_RSP_R1B || cmd->data) { >> > - tmo = jiffies + msecs_to_jiffies(20); >> > + /* >> > + * 2550ms is from EXT_CSD[248], after switch to hs200, >> > + * using CMD13 to polling card status, it will get response >> > + * of 0x800, but EMMC still pull-low DAT0. >> > + */ >> >> Seems like you are solving a eMMC specific issue on your driver? >> >> Perhaps we should try to use a card quirk instead? > > Actually, this is a Bug of __mmc_switch(), Per JEDEC Spec, while switch > speed mode, should not use CMD13 to get card status, as it's response > cannot reflect that if card was busy now, for this CMD6 switch HS200 There is a statement applicable to all HS modes, which says it's *not recommended* but *if* used, CRC errors shall be ignored. That's what we have been doing so far, but perhaps that isn't good enough for HS200/400. > case, I tried some Samsung/Sandisk/KSI eMMC, issue CMD13 will always get > 0x800, even eMMC has already changed to transfer state and DAT0 is high, > the response of CMD13 is also 0x800, and will never be 0x900. What do you mean by never? I assume it would when you extend the timeout? Does your host driver make sure to ignore CRC errors in this case? Just to be sure, that isn't the problem. > So, in __mmc_switch(), it's a bug to use CMD13 to know that if card has > already changed to transfer state. Whether it's a bug or not, it seems like we have eMMC cards that we have issues to support because of the way we have interpreted the spec. So let's try to fix them! > But, Our host do not support MMC_CAP_WAIT_WHILE_BUSY, that's why we hit > this issue. Okay, I see. Let's try to change the behaviour in __mmc_switch() to prevent it to send CMD13 before the cards stops signal busy on DAT0, when switching to HS200/HS400 mode. What I have in mind is: 1. When the host controller doesn't support MMC_CAP_WAIT_WHILE_BUSY, we would then to wait for a fixed timeout, before we send CMD13. In this case, do you know if the "generic_cmd6_time" is working for your eMMC devices that you had issues with? 2. In additional to the above solution, we can for those hosts that supports the ->card_busy() ops, but not MMC_CAP_WAIT_WHILE_BUSY, invoke the ->card_busy() in a polling manner. Of course the above timeout should also be considered as we need to stop polling at some point. So I noticed the Mediatek mmc host driver supports the ->card_busy() ops. So I think you can try 1) first, then extend it to 2). Does it makes sense? [...] Kind regards Uffe