Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756904AbcCaSMz (ORCPT ); Thu, 31 Mar 2016 14:12:55 -0400 Received: from mail-vk0-f53.google.com ([209.85.213.53]:33118 "EHLO mail-vk0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753118AbcCaSMx (ORCPT ); Thu, 31 Mar 2016 14:12:53 -0400 MIME-Version: 1.0 In-Reply-To: <20160324162220.GV19428@n2100.arm.linux.org.uk> References: <20160324153056.GT19428@n2100.arm.linux.org.uk> <20160324162220.GV19428@n2100.arm.linux.org.uk> Date: Thu, 31 Mar 2016 11:12:50 -0700 X-Google-Sender-Auth: byG6gMRN0_ZItMdT5qwIvPPlSwY Message-ID: Subject: Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors From: Doug Anderson To: Russell King - ARM Linux Cc: Enric Balletbo Serra , "linux-mmc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Alim Akhtar , Jaehoon Chung , Ulf Hansson , Alim Akhtar , Sonny Rao , Andrew Bresticker , Heiko Stuebner , Addy Ke , Alexandru Stan , Chris Zhong , Caesar Wang , Javier Martinez Canillas Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2138 Lines: 49 Hi, On Thu, Mar 24, 2016 at 9:22 AM, Russell King - ARM Linux wrote: > That's why, for sdhci, I came to the conclusion that waiting for the > data transfer to complete or timeout was the best solution for SDHCI. > > Maybe, if sending a STOP command does cause card firmware issues, then: > > 1) it provides evidence that trying to send a stop command on response > CRC error is the wrong thing to do (it was talked about making SDHCI > do this.) > > 2) it suggests that the solution I came up with for SDHCI is the better > solution, rather than trying to immediately recover the situation by > sending a STOP command. > > Maybe dw-mmc can do something similar, but with the lack of data transfer > timeout, maybe it's possible to do something with a kernel timer instead, > and check what the hardware is doing after a response CRC error? Since the problem only reproduced with a single model of a single brand of card, it is probably a card firmware issue as you say. Presumably if we put this card in an exynos that had tuning enabled we'd seem similar issues. I haven't had a chance to test this. dw_mmc does have a data transfer timeout, but for some reason it doesn't seem to fire reliably on Rockchip. Sounds like Shawn may be digging here? In the past I've found that the same code running on rk3288 and exynos would fire the timeout on exynos but not on rk3288. That's the reason why rk3288 has a special quirk to enable a software timer. In the case of CRC error perhaps the controller sin't sending a data timeout because it already told us about the CRC error (so need to report further?), but I guess the quirk on the Rockhip platform makes it work. Sounds like the right fix is to enable a timer after the CRC error (similar to the DTO quirk) and not send a STOP. Even if it's not a firmware problem, keeping dw_mmc behaving more like SDHCI is a good idea because presumably SD cards out there will not test against all controllers, so we'll have the most compatibility if all controllers behave the same (even if the spec technically allows them to behave otherwise). -Doug