Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755074AbcCXQG4 (ORCPT ); Thu, 24 Mar 2016 12:06:56 -0400 Received: from mail-vk0-f50.google.com ([209.85.213.50]:32959 "EHLO mail-vk0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752340AbcCXQGr (ORCPT ); Thu, 24 Mar 2016 12:06:47 -0400 MIME-Version: 1.0 In-Reply-To: <20160324153056.GT19428@n2100.arm.linux.org.uk> References: <20160324153056.GT19428@n2100.arm.linux.org.uk> Date: Thu, 24 Mar 2016 09:06:45 -0700 X-Google-Sender-Auth: ORh7y1iRNwidLkXdNEC5OSQJ61M Message-ID: Subject: Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors From: Doug Anderson To: Russell King - ARM Linux Cc: Enric Balletbo Serra , "linux-mmc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Alim Akhtar , Jaehoon Chung , Ulf Hansson , Alim Akhtar , Sonny Rao , Andrew Bresticker , Heiko Stuebner , Addy Ke , Alexandru Stan , Chris Zhong , Caesar Wang , Javier Martinez Canillas Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2417 Lines: 55 Russell, On Thu, Mar 24, 2016 at 8:30 AM, Russell King - ARM Linux wrote: > On Thu, Mar 24, 2016 at 12:26:43PM +0100, Enric Balletbo Serra wrote: >> I just saw that Russell introduced a patch [1] that will land on 4.6. >> I think that patch solves the same issue that we're trying to fix, but >> for sdhci controller. > > It doesn't sound like the same issue to me, though it was a long while > back when I was looking at sdhci, so I may be misremembering. > >> The problem that we have on peach-pi, with our patch applied, is that >> when we get a response CRC error on a command and we move to start >> sending data, the transfer doesn't receives a timeout interrupt (I >> don't know why). As I told, on rockchip works due the DTO quirk. >> exynos is not using this quirk. Also, please correct me if I'm wrong, >> looks like the sdhci controller has a timer to signal the command >> timed out. > > From what I remember, the problem I was seeing is that SDHCI sends a > command (iirc, a tuning command), and receives a response CRC error. > The card, however, knows nothing about the CRC error, so it moves into > the transfer state. > > Meanwhile, SDHCI stopped processing the command, resetting the SDHCI > controller and reporting the error to the upper layers. > > Then, a new command gets queued, issued to the card, and this fails > because the card is still in transfer state. This totally screws up > the SDHCI UHS tuning. > > This is not the only SDHCI UHS tuning bug: others exist which do not > yet have patches, where we can get spurious false positives/false > negatives for various tuning steps which totally confuse the code. > > From what you say above, your issue is that you get a response CRC > error, but the dw MMC masks the data side, which sounds like a > different solution is needed. What I was seeing that when the controller saw the CRC error it tried to abort with a "stop" command. You can see the "send_stop_abort(host, data)" in dw_mmc.c. Then I saw: > Sending the stop command after the "response CRC error" would > then throw the system into a confused state causing all future > tuning phases to report failure. Presumably this is similar to what you saw: the host saw the CRC error but the card knew nothing about it. Sending the stop command during this time confused the card. Presumably the card was in transfer state during this time? -Doug