Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754728AbcCXPbY (ORCPT ); Thu, 24 Mar 2016 11:31:24 -0400 Received: from pandora.arm.linux.org.uk ([78.32.30.218]:56695 "EHLO pandora.arm.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752428AbcCXPbO (ORCPT ); Thu, 24 Mar 2016 11:31:14 -0400 Date: Thu, 24 Mar 2016 15:30:56 +0000 From: Russell King - ARM Linux To: Enric Balletbo Serra Cc: Doug Anderson , "linux-mmc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Alim Akhtar , Jaehoon Chung , Ulf Hansson , Alim Akhtar , Sonny Rao , Andrew Bresticker , Heiko Stuebner , Addy Ke , Alexandru Stan , Chris Zhong , Caesar Wang , Javier Martinez Canillas Subject: Re: [PATCH] mmc: dw_mmc: Wait for data transfer after response errors Message-ID: <20160324153056.GT19428@n2100.arm.linux.org.uk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2464 Lines: 51 On Thu, Mar 24, 2016 at 12:26:43PM +0100, Enric Balletbo Serra wrote: > I just saw that Russell introduced a patch [1] that will land on 4.6. > I think that patch solves the same issue that we're trying to fix, but > for sdhci controller. It doesn't sound like the same issue to me, though it was a long while back when I was looking at sdhci, so I may be misremembering. > The problem that we have on peach-pi, with our patch applied, is that > when we get a response CRC error on a command and we move to start > sending data, the transfer doesn't receives a timeout interrupt (I > don't know why). As I told, on rockchip works due the DTO quirk. > exynos is not using this quirk. Also, please correct me if I'm wrong, > looks like the sdhci controller has a timer to signal the command > timed out. >From what I remember, the problem I was seeing is that SDHCI sends a command (iirc, a tuning command), and receives a response CRC error. The card, however, knows nothing about the CRC error, so it moves into the transfer state. Meanwhile, SDHCI stopped processing the command, resetting the SDHCI controller and reporting the error to the upper layers. Then, a new command gets queued, issued to the card, and this fails because the card is still in transfer state. This totally screws up the SDHCI UHS tuning. This is not the only SDHCI UHS tuning bug: others exist which do not yet have patches, where we can get spurious false positives/false negatives for various tuning steps which totally confuse the code. >From what you say above, your issue is that you get a response CRC error, but the dw MMC masks the data side, which sounds like a different solution is needed. The MMC block driver error handling is fairly robust, but there is no core error handling (because the error handling is not obvious.) So any command not eminating from the MMC block driver that invokes a transfer from the card which fails won't have a stop command sent for it. Maybe that's a weakness of the core MMC code: when I originally designed that part of the MMC code, my thoughts were to leave error handling to the higher levels (such as MMC block) because its dependent on those higher levels. Eg, the various status bits which report errors, whether a stop command needs to be issued, etc. -- RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net.