Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3409359imu; Thu, 29 Nov 2018 23:02:45 -0800 (PST) X-Google-Smtp-Source: AFSGD/Ws90kdpo46Bvqne8uP4fHfsr+UZ/67B8dxxl7I3rGmHFix3bcMTywCL9udiH0pJLx6WVNO X-Received: by 2002:a62:9657:: with SMTP id c84mr4613855pfe.77.1543561365681; Thu, 29 Nov 2018 23:02:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543561365; cv=none; d=google.com; s=arc-20160816; b=kAhLCl1qvWCqtqyM6HwLelmeLBRK6hivCpjnodzh7yvJPvowyFVHD5bVi35Ki4Xznu Br6g851yaQJNYQ/QiDmk4pp8tLK7D77hG4+jtG8L8RI39xuf9gGGEOcMmgJ5tl0Z+l57 W0bYkqcT6s3AcfRrzJ8MDmj74C8ixMwN2I/iXC29q3rOBbyxDiTC0TLv/19FeuP5d8X1 bT4C4tmLwlOzpDI3UgbIhW8fgomxAs7wcSLIGskd8TM8T9HC4s6ku23n4/FvxB74bQYE I8Tar8pscfpFFLVfnrx19Ry1WRPthVXTIFpCqAzohZgIr/1YPoc455O7z+WS1yec+Z4l OqiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:message-id:subject:cc:to:from:date; bh=ZzyOVsp1rC3BJ3BQpubSIlDPIVMo9P484mP5r93oPps=; b=SSf68MoP1IGWuet186c5fIWe5JjtN3JlBfbkZ8RrmkuHtcJ/iSmiMfOaj972A8/lKe EImhtek463gPUG/NN0fqZITtQ6DydpbpMhTX7vUtHynMUgnV0tELwDJ8HAd//5oxP60A 6MgvHnIzu/bMlMwNwcV1PCDI7giW2Vxghg0tN3T1mmlMpUwa+J6vsIy4dNK9zquwo/2V fhSW3Xj3JmqMvL4jQBsZ3wxz8jL8wzLC/oiPuttN1TsBC3VT5RSXzcty3AgCmWA8PNhp v9MSfm6izqMiEGDBBgBB4ihonywEOOrw5W3ZotFgkPsp+XOCv2wXK7IFirxGHZb8P41E 0Sfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g25si4166899pgm.14.2018.11.29.23.02.29; Thu, 29 Nov 2018 23:02:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726909AbeK3SI6 convert rfc822-to-8bit (ORCPT + 99 others); Fri, 30 Nov 2018 13:08:58 -0500 Received: from mga01.intel.com ([192.55.52.88]:10237 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726456AbeK3SI6 (ORCPT ); Fri, 30 Nov 2018 13:08:58 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Nov 2018 23:00:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,297,1539673200"; d="scan'208,223";a="96142703" Received: from xdu1-mobl.sh.intel.com (HELO xdu1-mobl) ([10.239.200.30]) by orsmga006.jf.intel.com with ESMTP; 29 Nov 2018 23:00:38 -0800 Date: Fri, 30 Nov 2018 15:00:28 +0800 From: "Du, Alek" To: linux-mmc@vger.kernel.org, adrian.hunter@intel.com, ulf.hansson@linaro.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH] sdhci: fix the fake timeout bug Message-ID: <20181130150028.732896d8@xdu1-mobl> Organization: Intel APAC R&D X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From b893df3a1a937bd5fe22d39ceae93454a2e5e0e4 Mon Sep 17 00:00:00 2001 From: Alek Du Date: Fri, 30 Nov 2018 14:02:28 +0800 Subject: [PATCH] sdhci: fix the fake timeout bug We observed some fake timeouts on some devices, the log is like this: [ 7586.290201] mmc1: Timeout waiting for hardware cmd interrupt. [ 7586.290420] mmc1: sdhci: ============ SDHCI REGISTER DUMP =========== ... [ 7586.291774] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000203 From the clock control register dump, we are pretty sure the clock was stabilized. In other cases, we also observed: [ 7596.530171] mmc1: Timeout waiting for hardware cmd interrupt. and [ 1956.534634] mmc1: Reset 0x2 never completed. But we are pretty sure the mmc controller is working perfectly under low system load. After checking the sdhci code, we found the timeout check actually has a little window that the CPU can be scheduled out and when it comes back, the original time set or check is not valid. Signed-off-by: Alek Du --- drivers/mmc/host/sdhci.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 99bdae53fa2e..f88c49fc574e 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -218,12 +218,17 @@ void sdhci_reset(struct sdhci_host *host, u8 mask) /* hw clears the bit when it's done */ while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) { if (ktime_after(ktime_get(), timeout)) { + /* check it again, since there is a window between + bit check and time check */ + if (!(sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask)) + break; pr_err("%s: Reset 0x%x never completed.\n", mmc_hostname(host->mmc), (int)mask); sdhci_dumpregs(host); return; + } else { + udelay(10); } - udelay(10); } } EXPORT_SYMBOL_GPL(sdhci_reset); @@ -1395,9 +1400,10 @@ void sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd) timeout += DIV_ROUND_UP(cmd->busy_timeout, 1000) * HZ + HZ; else timeout += 10 * HZ; - sdhci_mod_timer(host, cmd->mrq, timeout); sdhci_writew(host, SDHCI_MAKE_CMD(cmd->opcode, flags), SDHCI_COMMAND); + /* setup timer after command to avoid fake timeout */ + sdhci_mod_timer(host, cmd->mrq, timeout); } EXPORT_SYMBOL_GPL(sdhci_send_command); @@ -1611,12 +1617,19 @@ void sdhci_enable_clk(struct sdhci_host *host, u16 clk) while (!((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL)) & SDHCI_CLOCK_INT_STABLE)) { if (ktime_after(ktime_get(), timeout)) { + /* check it again since there is a window between + status check and time check */ + if ((clk = sdhci_readw(host, SDHCI_CLOCK_CONTROL)) + & SDHCI_CLOCK_INT_STABLE) + break; pr_err("%s: Internal clock never stabilised.\n", mmc_hostname(host->mmc)); sdhci_dumpregs(host); return; } - udelay(10); + else { + udelay(10); + } } clk |= SDHCI_CLOCK_CARD_EN; -- 2.17.1