Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3597952pxj; Mon, 21 Jun 2021 02:18:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyCAPKXvKgX55i3eFLep4XaOZJju68mr9UF7pXNkbtLL3wfj56AKLXmg+2BYg2f4SIyibDg X-Received: by 2002:a05:6402:152:: with SMTP id s18mr19810337edu.221.1624267119555; Mon, 21 Jun 2021 02:18:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624267119; cv=none; d=google.com; s=arc-20160816; b=V9GzYfe5OjgUbbOjkmqVf11lVfyITPC6WAMqlpAlqmDbycY+ELdnfv/dYu5jSP05zy 6HxdHAYGMizPvZnR9xYxEvCsRpjqzNtYq3DcBZLlFpQdlA9hjyFgX9CZ6Xxcu0qt+LzD P/pAU3nC2/rHpiXJVLCkkQB2RWoxOsp2f0/gVF5Cht/k6E8icIvpWWIjVmlvIoRSBm5w XGqI/DMsvuChQ2oqtbLomkQNwddzw7/bqGMav3AGDWaKioaC+OStynPBIc7qPRXSlx2+ ai+3uopi8A2MCY9iPH9e4fxe9Or1OPeDOqg2aPG8bsZM3XSjxonW3m6oL7NsmFN2igC0 Eu6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=RSxMFekXs5u0XR6LtG+eypS9kMRy7ys8erY8LuGG4i0=; b=0g29DBTF+V+ZloY5Y+RKgc1DMAGlfq+M+tlsbRLkdfKK/fkKurfIxeYIvtYYKQ1jdP ddHcyZGPT/idgs5IM5OAAnW4OcxLWU3h+usiKGwVKuFnDKWgiqrLsKaWUI5M1QDW0Rhj 5odx7iMc5hi7LWCz73pBRvkAvBT6DinCzt9xlIabsXLRhUx7WsTHHzf5o+FmJIQdgwLo IixDe1jeHwZHQxFgTa1kaX7w2tf/88ZvqoVtxWVXHqJH4UVICB/l0RYmzVNZFP8FfV7r NKM7sNeNgry7jKfeG1YadETMJ5gdumY82+6wFjHGj8aaquqridAx3sdD08Pf3CSvQQb7 pAuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h12si1440620ejx.90.2021.06.21.02.18.17; Mon, 21 Jun 2021 02:18:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbhFUJS4 (ORCPT + 99 others); Mon, 21 Jun 2021 05:18:56 -0400 Received: from mga02.intel.com ([134.134.136.20]:17479 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbhFUJSz (ORCPT ); Mon, 21 Jun 2021 05:18:55 -0400 IronPort-SDR: XyePhxm1nkJun3Xy0gNIRPvtT6XTyi7jgTOSl4Z6UNYvw8v6+WsbPZZ0OduErklIBCvwAeI274 mOKLyvgI7IzA== X-IronPort-AV: E=McAfee;i="6200,9189,10021"; a="193942767" X-IronPort-AV: E=Sophos;i="5.83,289,1616482800"; d="scan'208";a="193942767" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2021 02:16:40 -0700 IronPort-SDR: iQmkomCwDDdVXXDaqkNApQLjRWgLJtyXG4+cnhHyF0BlK9tAHMpsCJ1Fdp9xpNHUNCeP3fGcLy ANjGWplN0H2Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,289,1616482800"; d="scan'208";a="405563804" Received: from ahunter-desktop.fi.intel.com (HELO [10.237.72.79]) ([10.237.72.79]) by orsmga006.jf.intel.com with ESMTP; 21 Jun 2021 02:16:35 -0700 Subject: Re: [PATCH V1] mmc: sdhci: Update the software timeout value for sdhc To: Doug Anderson , Veerabhadrarao Badiganti Cc: Shaik Sajida Bhanu , Ulf Hansson , Asutosh Das , Sahitya Tummala , Ram Prakash Gupta , Sayali Lokhande , sartgarg@codeaurora.org, Rajendra Nayak , cang@codeaurora.org, pragalla@codeaurora.org, nitirawa@codeaurora.org, Linux MMC List , LKML , linux-arm-msm , Andy Gross , Bjorn Andersson , Rob Herring References: <1623835535-30871-1-git-send-email-sbhanu@codeaurora.org> <585e003c-0342-4691-ab6d-8c6a930f9404@codeaurora.org> From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki Message-ID: <1c61c5dc-4857-6089-0588-81c36dee1fd6@intel.com> Date: Mon, 21 Jun 2021 12:16:58 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 19/06/21 1:49 am, Doug Anderson wrote: > Hi, > > On Fri, Jun 18, 2021 at 8:31 AM Veerabhadrarao Badiganti > wrote: >> >> >> On 6/16/2021 2:55 PM, Shaik Sajida Bhanu wrote: >>> Whenever SDHC run at clock rate 50MHZ or below, the hardware data >>> timeout value will be 21.47secs, which is approx. 22secs and we have >>> a current software timeout value as 10secs. We have to set software >>> timeout value more than the hardware data timeout value to avioid seeing >>> the below register dumps. >>> >>> [ 332.953670] mmc2: Timeout waiting for hardware interrupt. >>> [ 332.959608] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== >>> [ 332.966450] mmc2: sdhci: Sys addr: 0x00000000 | Version: 0x00007202 >>> [ 332.973256] mmc2: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000001 >>> [ 332.980054] mmc2: sdhci: Argument: 0x00000000 | Trn mode: 0x00000027 >>> [ 332.986864] mmc2: sdhci: Present: 0x01f801f6 | Host ctl: 0x0000001f >>> [ 332.993671] mmc2: sdhci: Power: 0x00000001 | Blk gap: 0x00000000 >>> [ 333.000583] mmc2: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007 >>> [ 333.007386] mmc2: sdhci: Timeout: 0x0000000e | Int stat: 0x00000000 >>> [ 333.014182] mmc2: sdhci: Int enab: 0x03ff100b | Sig enab: 0x03ff100b >>> [ 333.020976] mmc2: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 >>> [ 333.027771] mmc2: sdhci: Caps: 0x322dc8b2 | Caps_1: 0x0000808f >>> [ 333.034561] mmc2: sdhci: Cmd: 0x0000183a | Max curr: 0x00000000 >>> [ 333.041359] mmc2: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x00000000 >>> [ 333.048157] mmc2: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000 >>> [ 333.054945] mmc2: sdhci: Host ctl2: 0x00000000 >>> [ 333.059657] mmc2: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: >>> 0x0000000ffffff218 >>> [ 333.067178] mmc2: sdhci_msm: ----------- VENDOR REGISTER DUMP >>> ----------- >>> [ 333.074343] mmc2: sdhci_msm: DLL sts: 0x00000000 | DLL cfg: >>> 0x6000642c | DLL cfg2: 0x0020a000 >>> [ 333.083417] mmc2: sdhci_msm: DLL cfg3: 0x00000000 | DLL usr ctl: >>> 0x00000000 | DDR cfg: 0x80040873 >>> [ 333.092850] mmc2: sdhci_msm: Vndr func: 0x00008a9c | Vndr func2 : >>> 0xf88218a8 Vndr func3: 0x02626040 >>> [ 333.102371] mmc2: sdhci: ============================================ >>> >>> So, set software timeout value more than hardware timeout value. >>> >>> Signed-off-by: Shaik Sajida Bhanu >>> --- >>> drivers/mmc/host/sdhci.c | 9 ++++++++- >>> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c >>> index bf238ad..1386f7d 100644 >>> --- a/drivers/mmc/host/sdhci.c >>> +++ b/drivers/mmc/host/sdhci.c >>> @@ -1670,7 +1670,14 @@ static bool sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd) >>> else if (!cmd->data && cmd->busy_timeout > 9000) >>> timeout += DIV_ROUND_UP(cmd->busy_timeout, 1000) * HZ + HZ; >>> else >>> - timeout += 10 * HZ; >>> + /* >>> + * In some of the conditions hardware data timeout value could be >>> + * approx 21.5 seconds and driver is setting software data timeout >>> + * value less than the hardware data timeout value and software data >>> + * timeout value should be more than the hardware data timeout value. >>> + * So, set software data timeout value more than 21.5 sec i.e. 22sec. >>> + */ >>> + timeout += 22 * HZ; >> >> This timeout is qcom SDHC specific. >> I think right way is to, define your own set_timeout op and update >> host->data_timeout >> in that as per qcom SDHC requirements. > > It is? Off-thread Shaik was indicating that the problem had to do with > the inaccuracy of the "SDHCI_TIMEOUT_CONTROL" register. That seems to > be in the common SDHCI code. Specifically looking at > sdhci_calc_timeout() it can be seen that the possible hardware values > for the timeout double each time, so if you need a hardware timeout > that's slightly higher than one of the possible values you might end > up with a hardware timeout that's almost double what you want. > > Assuming that the problem actually is with the inaccuracy of > SDHCI_TIMEOUT_CONTROL (I didn't walk through and validate), it > actually seems like we should generally be doubling the value we were > programming for the software timeout (in other words, not just ones > that are <= 9 seconds). I haven't done all the math, but I presume the > reason that we need 22 instead of 20 is some type of extra fudge > factor somewhere? > > Maybe the only reason that Qualcomm hits this is due to the PLL that's > sourcing the SDHCI controller at a non-standard rate? > > I suppose another reason maybe why people aren't hitting it is just > the random chance of what rate the integer overflow in > mmc_set_data_timeout() leaves you at? I pointed this out to Shiak and > was hoping a patch would be included for that, but I can always try to > write one later if not. To be concrete, I have > card->csd.taac_ns=5000000 and card->csd.r2w_factor=5. Multiplying > things out (and accounting for mult=100 for SD cards), I end up with a > timeout_ns of 0x3b9aca000 (16 seconds) which doesn't fit in the 32-bit > data->timeout_ns earlier. The truncation was making it look like the > card requested a max timeout of 3,115,098,112 ns = ~3.1 seconds. The software timeout is meant to catch cases where the hardware timeout has failed, but SDHCI also supports a mechanism to use software timeouts when the timeout is too big for the hardware - refer SDHCI_QUIRK2_DISABLE_HW_TIMEOUT and sdhci_calc_sw_timeout() Historically, R/W timeouts were never longer than 2.5 seconds. (5ms x 5 x 100 == 2.5s), so 10s has been enough ever since the driver was written around 2006 Longer timeouts are for erase etc, but they are now always set in cmd->busy_timeout. So 22s is a strangely long timeout for R/W operations, and I would say even 10s is longer than we really want. One option would be to make it a variable that host controllers can re-assign if needed e.g. diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index 6aaf5c3ce34c..e5b4317937ed 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1670,7 +1670,7 @@ static bool sdhci_send_command(struct sdhci_host *host, struct mmc_command *cmd) else if (!cmd->data && cmd->busy_timeout > 9000) timeout += DIV_ROUND_UP(cmd->busy_timeout, 1000) * HZ + HZ; else - timeout += 10 * HZ; + timeout += host->fallback_timeout_secs * HZ; sdhci_mod_timer(host, cmd->mrq, timeout); if (host->use_external_dma) @@ -3936,6 +3936,8 @@ struct sdhci_host *sdhci_alloc_host(struct device *dev, */ host->adma_table_cnt = SDHCI_MAX_SEGS * 2 + 1; + host->fallback_timeout_secs = 10; + return host; }