Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp3735638ybv; Tue, 25 Feb 2020 06:28:02 -0800 (PST) X-Google-Smtp-Source: APXvYqxkNBTL+8h2bar+3LLrbMJI3K6RuRdhxfEpN0dBJDUt8xqrusQoHpkv5ygJ89I+hl5ZNwEK X-Received: by 2002:a05:6808:8ee:: with SMTP id d14mr3551108oic.138.1582640881895; Tue, 25 Feb 2020 06:28:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582640881; cv=none; d=google.com; s=arc-20160816; b=UH3ImjHaj1sBG6NsLbLaDMBos3UKLWxU85Qe7TQ3X/Vg5anuLi1TqN+dK2myaBRdWO uAvMIH22BLnKBd3SSFPHatL7bP9KDhQkE5sOhVODXuG872vGmcuF+IiUXmRu1SWZ4u/4 Ox2IM1hLmnytTOO97M9pUDN1erfGnOCEy6CS25jkqC270PUjYvLE1Mji9y6MIP/TTCxz li/c43TvPtLOgOD4AfiOMG7vVS/mnLKhDIj1w9iI2NeKmdDGCAYBDBzboclOngzshow8 ZcVz1xFxSKhfK6NIT0pVbQF2cgwv8bT0BRd8XP37mVV/Rg+5EaSJv5RnDnH/Tx5Fz/Fh FQ/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=dc9mXVyInlW6eLNwpAGbHUKzj6s6l4zfLFOdqnVBf5M=; b=c+hWpUjqDUpHn+BG72fvlwmbZd4ExXh9lEpd3aRW/VY235nq+BIRz+HBoadV/NHNOV YYIEt1LAyPjcCE3yGzfCuPasBCJHn1a3cUUiI8a2wmD8lMAWpQF0GLuDc1hLzJ/8hPCC eMcpN3IgnIHf5N9+CXJUoRP3+aROCTpqWhGrXzwqG1OtX5T8IVzQUtWtmAOIB2pXQRQ2 T25MmQSkjReinYOYoIiDU+uRTSaRS3cZyQB8QSUG9StJ2chxT7JPaEnpPUzpEYpeJlD6 U2bctip1Oonz28hOULNGH0j4umFfxSewiglc1orL8LQgsPr3+ZDo5Zaiu6fkKQXrRt5e oxiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="qoxHjOy/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p19si8015340otq.296.2020.02.25.06.27.40; Tue, 25 Feb 2020 06:28:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="qoxHjOy/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730737AbgBYO0y (ORCPT + 99 others); Tue, 25 Feb 2020 09:26:54 -0500 Received: from mail-vs1-f68.google.com ([209.85.217.68]:33261 "EHLO mail-vs1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730719AbgBYO0y (ORCPT ); Tue, 25 Feb 2020 09:26:54 -0500 Received: by mail-vs1-f68.google.com with SMTP id n27so8117026vsa.0 for ; Tue, 25 Feb 2020 06:26:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dc9mXVyInlW6eLNwpAGbHUKzj6s6l4zfLFOdqnVBf5M=; b=qoxHjOy/GqgzD/NTmr9stfdAfFgOFFxqtrHseCckSRfapbD8ztNi/TI6LfdhPfuDqh wRyW0A/z3R22Av6sYOs/oG5S6GTBj3WhCWFSEq4CehCfZOhA4uwNyfLXZOfYrdPz939X 8Of1AwtxaR3tKdRswTDJ1RIC42BKYFfFixgxiMS9CUU6Xg7RaCZpeMgjhYqZKDJgbZ8Q 8e24iEpuWoTQqPVMBorcg41tA44diEyCBFTJbqKqvcV6P+Ohrc3iIy3iumTshW8WFqBS RuYsfmWm3KyvdT53cFhgTsV9EfHKdfHszyqnlg8HN9PdnP72WGy/ITmKDZzMoragPt9E g1lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dc9mXVyInlW6eLNwpAGbHUKzj6s6l4zfLFOdqnVBf5M=; b=XO/KMtxS/mCSJAtGvyxo3Y32k+BN7uzbpRSDInpWNtDc7KDp05a+yU6SBpT7YqFvHo 1fp99M8+KDUd0WuhLo/HW0kNfetKneGOghxABBxeAy+A0wY9R+DAJKl1NDCI/kHtRYVx hx24wArXgwYHkS+wBaZnRrAan+MEAkdQNCJ14cinrtXEsSRC6FgCGHLCZ+tllxWbrqjG +qyf7ifdwWRPRVOIjdR4Doe9CPx7Ftl6eou9z7Y8J6azAnBthrj5Qbt5pCbg6dvfwaer NJA1MJxwSLoZkb7kB3JG0DlD8KLnT4iuzM9b512Z8m5otyfxP3R3KaTMVbWKNyxBW86L C/0A== X-Gm-Message-State: APjAAAUOCjrunkHhHCZI6jgXDLZ/v/1hC4nyMPb0MYP1n2IQLPhUeeOK uk3aq0kJtc+8ia+1w0+KK6QRl78/st98Dq9S1RPmnQ== X-Received: by 2002:a05:6102:22d6:: with SMTP id a22mr28791854vsh.191.1582640812562; Tue, 25 Feb 2020 06:26:52 -0800 (PST) MIME-Version: 1.0 References: <6523119a-50ac-973a-d1cd-ab1569259411@nvidia.com> In-Reply-To: From: Ulf Hansson Date: Tue, 25 Feb 2020 15:26:16 +0100 Message-ID: Subject: Re: LKFT: arm x15: mmc1: cache flush error -110 To: Jon Hunter , Faiz Abbas Cc: Bitan Biswas , Adrian Hunter , Naresh Kamboju , Jens Axboe , Alexei Starovoitov , linux-block , lkft-triage@lists.linaro.org, open list , "linux-mmc@vger.kernel.org" , Arnd Bergmann , John Stultz , Thierry Reding Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org + Faiz Abbas On Tue, 25 Feb 2020 at 12:41, Jon Hunter wrote: > > > On 25/02/2020 10:04, Jon Hunter wrote: > > ... > > >>> I find that from the commit the changes in mmc_flush_cache below is > >>> the cause. > >>> > >>> ## > >>> @@ -961,7 +963,8 @@ int mmc_flush_cache(struct mmc_card *card) > >>> (card->ext_csd.cache_size > 0) && > >>> (card->ext_csd.cache_ctrl & 1)) { > >>> err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL, > >>> - EXT_CSD_FLUSH_CACHE, 1, 0); > >>> + EXT_CSD_FLUSH_CACHE, 1, > >>> + MMC_CACHE_FLUSH_TIMEOUT_MS); > > > > > > I no longer see the issue on reverting the above hunk as Bitan suggested > > but now I see the following (which is expected) ... > > > > WARNING KERN mmc1: unspecified timeout for CMD6 - use generic > > For Tegra, the default timeout used when no timeout is specified for CMD6 > is 100mS. So hard-coding the following also appears to workaround the > problem on Tegra ... Interesting. > > diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c > index 868653bc1555..5155e0240fca 100644 > --- a/drivers/mmc/core/mmc_ops.c > +++ b/drivers/mmc/core/mmc_ops.c > @@ -992,7 +992,7 @@ int mmc_flush_cache(struct mmc_card *card) > (card->ext_csd.cache_size > 0) && > (card->ext_csd.cache_ctrl & 1)) { > err = mmc_switch(card, EXT_CSD_CMD_SET_NORMAL, > - EXT_CSD_FLUSH_CACHE, 1, 0); > + EXT_CSD_FLUSH_CACHE, 1, 100); > if (err) > pr_err("%s: cache flush error %d\n", > mmc_hostname(card->host), err); > > So the problem appears to be causing by the timeout being too long rather > than not long enough. > > Looking more at the code, I think now that we are hitting the condition > ... > > diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c > index 868653bc1555..feae82b1ff35 100644 > --- a/drivers/mmc/core/mmc_ops.c > +++ b/drivers/mmc/core/mmc_ops.c > @@ -579,8 +579,10 @@ int __mmc_switch(struct mmc_card *card, u8 set, u8 index, u8 value, > * the host to avoid HW busy detection, by converting to a R1 response > * instead of a R1B. > */ > - if (host->max_busy_timeout && (timeout_ms > host->max_busy_timeout)) > + if (host->max_busy_timeout && (timeout_ms > host->max_busy_timeout)) { > + pr_warn("%s: timeout (%d) > max busy timeout (%d)", mmc_hostname(host), timeout_ms, host->max_busy_timeout); > use_r1b_resp = false; > + } > > > With the above I see ... > > WARNING KERN mmc1: timeout (1600) > max busy timeout (672) > > So with the longer timeout we are not using/requesting the response. You are most likely correct. However, from the core point of view, the response is still requested, only that we don't want the driver to wait for the card to stop signaling busy. Instead we want to deal with that via "polling" from the core. This is a rather worrying behaviour, as it seems like the host driver doesn't really follow this expectations from the core point of view. And mmc_flush_cache() is not the only case, as we have erase, bkops, sanitize, etc. Are all these working or not really well tested? Earlier, before my three patches, if the provided timeout_ms parameter to __mmc_switch() was zero, which was the case for mmc_mmc_flush_cache() - this lead to that __mmc_switch() simply ignored validating host->max_busy_timeout, which was wrong. In any case, this also meant that an R1B response was always used for mmc_flush_cache(), as you also indicated above. Perhaps this is the critical part where things can go wrong. BTW, have you tried erase commands for sdhci tegra driver? If those are working fine, do you have any special treatments for these? I have looped in Faiz, as sdhci-omap seems to suffer from very similar problems. One thing I noted for sdhci-omap, is that MMC_ERASE commands is treated in a special manner in sdhci_omap_set_timeout(). This indicates that there is something fishy going on. Faiz, can you please comment on this? Kind regards Uffe