Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp4355718ybf; Wed, 4 Mar 2020 02:19:05 -0800 (PST) X-Google-Smtp-Source: ADFU+vuC2tIWMMo+WtqzL+Sh2h9ZoekDkb2JEM7Ir1Z8aTiENt2OpCoISnhtcFRGvJncNLdPJN1w X-Received: by 2002:aca:4f17:: with SMTP id d23mr1231859oib.145.1583317145338; Wed, 04 Mar 2020 02:19:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583317145; cv=none; d=google.com; s=arc-20160816; b=WLqNWQmBKnu87iTmSZc1gfSK19Bb3cQ6QmicKhyXakoObzB/2neNgvV46lCeqhHTBR YsK1mSh1ecF7R7nSiTH7QsI4hYrCaosgwFeiOoEH4YkxlfG7UYQl9fLs8BVFT5cAPtbm i47KjbDAhI1wir9l+oiEWKB/ptAbilxzh/LGFxR1/ojGYLsdeH9+xsobytk/WgYTDZjk vpk5/y/xirY1a5SxjKIBf+sgSSfyGIThfLEVDgYDDVuzb4Y5lV7g0MZoU0EETYUXORIl NLuM0t+GzSBGZnAxqN9Tix/iJJpSU77k2uZRLae1/t3SSIFjOFOBuPO7nzA6h+shz5Gh eXYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ryX1DRtLYtnzOLBN2SFLNUQtTwzLIdm+5Qluhd2PgCc=; b=zUaZivmx22+LIWqGcYWMM6gTiAtpdmkSysw6V/24CWDc7ZAtC3vneVh19zjvuQIRR2 bfMNTbCZUpzwiwxav6PX6KH9L1RTDpgzZDvTNgUg58Te8jBr5BHCi35r9dYBNQzGs3iN fK3fucDMERpOPL0PzMHGRA7fDtneFLu1CXGWGciLhbLgmBsxFaWXhITJ3cpxmnruBPFB eA9IjAwAkUMt0XMamz2T4Ym/oOoJlj625p1HhxmVgWjf/TpnD32ht9SF1IA5R5iaJNQH e4ev8IHG9SnREdlyDOffMkUF6cv4XykdidWk0Zl47eYFAPuZrKH7tGhzZfPyOHNy/957 Q0BQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qvCQbIM8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9si867981ois.90.2020.03.04.02.18.53; Wed, 04 Mar 2020 02:19:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qvCQbIM8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729118AbgCDKSm (ORCPT + 99 others); Wed, 4 Mar 2020 05:18:42 -0500 Received: from mail-vk1-f193.google.com ([209.85.221.193]:44404 "EHLO mail-vk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728916AbgCDKSl (ORCPT ); Wed, 4 Mar 2020 05:18:41 -0500 Received: by mail-vk1-f193.google.com with SMTP id x62so383762vkg.11 for ; Wed, 04 Mar 2020 02:18:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ryX1DRtLYtnzOLBN2SFLNUQtTwzLIdm+5Qluhd2PgCc=; b=qvCQbIM8kf9q/8f1t/hUVsUlD43NnuWvBQVKa4PTh4Acfy31W2AO1PG+00Jvd2mOGx b05ydmJff9n49CdPpWmhwSEWwbDKnY6n2xjTnxLrl2KLNGuXm9HUte9L/NiXolPPn65s kY788/7OGFAg91Wiihy1AH3iYFwnvdN490T4wbPoPCaAAhTbWUflPyJsd5HUHbg0sE/q YKzCRcJ0BDz8jj3tevy/nBesggEKoqNwB/Oi5nH4lRUxwcuByVA4xjd0eMvZiIEzzYiX wt9XdyiD/f9yTRoREY7syhpWshVoSJApGw+1bcBbum6LRyhfCOYVFItIC/WdTE0UhKIo ytZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ryX1DRtLYtnzOLBN2SFLNUQtTwzLIdm+5Qluhd2PgCc=; b=p6XTJWzownk0R6W5ZzKkbPCpjkPB3DcWMxAGo6fhofdH3u1q6ujftYY0ImqbwtKhWN nl0PhGCebtpRwbAOMRQFYtxFv587GDP9YblQfp22+V4/htLgjHdBflcNVGDNy7/Y3CO8 UrsFmZT8MIpC0JljY1cqSkwxPqe+frnfO145DA0qx+klIZM0V1cabOgdJsKd2vJR4MPt dXi4CyIRJARl2ZpeemsJrrDJSNOBaLHC8ef0gYbXgpxx54/KNOzdqr18R4ygsHj56/A9 Kkf3HJOIDCxHLH6B5vazhyYKvQ95+1/kB7sp1/Q6vzck8k1EQXVe+TQD1mzNk4cupX3z N+EQ== X-Gm-Message-State: ANhLgQ0zsC7n1IO+fNvX3ZP3+NH0uMdGbU/Xgp9nw7m+5pp6oO9sKLGb CmKPQmQW9KyCW9GQvt4ekkQUqtqC8lwnVkqfYytAoQ== X-Received: by 2002:a1f:2c08:: with SMTP id s8mr907196vks.53.1583317120689; Wed, 04 Mar 2020 02:18:40 -0800 (PST) MIME-Version: 1.0 References: <6523119a-50ac-973a-d1cd-ab1569259411@nvidia.com> <0963b60f-15e7-4bc6-10df-6fc8003e4d42@nvidia.com> <34fd84d7-387b-b6f3-7fb3-aa490909e205@ti.com> <5e9b5646-bd48-e55b-54ee-1c2c41fc9218@nvidia.com> In-Reply-To: <5e9b5646-bd48-e55b-54ee-1c2c41fc9218@nvidia.com> From: Ulf Hansson Date: Wed, 4 Mar 2020 11:18:04 +0100 Message-ID: Subject: Re: LKFT: arm x15: mmc1: cache flush error -110 To: Sowjanya Komatineni Cc: Jon Hunter , Bitan Biswas , Adrian Hunter , Naresh Kamboju , Jens Axboe , Alexei Starovoitov , linux-block , lkft-triage@lists.linaro.org, open list , "linux-mmc@vger.kernel.org" , Arnd Bergmann , John Stultz , Faiz Abbas , Thierry Reding , Anders Roxell , Kishon Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [...] > > So, from my side, me and Anders Roxell, have been collaborating on > testing the behaviour on a TI Beagleboard x15 (remotely with limited > debug options), which is using the sdhci-omap variant. I am trying to > get hold of an Nvidia jetson-TX2, but not found one yet. These are the > conclusions from the observed behaviour on the Beagleboard for the > CMD6 cache flush command. > > First, the reported host->max_busy_timeout is 2581 (ms) for the > sdhci-omap driver in this configuration. > > 1. As we all know by now, the cache flush command (CMD6) fails with > -110 currently. This is when MMC_CACHE_FLUSH_TIMEOUT_MS is set to 30 * > 1000 (30s), which means __mmc_switch() drops the MMC_RSP_BUSY flag > from the command. > > 2. Changing the MMC_CACHE_FLUSH_TIMEOUT_MS to 2000 (2s), means that > the MMC_RSP_BUSY flag becomes set by __mmc_switch, because of the > timeout_ms parameter is less than max_busy_timeout (2000 < 2581). > Then everything works fine. > > 3. Updating the code to again use 30s as the > MMC_CACHE_FLUSH_TIMEOUT_MS, but instead forcing the MMC_RSP_BUSY to be > set, even when the timeout_ms becomes greater than max_busy_timeout. > This also works fine. > > Clearly this indicates a problem that I think needs to be addressed in > the sdhci driver. However, of course I can revert the three discussed > patches to fix the problem, but that would only hide the issues and I > am sure we would then get back to this issue, sooner or later. > > To fix the problem in the sdhci driver, I would appreciate if someone > from TI and Nvidia can step in to help, as I don't have the HW on my > desk. > > Comments or other ideas of how to move forward? [...] > Hi Ulf, > > I could repro during suspend on Jetson TX1/TX2 as when it does mmc flush = cache. Okay, great. > > > Timeout I see is for switch status CMD13 after sending CMD6 as device sid= e CMD6 is still inflight while host sends CMD13 as we are using R1 response= type with timeout_ms changes to 30s. > > > > Earlier we used timeout_ms of 0 for CMD6 flush cache, and with it uses R1= B response type and host will wait for busy state followed by response from= device for CMD6 and then data lines go High. > > > > Now with timeout_ms changed to 30s, we use R1 response and SW waits for b= usy by checking for DAT0 line to go High. If I understand correctly, because of the timeout now set to 30s, MMC_RSP_BUSY becomes disabled in __mmc_switch() for your case in sdhci-tegra as well? In other words, mmc_poll_for_busy() is being called, which in your case means the ->card_busy() host ops (set to sdhci_card_busy() in your case) will be invoked to wait for the card to stop signal busy on DAT0. This indicates to me, that the ->card_busy() ops returns zero to inform that the card is *not* busy, even if the card actually signals busy? Is that correct? > > > > With R1B type, host design after sending command at end of completion aft= er end bit waits for 2 cycles for data line to go low (busy state from devi= ce) and waits for response cycles after which data lines will go back high = and then we issue switch status CMD13. > > > > With R1 type, host after sending command and at end of completion after e= nd bit, DATA lines will go high immediately as its R1 type and switch statu= s CMD13 gets issued but by this time it looks like CMD6 on device side is s= till in flight for sending status and data. So, yes, using R1 instead of R1B triggers a different behaviour, but according to the eMMC spec it's perfectly allowed to issue a CMD13 even if the card signals busy on DAT0. The CMD13 is not using the DATA lines, so this should work. If I understand correctly, your driver (and controller?) has issues with coping with this scenario. Is it something that can be fixed? > > > 30s timeout is the wait time for data0 line to go high and mmc_busy_statu= s will return success right away with R1 response type and SW sends switch = status CMD13 but during that time on device side looks like still processin= g CMD6 as we are not waiting for enough time when we use R1 response type. Right, as stated above, isn't sdhci_card_busy() working for your case? Can we fix it? > > > > > Actually we always use R1B with CMD6 as per spec. I fully agree that R1B is preferable, but it's not against the spec to send CMD13 to poll for busy. Moreover, we need to cope with the scenario when the host has specified a maximum timeout that isn't sufficiently long enough for the requested operation. Do you have another proposal for how to manage this, but disabling MMC_RSP_BUSY? Let's assume you driver would get a R1B for the CMD6 (we force it), then what timeout would the driver be using if we would set cmd.busy_timeout to 30ms? Kind regards Uffe