Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp782049imu; Tue, 20 Nov 2018 06:56:29 -0800 (PST) X-Google-Smtp-Source: AFSGD/XtLNRijRmg08PukJNSLcKiSPD9jbSCv7ES4u6nu4PsFZSjO4CDJXH91DK9961zf05GVLzu X-Received: by 2002:a17:902:a6:: with SMTP id a35mr2442137pla.201.1542725789848; Tue, 20 Nov 2018 06:56:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542725789; cv=none; d=google.com; s=arc-20160816; b=e+Sh6q8a+Qp/kbIHEDcr6DW5iMQDvlDj5hVsYcDf88HopM6lyDFp5XpBuVT3phesU6 WxIDEGU7BCtb//FkkufBzTKlOjXHmFIr+xbYeBK6gK15glUfQ5yXUwc+KvhJq4TTKBHN Ct199RU/9HYONze6JYQ4GTGhAfIIdY2hAv1JwrfmgiVGhlZz+5ZyLqlx9Bl0sif0nugL hM4Hp6s1AgFCpV1tJqCCzbzvQ/xsLhkZB/f98rmJ+koCfw11fEG0CAu2KPBaMDhIyOb5 w2R/ym4SN0MbQDnspyf7nbMy2+WYaIN70soCyKYxo0YtFbVgX5oR6Stlc8A97uuStpI2 /e7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=EJ/2rictGiotCUNyiDLqvZDGdfnb08uSJa4HOkr68UY=; b=LY57s5k0HIjHj+dBcwm+7R1xeDtkXrxRqARUySy8i7V8bIU29CO9RaCiT+j/Wj5o36 28I9cbFJeL7nOnl1koD6sWDrMQmrzB7B2ULuXL35HVLqLjXRf9+sq2tw8lGP/zepgRwq d3LjnB4fzs0t/vVEzWXy5D5iTihL1YBDSZO6A+jCkDqkQS2iWfgY7/vHV/i8dKBgvunm bJCWWEwdvnzSz6bajTbrb11wEN09zT8RMLR7AwVadtA6poib9T1cSD9sUJMlUiX4wnF6 voajpB4UueaX4e5CBdXUB/buo8Jt4pXlvwtKtNYQT3OTOg17PaOzZ7fQcSs32NV+EcPB ZEMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Q/wPwdHS"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2si29628851plt.101.2018.11.20.06.56.15; Tue, 20 Nov 2018 06:56:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Q/wPwdHS"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726991AbeKTXiQ (ORCPT + 99 others); Tue, 20 Nov 2018 18:38:16 -0500 Received: from mail-it1-f194.google.com ([209.85.166.194]:40860 "EHLO mail-it1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725902AbeKTXiO (ORCPT ); Tue, 20 Nov 2018 18:38:14 -0500 Received: by mail-it1-f194.google.com with SMTP id h193so3348363ita.5 for ; Tue, 20 Nov 2018 05:09:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=EJ/2rictGiotCUNyiDLqvZDGdfnb08uSJa4HOkr68UY=; b=Q/wPwdHSIaPvEAV7pw8s0Xjlw8b/DTAK+ZJOV1LNj5jrd0qKY31vDwapcbP+zJM/r0 C6niCbG83p4nDw15oR7r9jBguwQB89gvUNJL3Wpm4iRueO65HOc767oOyUTZQXLPF27g R8iMhLWmO56BPFMFrIGb9GcJZw52kQeGMHj+E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=EJ/2rictGiotCUNyiDLqvZDGdfnb08uSJa4HOkr68UY=; b=VIuPlN5UxKJfLo93k/riSdwtO6JpOoGRpjKySxtKY3L8aajgBf2OXANCb68tguvP6p M11YC79Q6gho8PQZn5Ad9FP8B0/j7eY9qn318zvGcB2wRYix9VRc763ynTCM8DlpSDKN A7F2R1No/lPRERfAUhCVWS/Bj2OR4xkFa2knyrGn3kPxpqZcMJwoTujCPB86r+c9yhdA RTRH1MErRbcywsPN9yqRsduudsXB9IaKPYRrH27XJbeuWkoxbjF9ieTlkYbba3BLtKeG LwFKdScbThCT68BT4WbEH7yFsQX3KEb4lX620O5ONWn6b2CCvvCURj12DqPqSk6vr2/2 6MGw== X-Gm-Message-State: AGRZ1gJKnydGrDrPpgtgJaxHzhGPkyJhSJumFPM1w3NmtCOD/yqn6ewj d+2Bv7yim/nOL6Gy56ARUgNqHQRsB5NYgQGSNGd6Sw== X-Received: by 2002:a02:2b29:: with SMTP id h41-v6mr1611148jaa.12.1542719349816; Tue, 20 Nov 2018 05:09:09 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a02:70c8:0:0:0:0:0 with HTTP; Tue, 20 Nov 2018 05:08:28 -0800 (PST) In-Reply-To: References: <20181106133007.12318-1-sjoerd.simons@collabora.co.uk> <9051c212-6e2a-bc39-3686-693e6cd87f1d@ti.com> <303b49cbb5b687d6b6a7ad4048eda459586c0806.camel@collabora.co.uk> <20181107084741.GA31092@kunai> <20181120102300.GA1056@kunai> From: Ulf Hansson Date: Tue, 20 Nov 2018 14:08:28 +0100 Message-ID: Subject: Re: [PATCH] mmc: core: Remove timeout when enabling cache To: Sjoerd Simons Cc: Wolfram Sang , Faiz Abbas , "linux-mmc@vger.kernel.org" , kernel@collabora.com, Linux Kernel Mailing List , Hongjie Fang , Bastian Stender , Kyle Roeschley , Wolfram Sang , Shawn Lin , Harish Jenny K N , Simon Horman , Hal Emmerich Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org + Hal Emmerich On 20 November 2018 at 12:38, Sjoerd Simons wrote: > On Tue, 2018-11-20 at 11:23 +0100, Wolfram Sang wrote: >> > > > That also happens to be one of the cards we deploy; However i >> > > > did >> > > > wonder about adding a quirk but decided against it as it was >> > > > not clear >> > > > to me from the specification that CACHE ON really is meant to >> > > > complete >> > > > within GENERIC_CMD6_TIMEOUT. That and i fret about ending up in >> > > > hit-a- >> > > > mole games as the failure is really quite tedious (boot >> > > > failure). >> > > >> > > I agree that we should use the more defensive variant as a >> > > default. I >> > > mean there should be no performance regression since most cards >> > > will >> > > respond just faster, or? The only downside I could see is that we >> > > might >> > > miss a real timeout with no bounds set and might get stuck? >> > >> > Well, you have a point, but still it's kind of nice to know which >> > cards are behaving well and which ones that doesn't. Hence I think >> > I >> > prefer to stick using a quirk, unless you have a strong opinion. > > Not an incredibly strong opinion either; I just wonder if it's the > right trade-off. > > If the quirk/work-around is not there while it should be, the impact is > that you get an unusable card (which for eMMC is likely to mean a > failure to boot the system). Which is somewhat unfortunate. > > If the work-around is there while it's not needed then there doesn't > seem to be much of an impact at all; Apart from it not being reported > to the user/developer/kernel community? > > In which case it might make more to put in a warning iff the card takes > too long with a list of cards for which this is known? > >> No strong opinion. Especially not if you say it is in the spec >> (although >> "must be sufficient" would be better than "should be" ;)). Also, I >> assume this failure is reproducible and should turn up during >> development? Compared to "happens once in a while randomly"? > > For the card in question it happens only on hard power off; The time it > takes seems correlated to the state of the cache at hard power off (It > takes substantially longer if there was a lot of I/O activity at > the time of hard power off). With light I/O activity the current > timeout is sometimes enough. > > So if you know the pattern, or just happen to hit it often in e.g. > automated testing, it does show up during development. Otherwise it can > appear to "happen once in a while randomly". I don't quite follow. As far as I understand, the extended timeout is needed when turning the cache on. The above seems more related to flushing the cache, no? Flushing have no timeout (also reported to be an issue [1]), which happens either at _mmc_hw_reset() or at _mmc_suspend(). What is the relation here? > > Unfortunately for me, it was really a case of getting reports of some > boards started failing at some point which took a while to track back. > Especially since it's a battery powered device (thus hard poweroffs are > rather rare) and we allow the board manufactorer to select from various > different eMMCs depending on price/available at build time... > >> Yet, if we add a quirk for that, then we should probably mention it >> in >> an error message when we hit -ETIMEDOUT for cache on ("does your card >> need this quirk?")? It can be pretty time consuming to track this >> down >> otherwise, I'd think. > > Yes please. It would be nice if someone happens to have the right > contacts with Micron to see if it's a known issue for their cards in > general or just this one. > > Also would be good to have a timeout higher then 1 seconds (or for > these cards not have one?); On our testing thusfar we've seen timeouts > up to 850ms, but it's impossible to ensure that that's the true upper > bound. Using no limit of the timeout, would mean we may hang for ~10 minutes (MMC_OPS_TIMEOUT_MS) instead, no thanks. I am fine with let's say double of 850ms (1700ms), to have some room. Anyway, the point is, the timeouts in the spec is there for reason. Unfortunate I think the spec is "lazy" in some other regards and don't specify timeouts, which complicates things. Kind regards Uffe [1] https://www.spinics.net/lists/linux-mmc/msg51815.html