Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp778147imu; Tue, 20 Nov 2018 06:52:37 -0800 (PST) X-Google-Smtp-Source: AFSGD/XkvgAsnpH6cMVN6qSL7UpMuF9jCKphgc746FphKLjToO1/MrbRPmssVrDj2WQ4HGmQ/Is4 X-Received: by 2002:a63:5d55:: with SMTP id o21mr2150280pgm.92.1542725557241; Tue, 20 Nov 2018 06:52:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542725557; cv=none; d=google.com; s=arc-20160816; b=cs4ZgiTochbXnQlRxDByuVSI7qfEWs6T1zsX/xDEJ5aqb+zjCifkvLS+748L6pDvuS WQkCHIDU2BkKYVluvkJ4/OFf1mTXsY/aFPrynSMnh80ykP+Okk72hDp7A3gYbZT04y80 LzYj2u8b9HcdB5Y9nAveb2UvcBaVLuXeboy64lJYS1AbKqB898j1It2vn7wGZnw7mgLL jDbFWbIezlFWo00nY/ej9hIBK3YG6hgAMtBWDlvlWXpDaGgU2vjbISFZYAv2UlQWwMqg b6NFqpYEgx40n9cCx3TMAyoitLR6+yURLh2XFlq+s079DTgco35GOcEfqhSJPgb4mAOj MfOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent :content-transfer-encoding:organization:references:in-reply-to:date :cc:to:from:subject:message-id; bh=1uVsHtj9mQvj/kmtXIyt9k4bphbiE/A5rW2tcBTN0o0=; b=TvUbrYZ0UhlMAc0PZ/sfQ8fNfPh0z1EQExPdjbkvqb+4AVRcB/Gfe1/xATmNsizpOh T5cowASxKwZVa0Yp47l8BpjXL9IZuRTtzR6l0LtTu5Y4L4O5WZrm+MY8SVsCMpzEeQHJ 79ylsZvBxoxuc0TN3Xv/j3X9v164q78Kq3LKXZUdJBsHfVLX3lxW/kcuFwMAwzuxGxb1 G7qGKFsbSkZOMHhjZ61eBBl/rldQodcUnD9jYyzyDHqe2fla+SRWUpNb9MTRZuFDhY4r yKpx5CT9m1S4FRDEozN7Ij4L+Iu0U6rXQmD7QJ575uyABhsiN1WD3lss+lFz3I1owG3O Stag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.co.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n59-v6si6889522plb.416.2018.11.20.06.52.22; Tue, 20 Nov 2018 06:52:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.co.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729367AbeKTWGy convert rfc822-to-8bit (ORCPT + 99 others); Tue, 20 Nov 2018 17:06:54 -0500 Received: from bhuna.collabora.co.uk ([46.235.227.227]:42606 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728997AbeKTWGy (ORCPT ); Tue, 20 Nov 2018 17:06:54 -0500 Received: from beast.luon.net (unknown [IPv6:2001:470:78b1:0:40e2:7ff:fef4:3122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: sjoerd) by bhuna.collabora.co.uk (Postfix) with ESMTPSA id 00A56263AA7; Tue, 20 Nov 2018 11:38:08 +0000 (GMT) Received: by beast.luon.net (Postfix, from userid 1000) id 4EDC63E238F; Tue, 20 Nov 2018 12:38:05 +0100 (CET) Message-ID: Subject: Re: [PATCH] mmc: core: Remove timeout when enabling cache From: Sjoerd Simons To: Wolfram Sang , Ulf Hansson Cc: Faiz Abbas , "linux-mmc@vger.kernel.org" , kernel@collabora.com, Linux Kernel Mailing List , Hongjie Fang , Bastian Stender , Kyle Roeschley , Wolfram Sang , Shawn Lin , Harish Jenny K N , Simon Horman Date: Tue, 20 Nov 2018 12:38:05 +0100 In-Reply-To: <20181120102300.GA1056@kunai> References: <20181106133007.12318-1-sjoerd.simons@collabora.co.uk> <9051c212-6e2a-bc39-3686-693e6cd87f1d@ti.com> <303b49cbb5b687d6b6a7ad4048eda459586c0806.camel@collabora.co.uk> <20181107084741.GA31092@kunai> <20181120102300.GA1056@kunai> Organization: Collabora Ltd. Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT User-Agent: Evolution 3.30.1-1 Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2018-11-20 at 11:23 +0100, Wolfram Sang wrote: > > > > That also happens to be one of the cards we deploy; However i > > > > did > > > > wonder about adding a quirk but decided against it as it was > > > > not clear > > > > to me from the specification that CACHE ON really is meant to > > > > complete > > > > within GENERIC_CMD6_TIMEOUT. That and i fret about ending up in > > > > hit-a- > > > > mole games as the failure is really quite tedious (boot > > > > failure). > > > > > > I agree that we should use the more defensive variant as a > > > default. I > > > mean there should be no performance regression since most cards > > > will > > > respond just faster, or? The only downside I could see is that we > > > might > > > miss a real timeout with no bounds set and might get stuck? > > > > Well, you have a point, but still it's kind of nice to know which > > cards are behaving well and which ones that doesn't. Hence I think > > I > > prefer to stick using a quirk, unless you have a strong opinion. Not an incredibly strong opinion either; I just wonder if it's the right trade-off. If the quirk/work-around is not there while it should be, the impact is that you get an unusable card (which for eMMC is likely to mean a failure to boot the system). Which is somewhat unfortunate. If the work-around is there while it's not needed then there doesn't seem to be much of an impact at all; Apart from it not being reported to the user/developer/kernel community? In which case it might make more to put in a warning iff the card takes too long with a list of cards for which this is known? > No strong opinion. Especially not if you say it is in the spec > (although > "must be sufficient" would be better than "should be" ;)). Also, I > assume this failure is reproducible and should turn up during > development? Compared to "happens once in a while randomly"? For the card in question it happens only on hard power off; The time it takes seems correlated to the state of the cache at hard power off (It takes substantially longer if there was a lot of I/O activity at the time of hard power off). With light I/O activity the current timeout is sometimes enough. So if you know the pattern, or just happen to hit it often in e.g. automated testing, it does show up during development. Otherwise it can appear to "happen once in a while randomly". Unfortunately for me, it was really a case of getting reports of some boards started failing at some point which took a while to track back. Especially since it's a battery powered device (thus hard poweroffs are rather rare) and we allow the board manufactorer to select from various different eMMCs depending on price/available at build time... > Yet, if we add a quirk for that, then we should probably mention it > in > an error message when we hit -ETIMEDOUT for cache on ("does your card > need this quirk?")? It can be pretty time consuming to track this > down > otherwise, I'd think. Yes please. It would be nice if someone happens to have the right contacts with Micron to see if it's a known issue for their cards in general or just this one. Also would be good to have a timeout higher then 1 seconds (or for these cards not have one?); On our testing thusfar we've seen timeouts up to 850ms, but it's impossible to ensure that that's the true upper bound. -- Sjoerd Simons Collabora Ltd.