Received: by 2002:a05:7412:f690:b0:e2:908c:2ebd with SMTP id ej16csp876783rdb; Fri, 20 Oct 2023 01:53:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEQWnJGqvjt2RmOeOlpcGQ9AMQoSDG/BdS9NOZHdOfOr1DkPgRVYayz1/VFVEVc0rSPquEn X-Received: by 2002:a17:902:e887:b0:1c6:3222:c67c with SMTP id w7-20020a170902e88700b001c63222c67cmr1394093plg.23.1697792003768; Fri, 20 Oct 2023 01:53:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697792003; cv=none; d=google.com; s=arc-20160816; b=vqwXNBsi0GsnBdfghIOgM/3vSAupjPj9VZ2xOwhm0L72xzy8/RhQ6QlciBv0wGMH5f jj/3Rz/e8rDrhmO88ES7ogOUhhAmszNb/ygjWbptTduAfBsp9RDb4cu9ggFW8KkxI8cU gXy+WOz2tVfgtYhIyZVTAVm62Nn7GPRXCQ0/iguXj9vwsv3fyk0xm3NP4jmRep2YUiz7 kV1gprU5cUuLXIgqVt0VwBVj4ky5YR/paWxqWnZztaOa79/DqvtKevFcnAmkpprpFvot O7+FL1PYnXtqtfgwQmotA4K+tPhFYrFiz49lTrXUQHWIO5VJClyG34Ufp45b5L7D2iFC U8Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=SZMWQONtyhFjJunBgu/jOPJGbtthXaYLNhoitCzo7fs=; fh=AOpJg3VyqOa+rQ2k76pYovQ7fGKDlfzKMqsfMrSsAB8=; b=uLMdKyIyEm6xNGVaiVF4hObJmZQQv+NfXq2r9djIWeC5qI8QHxTIsKxLbuWBYYfCDX CBDDl2eZyT5CY6y/MPd/VR46rsI1mdOiIXZoJ9lJsn2cznkUk+GemzCz3s8bMpPlZ0IQ EdWu5lQLy3UJaZ6naohvNcB3gUldKNAQ3jCp91erjNF7tBxpZpfH2XAHco0Im2fZpsBs 6A0bAITPBD0XLvu7kO1FNN1o0RQOj5TzQE3wGztfq78D179UjztUriukryBKvw0LuOo+ nVRtMQ5x0HLFmmTt9UbDS3Yfsm59+ayjmi7yhc2lDocQiupNCb2Yb+36oPiknlA0lTFq WRkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Tw9J9ESC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id l14-20020a170903244e00b001ca33bba8ecsi1520929pls.371.2023.10.20.01.53.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Oct 2023 01:53:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Tw9J9ESC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 96F03806884D; Fri, 20 Oct 2023 01:53:22 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376530AbjJTIxU (ORCPT + 99 others); Fri, 20 Oct 2023 04:53:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1376487AbjJTIxT (ORCPT ); Fri, 20 Oct 2023 04:53:19 -0400 Received: from mail-ua1-x92a.google.com (mail-ua1-x92a.google.com [IPv6:2607:f8b0:4864:20::92a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D365A1A8 for ; Fri, 20 Oct 2023 01:53:16 -0700 (PDT) Received: by mail-ua1-x92a.google.com with SMTP id a1e0cc1a2514c-7ab9f1efecfso224859241.3 for ; Fri, 20 Oct 2023 01:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1697791996; x=1698396796; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SZMWQONtyhFjJunBgu/jOPJGbtthXaYLNhoitCzo7fs=; b=Tw9J9ESCN8MSuL0Ne8uMvx/ua8IXwrU5vWsfMKUkb7UYTn1+zZXFSFoARGX11X4hzM JqGMSkgxRYUPVHLYuGVKgI1cNYRCBwGLaLXSJZknI5zEkKrTxg6u2JbmOFoTPcPdTfhz UuJdMLSyP3oM5MTze1kjaas0q7WWPe5QtICSA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697791996; x=1698396796; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SZMWQONtyhFjJunBgu/jOPJGbtthXaYLNhoitCzo7fs=; b=hWYG64YWLzU99YCegqpNC4urgsY8NzUFfJYyaghFBKbsHg7qRbpv3SOXKs2LeusTL5 a9q3qyEQq4PATNGMDmwtCVY67hiA2k5eeQjd3p6K5yHwRBpxRShy3JfNAp5TFvoiFbk0 GKlvhjpugKlfl0BR+Q3C/TQFwkuvVh2W7Vnh5RMLlYIqNLDCu/bq7BkIWIu7AuA0Hspb 93Vrsa6FSBLDf0mv3P+1BshZyvStHQ2azOvNBmSNiXGLQEgM/YErTfnO7izE/iFVBGIg jgzKFJS6MeL/KQhOS4gRFt2nd8p+wbEJGr0r1sQ75kUD+0SVwY1/CysQyfJHBaaZnb3o frng== X-Gm-Message-State: AOJu0YywSALIIOjE+Y5dDD4+Un1F/Wi0NDC2t+np29Moav1wpNRslcd3 YCqlthXW2G/6KWC/KdI9i5phMowJLGWTt/DAAoiTCA== X-Received: by 2002:a05:6102:1051:b0:44e:d6c3:51d4 with SMTP id h17-20020a056102105100b0044ed6c351d4mr1324954vsq.18.1697791995934; Fri, 20 Oct 2023 01:53:15 -0700 (PDT) MIME-Version: 1.0 References: <20231016095610.1095084-1-korneld@chromium.org> <613c51f0-c32e-4de5-9627-525d92fb06ed@intel.com> In-Reply-To: <613c51f0-c32e-4de5-9627-525d92fb06ed@intel.com> From: =?UTF-8?Q?Kornel_Dul=C4=99ba?= Date: Fri, 20 Oct 2023 10:53:05 +0200 Message-ID: Subject: Re: [PATCH] mmc: cqhci: Be more verbose in error irq handler To: Adrian Hunter Cc: linux-mmc@vger.kernel.org, linux-kernel@vger.kernel.org, Ritesh Harjani , Asutosh Das , Ulf Hansson , Radoslaw Biernacki , Guenter Roeck , Gwendal Grignou , upstream@semihalf.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Fri, 20 Oct 2023 01:53:22 -0700 (PDT) On Fri, Oct 20, 2023 at 9:41=E2=80=AFAM Adrian Hunter wrote: > > On 16/10/23 12:56, Kornel Dul=C4=99ba wrote: > > There are several reasons for controller to generate an error interrupt= . > > They include controller<->card timeout, and CRC mismatch error. > > Right now we only get one line in the logs stating that CQE recovery wa= s > > triggered, but with no information about what caused it. > > To figure out what happened be more verbose and dump the registers from > > irq error handler logic. > > This matches the behaviour of the software timeout logic, see > > cqhci_timeout. > > > > Signed-off-by: Kornel Dul=C4=99ba > > --- > > drivers/mmc/host/cqhci-core.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/mmc/host/cqhci-core.c b/drivers/mmc/host/cqhci-cor= e.c > > index b3d7d6d8d654..33abb4bd53b5 100644 > > --- a/drivers/mmc/host/cqhci-core.c > > +++ b/drivers/mmc/host/cqhci-core.c > > @@ -700,8 +700,9 @@ static void cqhci_error_irq(struct mmc_host *mmc, u= 32 status, int cmd_error, > > > > terri =3D cqhci_readl(cq_host, CQHCI_TERRI); > > > > - pr_debug("%s: cqhci: error IRQ status: 0x%08x cmd error %d data e= rror %d TERRI: 0x%08x\n", > > - mmc_hostname(mmc), status, cmd_error, data_error, terri)= ; > > + pr_warn("%s: cqhci: error IRQ status: 0x%08x cmd error %d data er= ror %d\n", > > + mmc_hostname(mmc), status, cmd_error, data_error); > > + cqhci_dumpregs(cq_host); > > For debugging, isn't dynamic debug seems more appropriate? Dynamic debug is an option, but my personal preference would be to just log more info in the error handler. To give you some background. We're seeing some "running CQE recovery" lines in the logs, followed by a dm_verity mismatch error. The reports come from the field, with no feasible way to reproduce the issue locally. I'd argue that logging only the info that CQE recovery was executed is not particularly helpful for someone looking into those logs. Ideally we would have more data about the state the controller was in when the error happened, or at least what caused the recovery to be triggered. The question here is how verbose should we be in this error scenario. Looking at other error scenarios, in the case of a software timeout we're dumping the controller registers. (cqhci_timeout) Hence I thought that I'd be appropriate to match that and do the same in CQE recovery logic. > > > > > /* Forget about errors when recovery has already been triggered *= / > > if (cq_host->recovery_halt) >