Received: by 2002:a05:7412:b795:b0:e2:908c:2ebd with SMTP id iv21csp412616rdb; Thu, 2 Nov 2023 07:19:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG7AzFR6r6wRYte9ID2KLvQYtR+ISWOT0TaEGc7S/HssKXCASWgPMiyHSLynEFGs0jGc30S X-Received: by 2002:a17:902:db0f:b0:1cc:5ef7:e3dd with SMTP id m15-20020a170902db0f00b001cc5ef7e3ddmr10466163plx.47.1698934753528; Thu, 02 Nov 2023 07:19:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698934753; cv=none; d=google.com; s=arc-20160816; b=M7uMp1xgIwNxQWYykrDkVZ/sXku0/PQxnIrGWX1StUUONFs0f16nA2kq48o6nN18iw aFao8+bT0go0673ZUeQ8WiI0gMIR3UNHnZdTUoukGhcwOu5hL6kioGyWsXrEMNCXDRHC zqwMBdrF4QQiEoPFSI2isvGUoqR1hK9Uz5X7JGfvNyVtZLx5T4jwtWM69dUZIzXtd4wO uuOAlspNkSzuujktOGvyFItEZ0qProbeE8iclbK5a5lvRdsL3Q9Fn42zEKhPdw2klcHK txIxF7SYQwEZiuCZg6Nd0AFCrFlpSm5UnzN/z6MQAYZhapS6/wIOfNpK7Y5GK7pAl03E ed/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=P0q/WyUH/RO3Ia1/awVvdyA1n//Vy7px8qroeNCo7qY=; fh=zjJInP7jsOLNI8JJXvggJ/i648XfdiN1ko/yp4JL6YY=; b=FQMe9Tp7Wn8okrn1Zcn+I02FGAnZN8OhPzDdqtBdPPV/xQ7Ayz5E5+xpALrLiQyioW FOIsw9txKUrXMc124cf6FlkKEV5cQKtXkr4m/EGL6qysQ9RP2BUJWgwj3vwxY8OycpdQ +WCROeNL7zW95rZNy1HR8QhFFn3iiKWwFM2x41Lec8+aoMdCJjbHLuExELYsqirp8voM CS4eHUHd/Xlf6mlKt8OmGR+uffUdaBZtR3rfLBbr8KFt2jAi6o5MGzXgJehXt8eX052a 6eDhjGfl9eAD2kAanpt3SwZ8rcBVesGmBve9HkCwPpsyocQx73Kmosx1ye3OypSsVu6g /q5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=GCtHnkUV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id f17-20020a170903105100b001b9be3b94dfsi4731900plc.268.2023.11.02.07.19.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 07:19:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=GCtHnkUV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id C3C76809C924; Thu, 2 Nov 2023 07:19:10 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229822AbjKBOTB (ORCPT + 99 others); Thu, 2 Nov 2023 10:19:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbjKBOSy (ORCPT ); Thu, 2 Nov 2023 10:18:54 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8F11B7 for ; Thu, 2 Nov 2023 07:18:51 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-32dc9ff4a8fso558659f8f.1 for ; Thu, 02 Nov 2023 07:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698934730; x=1699539530; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=P0q/WyUH/RO3Ia1/awVvdyA1n//Vy7px8qroeNCo7qY=; b=GCtHnkUVi4Z07t3eYioJpFGjdHOzjU5Sqb1JB3zzCDcuSf/Wq7FyDNtefPADd/sOX/ jKZWwxVugV0YNO02xDdkXmIl4yFX/ChA2tSSfNNOvOUVcpR3cMx2svPyuVjavr9iWKs6 0w4Yqt1SmWi0cjFGxKXgSp68A0t0KFxOHt++xXuyAghOSJdX1/Wd0zDuSNwBUSmw/ddP JDSjoInGbHzCVfKamcQMtUnQDleVBBXuqgJXh1NFU3Xb5bDJhCRsTb90TPGsx3Z6+JZQ orXNmpl/Q/LJH7vfUviCpGrgq8exPihczPr1F2UmMJko2eCAKOZAWKIsd83bs8YOjjay w21A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698934730; x=1699539530; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P0q/WyUH/RO3Ia1/awVvdyA1n//Vy7px8qroeNCo7qY=; b=g05+PD0ZWpnxmR9eP+IN6+AEHdSv5w1cZ6V96QQcA8AKE3gyKJbSRTEGik0IJLULDm FchSgYuvrLyCoi9C87Fe8Plg1v1Dg5DjqLWkxJtNeZUj/fpFgboiXqheZtSw8neeiNs9 k/VDs5skRfP1e4uJ+pJAAMcEWsURYz98NyGIK69tJAf7x/L/rYeon3RshZOc7+URZ5mo +vSP3b/a0oCCjieVP/Sq9ljOReoIUYBiMI8sdop/XrRkKt8ESTwSR8oYw/a0GLV80CQP 1ph8CAAFv6tBYFXhPGLrAw9Xa/FYf/HUb1bc9xvxrrrZaRKJfC+qOxcV9BHtPMkFkqJ9 xvlQ== X-Gm-Message-State: AOJu0Yy0O8FXUXAXmS/rinrLMsWz3WAnJTEPhfIAP8/0NHvcBDZSpPyL JukwYT8obiIio2iEnY437lRIOLxolGIJW9nTreHlhQ== X-Received: by 2002:a5d:5272:0:b0:32d:be44:f70c with SMTP id l18-20020a5d5272000000b0032dbe44f70cmr14423665wrc.7.1698934729892; Thu, 02 Nov 2023 07:18:49 -0700 (PDT) MIME-Version: 1.0 References: <20231027145623.2258723-1-korneld@chromium.org> <20231027145623.2258723-2-korneld@chromium.org> <63e54bfd-9bb3-423b-a965-e0a9b399671c@intel.com> In-Reply-To: From: Radoslaw Biernacki Date: Thu, 2 Nov 2023 15:18:33 +0100 Message-ID: Subject: Re: [PATCH 1/2] mmc: cqhci: Add a quirk to clear stale TC To: =?UTF-8?Q?Kornel_Dul=C4=99ba?= Cc: Adrian Hunter , linux-mmc@vger.kernel.org, linux-kernel@vger.kernel.org, Ulf Hansson , Gwendal Grignou Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 02 Nov 2023 07:19:11 -0700 (PDT) On Thu, Nov 2, 2023 at 3:07=E2=80=AFPM Kornel Dul=C4=99ba wrote: > > On Thu, Nov 02, 2023 at 01:01:22PM +0200, Adrian Hunter wrote: > > On 2/11/23 11:21, Kornel Dul=C4=99ba wrote: > > > On Mon, Oct 30, 2023 at 8:31=E2=80=AFPM Adrian Hunter wrote: > > >> > > >> On 27/10/23 17:56, Kornel Dul=C4=99ba wrote: > > >>> This fix addresses a stale task completion event issued right after= the > > >>> CQE recovery. As it's a hardware issue the fix is done in form of a > > >>> quirk. > > >>> > > >>> When error interrupt is received the driver runs recovery logic is = run. > > >>> It halts the controller, clears all pending tasks, and then re-enab= les > > >>> it. On some platforms a stale task completion event is observed, > > >>> regardless of the CQHCI_CLEAR_ALL_TASKS bit being set. > > >>> > > >>> This results in either: > > >>> a) Spurious TC completion event for an empty slot. > > >>> b) Corrupted data being passed up the stack, as a result of prematu= re > > >>> completion for a newly added task. > > >>> > > >>> To fix that re-enable the controller, clear task completion bits, > > >>> interrupt status register and halt it again. > > >>> This is done at the end of the recovery process, right before inter= rupts > > >>> are re-enabled. > > >>> > > >>> Signed-off-by: Kornel Dul=C4=99ba > > >>> --- > > >>> drivers/mmc/host/cqhci-core.c | 42 +++++++++++++++++++++++++++++++= ++++ > > >>> drivers/mmc/host/cqhci.h | 1 + > > >>> 2 files changed, 43 insertions(+) > > >>> > > >>> diff --git a/drivers/mmc/host/cqhci-core.c b/drivers/mmc/host/cqhci= -core.c > > >>> index b3d7d6d8d654..e534222df90c 100644 > > >>> --- a/drivers/mmc/host/cqhci-core.c > > >>> +++ b/drivers/mmc/host/cqhci-core.c > > >>> @@ -1062,6 +1062,45 @@ static void cqhci_recover_mrqs(struct cqhci_= host *cq_host) > > >>> /* CQHCI could be expected to clear it's internal state pretty qui= ckly */ > > >>> #define CQHCI_CLEAR_TIMEOUT 20 > > >>> > > >>> +/* > > >>> + * During CQE recovery all pending tasks are cleared from the > > >>> + * controller and its state is being reset. > > >>> + * On some platforms the controller sets a task completion bit for > > >>> + * a stale(previously cleared) task right after being re-enabled. > > >>> + * This results in a spurious interrupt at best and corrupted data > > >>> + * being passed up the stack at worst. The latter happens when > > >>> + * the driver enqueues a new request on the problematic task slot > > >>> + * before the "spurious" task completion interrupt is handled. > > >>> + * To fix it: > > >>> + * 1. Re-enable controller by clearing the halt flag. > > >>> + * 2. Clear interrupt status and the task completion register. > > >>> + * 3. Halt the controller again to be consistent with quirkless lo= gic. > > >>> + * > > >>> + * This assumes that there are no pending requests on the queue. > > >>> + */ > > >>> +static void cqhci_quirk_clear_stale_tc(struct cqhci_host *cq_host) > > >>> +{ > > >>> + u32 reg; > > >>> + > > >>> + WARN_ON(cq_host->qcnt); > > >>> + cqhci_writel(cq_host, 0, CQHCI_CTL); > > >>> + if ((cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT)) { > > >>> + pr_err("%s: cqhci: CQE failed to exit halt state\n", > > >>> + mmc_hostname(cq_host->mmc)); > > >>> + } > > >>> + reg =3D cqhci_readl(cq_host, CQHCI_TCN); > > >>> + cqhci_writel(cq_host, reg, CQHCI_TCN); > > >>> + reg =3D cqhci_readl(cq_host, CQHCI_IS); > > >>> + cqhci_writel(cq_host, reg, CQHCI_IS); > > >>> + > > >>> + /* > > >>> + * Halt the controller again. > > >>> + * This is only needed so that we're consistent across quirk > > >>> + * and quirkless logic. > > >>> + */ > > >>> + cqhci_halt(cq_host->mmc, CQHCI_FINISH_HALT_TIMEOUT); > > >>> +} > > >> > > >> Thanks a lot for tracking this down! > > >> > > >> It could be that the "un-halt" starts a task, so it would be > > >> better to force the "clear" to work if possible, which > > >> should be the case if CQE is disabled. > > >> > > >> Would you mind trying the code below? Note the increased > > >> CQHCI_START_HALT_TIMEOUT helps avoid trying to clear tasks > > >> when CQE has not halted. > > > > > > I've run a quick test and it works just fine. > > > > Thank you! > > > > > Your approach looks better than what I proposed, since as you > > > mentioned, doing it like this avoids some weird side effects, e.g. DM= A > > > to freed memory. > > > Do you plan to include it in the other series that you posted yesterd= ay? > > > > Yes I will do that > > Feel free to add "Tested-by: Kornel Dul=C4=99ba " a= nd > maybe "Reported-by". I do not want to be you advocate Kornel, but I think you earned a Co-developed-by That was a lot of work.