Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5869464ioo; Wed, 1 Jun 2022 14:31:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwaoTFCRRz+SRciAwmhHpCh0T48KaZYnEZWny4bH71/l7ycsGryqk+bk6LMz4kjk+e7Lw6l X-Received: by 2002:a17:90b:4a91:b0:1df:ea07:9596 with SMTP id lp17-20020a17090b4a9100b001dfea079596mr1405893pjb.141.1654119103286; Wed, 01 Jun 2022 14:31:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654119103; cv=none; d=google.com; s=arc-20160816; b=okzwPaB8svWD4Yh7R9TJwmYevcTkcC2c6RICTR4f4KxQYZHMoIdo22T5vy08lVKwGj THngVbahXvGr3Wno86c8Djo5uq7DHfoLscZa4ftbHS9TafAeEBm44a6ZvHSGdY+7u1rZ ZYbboDtiwjq8sYplBko2TZqendIccVu9hOmVpZoiwnwDOw8UxSr+a3vdOcNKtNfCDPik Aa5aT9QHY0UGdEoeRpVHownC5bDkVPDkoacFZeL2f+ww0GcCjCIsVP1WgQJRyfErrJXG 95k0qKvbezUuh6Pe3T1auL18k9T7jxjYajysgBVN/fWVI0FlrwnEqh8qJL5wrEfk2rIz vmEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:references:to:from:content-language:subject:user-agent :mime-version:date:message-id:dkim-signature; bh=oPMSsClHFoDBTi7hmfvjrG4Ch3QkZI7ggjlRb7pAOsw=; b=AScCPDQcZMVdTTp3RntLMZEgjkU2U5UTywUntgubRCAd7ycxLd3xbw2xGl5p20ebfG AJijJrKcHVXegyDRNWuDsnRih2T/LvZzM9Xs01AjBSEYEcoLzHBk6CksGJqvoNRecoFo LXd9soATnQtfzf8kPrQV3A0591zWJMim29Z1TWHWkU3bbboXVdXC35YV7MTt+le7Fazf ECh4kDN6V3l6LkeCbweJLMkvPzc+gVjW/Ry+0JUv9Up7wzcAC+Tvz/qRLsT+a6v5Q6Ya lmPH3FG0BNtVMg4Gdxz55jiisenv8d7+Fwt48L9lMFNpCj0LoO3zLE06ZCVwvVZjiEpG xgzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gWDIeCOf; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id bf20-20020a17090b0b1400b001e285f0889esi7151626pjb.35.2022.06.01.14.31.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 14:31:43 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gWDIeCOf; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 06BE83153A9; Wed, 1 Jun 2022 13:19:28 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244810AbiFAGE5 (ORCPT + 99 others); Wed, 1 Jun 2022 02:04:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237719AbiFAGEz (ORCPT ); Wed, 1 Jun 2022 02:04:55 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC6166BFF4; Tue, 31 May 2022 23:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1654063494; x=1685599494; h=message-id:date:mime-version:subject:from:to:references: in-reply-to:content-transfer-encoding; bh=WqtYoapdLsHkuvevA00O2e2yJYy3FA/wFh2olBeXk/U=; b=gWDIeCOfdTIh4L/N8cXpuaoVQmVbo+Xyg8lCSUbgCCKAKLwCu/WVZDxS Cxtv/EhiHOavjJsdsJJYmNhT0tuhrjt942sx0sJPvegJTxkrmnafZ8uzA mGOhYgi8XU0UipXZqRM8L5GfdHmlkTAI89dPu/TKsU/cJtCc5aDM8/Fop k9WZ8qF4tirnd+LtHKmQJ8WARSwPa56wGEr9KULUb3kb+WxK8H06qvAiF pbQojEy/LNCeyMBdij/RQVSQw5UQln62bu2PxBStKNd8OGDZDgTk89N71 BM+G3jvW7XVDjSTGDK879JxTzCmYvYbY6gpL0OpIXvez+rcmmBARU69C8 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10364"; a="255960931" X-IronPort-AV: E=Sophos;i="5.91,266,1647327600"; d="scan'208";a="255960931" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2022 23:04:54 -0700 X-IronPort-AV: E=Sophos;i="5.91,266,1647327600"; d="scan'208";a="606085363" Received: from ahunter6-mobl1.ger.corp.intel.com (HELO [10.0.2.15]) ([10.252.44.223]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2022 23:04:50 -0700 Message-ID: <273f9bf0-0018-a34e-7bf0-2f6ad9aa73ee@intel.com> Date: Wed, 1 Jun 2022 09:04:46 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0 Thunderbird/91.9.1 Subject: Re: [PATCH V1] mmc: core: Enable force hw reset Content-Language: en-US From: Adrian Hunter To: "Sarthak Garg (QUIC)" , "Kamasali Satyanarayan (Consultant) (QUIC)" , quic_spathi , "ulf.hansson@linaro.org" , "axboe@kernel.dk" , "avri.altman@wdc.com" , "kch@nvidia.com" , "CLoehle@hyperstone.com" , "swboyd@chromium.org" , "digetx@gmail.com" , "bigeasy@linutronix.de" , "linux-mmc@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <1650961818-13452-1-git-send-email-quic_spathi@quicinc.com> <7db46c19-a92a-a13a-eb63-38e5ed31580f@intel.com> <618e533c-7155-3a8d-53f1-04c436a21364@intel.com> Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki In-Reply-To: <618e533c-7155-3a8d-53f1-04c436a21364@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27/05/22 15:44, Adrian Hunter wrote: > On 25/05/22 10:06, Sarthak Garg (QUIC) wrote: >> Hi Adrian, >> >> Thanks for the review. >> Please find comments inline. >> >> Thanks, >> Sarthak >> >>> -----Original Message----- >>> From: Kamasali Satyanarayan (Consultant) (QUIC) >>> >>> Sent: Tuesday, May 24, 2022 5:33 PM >>> To: 'Adrian Hunter' ; quic_spathi >>> ; ulf.hansson@linaro.org; riteshh@codeaurora.org; >>> asutoshd@codeaurora.org; axboe@kernel.dk; avri.altman@wdc.com; >>> kch@nvidia.com; CLoehle@hyperstone.com; swboyd@chromium.org; >>> digetx@gmail.com; bigeasy@linutronix.de; linux-mmc@vger.kernel.org; linux- >>> kernel@vger.kernel.org; Sarthak Garg (QUIC) >>> Cc: Shaik Sajida Bhanu >>> Subject: RE: [PATCH V1] mmc: core: Enable force hw reset >>> >>> Hi, >>> These patches will be further taken by Sarthak. >>> >>> Thanks, >>> Satya >>> >>> -----Original Message----- >>> From: Adrian Hunter >>> Sent: Wednesday, April 27, 2022 6:04 PM >>> To: quic_spathi ; ulf.hansson@linaro.org; >>> riteshh@codeaurora.org; asutoshd@codeaurora.org; axboe@kernel.dk; >>> avri.altman@wdc.com; kch@nvidia.com; CLoehle@hyperstone.com; >>> swboyd@chromium.org; digetx@gmail.com; bigeasy@linutronix.de; linux- >>> mmc@vger.kernel.org; linux-kernel@vger.kernel.org >>> Cc: Shaik Sajida Bhanu ; Kamasali Satyanarayan >>> (Consultant) (QUIC) >>> Subject: Re: [PATCH V1] mmc: core: Enable force hw reset >>> >>> On 26/04/22 11:30, Srinivasarao Pathipati wrote: >>>> From: Shaik Sajida Bhanu >>>> >>>> During error recovery set need hw reset to handle ICE error where cqe >>>> reset is must. >>> >>> How do you get ICE errors? Doesn't it mean either the hardware is broken or >>> the configuration is broken? >> >> This patch is not intended for ice errors and will update the commit text in V2. >> Long back intermittent recovery failures were observed but after forcing hardware reset during error recovery we have no single instance of recovery failure. This have made recovery more robust for us. >> Any suggestions on how we can take it forward will be highly appreciated. > > We can definitely go forward, but with hopefully a little more > explanation first. > > It is preferable to be able to explain why changes are being made. > > Do you have any logs or other information on the recovery failures? > Are you able to reproduce the problem? > > I notice you always do mmc_blk_reset_success(). Does that mean you > sometimes need several resets in a row? > > A potential issue that I notice, is that the recovery does not > explicitly deal with the case that the card's command queue has > been disabled e.g. due to RPMB access or IOCTL commands. Are you > using either of those? Looking closer, the command queue is reenabled on the error path so that is not a concern, but I did send: https://lore.kernel.org/linux-mmc/20220531171922.76080-1-adrian.hunter@intel.com/ For your case, if it is not about ICE errors, why not add only: diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index f4a1281658db..a2ee850a5c16 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -1497,7 +1497,7 @@ void mmc_blk_cqe_recovery(struct mmc_queue *mq) pr_debug("%s: CQE recovery start\n", mmc_hostname(host)); err = mmc_cqe_recovery(host); - if (err) + if (err || host->cqe_recovery_reset_always) mmc_blk_reset(mq->blkdata, host, MMC_BLK_CQE_RECOVERY); mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY); And then just set it in your host controller driver probe function. host->cqe_recovery_reset_always = true; > >>> >>>> >>>> Signed-off-by: Shaik Sajida Bhanu >>>> Signed-off-by: kamasali >>>> Signed-off-by: Srinivasarao Pathipati >>>> --- >>>> drivers/mmc/core/block.c | 8 +++++--- >>>> drivers/mmc/host/cqhci-core.c | 7 +++++-- >>>> include/linux/mmc/host.h | 1 + >>>> 3 files changed, 11 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index >>>> b35e7a9..f63bf33 100644 >>>> --- a/drivers/mmc/core/block.c >>>> +++ b/drivers/mmc/core/block.c >>>> @@ -1482,10 +1482,12 @@ void mmc_blk_cqe_recovery(struct mmc_queue >>> *mq) >>>> pr_debug("%s: CQE recovery start\n", mmc_hostname(host)); >>>> >>>> err = mmc_cqe_recovery(host); >>>> - if (err) >>>> + if (err || host->need_hw_reset) { >>>> mmc_blk_reset(mq->blkdata, host, >>> MMC_BLK_CQE_RECOVERY); >>>> - else >>>> - mmc_blk_reset_success(mq->blkdata, >>> MMC_BLK_CQE_RECOVERY); >>>> + if (host->need_hw_reset) >>>> + host->need_hw_reset = false; >>>> + } >>>> + mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY); >>>> >>>> pr_debug("%s: CQE recovery done\n", mmc_hostname(host)); } diff >>>> --git a/drivers/mmc/host/cqhci-core.c b/drivers/mmc/host/cqhci-core.c >>>> index b0d30c3..311b510 100644 >>>> --- a/drivers/mmc/host/cqhci-core.c >>>> +++ b/drivers/mmc/host/cqhci-core.c >>>> @@ -812,18 +812,21 @@ static void cqhci_finish_mrq(struct mmc_host >>>> *mmc, unsigned int tag) irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 >>> intmask, int cmd_error, >>>> int data_error) >>>> { >>>> - u32 status; >>>> + u32 status, ice_err; >>>> unsigned long tag = 0, comp_status; >>>> struct cqhci_host *cq_host = mmc->cqe_private; >>>> >>>> status = cqhci_readl(cq_host, CQHCI_IS); >>>> cqhci_writel(cq_host, status, CQHCI_IS); >>>> + ice_err = status & (CQHCI_IS_GCE | CQHCI_IS_ICCE); >>>> >>>> pr_debug("%s: cqhci: IRQ status: 0x%08x\n", mmc_hostname(mmc), >>>> status); >>>> >>>> if ((status & (CQHCI_IS_RED | CQHCI_IS_GCE | CQHCI_IS_ICCE)) || >>>> - cmd_error || data_error) >>>> + cmd_error || data_error || ice_err){ >>>> + mmc->need_hw_reset = true; >>>> cqhci_error_irq(mmc, status, cmd_error, data_error); >>>> + } >>>> >>>> if (status & CQHCI_IS_TCC) { >>>> /* read TCN and complete the request */ diff --git >>>> a/include/linux/mmc/host.h b/include/linux/mmc/host.h index >>>> c193c50..3d00bcf 100644 >>>> --- a/include/linux/mmc/host.h >>>> +++ b/include/linux/mmc/host.h >>>> @@ -492,6 +492,7 @@ struct mmc_host { >>>> int cqe_qdepth; >>>> bool cqe_enabled; >>>> bool cqe_on; >>>> + bool need_hw_reset; >>>> >>>> /* Inline encryption support */ >>>> #ifdef CONFIG_MMC_CRYPTO >>> >> >