Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp3590799pxa; Sun, 9 Aug 2020 05:20:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyfrAAVgkIU7W63l1IJvvYFQmMxlUlbM7bbQY+4cNKnU9PQYlz+7Dd+P0ghfLKrgvxmsenS X-Received: by 2002:aa7:cdc4:: with SMTP id h4mr16425155edw.252.1596975620384; Sun, 09 Aug 2020 05:20:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596975620; cv=none; d=google.com; s=arc-20160816; b=y0Mx09gM7I1axU585SOguEamtakdNGCqi0I+zyPaSM99rCnByFD3lzMx49zBCY4Bw5 9x9GpYnivCZo3VCp6prheji6p1FVczTdAR1vgOSz1Y2X8c5vAS2GUrVOS8T2FIvf+Edw 4+nDKh0VKTIx+HaWiH/QssfG9CEMc5ERlkWIqeJ9SFJqeUnHkhVoLl0al8Z0radn3ufj AnG5/hgNNZuM1fDlXjUNNp5ysJwmOjEwcO8A5ehVFOijdA0LvbyUEuo2wpkTMtPTo5f1 vOfsmhe3ptw6EbYCGAiC/pNFME709RvEl3qBN4i3cdyg8PRdAluHqCsfwhc6QcsbGL9G WTVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:ironport-sdr; bh=MP6yQgVt5bGnNTB4H5d6u26S9KY9ra66Plk20+a9VME=; b=T92jr0CiR7MtYdi/207HCitSSyvBYJEsCmokE9TtjCzDjo/ORIFZupLZB0lvqJop+l Cp66tMkYNoFTQk9HikSgxgWooEb/puTVLrOeM72rjeWCQZnHdE/rSDVjsV0WNr7u3goR 2OQzSSgVK2vwmtySM8jEc9MiHGzCxhUHpK1JC96+3wALZwLPqpwQAavzUwJkwXwBqdgW j4UdIcxfzLNyuvfdK9rSfnacCRutJeiRExENsYWNNIrDFvSkTTig/8Q64TNzdn7WzwN3 bYu/37zhw5L0Z1GydzL6d7mjQu3mm2Hypv3U9SvJZ0MGBdsgl0+k4h4ic1jJF2FYTZPB //qg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id qn9si9113920ejb.192.2020.08.09.05.19.57; Sun, 09 Aug 2020 05:20:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726466AbgHIMRL (ORCPT + 99 others); Sun, 9 Aug 2020 08:17:11 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:31382 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbgHIMQj (ORCPT ); Sun, 9 Aug 2020 08:16:39 -0400 IronPort-SDR: dm7IZg+RYaz4Cwxikti2GnvOT38ydrP1XMeMTmo9wB9V7fVqNxV7M2GO/WcA+mM8j1xMNzypl/ Vys7fmtAO7qbu4V0LzwF3sQIYHvl+Xek/8Se++LDIQNrbQ996YMcBFz5U5SYlzq/atOh0bAVes uPynYkON48FC9YVvS/yYfgcpO5soa8ZQumPi5ZPbavNOrjxuzMZEkg3vxtBKIiB5z43fLx9tT3 bd21ziI+iWLJopgM4fx2W9Uh5q6TbpeTyjHZMgvfDRzQK2e0bs249vjbSurE0tjxgA/S/ZFK0K Euk= X-IronPort-AV: E=Sophos;i="5.75,453,1589266800"; d="scan'208";a="29073793" Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by labrats.qualcomm.com with ESMTP; 09 Aug 2020 05:16:25 -0700 Received: from stor-presley.qualcomm.com ([192.168.140.85]) by ironmsg03-sd.qualcomm.com with ESMTP; 09 Aug 2020 05:16:25 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 473252156E; Sun, 9 Aug 2020 05:16:25 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Stanley Chu , Bean Huo , Bart Van Assche , linux-kernel@vger.kernel.org (open list) Subject: [PATCH 6/9] scsi: ufs: Recover hba runtime PM error in error handler Date: Sun, 9 Aug 2020 05:15:52 -0700 Message-Id: <1596975355-39813-7-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596975355-39813-1-git-send-email-cang@codeaurora.org> References: <1596975355-39813-1-git-send-email-cang@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Current error handler cannot work well or recover hba runtime PM error if ufshcd_suspend/resume has failed due to UFS errors, e.g. hibern8 enter/exit error or SSU cmd error. When this happens, error handler may fail doing full reset and restore because error handler always assumes that powers, IRQs and clocks are ready after pm_runtime_get_sync returns, but actually they are not if ufshcd_reusme fails [1]. Besides, if ufschd_suspend/resume fails due to UFS error, runtime PM framework saves the error value to dev.power.runtime_error. After that, hba dev runtime suspend/resume would not be invoked anymore unless runtime_error is cleared [2]. In case of ufshcd_suspend/resume fails due to UFS errors, for scenario [1], error handler cannot assume anything of pm_runtime_get_sync, meaning error handler should explicitly turn ON powers, IRQs and clocks again. To get the hba runtime PM work as regard for scenario [2], error handler can clear the runtime_error by calling pm_runtime_set_active() if full reset and restore succeeds. And, more important, if pm_runtime_set_active() returns no error, which means runtime_error has been cleared, we also need to resume those scsi devices under hba in case any of them has failed to be resumed due to hba runtime resume failure. This is to unblock blk_queue_enter in case there are bios waiting inside it. Signed-off-by: Can Guo Reviewed-by: Bean Huo Reviewed-by: Asutosh Das diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 2604016..ed24582 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "ufshcd.h" #include "ufs_quirks.h" #include "unipro.h" @@ -229,6 +230,10 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba); static int ufshcd_change_power_mode(struct ufs_hba *hba, struct ufs_pa_layer_attr *pwr_mode); static void ufshcd_schedule_eh_work(struct ufs_hba *hba); +static int ufshcd_setup_hba_vreg(struct ufs_hba *hba, bool on); +static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on); +static inline int ufshcd_config_vreg_hpm(struct ufs_hba *hba, + struct ufs_vreg *vreg); static int ufshcd_wb_buf_flush_enable(struct ufs_hba *hba); static int ufshcd_wb_buf_flush_disable(struct ufs_hba *hba); static int ufshcd_wb_ctrl(struct ufs_hba *hba, bool enable); @@ -5553,6 +5558,84 @@ static inline void ufshcd_schedule_eh_work(struct ufs_hba *hba) } } +static void ufshcd_err_handling_prepare(struct ufs_hba *hba) +{ + pm_runtime_get_sync(hba->dev); + if (pm_runtime_suspended(hba->dev)) { + /* + * Don't assume anything of pm_runtime_get_sync(), if + * resume fails, irq and clocks can be OFF, and powers + * can be OFF or in LPM. + */ + ufshcd_setup_hba_vreg(hba, true); + ufshcd_enable_irq(hba); + ufshcd_setup_vreg(hba, true); + ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq); + ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq2); + ufshcd_hold(hba, false); + if (!ufshcd_is_clkgating_allowed(hba)) + ufshcd_setup_clocks(hba, true); + ufshcd_release(hba); + ufshcd_vops_resume(hba, UFS_RUNTIME_PM); + } else { + ufshcd_hold(hba, false); + if (hba->clk_scaling.is_allowed) { + cancel_work_sync(&hba->clk_scaling.suspend_work); + cancel_work_sync(&hba->clk_scaling.resume_work); + ufshcd_suspend_clkscaling(hba); + } + } +} + +static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) +{ + ufshcd_release(hba); + if (hba->clk_scaling.is_allowed) + ufshcd_resume_clkscaling(hba); + pm_runtime_put(hba->dev); +} + +static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) +{ + return (hba->ufshcd_state == UFSHCD_STATE_ERROR || + (!(hba->saved_err || hba->saved_uic_err || hba->force_reset || + ufshcd_is_link_broken(hba)))); +} + +#ifdef CONFIG_PM +static void ufshcd_recover_pm_error(struct ufs_hba *hba) +{ + struct Scsi_Host *shost = hba->host; + struct scsi_device *sdev; + struct request_queue *q; + int ret; + + /* + * Set RPM status of hba device to RPM_ACTIVE, + * this also clears its runtime error. + */ + ret = pm_runtime_set_active(hba->dev); + /* + * If hba device had runtime error, we also need to resume those + * scsi devices under hba in case any of them has failed to be + * resumed due to hba runtime resume failure. This is to unblock + * blk_queue_enter in case there are bios waiting inside it. + */ + if (!ret) { + shost_for_each_device(sdev, shost) { + q = sdev->request_queue; + if (q->dev && (q->rpm_status == RPM_SUSPENDED || + q->rpm_status == RPM_SUSPENDING)) + pm_request_resume(q->dev); + } + } +} +#else +static inline void ufshcd_recover_pm_error(struct ufs_hba *hba) +{ +} +#endif + /** * ufshcd_err_handler - handle UFS errors that require s/w attention * @work: pointer to work structure @@ -5570,9 +5653,7 @@ static void ufshcd_err_handler(struct work_struct *work) hba = container_of(work, struct ufs_hba, eh_work); spin_lock_irqsave(hba->host->host_lock, flags); - if (hba->ufshcd_state == UFSHCD_STATE_ERROR || - (!(hba->saved_err || hba->saved_uic_err || hba->force_reset || - ufshcd_is_link_broken(hba)))) { + if (ufshcd_err_handling_should_stop(hba)) { if (hba->ufshcd_state != UFSHCD_STATE_ERROR) hba->ufshcd_state = UFSHCD_STATE_OPERATIONAL; spin_unlock_irqrestore(hba->host->host_lock, flags); @@ -5581,10 +5662,17 @@ static void ufshcd_err_handler(struct work_struct *work) } ufshcd_set_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); - pm_runtime_get_sync(hba->dev); - ufshcd_hold(hba, false); - + ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); + /* + * A full reset and restore might have happened after preparation + * is finished, double check whether we should stop. + */ + if (ufshcd_err_handling_should_stop(hba)) { + if (hba->ufshcd_state != UFSHCD_STATE_ERROR) + hba->ufshcd_state = UFSHCD_STATE_OPERATIONAL; + goto out; + } hba->ufshcd_state = UFSHCD_STATE_RESET; /* Complete requests that have door-bell cleared by h/w */ @@ -5662,10 +5750,12 @@ static void ufshcd_err_handler(struct work_struct *work) hba->force_reset = false; spin_unlock_irqrestore(hba->host->host_lock, flags); err = ufshcd_reset_and_restore(hba); - spin_lock_irqsave(hba->host->host_lock, flags); if (err) dev_err(hba->dev, "%s: reset and restore failed with err %d\n", __func__, err); + else + ufshcd_recover_pm_error(hba); + spin_lock_irqsave(hba->host->host_lock, flags); } skip_err_handling: @@ -5677,11 +5767,11 @@ static void ufshcd_err_handler(struct work_struct *work) __func__, hba->saved_err, hba->saved_uic_err); } +out: ufshcd_clear_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_scsi_unblock_requests(hba); - ufshcd_release(hba); - pm_runtime_put_sync(hba->dev); + ufshcd_err_handling_unprepare(hba); } /** -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.