Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp2999709ybi; Mon, 17 Jun 2019 14:22:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqxNoMANml7YkAk/e4TPYZIfDf7ufNXr/xdfopvoqGCD8YDKHmDFO0cJgf+1pAGU8eBIr7OQ X-Received: by 2002:a17:90a:3aed:: with SMTP id b100mr1146078pjc.63.1560806571957; Mon, 17 Jun 2019 14:22:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560806571; cv=none; d=google.com; s=arc-20160816; b=oRk7vbe5ZkOjQBrO2jnfjvHH0P3WhVUp9RLAKH8N3rbwPB9SoSpbG3F978DugP5bpz hmaosiJ4qbNq+qTCevAyEkOP5lc6hlEayldkThPlzBF5PwtD/SfedAQsBfY2SXFDAYY0 Uo7Z6XQLVELB5LdAeVok64FNQLy9/FulUVnqsi/j0F4z4hG7h4aWODStMlMrvoYEXcbs cwvMQBdo4PT4Syq8gVIGcqHCC3Hd/shmXJhsUBcOSZ2bSSr6QQBpDb4mTP27Ej9oO8XO ZQr5EB2gj2wkq39HzBgBCghdATGjT6pNkJf4V0dof2RLDwTB0w67rVpymmvvCm6sdC6M vDvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=izjpNW+Kv4hUNoN7AFJI0bQVCODJW+CHF3+/FWeFQhk=; b=LBfPrYBri/5bbLklUxQZBS7dmhGiaRw5KEJYAMVhfLBBo0EOEVCj62jgp1zSFZliyS byZuimdFaQPCvzj0fClJAJU/nL0m6GBdwmxKTs3GOd1TG5x4Mu3vNhTjCbXloX5ZJ2af nvciKua5tZdZstHcxTP6xXbiaQ1CzdVMBvS2B0zzh5/bbqmMlih98ruR7M4p9kM5yHck fBMTTZYuHxmh+xO+qvofk/1ubss1785YGisApIWe2YWDtdLlKpBp1OUF1ctH+tLhSLbb z97jrglgvqhD27kjCCXrX8ggjJK5e7KR8VJ77nzUwEFQSdOs5Gvn5Qlju3pKuyju2ikT NtlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rFPa+wyC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u4si4026935pgk.48.2019.06.17.14.22.37; Mon, 17 Jun 2019 14:22:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=rFPa+wyC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729162AbfFQVUi (ORCPT + 99 others); Mon, 17 Jun 2019 17:20:38 -0400 Received: from mail.kernel.org ([198.145.29.99]:44882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729157AbfFQVUg (ORCPT ); Mon, 17 Jun 2019 17:20:36 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EA2B020861; Mon, 17 Jun 2019 21:20:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1560806435; bh=HCezjMQjWUHojgI7GRlZ24BxBV9CBvXrMfU/Ay8TccQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rFPa+wyC3jJ0bjG0aAEgbBfOLtgpCS9cstDrDDyekjAl54j+SPTFVc2SIovNFHLo+ g0A3LwrhkwDWcFzduIM0OQOwQA63ZJqHLU8o8jrVIJiMMSLVDBpaiF7MUIvkVAuB/t 4t+gZ6o9SpKH6FOqX0FzP0fL4RNDG8xnkHBURdkI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Quinn Tran , Himanshu Madhani , "Ewan D. Milne" , "Martin K. Petersen" , Sasha Levin Subject: [PATCH 5.1 052/115] scsi: qla2xxx: Add cleanup for PCI EEH recovery Date: Mon, 17 Jun 2019 23:09:12 +0200 Message-Id: <20190617210803.104970120@linuxfoundation.org> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190617210759.929316339@linuxfoundation.org> References: <20190617210759.929316339@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [ Upstream commit 5386a4e6c7fecd282d265a24d930a74ba3c5917b ] During EEH error recovery testing it was discovered that driver's reset() callback partially frees resources used by driver, leaving some stale memory. After reset() is done and when resume() callback in driver uses old data which results into error leaving adapter disabled due to PCIe error. This patch does cleanup for EEH recovery code path and prevents adapter from getting disabled. Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Reviewed-by: Ewan D. Milne Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/qla2xxx/qla_os.c | 221 +++++++++++++--------------------- 1 file changed, 82 insertions(+), 139 deletions(-) diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 91f576d743fe..d377e50a6c19 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -6838,6 +6838,78 @@ qla2x00_release_firmware(void) mutex_unlock(&qla_fw_lock); } +static void qla_pci_error_cleanup(scsi_qla_host_t *vha) +{ + struct qla_hw_data *ha = vha->hw; + scsi_qla_host_t *base_vha = pci_get_drvdata(ha->pdev); + struct qla_qpair *qpair = NULL; + struct scsi_qla_host *vp; + fc_port_t *fcport; + int i; + unsigned long flags; + + ha->chip_reset++; + + ha->base_qpair->chip_reset = ha->chip_reset; + for (i = 0; i < ha->max_qpairs; i++) { + if (ha->queue_pair_map[i]) + ha->queue_pair_map[i]->chip_reset = + ha->base_qpair->chip_reset; + } + + /* purge MBox commands */ + if (atomic_read(&ha->num_pend_mbx_stage3)) { + clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags); + complete(&ha->mbx_intr_comp); + } + + i = 0; + + while (atomic_read(&ha->num_pend_mbx_stage3) || + atomic_read(&ha->num_pend_mbx_stage2) || + atomic_read(&ha->num_pend_mbx_stage1)) { + msleep(20); + i++; + if (i > 50) + break; + } + + ha->flags.purge_mbox = 0; + + mutex_lock(&ha->mq_lock); + list_for_each_entry(qpair, &base_vha->qp_list, qp_list_elem) + qpair->online = 0; + mutex_unlock(&ha->mq_lock); + + qla2x00_mark_all_devices_lost(vha, 0); + + spin_lock_irqsave(&ha->vport_slock, flags); + list_for_each_entry(vp, &ha->vp_list, list) { + atomic_inc(&vp->vref_count); + spin_unlock_irqrestore(&ha->vport_slock, flags); + qla2x00_mark_all_devices_lost(vp, 0); + spin_lock_irqsave(&ha->vport_slock, flags); + atomic_dec(&vp->vref_count); + } + spin_unlock_irqrestore(&ha->vport_slock, flags); + + /* Clear all async request states across all VPs. */ + list_for_each_entry(fcport, &vha->vp_fcports, list) + fcport->flags &= ~(FCF_LOGIN_NEEDED | FCF_ASYNC_SENT); + + spin_lock_irqsave(&ha->vport_slock, flags); + list_for_each_entry(vp, &ha->vp_list, list) { + atomic_inc(&vp->vref_count); + spin_unlock_irqrestore(&ha->vport_slock, flags); + list_for_each_entry(fcport, &vp->vp_fcports, list) + fcport->flags &= ~(FCF_LOGIN_NEEDED | FCF_ASYNC_SENT); + spin_lock_irqsave(&ha->vport_slock, flags); + atomic_dec(&vp->vref_count); + } + spin_unlock_irqrestore(&ha->vport_slock, flags); +} + + static pci_ers_result_t qla2xxx_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t state) { @@ -6863,20 +6935,7 @@ qla2xxx_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t state) return PCI_ERS_RESULT_CAN_RECOVER; case pci_channel_io_frozen: ha->flags.eeh_busy = 1; - /* For ISP82XX complete any pending mailbox cmd */ - if (IS_QLA82XX(ha)) { - ha->flags.isp82xx_fw_hung = 1; - ql_dbg(ql_dbg_aer, vha, 0x9001, "Pci channel io frozen\n"); - qla82xx_clear_pending_mbx(vha); - } - qla2x00_free_irqs(vha); - pci_disable_device(pdev); - /* Return back all IOs */ - qla2x00_abort_all_cmds(vha, DID_RESET << 16); - if (ql2xmqsupport || ql2xnvmeenable) { - set_bit(QPAIR_ONLINE_CHECK_NEEDED, &vha->dpc_flags); - qla2xxx_wake_dpc(vha); - } + qla_pci_error_cleanup(vha); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: ha->flags.pci_channel_io_perm_failure = 1; @@ -6930,122 +6989,14 @@ qla2xxx_pci_mmio_enabled(struct pci_dev *pdev) return PCI_ERS_RESULT_RECOVERED; } -static uint32_t -qla82xx_error_recovery(scsi_qla_host_t *base_vha) -{ - uint32_t rval = QLA_FUNCTION_FAILED; - uint32_t drv_active = 0; - struct qla_hw_data *ha = base_vha->hw; - int fn; - struct pci_dev *other_pdev = NULL; - - ql_dbg(ql_dbg_aer, base_vha, 0x9006, - "Entered %s.\n", __func__); - - set_bit(ABORT_ISP_ACTIVE, &base_vha->dpc_flags); - - if (base_vha->flags.online) { - /* Abort all outstanding commands, - * so as to be requeued later */ - qla2x00_abort_isp_cleanup(base_vha); - } - - - fn = PCI_FUNC(ha->pdev->devfn); - while (fn > 0) { - fn--; - ql_dbg(ql_dbg_aer, base_vha, 0x9007, - "Finding pci device at function = 0x%x.\n", fn); - other_pdev = - pci_get_domain_bus_and_slot(pci_domain_nr(ha->pdev->bus), - ha->pdev->bus->number, PCI_DEVFN(PCI_SLOT(ha->pdev->devfn), - fn)); - - if (!other_pdev) - continue; - if (atomic_read(&other_pdev->enable_cnt)) { - ql_dbg(ql_dbg_aer, base_vha, 0x9008, - "Found PCI func available and enable at 0x%x.\n", - fn); - pci_dev_put(other_pdev); - break; - } - pci_dev_put(other_pdev); - } - - if (!fn) { - /* Reset owner */ - ql_dbg(ql_dbg_aer, base_vha, 0x9009, - "This devfn is reset owner = 0x%x.\n", - ha->pdev->devfn); - qla82xx_idc_lock(ha); - - qla82xx_wr_32(ha, QLA82XX_CRB_DEV_STATE, - QLA8XXX_DEV_INITIALIZING); - - qla82xx_wr_32(ha, QLA82XX_CRB_DRV_IDC_VERSION, - QLA82XX_IDC_VERSION); - - drv_active = qla82xx_rd_32(ha, QLA82XX_CRB_DRV_ACTIVE); - ql_dbg(ql_dbg_aer, base_vha, 0x900a, - "drv_active = 0x%x.\n", drv_active); - - qla82xx_idc_unlock(ha); - /* Reset if device is not already reset - * drv_active would be 0 if a reset has already been done - */ - if (drv_active) - rval = qla82xx_start_firmware(base_vha); - else - rval = QLA_SUCCESS; - qla82xx_idc_lock(ha); - - if (rval != QLA_SUCCESS) { - ql_log(ql_log_info, base_vha, 0x900b, - "HW State: FAILED.\n"); - qla82xx_clear_drv_active(ha); - qla82xx_wr_32(ha, QLA82XX_CRB_DEV_STATE, - QLA8XXX_DEV_FAILED); - } else { - ql_log(ql_log_info, base_vha, 0x900c, - "HW State: READY.\n"); - qla82xx_wr_32(ha, QLA82XX_CRB_DEV_STATE, - QLA8XXX_DEV_READY); - qla82xx_idc_unlock(ha); - ha->flags.isp82xx_fw_hung = 0; - rval = qla82xx_restart_isp(base_vha); - qla82xx_idc_lock(ha); - /* Clear driver state register */ - qla82xx_wr_32(ha, QLA82XX_CRB_DRV_STATE, 0); - qla82xx_set_drv_active(base_vha); - } - qla82xx_idc_unlock(ha); - } else { - ql_dbg(ql_dbg_aer, base_vha, 0x900d, - "This devfn is not reset owner = 0x%x.\n", - ha->pdev->devfn); - if ((qla82xx_rd_32(ha, QLA82XX_CRB_DEV_STATE) == - QLA8XXX_DEV_READY)) { - ha->flags.isp82xx_fw_hung = 0; - rval = qla82xx_restart_isp(base_vha); - qla82xx_idc_lock(ha); - qla82xx_set_drv_active(base_vha); - qla82xx_idc_unlock(ha); - } - } - clear_bit(ABORT_ISP_ACTIVE, &base_vha->dpc_flags); - - return rval; -} - static pci_ers_result_t qla2xxx_pci_slot_reset(struct pci_dev *pdev) { pci_ers_result_t ret = PCI_ERS_RESULT_DISCONNECT; scsi_qla_host_t *base_vha = pci_get_drvdata(pdev); struct qla_hw_data *ha = base_vha->hw; - struct rsp_que *rsp; - int rc, retries = 10; + int rc; + struct qla_qpair *qpair = NULL; ql_dbg(ql_dbg_aer, base_vha, 0x9004, "Slot Reset.\n"); @@ -7074,24 +7025,16 @@ qla2xxx_pci_slot_reset(struct pci_dev *pdev) goto exit_slot_reset; } - rsp = ha->rsp_q_map[0]; - if (qla2x00_request_irqs(ha, rsp)) - goto exit_slot_reset; if (ha->isp_ops->pci_config(base_vha)) goto exit_slot_reset; - if (IS_QLA82XX(ha)) { - if (qla82xx_error_recovery(base_vha) == QLA_SUCCESS) { - ret = PCI_ERS_RESULT_RECOVERED; - goto exit_slot_reset; - } else - goto exit_slot_reset; - } - - while (ha->flags.mbox_busy && retries--) - msleep(1000); + mutex_lock(&ha->mq_lock); + list_for_each_entry(qpair, &base_vha->qp_list, qp_list_elem) + qpair->online = 1; + mutex_unlock(&ha->mq_lock); + base_vha->flags.online = 1; set_bit(ABORT_ISP_ACTIVE, &base_vha->dpc_flags); if (ha->isp_ops->abort_isp(base_vha) == QLA_SUCCESS) ret = PCI_ERS_RESULT_RECOVERED; @@ -7115,13 +7058,13 @@ qla2xxx_pci_resume(struct pci_dev *pdev) ql_dbg(ql_dbg_aer, base_vha, 0x900f, "pci_resume.\n"); + ha->flags.eeh_busy = 0; + ret = qla2x00_wait_for_hba_online(base_vha); if (ret != QLA_SUCCESS) { ql_log(ql_log_fatal, base_vha, 0x9002, "The device failed to resume I/O from slot/link_reset.\n"); } - - ha->flags.eeh_busy = 0; } static void -- 2.20.1