Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935637AbcKWMdj (ORCPT ); Wed, 23 Nov 2016 07:33:39 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:42358 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933392AbcKWMdi (ORCPT ); Wed, 23 Nov 2016 07:33:38 -0500 From: Mauricio Faria de Oliveira To: james.smart@broadcom.com, martin.petersen@oracle.com, James.Bottomley@HansenPartnership.com Cc: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put() Date: Wed, 23 Nov 2016 10:33:19 -0200 X-Mailer: git-send-email 1.8.3.1 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16112312-0020-0000-0000-000002679291 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16112312-0021-0000-0000-0000307D952E Message-Id: <1479904399-19496-1-git-send-email-mauricfo@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-23_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611230221 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3940 Lines: 104 The BUG_ON() recently introduced in lpfc_sli_ringtxcmpl_put() is hit in the lpfc_els_abort() > lpfc_sli_issue_abort_iotag() > lpfc_sli_abort_iotag_issue() function path [similar names], due to 'piocb->vport == NULL': BUG_ON(!piocb || !piocb->vport); This happens because lpfc_sli_abort_iotag_issue() doesn't set the 'abtsiocbp->vport' pointer -- but this is not the problem. Previously, lpfc_sli_ringtxcmpl_put() accessed 'piocb->vport' only if 'piocb->iocb.ulpCommand' is neither CMD_ABORT_XRI_CN nor CMD_CLOSE_XRI_CN, which are the only possible values for lpfc_sli_abort_iotag_issue(): lpfc_sli_ringtxcmpl_put(): if ((unlikely(pring->ringno == LPFC_ELS_RING)) && (piocb->iocb.ulpCommand != CMD_ABORT_XRI_CN) && (piocb->iocb.ulpCommand != CMD_CLOSE_XRI_CN) && (!(piocb->vport->load_flag & FC_UNLOADING))) lpfc_sli_abort_iotag_issue(): if (phba->link_state >= LPFC_LINK_UP) iabt->ulpCommand = CMD_ABORT_XRI_CN; else iabt->ulpCommand = CMD_CLOSE_XRI_CN; So, this function path would not have hit this possible NULL pointer dereference before. In order to fix this regression, move the second part of the BUG_ON() check prior to the pointer dereference that it does check for. For reference, this is the stack trace observed. The problem happened because an unsolicited event was received - a PLOGI was received after our PLOGI was issued but not yet complete, so the discovery state machine goes on to sw-abort our PLOGI. kernel BUG at drivers/scsi/lpfc/lpfc_sli.c:1326! Oops: Exception in kernel mode, sig: 5 [#1] <...> NIP [...] lpfc_sli_ringtxcmpl_put+0x1c/0xf0 [lpfc] LR [...] __lpfc_sli_issue_iocb_s4+0x188/0x200 [lpfc] Call Trace: [...] [...] __lpfc_sli_issue_iocb_s4+0xb0/0x200 [lpfc] (unreliable) [...] [...] lpfc_sli_issue_abort_iotag+0x2b4/0x350 [lpfc] [...] [...] lpfc_els_abort+0x1a8/0x4a0 [lpfc] [...] [...] lpfc_rcv_plogi+0x6d4/0x700 [lpfc] [...] [...] lpfc_rcv_plogi_plogi_issue+0xd8/0x1d0 [lpfc] [...] [...] lpfc_disc_state_machine+0xc0/0x2b0 [lpfc] [...] [...] lpfc_els_unsol_buffer+0xcc0/0x26c0 [lpfc] [...] [...] lpfc_els_unsol_event+0xa8/0x220 [lpfc] [...] [...] lpfc_complete_unsol_iocb+0xb8/0x138 [lpfc] [...] [...] lpfc_sli4_handle_received_buffer+0x6a0/0xec0 [lpfc] [...] [...] lpfc_sli_handle_slow_ring_event_s4+0x1c4/0x240 [lpfc] [...] [...] lpfc_sli_handle_slow_ring_event+0x24/0x40 [lpfc] [...] [...] lpfc_do_work+0xd88/0x1970 [lpfc] [...] [...] kthread+0x108/0x130 [...] [...] ret_from_kernel_thread+0x5c/0xbc <...> Cc: stable@vger.kernel.org # v4.8 Fixes: 22466da5b4b7 ("lpfc: Fix possible NULL pointer dereference") Signed-off-by: Mauricio Faria de Oliveira --- drivers/scsi/lpfc/lpfc_sli.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index c5326055beee..f4f77c5b0c83 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -1323,18 +1323,20 @@ struct lpfc_iocbq * { lockdep_assert_held(&phba->hbalock); - BUG_ON(!piocb || !piocb->vport); + BUG_ON(!piocb); list_add_tail(&piocb->list, &pring->txcmplq); piocb->iocb_flag |= LPFC_IO_ON_TXCMPLQ; if ((unlikely(pring->ringno == LPFC_ELS_RING)) && (piocb->iocb.ulpCommand != CMD_ABORT_XRI_CN) && - (piocb->iocb.ulpCommand != CMD_CLOSE_XRI_CN) && - (!(piocb->vport->load_flag & FC_UNLOADING))) - mod_timer(&piocb->vport->els_tmofunc, - jiffies + - msecs_to_jiffies(1000 * (phba->fc_ratov << 1))); + (piocb->iocb.ulpCommand != CMD_CLOSE_XRI_CN)) { + BUG_ON(!piocb->vport); + if (!(piocb->vport->load_flag & FC_UNLOADING)) + mod_timer(&piocb->vport->els_tmofunc, + jiffies + + msecs_to_jiffies(1000 * (phba->fc_ratov << 1))); + } return 0; } -- 1.8.3.1