Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941090AbcKNWXB (ORCPT ); Mon, 14 Nov 2016 17:23:01 -0500 Received: from mail-cys01nam02on0071.outbound.protection.outlook.com ([104.47.37.71]:36997 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S941041AbcKNWW4 (ORCPT ); Mon, 14 Nov 2016 17:22:56 -0500 From: "Madhani, Himanshu" To: Mauricio Faria de Oliveira , "qla2xxx-upstream@qlogic.com" CC: "martin.petersen@oracle.com" , "jejb@linux.vnet.ibm.com" , "linux-scsi@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] qla2xxx: do not abort all commands in the adapter during EEH recovery Thread-Topic: [PATCH] qla2xxx: do not abort all commands in the adapter during EEH recovery Thread-Index: AQHSPr3FMCd9D0bkkE2D6ropx8xPTKDYg0AA Date: Mon, 14 Nov 2016 22:07:01 +0000 Message-ID: <15B66EB6-63CD-46D3-B6A9-444CCD181D2B@cavium.com> References: <1479158782-4544-1-git-send-email-mauricfo@linux.vnet.ibm.com> In-Reply-To: <1479158782-4544-1-git-send-email-mauricfo@linux.vnet.ibm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Himanshu.Madhani@cavium.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [173.186.134.106] x-microsoft-exchange-diagnostics: 1;SN1PR0701MB1839;7:YkcCivM1EIYtYR+yOO1kE+4PNnNS1Oan9by3vNq9oRhcLnI97WoKg23feciO2FMsi0U0SPzbLKTwzyB6KbvJeJhqamcYhP1Z823ArDBiLfewMl4uab8y0WrEb+nHvhJ7Qa7DEhYFaQ8LtzlugPrHoOBg/++PfurHfuhmXqm4+cQmshPs2hCZwDOMZtqVJVUehwg+bJlIUFYzuYFtxnwouQliWb4X3k0xTXDWh3BDztp+ZV/uUMrvqc3Qw4Ptwr1tOgozUm0Ulnbg1l1CkJnyGcxyWtxGMPLbUXTQ1qfhFcMO8cf4iPefQLDOmIRM1fK0x0hBWer0wcQta8Cvi7CXU5g/p6Pl+lU5UMXlA9BflT0= x-ms-office365-filtering-correlation-id: dea81389-774e-48c9-47de-08d40cda8f9f x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:SN1PR0701MB1839; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(104084551191319); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6060326)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6061324);SRVR:SN1PR0701MB1839;BCL:0;PCL:0;RULEID:;SRVR:SN1PR0701MB1839; x-forefront-prvs: 0126A32F74 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(979002)(6009001)(7916002)(377454003)(189002)(24454002)(199003)(106116001)(305945005)(7846002)(36756003)(105586002)(66066001)(3846002)(7736002)(99286002)(8936002)(6116002)(106356001)(102836003)(86362001)(92566002)(68736007)(82746002)(5001770100001)(97736004)(2501003)(229853002)(5660300001)(83716003)(87936001)(50986999)(33656002)(4326007)(2906002)(76176999)(54356999)(3280700002)(2950100002)(3660700001)(81166006)(101416001)(81156014)(122556002)(189998001)(8676002)(2900100001)(77096005)(104396002)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR0701MB1839;H:SN1PR0701MB1837.namprd07.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <67653AF62BD8AC49999BD6C4C5918AD1@namprd07.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Nov 2016 22:07:01.5054 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR0701MB1839 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id uAEMNEcX023388 Content-Length: 2849 Lines: 71 On 11/14/16, 1:26 PM, "Mauricio Faria de Oliveira" wrote: >The previous commit ("qla2xxx: fix invalid DMA access after command >aborts in PCI device remove") introduced a regression during an EEH >recovery, since the change to the qla2x00_abort_all_cmds() function >calls qla2xxx_eh_abort(), which verifies the EEH recovery condition >but handles it heavy-handed. (commit a465537ad1a4 "qla2xxx: Disable >the adapter and skip error recovery in case of register disconnect.") > >This problem warrants a more general/optimistic solution right into >qla2xxx_eh_abort() (eg in case a real command abort arrives during >EEH recovery, or if it takes long enough to trigger command aborts); >but it's still worth to add a check to ensure the code added by the >previous commit is correct and contained within its owner function. > >This commit just adds a 'if (!ha->flags.eeh_busy)' check around it. >(ahem; a trivial fix for this -rc series; sorry for this oversight.) > >With it applied, both PCI device remove and EEH recovery works fine. > >Fixes: 1535aa75a3d8 ("scsi: qla2xxx: fix invalid DMA access after >command aborts in PCI device remove") >Signed-off-by: Mauricio Faria de Oliveira >--- > drivers/scsi/qla2xxx/qla_os.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > >diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c >index 567fa080e261..56d6142852a5 100644 >--- a/drivers/scsi/qla2xxx/qla_os.c >+++ b/drivers/scsi/qla2xxx/qla_os.c >@@ -1456,15 +1456,20 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) > for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { > sp = req->outstanding_cmds[cnt]; > if (sp) { >- /* Get a reference to the sp and drop the lock. >- * The reference ensures this sp->done() call >- * - and not the call in qla2xxx_eh_abort() - >- * ends the SCSI command (with result 'res'). >+ /* Don't abort commands in adapter during EEH >+ * recovery as it's not accessible/responding. > */ >- sp_get(sp); >- spin_unlock_irqrestore(&ha->hardware_lock, flags); >- qla2xxx_eh_abort(GET_CMD_SP(sp)); >- spin_lock_irqsave(&ha->hardware_lock, flags); >+ if (!ha->flags.eeh_busy) { >+ /* Get a reference to the sp and drop the lock. >+ * The reference ensures this sp->done() call >+ * - and not the call in qla2xxx_eh_abort() - >+ * ends the SCSI command (with result 'res'). >+ */ >+ sp_get(sp); >+ spin_unlock_irqrestore(&ha->hardware_lock, flags); >+ qla2xxx_eh_abort(GET_CMD_SP(sp)); >+ spin_lock_irqsave(&ha->hardware_lock, flags); >+ } > req->outstanding_cmds[cnt] = NULL; > sp->done(vha, sp, res); > } >-- >1.8.3.1 > Acked-by: Himanshu Madhani Thanks, Himanshu