Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752056AbcKGTxq (ORCPT ); Mon, 7 Nov 2016 14:53:46 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:54990 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751032AbcKGTxm (ORCPT ); Mon, 7 Nov 2016 14:53:42 -0500 From: Mauricio Faria de Oliveira To: qla2xxx-upstream@qlogic.com Cc: jejb@linux.vnet.ibm.com, martin.petersen@oracle.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/2] qla2xxx: fix errors in PCI device remove with ongoing I/O Date: Mon, 7 Nov 2016 17:53:29 -0200 X-Mailer: git-send-email 1.8.3.1 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16110719-0028-0000-0000-00000164091C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16110719-0029-0000-0000-0000145C1975 Message-Id: <1478548411-17932-1-git-send-email-mauricfo@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-07_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611070364 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1934 Lines: 64 This patchset addresses a couple of errors that might happen during PCI device remove (e.g., PCI hotplug, PowerVM DLPAR), which prevent the successful removal and re-addition of the adapter to the system, and cause an oops and/or invalid DMA access (triggers an EEH event). It allowed several cycles of PCI device add/remove with ongoing I/O, to complete successfully without triggering oopses or EEH events. Verified on v4.9-rc3. Test-case: --- # lspci <...> 001d:70:00.0 Fibre Channel: QLogic Corp. ISP2532-based ... 001d:70:00.1 Fibre Channel: QLogic Corp. ISP2532-based ... <...> # for sd in $(find /sys/bus/pci/devices/001d:70:00.*/ \ -name 'sd*' -printf "%f\n"); do \ dd if=/dev/$sd of=/dev/null iflag=nocache & done # echo 1 | tee /sys/bus/pci/devices/001d:70:00.*/remove (this either works or not) # echo 1 > /sys/bus/pci/rescan Before: --- <...> EEH: Frozen PHB#1d-PE#700000 detected qla2xxx [001d:70:00.1]-8042:2: PCI/Register disconnect, exiting. <...> EEH: Detected PCI bus error on PHB#29-PE#700000 <...> (and/or) Unable to handle kernel paging request for data at address 0x00000138 <...> NIP [d000000004700a40] qla2xxx_queuecommand+0x80/0x3f0 [qla2xxx] LR [d000000004700a10] qla2xxx_queuecommand+0x50/0x3f0 [qla2xxx] (command does not return; adapter cannot be re-added) After: --- <...> qla2xxx [001d:70:00.0]-801c:1: Abort command issued nexus=1:0:0 -- 1 2003. <...> qla2xxx [001d:70:00.1]-801c:2: Abort command issued nexus=2:3:0 -- 1 2003. <...> (command does return; adapter can be re-added correctly) Mauricio Faria de Oliveira (2): qla2xxx: do not queue commands when unloading qla2xxx: fix invalid DMA access after command aborts in PCI device remove drivers/scsi/qla2xxx/qla_os.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) -- 1.8.3.1