Received: by 10.213.65.68 with SMTP id h4csp2144987imn; Sun, 8 Apr 2018 20:52:57 -0700 (PDT) X-Google-Smtp-Source: AIpwx49DbZFwZsWNKytYfizWrImyOjm9PjIuYaj2zxt7Hxj9g2d3XtfO6REH5IFDZAwsQvxcyOf0 X-Received: by 10.99.132.72 with SMTP id k69mr23953688pgd.367.1523245977474; Sun, 08 Apr 2018 20:52:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523245977; cv=none; d=google.com; s=arc-20160816; b=OKou3DwIFmz3HuJoVbWbryDj7EFS2wXN8l37cmx1Y6xbjLwIZud3fuGKaFrRlVoxA3 OPcJp4cjmhqsmtV+Beagym8Lkg9t/8VbO2sCXZh7kf+mwOxyO9vPl6P2G8DI1ZVxV2u4 oSkLWACHUZkUgRgHFfsVniS3/SqWLFOLdKvdY/e+jZdVU3ou36psz9Tgx9VTOIqgjWtV PRjpCJw7zJFHtDChfHd4Ov8v4fxyqwI9qxZFbELGqMqx4BmG4At/TKI/MFoaRQPQbnbO ae4epEFFpKnd9XEUvhsq9s66utgCbRu9VM8EXgN/6AoOvK6lXQHvCStbINAEkttJJ710 K0kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :spamdiagnosticoutput:content-language:accept-language:in-reply-to :references:message-id:date:thread-index:thread-topic:subject:cc:to :from:dkim-signature:arc-authentication-results; bh=7d87n/QSp3IGS4QiW7OIJTUdwcuGEEw5ASqfuOZrffE=; b=vQicuFpGcKA0XVhWyYcFzQTUnMfbvGzfnJbss7xlaviqJZZfJuD/q6eBEEC/BxxpQk PFs7dwyYOT/lnleOsrMOYG9/SEZUuV1p+zwiSiDc/xdTOnWcAUvBS8dr+hxLDxJeXJZD qnsYSB7HgXpvR9j8v/FOjTtApdoZpDV99om2S5oCOXQhooafBJlKFHYMMmwdKEAqP4j8 E4QJuadcyJR4CoZB/HEBAe6L1V3zZbCoIAzS81FKIZxJDLFKHNNDItQQzaFnnOVgV1mZ gBLka3GzEVwHTBQfrLHdqAwRxcj8J/wrKr18IU70Om6UpuAVSO552WI3L2o2opP6/Mb2 0F7Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=lbWTMWDh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f14si6239801pgn.597.2018.04.08.20.52.20; Sun, 08 Apr 2018 20:52:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector1 header.b=lbWTMWDh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754285AbeDIDrX (ORCPT + 99 others); Sun, 8 Apr 2018 23:47:23 -0400 Received: from mail-bn3nam01hn0242.outbound.protection.outlook.com ([104.47.33.242]:40224 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754157AbeDIAUL (ORCPT ); Sun, 8 Apr 2018 20:20:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=7d87n/QSp3IGS4QiW7OIJTUdwcuGEEw5ASqfuOZrffE=; b=lbWTMWDhn24enwNasnHRwEjBlHTbKvAbNxJnT6jiH+lb5nJ0uDE5cslHwUew3va9j0n2pQqngIm7MvVEnfUg8hl6MzJS7rCpDT6SATyqh702Gw3gBmTr4iUAwv9X2z6W+wZe8KrxhWPfc+7dAegQhUoU7FzQ3Ltivscti23Ttvg= Received: from DM5PR2101MB1032.namprd21.prod.outlook.com (52.132.128.13) by DM5PR2101MB1110.namprd21.prod.outlook.com (52.132.131.167) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.696.0; Mon, 9 Apr 2018 00:20:09 +0000 Received: from DM5PR2101MB1032.namprd21.prod.outlook.com ([fe80::8109:aef0:a777:7059]) by DM5PR2101MB1032.namprd21.prod.outlook.com ([fe80::8109:aef0:a777:7059%2]) with mapi id 15.20.0696.003; Mon, 9 Apr 2018 00:20:09 +0000 From: Sasha Levin To: "stable@vger.kernel.org" , "linux-kernel@vger.kernel.org" CC: "Michael J. Ruhl" , Dennis Dalessandro , Jason Gunthorpe , Sasha Levin Subject: [PATCH AUTOSEL for 4.15 129/189] IB/hfi1: Re-order IRQ cleanup to address driver cleanup race Thread-Topic: [PATCH AUTOSEL for 4.15 129/189] IB/hfi1: Re-order IRQ cleanup to address driver cleanup race Thread-Index: AQHTz5hJ0ZTrnbEoQUC6jQAHf17uSA== Date: Mon, 9 Apr 2018 00:18:29 +0000 Message-ID: <20180409001637.162453-129-alexander.levin@microsoft.com> References: <20180409001637.162453-1-alexander.levin@microsoft.com> In-Reply-To: <20180409001637.162453-1-alexander.levin@microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [52.168.54.252] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;DM5PR2101MB1110;7:a/euYdNl+aOrp6nfn2J2lvazORu/QCxxi/MuApXOj1EDlSGqDTZ9SUoQWY+ZuSrYGdTK2bRRgsvyo97RX5kaeBkTYkaZVeXjBH3dpndhLUzUiqUW04VdwlrGvFcwZ7Lyj/rxQEhC4xquOfhsCXddt64Bc6r4elrlzEcc5Gz+Hyaj0qsOb/mLL81hTjyqtM0feeffOt3+h35tUS/0KjZhWX8pMZS1YJkCIdiMODxzz9xIgO5V4/4OYhA566YDkbPl;20:jtit3AElmhS1ojsJAsgCJKBQPADByPt7lUQ1rYuUcqHT4Pq5TG2UEwnp+lWl1pGPzWh5c13CfGLZDmhN8yviO1YmFWY6CbmPzx8Tb6dl7Hc4ORtyKKvqdAeA/kMOyA+2T7Fe66IqQydMjrkWvPPYhA0qf1mpETnV2WpNhX69IcM= x-ms-office365-filtering-ht: Tenant X-MS-Office365-Filtering-Correlation-Id: 439b0d2b-6730-4c60-ea11-08d59dafa731 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7193020);SRVR:DM5PR2101MB1110; x-ms-traffictypediagnostic: DM5PR2101MB1110: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Alexander.Levin@microsoft.com; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(28532068793085)(89211679590171)(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(8211001083)(61425038)(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(3231221)(944501327)(52105095)(3002001)(10201501046)(6055026)(61426038)(61427038)(6041310)(20161123558120)(20161123562045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(6072148)(201708071742011);SRVR:DM5PR2101MB1110;BCL:0;PCL:0;RULEID:;SRVR:DM5PR2101MB1110; x-forefront-prvs: 0637FCE711 x-forefront-antispam-report: SFV:SPM;SFS:(10019020)(376002)(366004)(346002)(396003)(39380400002)(39860400002)(199004)(189003)(5660300001)(478600001)(86612001)(3846002)(6116002)(106356001)(6486002)(72206003)(25786009)(10290500003)(8936002)(6436002)(76176011)(2616005)(6506007)(2900100001)(97736004)(486006)(26005)(36756003)(10090500001)(81166006)(8676002)(81156014)(59450400001)(102836004)(2906002)(99286004)(11346002)(476003)(3660700001)(66066001)(6512007)(86362001)(53936002)(105586002)(186003)(5250100002)(14454004)(22452003)(6666003)(2501003)(446003)(7736002)(316002)(1076002)(110136005)(3280700002)(54906003)(68736007)(107886003)(4326008)(305945005)(22906009)(427584002)(169823001)(217873001);DIR:OUT;SFP:1501;SCL:5;SRVR:DM5PR2101MB1110;H:DM5PR2101MB1032.namprd21.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: qIA7P3HkxE+oC83QStJIVV1WOztk0mxnLR5GBBxAGvriyrZFfwTSNa0iY8f24g6vCCyy49q/R8lId68dlbNonz+H6UG1DPm1Itf+11J1Df8IdbVG4kChUFivWp2I78Dx0UiwVO18rTgRLAjV4Qb7X2EjCPlewJC4Bzau5t5RhdYYTv0oZtIrBdOPrOXJBouyNkRSXw0DhrCfr4RizkpZ74SNt26p1nnlprG5hMu9dGXSTliri2CsFmRnavgO+hhbJs7gWw2gCEfD+5td4qbQMT5z1Hi0H1Ib0dVC7w2U81LiWtKq0PivDPH8iBaVsfS7ejtkmMN1JVFO+jXpdKaz2MOcvM33yypD5VgtgCU+uWd5fc6JBGDgGj/HuRWDig41uvpPg8lkjmkojaofEhD0vo2kAmkqEmgG5dWEXfJaqHb5tmvbUTlscskTOhxa37xqWt7B7EFdEqJu9sflF7gf+cimg59HWiEGHv7yM+/gUn9NeRwTBwc54HsDofEZkcYP spamdiagnosticoutput: 1:22 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: 439b0d2b-6730-4c60-ea11-08d59dafa731 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Apr 2018 00:18:29.8320 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR2101MB1110 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Michael J. Ruhl" [ Upstream commit 82a979265638c505e12fbe7ba40980dc0901436d ] The pci_request_irq() interfaces always adds the IRQF_SHARED bit to all IRQ requests. When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the IRQF_SHARED bit is set, a call to the IRQ handler is made from the __free_irq() function. This is testing a race condition between the IRQ cleanup and an IRQ racing the cleanup. The HFI driver should be able to handle this race, but does not. This race can cause traces that start with this footprint: BUG: unable to handle kernel NULL pointer dereference at (null) Call Trace: ... __free_irq+0x1b3/0x2d0 free_irq+0x35/0x70 pci_free_irq+0x1c/0x30 clean_up_interrupts+0x53/0xf0 [hfi1] hfi1_start_cleanup+0x122/0x190 [hfi1] postinit_cleanup+0x1d/0x280 [hfi1] remove_one+0x233/0x250 [hfi1] pci_device_remove+0x39/0xc0 Export IRQ cleanup function so it can be called from other modules. Using the exported cleanup function: Re-order the driver cleanup code to clean up IRQ resources before other resources, eliminating the race. Re-order error path for init so that the race does not occur. Reduce severity on spurious error message for SDMA IRQs to info. Reviewed-by: Alex Estrin Reviewed-by: Patel Jay P Reviewed-by: Mike Marciniszyn Signed-off-by: Michael J. Ruhl Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe Signed-off-by: Sasha Levin --- drivers/infiniband/hw/hfi1/chip.c | 18 ++++++++++++------ drivers/infiniband/hw/hfi1/hfi.h | 1 + drivers/infiniband/hw/hfi1/init.c | 4 +++- 3 files changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1= /chip.c index 4f057e8ffe50..a7a5d19b1fe4 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -8263,8 +8263,8 @@ static irqreturn_t sdma_interrupt(int irq, void *data= ) /* handle the interrupt(s) */ sdma_engine_interrupt(sde, status); } else { - dd_dev_err_ratelimited(dd, "SDMA engine %u interrupt, but no status bits= set\n", - sde->this_idx); + dd_dev_info_ratelimited(dd, "SDMA engine %u interrupt, but no status bit= s set\n", + sde->this_idx); } return IRQ_HANDLED; } @@ -12984,7 +12984,14 @@ static void disable_intx(struct pci_dev *pdev) pci_intx(pdev, 0); } =20 -static void clean_up_interrupts(struct hfi1_devdata *dd) +/** + * hfi1_clean_up_interrupts() - Free all IRQ resources + * @dd: valid device data data structure + * + * Free the MSI or INTx IRQs and assoicated PCI resources, + * if they have been allocated. + */ +void hfi1_clean_up_interrupts(struct hfi1_devdata *dd) { int i; =20 @@ -13345,7 +13352,7 @@ static int set_up_interrupts(struct hfi1_devdata *d= d) return 0; =20 fail: - clean_up_interrupts(dd); + hfi1_clean_up_interrupts(dd); return ret; } =20 @@ -14772,7 +14779,6 @@ void hfi1_start_cleanup(struct hfi1_devdata *dd) aspm_exit(dd); free_cntrs(dd); free_rcverr(dd); - clean_up_interrupts(dd); finish_chip_resources(dd); } =20 @@ -15229,7 +15235,7 @@ bail_free_rcverr: bail_free_cntrs: free_cntrs(dd); bail_clear_intr: - clean_up_interrupts(dd); + hfi1_clean_up_interrupts(dd); bail_cleanup: hfi1_pcie_ddcleanup(dd); bail_free: diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/= hfi.h index 8ce9118d4a7f..3c3f71d7919d 100644 --- a/drivers/infiniband/hw/hfi1/hfi.h +++ b/drivers/infiniband/hw/hfi1/hfi.h @@ -1957,6 +1957,7 @@ void hfi1_verbs_unregister_sysfs(struct hfi1_devdata = *dd); int qsfp_dump(struct hfi1_pportdata *ppd, char *buf, int len); =20 int hfi1_pcie_init(struct pci_dev *pdev, const struct pci_device_id *ent); +void hfi1_clean_up_interrupts(struct hfi1_devdata *dd); void hfi1_pcie_cleanup(struct pci_dev *pdev); int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev); void hfi1_pcie_ddcleanup(struct hfi1_devdata *); diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1= /init.c index 8e3b3e7d829a..117a74f22670 100644 --- a/drivers/infiniband/hw/hfi1/init.c +++ b/drivers/infiniband/hw/hfi1/init.c @@ -1058,8 +1058,9 @@ static void shutdown_device(struct hfi1_devdata *dd) } dd->flags &=3D ~HFI1_INITTED; =20 - /* mask interrupts, but not errors */ + /* mask and clean up interrupts, but not errors */ set_intr_state(dd, 0); + hfi1_clean_up_interrupts(dd); =20 for (pidx =3D 0; pidx < dd->num_pports; ++pidx) { ppd =3D dd->pport + pidx; @@ -1702,6 +1703,7 @@ static int init_one(struct pci_dev *pdev, const struc= t pci_device_id *ent) dd_dev_err(dd, "Failed to create /dev devices: %d\n", -j); =20 if (initfail || ret) { + hfi1_clean_up_interrupts(dd); stop_timers(dd); flush_workqueue(ib_wq); for (pidx =3D 0; pidx < dd->num_pports; ++pidx) { --=20 2.15.1