Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750920AbdL3FvM convert rfc822-to-8bit (ORCPT ); Sat, 30 Dec 2017 00:51:12 -0500 Received: from mga03.intel.com ([134.134.136.65]:14374 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750806AbdL3FvK (ORCPT ); Sat, 30 Dec 2017 00:51:10 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,478,1508828400"; d="scan'208";a="16109779" From: "Brown, Aaron F" To: Lyude Paul , "intel-wired-lan@lists.osuosl.org" CC: "Fujinaka, Todd" , Stephen Hemminger , "stable@vger.kernel.org" , "Kirsher, Jeffrey T" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH v3] igb: Free IRQs when device is hotplugged Thread-Topic: [PATCH v3] igb: Free IRQs when device is hotplugged Thread-Index: AQHTc3/sFrBA2eUKYEaHagg/DNxzf6NbfSwg Date: Sat, 30 Dec 2017 05:51:08 +0000 Message-ID: <309B89C4C689E141A5FF6A0C5FB2118B8C71FCAE@ORSMSX103.amr.corp.intel.com> References: <20171212193130.5971-1-lyude@redhat.com> In-Reply-To: <20171212193130.5971-1-lyude@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNGFiMjUzNTktYmI5NS00NWQ1LTk5ZDgtM2U5NmIwNDg2YjRkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6Im0wdSt3Z3RMQmpiYjAwT0xZWFhNNXFHTUR5UU1FMXNDNHlSMmtqZ3lxUUE9In0= x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [10.22.254.139] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4938 Lines: 100 > From: netdev-owner@vger.kernel.org [mailto:netdev- > owner@vger.kernel.org] On Behalf Of Lyude Paul > Sent: Tuesday, December 12, 2017 11:32 AM > To: intel-wired-lan@lists.osuosl.org > Cc: Fujinaka, Todd ; Stephen Hemminger > ; stable@vger.kernel.org; Kirsher, Jeffrey T > ; netdev@vger.kernel.org; linux- > kernel@vger.kernel.org > Subject: [PATCH v3] igb: Free IRQs when device is hotplugged > > Recently I got a Caldigit TS3 Thunderbolt 3 dock, and noticed that upon > hotplugging my kernel would immediately crash due to igb: > > [ 680.825801] kernel BUG at drivers/pci/msi.c:352! > [ 680.828388] invalid opcode: 0000 [#1] SMP > [ 680.829194] Modules linked in: igb(O) thunderbolt i2c_algo_bit joydev vfat > fat btusb btrtl btbcm btintel bluetooth ecdh_generic hp_wmi > sparse_keymap rfkill wmi_bmof iTCO_wdt intel_rapl > x86_pkg_temp_thermal coretemp crc32_pclmul snd_pcm rtsx_pci_ms > mei_me snd_timer memstick snd pcspkr mei soundcore i2c_i801 tpm_tis > psmouse shpchp wmi tpm_tis_core tpm video hp_wireless acpi_pad > rtsx_pci_sdmmc mmc_core crc32c_intel serio_raw rtsx_pci mfd_core > xhci_pci xhci_hcd i2c_hid i2c_core [last unloaded: igb] > [ 680.831085] CPU: 1 PID: 78 Comm: kworker/u16:1 Tainted: G O > 4.15.0-rc3Lyude-Test+ #6 > [ 680.831596] Hardware name: HP HP ZBook Studio G4/826B, BIOS P71 Ver. > 01.03 06/09/2017 > [ 680.832168] Workqueue: kacpi_hotplug acpi_hotplug_work_fn > [ 680.832687] RIP: 0010:free_msi_irqs+0x180/0x1b0 > [ 680.833271] RSP: 0018:ffffc9000030fbf0 EFLAGS: 00010286 > [ 680.833761] RAX: ffff8803405f9c00 RBX: ffff88033e3d2e40 RCX: > 000000000000002c > [ 680.834278] RDX: 0000000000000000 RSI: 00000000000000ac RDI: > ffff880340be2178 > [ 680.834832] RBP: 0000000000000000 R08: ffff880340be1ff0 R09: > ffff8803405f9c00 > [ 680.835342] R10: 0000000000000000 R11: 0000000000000040 R12: > ffff88033d63a298 > [ 680.835822] R13: ffff88033d63a000 R14: 0000000000000060 R15: > ffff880341959000 > [ 680.836332] FS: 0000000000000000(0000) GS:ffff88034f440000(0000) > knlGS:0000000000000000 > [ 680.836817] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 680.837360] CR2: 000055e64044afdf CR3: 0000000001c09002 CR4: > 00000000003606e0 > [ 680.837954] Call Trace: > [ 680.838853] pci_disable_msix+0xce/0xf0 > [ 680.839616] igb_reset_interrupt_capability+0x5d/0x60 [igb] > [ 680.840278] igb_remove+0x9d/0x110 [igb] > [ 680.840764] pci_device_remove+0x36/0xb0 > [ 680.841279] device_release_driver_internal+0x157/0x220 > [ 680.841739] pci_stop_bus_device+0x7d/0xa0 > [ 680.842255] pci_stop_bus_device+0x2b/0xa0 > [ 680.842722] pci_stop_bus_device+0x3d/0xa0 > [ 680.843189] pci_stop_and_remove_bus_device+0xe/0x20 > [ 680.843627] trim_stale_devices+0xf3/0x140 > [ 680.844086] trim_stale_devices+0x94/0x140 > [ 680.844532] trim_stale_devices+0xa6/0x140 > [ 680.845031] ? get_slot_status+0x90/0xc0 > [ 680.845536] acpiphp_check_bridge.part.5+0xfe/0x140 > [ 680.846021] acpiphp_hotplug_notify+0x175/0x200 > [ 680.846581] ? free_bridge+0x100/0x100 > [ 680.847113] acpi_device_hotplug+0x8a/0x490 > [ 680.847535] acpi_hotplug_work_fn+0x1a/0x30 > [ 680.848076] process_one_work+0x182/0x3a0 > [ 680.848543] worker_thread+0x2e/0x380 > [ 680.848963] ? process_one_work+0x3a0/0x3a0 > [ 680.849373] kthread+0x111/0x130 > [ 680.849776] ? kthread_create_worker_on_cpu+0x50/0x50 > [ 680.850188] ret_from_fork+0x1f/0x30 > [ 680.850601] Code: 43 14 85 c0 0f 84 d5 fe ff ff 31 ed eb 0f 83 c5 01 39 6b 14 0f > 86 c5 fe ff ff 8b 7b 10 01 ef e8 b7 e4 d2 ff 48 83 78 70 00 74 e3 <0f> 0b 49 8d b5 > a0 00 00 00 e8 62 6f d3 ff e9 c7 fe ff ff 48 8b > [ 680.851497] RIP: free_msi_irqs+0x180/0x1b0 RSP: ffffc9000030fbf0 > > As it turns out, normally the freeing of IRQs that would fix this is called > inside of the scope of __igb_close(). However, since the device is > already gone by the point we try to unregister the netdevice from the > driver due to a hotplug we end up seeing that the netif isn't present > and thus, forget to free any of the device IRQs. > > So: make sure that if we're in the process of dismantling the netdev, we > always allow __igb_close() to be called so that IRQs may be freed > normally. Additionally, only allow igb_close() to be called from > __igb_close() if it hasn't already been called for the given adapter. > > Signed-off-by: Lyude Paul > Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach") > Cc: Todd Fujinaka > Cc: Stephen Hemminger > Cc: stable@vger.kernel.org > --- > Changes since v2: > - Remove hunk in __igb_close() that was left over by accident, it's > not needed > > drivers/net/ethernet/intel/igb/igb_main.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Tested-by: Aaron Brown