Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753720Ab3JXFxp (ORCPT ); Thu, 24 Oct 2013 01:53:45 -0400 Received: from mail-ie0-f173.google.com ([209.85.223.173]:58752 "EHLO mail-ie0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751506Ab3JXFxn (ORCPT ); Thu, 24 Oct 2013 01:53:43 -0400 MIME-Version: 1.0 In-Reply-To: References: <20131015024452.GA31951@srcf.ucam.org> <20131016202123.GA17866@google.com> Date: Wed, 23 Oct 2013 22:53:42 -0700 X-Google-Sender-Auth: nGA23hAlOMSlvWkzxhM67S0AQv8 Message-ID: Subject: Re: [3.11.4] Thunderbolt/PCI unplug oops in pci_pme_list_scan From: Yinghai Lu To: Bjorn Helgaas Cc: Andreas Noever , Matthew Garrett , "linux-kernel@vger.kernel.org" , "Rafael J. Wysocki" , "linux-pci@vger.kernel.org" , Mika Westerberg , "Kirill A. Shutemov" Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5489 Lines: 116 On Tue, Oct 22, 2013 at 8:32 PM, Bjorn Helgaas wrote: > [+cc Yinghai] > > On Thu, Oct 17, 2013 at 7:59 AM, Andreas Noever > wrote: >> On Wed, Oct 16, 2013 at 10:21 PM, Bjorn Helgaas wrote: >>> On Tue, Oct 15, 2013 at 03:44:52AM +0100, Matthew Garrett wrote: >>>> On Mon, Oct 14, 2013 at 05:50:38PM -0600, Bjorn Helgaas wrote: >>>> > [+cc Rafael, Mika, Kirill, linux-pci] >>>> > >>>> > On Mon, Oct 14, 2013 at 4:47 PM, Andreas Noever >>>> > wrote: >>>> > > When I unplug the Thunderbolt ethernet adapter on my MacBookPro Linux >>>> > > crashes a few seconds later. Using >>>> > > echo 1 > /sys/bus/pci/devices/0000:08:00.0/remove >>>> > > to remove a bridge two levels above the device triggers the fault immediately: >>>> > >>>> > There have been significant changes in acpiphp related to Thunderbolt >>>> > since v3.11. >>>> >>>> Apple don't expose Thunderbolt via ACPI, so it appears as native PCIe. >>>> I'd be surprised if acpiphp makes a difference here. >>> >>> Yeah, you're right; I wasn't paying attention. >>> >>> We save a pci_dev pointer in the pci_pme_list, which of course has a >>> longer lifetime than the pci_dev itself, but we don't acquire a reference >>> on it, so I suspect the pci_dev got released before we got around to >>> doing the pci_pme_list_scan(). >>> >>> Andreas, can you try the patch below? It's against v3.12-rc2, but it >>> should apply to v3.11, too. >> >> I have tested your patch against 3.11 where it solves the problem. Thanks! >> >> Unfortunately I could not reproduce the problem in 3.12-rc5. I only >> get the following warning (and no crash): >> >> tg3 0000:0a:00.0: PME# disabled >> pcieport 0000:09:00.0: PME# disabled >> pciehp 0000:09:00.0:pcie24: unloading service driver pciehp >> pci_bus 0000:0a: dev 00, dec refcount to 0 >> pci_bus 0000:0a: dev 00, released physical slot 9 >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 122 at drivers/pci/pci.c:1430 >> pci_disable_device+0x84/0x90() >> Device pcieport >> disabling already-disabled device >> Modules linked in: >> btusb bluetooth joydev hid_apple bcm5974 nls_utf8 nls_cp437 hfsplus >> vfat fat snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp >> coretemp kvm_intel kvm cfg80211 uvcvideo crc32_pclmul crc32c_intel >> videobuf2_vmalloc ghash_clmulni_intel aesni_intel videobuf2_memops >> aes_x86_64 glue_helper videobuf2_core tg3 videodev lrw gf128mul >> ablk_helper iTCO_wdt hid_generic iTCO_vendor_support cryptd media >> applesmc input_polldev usbhid ptp microcode snd_hda_codec_cirrus hid >> pps_core libphy rfkill i2c_i801 pcspkr snd_hda_intel apple_gmux >> lib80211 snd_hda_codec acpi_cpufreq snd_hwdep snd_pcm snd_page_alloc >> snd_timer mei_me snd mei processor soundcore lpc_ich evdev mfd_core >> apple_bl ac battery ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata >> xhci_hcd ehci_pci sdhci_pci ehci_hcd sdhci scsi_mod mmc_core >> usbcore usb_common nouveau mxm_wmi wmi ttm i915 video button >> i2c_algo_bit intel_agp intel_gtt drm_kms_helper drm i2c_core >> CPU: 0 PID: 122 Comm: kworker/u16:5 Not tainted 3.12.0-1-dirty #30 >> Hardware name: Apple Inc. MacBookPro10,1/Mac-C3EC7CD22292981F, BIOS >> MBP101.88Z.00EE.B03.1212211437 12/21/2012 >> Workqueue: sysfsd sysfs_schedule_callback_work >> 0000000000000009 ffff88044c021c00 ffffffff814c4288 ffff88044c021c48 >> ffff88044c021c38 ffffffff81061b7d ffff880458a5c000 ffffffff8187c5c0 >> ffff880458a5c000 ffff880458a5b098 0000000000000000 ffff88044c021c98 >> Call Trace: >> [] dump_stack+0x54/0x8d >> [] warn_slowpath_common+0x7d/0xa0 >> [] warn_slowpath_fmt+0x4c/0x50 >> [] ? do_pci_disable_device+0x52/0x60 >> [] ? acpi_pci_irq_disable+0x4c/0x8d >> [] pci_disable_device+0x84/0x90 >> [] pcie_portdrv_remove+0x1a/0x20 >> [] pci_device_remove+0x3b/0xb0 >> [] __device_release_driver+0x7f/0xf0 >> [] device_release_driver+0x23/0x30 >> [] bus_remove_device+0x108/0x180 >> [] device_del+0x135/0x1d0 >> [] pci_stop_bus_device+0x94/0xa0 >> [] pci_stop_bus_device+0x3b/0xa0 >> [] pci_stop_and_remove_bus_device+0x12/0x20 >> [] remove_callback+0x25/0x40 >> [] sysfs_schedule_callback_work+0x14/0x80 >> [] process_one_work+0x178/0x470 >> [] worker_thread+0x121/0x3a0 >> [] ? manage_workers.isra.21+0x2b0/0x2b0 >> [] kthread+0xc0/0xd0 >> [] ? kthread_create_on_node+0x120/0x120 >> [] ret_from_fork+0x7c/0xb0 >> [] ? kthread_create_on_node+0x120/0x120 >> ---[ end trace b39a15fa94fbb2a2 ]--- >> >> >> Bisection points to 928bea964827d7824b548c1f8e06eccbbc4d0d7d . > > This is "PCI: Delay enabling bridges until they're needed" by Yinghai. that double disabling should be addressed by: https://lkml.org/lkml/2013/4/25/608 [PATCH] PCI: Remove duplicate pci_disable_device for pcie port Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/