2009-03-09 06:37:14

by Alex Chiang

[permalink] [raw]
Subject: kobj refcounting weirdness

Hi Kay, Greg,

I've been working on this patch series recently that adds
function and device level hotplug into the PCI core:

http://thread.gmane.org/gmane.linux.kernel.pci/3495

For the last two weeks, I've been beating my head against a
refcounting/kobject problem, and was hoping you could give me
some advice, since I seem to have run into a wall.

My test case has been removing device 0000:04:00.0, which should
remove all the devices below it.

+-[0000:03]---00.0-[0000:04-07]----00.0-[0000:05-07]--+-02.0-[0000:06]--+-00.0 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
| | \-00.1 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
| \-04.0-[0000:07]--+-00.0 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
| \-00.1 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter

I can remove the device and rescan the bus once, and it works
fine. The second removal works fine, and then, unpredictably,
later rescan/remove cycles eventually end up producing a warning
and oops every time. Sometimes I die on the 2nd rescan, sometimes
not until the 4th or 5th remove/rescan cycle.

In this data set, I turned on kobject debugging, and managed to
capture a trace where we die on the 2nd rescan.

In this data set, we:

- create a kobject for 0000:04:00.0 (e00000018cac2920)
- remove the device
- observe '0000:04:00.0' (e00000018cac2920): calling ktype release
- rescan the bus
- discover that e00000018cac2920 is still hanging around!

At this point, I'm not sure what else to instrument or look at to
see what might be going on, and was hoping you could give me a
pointer.

One final data point, if I build my kernel without PCIe support,
the same experiment on the same topology/devices above can
withstand infinite remove/recan cycles (where infinite is defined
as > 20 before I got bored ;).

I'm going to include the kobject debug info below along with some
annotations. There's a lot of data, but it should be
straightforward to follow. It doesn't get interesting until the
2nd remove/rescan.

Any help/advice you have for me would be greatly appreciated.

Thanks!

/ac

[During initial boot]
ACPI: PCI Root Bridge [L003] (0000:03)
kobject: 'pci0000:03' (e0000001809b3090): kobject_add_internal: parent: 'devices', set: 'devices'
kobject: 'pci0000:03' (e0000001809b3090): kobject_uevent_env
kobject: 'pci0000:03' (e0000001809b3090): kobject_uevent_env: filter function caused the event to drop!
kobject: '0000:03' (e0000001801961b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:03' (e0000001801961b8): kobject_uevent_env
kobject: '0000:03' (e0000001801961b8): fill_kobj_path: path = '/class/pci_bus/0000:03'
PCI: Scanning bus 0000:03
[scan buses 0000:03 -- 0000:07]
kobject: '0000:03:00.0' (e000000181661120): kobject_add_internal: parent: 'pci0000:03', set: 'devices'
kobject: '0000:03:00.0' (e000000181661120): kobject_uevent_env
kobject: '0000:03:00.0' (e000000181661120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0'
kobject: '0000:04:00.0' (e000000181661920): kobject_add_internal: parent: '0000:03:00.0', set: 'devices'
kobject: '0000:04:00.0' (e000000181661920): kobject_uevent_env
kobject: '0000:04:00.0' (e000000181661920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:05:02.0' (e000000181662120): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:05:02.0' (e000000181662120): kobject_uevent_env
kobject: '0000:05:02.0' (e000000181662120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:05:04.0' (e000000181662920): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:05:04.0' (e000000181662920): kobject_uevent_env
kobject: '0000:05:04.0' (e000000181662920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:06:00.0' (e000000181663120): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:06:00.0' (e000000181663120): kobject_uevent_env
kobject: '0000:06:00.0' (e000000181663120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
kobject: '0000:06:00.1' (e000000181663920): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:06:00.1' (e000000181663920): kobject_uevent_env
kobject: '0000:06:00.1' (e000000181663920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.1'
kobject: '0000:06' (e000000180196db8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:06' (e000000180196db8): kobject_uevent_env
kobject: '0000:06' (e000000180196db8): fill_kobj_path: path = '/class/pci_bus/0000:06'
kobject: '0000:05:02.0' (e000000181662120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:07:00.0' (e000000181664120): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:07:00.0' (e000000181664120): kobject_uevent_env
kobject: '0000:07:00.0' (e000000181664120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.0'
kobject: '0000:07:00.1' (e000000181664920): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:07:00.1' (e000000181664920): kobject_uevent_env
kobject: '0000:07:00.1' (e000000181664920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.1'
kobject: '0000:07' (e0000001801971b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:07' (e0000001801971b8): kobject_uevent_env
kobject: '0000:07' (e0000001801971b8): fill_kobj_path: path = '/class/pci_bus/0000:07'
kobject: '0000:05:04.0' (e000000181662920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:05' (e0000001801969b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:05' (e0000001801969b8): kobject_uevent_env
kobject: '0000:05' (e0000001801969b8): fill_kobj_path: path = '/class/pci_bus/0000:05'
kobject: '0000:04:00.0' (e000000181661920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04' (e0000001801965b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:04' (e0000001801965b8): kobject_uevent_env
kobject: '0000:04' (e0000001801965b8): fill_kobj_path: path = '/class/pci_bus/0000:04'
kobject: '0000:03:00.0' (e000000181661120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0'
[PCIe port driver gets loaded and claims ports]
kobject: '0000:03:00.0:pcie01' (e0000001830b00b0): kobject_add_internal: parent: '0000:03:00.0', set: 'devices'
kobject: '0000:03:00.0:pcie01' (e0000001830b00b0): kobject_uevent_env
kobject: '0000:03:00.0:pcie01' (e0000001830b00b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:03:00.0:pcie01'
kobject: '0000:03:00.0:pcie02' (e0000001830b02b0): kobject_add_internal: parent: '0000:03:00.0', set: 'devices'
kobject: '0000:03:00.0:pcie02' (e0000001830b02b0): kobject_uevent_env
kobject: '0000:03:00.0:pcie02' (e0000001830b02b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:03:00.0:pcie02'
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): kobject_uevent_env
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18'
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie22'
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie28'
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): kobject_uevent_env
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie22'
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): kobject_uevent_env
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie28'

[system completes boot, now ready for 1st remove iteration]
[root@tahitifp1 /sys/bus/pci]# echo 1 > devices/0000\:04\:00.0/remove
kobject: '0000:06:00.0' (e000000181663120): kobject_uevent_env
kobject: '0000:06:00.0' (e000000181663120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
kobject: '0000:06:00.0' (e000000181663120): kobject_cleanup
kobject: '0000:06:00.0' (e000000181663120): calling ktype release
kobject: '0000:06:00.0': free name
kobject: '0000:06:00.1' (e000000181663920): kobject_uevent_env
kobject: '0000:06:00.1' (e000000181663920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.1'
kobject: '0000:06:00.1' (e000000181663920): kobject_cleanup
kobject: '0000:06:00.1' (e000000181663920): calling ktype release
kobject: '0000:06:00.1': free name
kobject: '0000:06' (e000000180196db8): kobject_uevent_env
kobject: '0000:06' (e000000180196db8): fill_kobj_path: path = '/class/pci_bus/0000:06'
kobject: '0000:05:02.0' (e000000181662120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:06' (e000000180196db8): kobject_cleanup
kobject: '0000:06' (e000000180196db8): calling ktype release
kobject: '0000:06': free name
aer 0000:05:02.0:pcie22: unloading service driver aer
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie22'
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): kobject_cleanup
kobject: '0000:05:02.0:pcie22' (e0000001830b06b0): calling ktype release
kobject: '0000:05:02.0:pcie22': free name
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie28'
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): kobject_cleanup
kobject: '0000:05:02.0:pcie28' (e0000001830b08b0): calling ktype release
kobject: '0000:05:02.0:pcie28': free name
kobject: '0000:05:02.0' (e000000181662120): kobject_uevent_env
kobject: '0000:05:02.0' (e000000181662120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:05:02.0' (e000000181662120): kobject_cleanup
kobject: '0000:05:02.0' (e000000181662120): calling ktype release
kobject: '0000:05:02.0': free name
kobject: '0000:07:00.0' (e000000181664120): kobject_uevent_env
kobject: '0000:07:00.0' (e000000181664120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.0'
kobject: '0000:07:00.0' (e000000181664120): kobject_cleanup
kobject: '0000:07:00.0' (e000000181664120): calling ktype release
kobject: '0000:07:00.0': free name
kobject: '0000:07:00.1' (e000000181664920): kobject_uevent_env
kobject: '0000:07:00.1' (e000000181664920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.1'
kobject: '0000:07:00.1' (e000000181664920): kobject_cleanup
kobject: '0000:07:00.1' (e000000181664920): calling ktype release
kobject: '0000:07:00.1': free name
kobject: '0000:07' (e0000001801971b8): kobject_uevent_env
kobject: '0000:07' (e0000001801971b8): fill_kobj_path: path = '/class/pci_bus/0000:07'
kobject: '0000:05:04.0' (e000000181662920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:07' (e0000001801971b8): kobject_cleanup
kobject: '0000:07' (e0000001801971b8): calling ktype release
kobject: '0000:07': free name
aer 0000:05:04.0:pcie22: unloading service driver aer
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): kobject_uevent_env
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie22'
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): kobject_cleanup
kobject: '0000:05:04.0:pcie22' (e0000001830b0ab0): calling ktype release
kobject: '0000:05:04.0:pcie22': free name
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): kobject_uevent_env
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie28'
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): kobject_cleanup
kobject: '0000:05:04.0:pcie28' (e0000001830b0cb0): calling ktype release
kobject: '0000:05:04.0:pcie28': free name
kobject: '0000:05:04.0' (e000000181662920): kobject_uevent_env
kobject: '0000:05:04.0' (e000000181662920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:05:04.0' (e000000181662920): kobject_cleanup
kobject: '0000:05:04.0' (e000000181662920): calling ktype release
kobject: '0000:05:04.0': free name
kobject: '0000:05' (e0000001801969b8): kobject_uevent_env
kobject: '0000:05' (e0000001801969b8): fill_kobj_path: path = '/class/pci_bus/0000:05'
kobject: '0000:04:00.0' (e000000181661920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:05' (e0000001801969b8): kobject_cleanup
kobject: '0000:05' (e0000001801969b8): calling ktype release
kobject: '0000:05': free name
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): kobject_uevent_env
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18'
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): kobject_cleanup
kobject: '0000:04:00.0:pcie18' (e0000001830b04b0): calling ktype release
kobject: '0000:04:00.0:pcie18': free name
kobject: '0000:04:00.0' (e000000181661920): kobject_uevent_env
kobject: '0000:04:00.0' (e000000181661920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04' (e0000001801965b8): kobject_uevent_env
kobject: '0000:04' (e0000001801965b8): fill_kobj_path: path = '/class/pci_bus/0000:04'
kobject: '0000:03:00.0' (e000000181661120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0'
kobject: '0000:04' (e0000001801965b8): kobject_cleanup
kobject: '0000:04' (e0000001801965b8): calling ktype release
kobject: '0000:04': free name
pci 0000:04:00.0: remove_callback ref: 1
kobject: '0000:04:00.0' (e000000181661920): kobject_cleanup
kobject: '0000:04:00.0' (e000000181661920): calling ktype release
kobject: '0000:04:00.0': free name
pci 0000:04:00.0: remove_callback e000000181661800 ref: 0

[remove succeeded, now rescan the PCI bus. we expect to find 0000:04:00.0
and all devices below it. this rescan is successful. Note kobject
e00000018cac2920 -- this is the device we're interested in.]
[root@tahitifp1 /sys/bus/pci]# echo 1 > rescan
kobject: '0000:04:00.0' (e00000018cac2920): kobject_add_internal: parent: '0000:03:00.0', set: 'devices'
kobject: '0000:04:00.0' (e00000018cac2920): kobject_uevent_env
kobject: '0000:04:00.0' (e00000018cac2920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): kobject_uevent_env
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18'
kobject: '0000:05:02.0' (e00000018cac6120): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:05:02.0' (e00000018cac6120): kobject_uevent_env
kobject: '0000:05:02.0' (e00000018cac6120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie22'
aer 0000:05:02.0:pcie22: service driver aer loaded
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie28'
kobject: '0000:05:04.0' (e00000018cac9920): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:05:04.0' (e00000018cac9920): kobject_uevent_env
kobject: '0000:05:04.0' (e00000018cac9920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): kobject_uevent_env
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie22'
aer 0000:05:04.0:pcie22: service driver aer loaded
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): kobject_uevent_env
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie28'
kobject: '0000:06:00.0' (e00000018cacc920): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:06:00.0' (e00000018cacc920): kobject_uevent_env
kobject: '0000:06:00.0' (e00000018cacc920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
kobject: '0000:06:00.1' (e00000018cacf120): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'
kobject: '0000:06:00.1' (e00000018cacf120): kobject_uevent_env
kobject: '0000:06:00.1' (e00000018cacf120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.1'
kobject: '0000:06' (e000000181ded1b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:06' (e000000181ded1b8): kobject_uevent_env
kobject: '0000:06' (e000000181ded1b8): fill_kobj_path: path = '/class/pci_bus/0000:06'
kobject: '0000:05:02.0' (e00000018cac6120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:07:00.0' (e00000018cacb120): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:07:00.0' (e00000018cacb120): kobject_uevent_env
kobject: '0000:07:00.0' (e00000018cacb120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.0'
kobject: '0000:07:00.1' (e00000018cacd120): kobject_add_internal: parent: '0000:05:04.0', set: 'devices'
kobject: '0000:07:00.1' (e00000018cacd120): kobject_uevent_env
kobject: '0000:07:00.1' (e00000018cacd120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.1'
kobject: '0000:07' (e0000001801965b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:07' (e0000001801965b8): kobject_uevent_env
kobject: '0000:07' (e0000001801965b8): fill_kobj_path: path = '/class/pci_bus/0000:07'
kobject: '0000:05:04.0' (e00000018cac9920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:05' (e000000181dec1b8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:05' (e000000181dec1b8): kobject_uevent_env
kobject: '0000:05' (e000000181dec1b8): fill_kobj_path: path = '/class/pci_bus/0000:05'
kobject: '0000:04:00.0' (e00000018cac2920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04' (e000000181defdb8): kobject_add_internal: parent: 'pci_bus', set: 'devices'
kobject: '0000:04' (e000000181defdb8): kobject_uevent_env
kobject: '0000:04' (e000000181defdb8): fill_kobj_path: path = '/class/pci_bus/0000:04'
kobject: '0000:03:00.0' (e000000181661120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0'

[rescan succeeded, time to remove again. again, note kobject e00000018cac2920]
[root@tahitifp1 /sys/bus/pci]# echo 1 > devices/0000\:04\:00.0/remove
kobject: '0000:06:00.0' (e00000018cacc920): kobject_uevent_env
kobject: '0000:06:00.0' (e00000018cacc920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
kobject: '0000:06:00.0' (e00000018cacc920): kobject_cleanup
kobject: '0000:06:00.0' (e00000018cacc920): calling ktype release
kobject: '0000:06:00.0': free name
kobject: '0000:06:00.1' (e00000018cacf120): kobject_uevent_env
kobject: '0000:06:00.1' (e00000018cacf120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.1'
kobject: '0000:06:00.1' (e00000018cacf120): kobject_cleanup
kobject: '0000:06:00.1' (e00000018cacf120): calling ktype release
kobject: '0000:06:00.1': free name
kobject: '0000:06' (e000000181ded1b8): kobject_uevent_env
kobject: '0000:06' (e000000181ded1b8): fill_kobj_path: path = '/class/pci_bus/0000:06'
kobject: '0000:05:02.0' (e00000018cac6120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:06' (e000000181ded1b8): kobject_cleanup
kobject: '0000:06' (e000000181ded1b8): calling ktype release
kobject: '0000:06': free name
aer 0000:05:02.0:pcie22: unloading service driver aer
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie22'
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): kobject_cleanup
kobject: '0000:05:02.0:pcie22' (e000000184cf26b0): calling ktype release
kobject: '0000:05:02.0:pcie22': free name
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): kobject_uevent_env
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie28'
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): kobject_cleanup
kobject: '0000:05:02.0:pcie28' (e000000184cf22b0): calling ktype release
kobject: '0000:05:02.0:pcie28': free name
kobject: '0000:05:02.0' (e00000018cac6120): kobject_uevent_env
kobject: '0000:05:02.0' (e00000018cac6120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:05:02.0' (e00000018cac6120): kobject_cleanup
kobject: '0000:05:02.0' (e00000018cac6120): calling ktype release
kobject: '0000:05:02.0': free name
kobject: '0000:07:00.0' (e00000018cacb120): kobject_uevent_env
kobject: '0000:07:00.0' (e00000018cacb120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.0'
kobject: '0000:07:00.0' (e00000018cacb120): kobject_cleanup
kobject: '0000:07:00.0' (e00000018cacb120): calling ktype release
kobject: '0000:07:00.0': free name
kobject: '0000:07:00.1' (e00000018cacd120): kobject_uevent_env
kobject: '0000:07:00.1' (e00000018cacd120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:07:00.1'
kobject: '0000:07:00.1' (e00000018cacd120): kobject_cleanup
kobject: '0000:07:00.1' (e00000018cacd120): calling ktype release
kobject: '0000:07:00.1': free name
kobject: '0000:07' (e0000001801965b8): kobject_uevent_env
kobject: '0000:07' (e0000001801965b8): fill_kobj_path: path = '/class/pci_bus/0000:07'
kobject: '0000:05:04.0' (e00000018cac9920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:07' (e0000001801965b8): kobject_cleanup
kobject: '0000:07' (e0000001801965b8): calling ktype release
kobject: '0000:07': free name
aer 0000:05:04.0:pcie22: unloading service driver aer
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): kobject_uevent_env
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie22'
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): kobject_cleanup
kobject: '0000:05:04.0:pcie22' (e000000184cf1cb0): calling ktype release
kobject: '0000:05:04.0:pcie22': free name
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): kobject_uevent_env
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0/0000:05:04.0:pcie28'
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): kobject_cleanup
kobject: '0000:05:04.0:pcie28' (e000000184cf18b0): calling ktype release
kobject: '0000:05:04.0:pcie28': free name
kobject: '0000:05:04.0' (e00000018cac9920): kobject_uevent_env
kobject: '0000:05:04.0' (e00000018cac9920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:04.0'
kobject: '0000:05:04.0' (e00000018cac9920): kobject_cleanup
kobject: '0000:05:04.0' (e00000018cac9920): calling ktype release
kobject: '0000:05:04.0': free name
kobject: '0000:05' (e000000181dec1b8): kobject_uevent_env
kobject: '0000:05' (e000000181dec1b8): fill_kobj_path: path = '/class/pci_bus/0000:05'
kobject: '0000:04:00.0' (e00000018cac2920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:05' (e000000181dec1b8): kobject_cleanup
kobject: '0000:05' (e000000181dec1b8): calling ktype release
kobject: '0000:05': free name
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): kobject_uevent_env
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18'
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): kobject_cleanup
kobject: '0000:04:00.0:pcie18' (e000000184cf2cb0): calling ktype release
kobject: '0000:04:00.0:pcie18': free name
kobject: '0000:04:00.0' (e00000018cac2920): kobject_uevent_env
kobject: '0000:04:00.0' (e00000018cac2920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04' (e000000181defdb8): kobject_uevent_env
kobject: '0000:04' (e000000181defdb8): fill_kobj_path: path = '/class/pci_bus/0000:04'
kobject: '0000:03:00.0' (e000000181661120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0'
kobject: '0000:04' (e000000181defdb8): kobject_cleanup
kobject: '0000:04' (e000000181defdb8): calling ktype release
kobject: '0000:04': free name
kobject: '0000:04:00.0' (e00000018cac2920): kobject_cleanup
kobject: '0000:04:00.0' (e00000018cac2920): calling ktype release
kobject: '0000:04:00.0': free name

[2nd remove complete -- note above that we called ktype release for
e00000018cac2920. 2nd rescan, and here's were we blow up. why is
that kobject e00000018cac2920 still hanging around?!?]
[root@tahitifp1 /sys/bus/pci]# echo 1 > rescan
kobject: '0000:04:00.0' (e00000018cac2920): kobject_add_internal: parent: '0000:03:00.0', set: 'devices'
kobject: '0000:04:00.0' (e00000018cac2920): kobject_uevent_env
kobject: '0000:04:00.0' (e00000018cac2920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0'
kobject: '0000:04:00.0:pcie18' (e000000184cf12b0): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0x110/0x140()
Hardware name: server BL860c
sysfs: cannot create duplicate filename '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18/power'
Modules linked in: binfmt_misc dm_multipath pci_slot sg shpchp pci_hotplug dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ohci_hcd ehci_hcd usbcore

Call Trace:
[<a0000001000146d0>] show_stack+0x50/0xa0
sp=e00000018756f860 bsp=e000000187561818
[<a000000100014750>] dump_stack+0x30/0x60
sp=e00000018756fa30 bsp=e000000187561800
[<a00000010009a0f0>] warn_slowpath+0x130/0x180
sp=e00000018756fa30 bsp=e000000187561798
[<a000000100230790>] sysfs_add_one+0x110/0x140
sp=e00000018756fb50 bsp=e000000187561760
[<a000000100230900>] create_dir+0x80/0x120
sp=e00000018756fb50 bsp=e000000187561728
[<a0000001002309d0>] sysfs_create_subdir+0x30/0x60
sp=e00000018756fb70 bsp=e0000001875616f8
[<a0000001002333f0>] internal_create_group+0x130/0x400
sp=e00000018756fb70 bsp=e0000001875616a0
[<a0000001002336f0>] sysfs_create_group+0x30/0x60
sp=e00000018756fb80 bsp=e000000187561678
[<a000000100513b50>] dpm_sysfs_add+0x30/0x60
sp=e00000018756fb80 bsp=e000000187561658
[<a0000001005043e0>] device_add+0xa00/0xd80
sp=e00000018756fb80 bsp=e0000001875615e0
[<a000000100504790>] device_register+0x30/0x60
sp=e00000018756fb90 bsp=e0000001875615c0
[<a00000010041e610>] pcie_port_device_register+0xa30/0xbc0
sp=e00000018756fb90 bsp=e000000187561510
[<a000000100912060>] pcie_portdrv_probe+0xe0/0x140
sp=e00000018756fbd0 bsp=e0000001875614f0
[<a000000100413a30>] local_pci_probe+0x70/0xa0
sp=e00000018756fbd0 bsp=e0000001875614d0
[<a0000001004142b0>] pci_device_probe+0x210/0x280
sp=e00000018756fbd0 bsp=e000000187561490
[<a00000010050be00>] driver_probe_device+0x3a0/0x5a0
sp=e00000018756fdf0 bsp=e000000187561450
[<a00000010050c030>] __device_attach+0x30/0x60
sp=e00000018756fdf0 bsp=e000000187561428
[<a0000001005094a0>] bus_for_each_drv+0xa0/0x140
sp=e00000018756fdf0 bsp=e0000001875613e0
[<a00000010050c110>] device_attach+0xb0/0x100
sp=e00000018756fe00 bsp=e0000001875613b0
[<a000000100509bd0>] bus_attach_device+0x70/0x100
sp=e00000018756fe00 bsp=e000000187561380
[<a000000100504460>] device_add+0xa80/0xd80
sp=e00000018756fe00 bsp=e000000187561310
[<a000000100408000>] pci_bus_add_device+0x20/0xa0
sp=e00000018756fe10 bsp=e0000001875612f0
[<a000000100408260>] pci_bus_add_devices+0xe0/0x3a0
sp=e00000018756fe10 bsp=e0000001875612a8
[<a000000100408450>] pci_bus_add_devices+0x2d0/0x3a0
sp=e00000018756fe10 bsp=e000000187561260
[<a0000001009124c0>] pci_do_scan_bus+0x160/0x180
sp=e00000018756fe10 bsp=e000000187561220
[<a000000100417f30>] bus_rescan_store+0xb0/0x120
sp=e00000018756fe10 bsp=e0000001875611e8
[<a000000100508b50>] bus_attr_store+0x70/0xa0
sp=e00000018756fe20 bsp=e0000001875611b0
[<a00000010022da00>] sysfs_write_file+0x240/0x2e0
sp=e00000018756fe20 bsp=e000000187561160
[<a000000100187290>] vfs_write+0x1b0/0x2e0
sp=e00000018756fe20 bsp=e000000187561110
[<a000000100187510>] sys_write+0x70/0xe0
sp=e00000018756fe20 bsp=e000000187561098
[<a00000010000c560>] ia64_ret_from_syscall+0x0/0x20
sp=e00000018756fe30 bsp=e000000187561098
[<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
sp=e000000187570000 bsp=e000000187561098
---[ end trace 60823c890ff5d765 ]---
kobject: '0000:04:00.0:pcie18' (e000000184cf12b0): kobject_uevent_env
kobject: '0000:04:00.0:pcie18' (e000000184cf12b0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:04:00.0:pcie18'
kobject: '0000:05:02.0' (e00000018cac9920): kobject_add_internal: parent: '0000:04:00.0', set: 'devices'
kobject: '0000:05:02.0' (e00000018cac9920): kobject_uevent_env
kobject: '0000:05:02.0' (e00000018cac9920): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:05:02.0:pcie22' (e000000184cf16b0): kobject_add_internal: parent: '0000:05:02.0', set: 'devices'


2009-03-09 15:07:29

by Greg KH

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> Hi Kay, Greg,
>
> I've been working on this patch series recently that adds
> function and device level hotplug into the PCI core:
>
> http://thread.gmane.org/gmane.linux.kernel.pci/3495
>
> For the last two weeks, I've been beating my head against a
> refcounting/kobject problem, and was hoping you could give me
> some advice, since I seem to have run into a wall.
>
> My test case has been removing device 0000:04:00.0, which should
> remove all the devices below it.

You are removing the children before the parent device, right? If not,
you have to be _very_ careful (personally, I don't think you should be
allowed to do that, but others, like the scsi developers, like doing
things like this...)

> +-[0000:03]---00.0-[0000:04-07]----00.0-[0000:05-07]--+-02.0-[0000:06]--+-00.0 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
> | | \-00.1 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
> | \-04.0-[0000:07]--+-00.0 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
> | \-00.1 Intel Corporation 82571EB Quad Port Gigabit Mezzanine Adapter
>
> I can remove the device and rescan the bus once, and it works
> fine. The second removal works fine, and then, unpredictably,
> later rescan/remove cycles eventually end up producing a warning
> and oops every time. Sometimes I die on the 2nd rescan, sometimes
> not until the 4th or 5th remove/rescan cycle.

What is the warning and oops?

> In this data set, I turned on kobject debugging, and managed to
> capture a trace where we die on the 2nd rescan.
>
> In this data set, we:
>
> - create a kobject for 0000:04:00.0 (e00000018cac2920)
> - remove the device
> - observe '0000:04:00.0' (e00000018cac2920): calling ktype release
> - rescan the bus
> - discover that e00000018cac2920 is still hanging around!

What do you mean by "rescan"? And sure, if you create a new device, it
could be allocated at the same location, that's what the slab allocators
do, right?

Can you provide the full debug log that shows the problem?

thanks,

greg k-h

2009-03-09 15:07:46

by Greg KH

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

On Mon, Mar 09, 2009 at 08:04:53AM -0700, Greg KH wrote:
> On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> > - discover that e00000018cac2920 is still hanging around!
>
> What do you mean by "rescan"? And sure, if you create a new device, it
> could be allocated at the same location, that's what the slab allocators
> do, right?
>
> Can you provide the full debug log that shows the problem?

Oh nevermind, you did that already, sorry, too early this morning...

2009-03-09 16:50:53

by Alex Chiang

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

* Greg KH <[email protected]>:
> On Mon, Mar 09, 2009 at 12:36:54AM -0600, Alex Chiang wrote:
> > Hi Kay, Greg,
> >
> > I've been working on this patch series recently that adds
> > function and device level hotplug into the PCI core:
> >
> > http://thread.gmane.org/gmane.linux.kernel.pci/3495
> >
> > For the last two weeks, I've been beating my head against a
> > refcounting/kobject problem, and was hoping you could give me
> > some advice, since I seem to have run into a wall.
> >
> > My test case has been removing device 0000:04:00.0, which should
> > remove all the devices below it.
>
> You are removing the children before the parent device, right? If not,
> you have to be _very_ careful (personally, I don't think you should be
> allowed to do that, but others, like the scsi developers, like doing
> things like this...)

Yes, I'm removing children before the parent, using the
pci_remove_bus_device() interface.

> > In this data set, I turned on kobject debugging, and managed to
> > capture a trace where we die on the 2nd rescan.
> >
> > In this data set, we:
> >
> > - create a kobject for 0000:04:00.0 (e00000018cac2920)
> > - remove the device
> > - observe '0000:04:00.0' (e00000018cac2920): calling ktype release
> > - rescan the bus
> > - discover that e00000018cac2920 is still hanging around!
>
> What do you mean by "rescan"?

By rescan, I mean we're rescanning the entire PCI bus, looking
for new devices.

for each PCI root bus:
pci_scan_child_bus()
pci_bus_add_devices()

> And sure, if you create a new device, it could be allocated at
> the same location, that's what the slab allocators do, right?

I thought about the allocators returning a pointer to the same
location that maybe has some valid looking data hanging around,
but it's not wise for someone like me to go pointing fingers at
the allocator before I've proven the bug isn't in my code. ;)

I'm just hoping for some advice on what else I could instrument
to try and track this down further.

Thanks.

/ac

2009-03-09 16:58:20

by Matthew Wilcox

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

On Mon, Mar 09, 2009 at 10:50:10AM -0600, Alex Chiang wrote:
> I thought about the allocators returning a pointer to the same
> location that maybe has some valid looking data hanging around,
> but it's not wise for someone like me to go pointing fingers at
> the allocator before I've proven the bug isn't in my code. ;)

Slab poisoning would be the logical next thing to try to decide whether
the allocator is wrong or you're using it wrong ;-)

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2009-03-09 17:21:36

by Alex Chiang

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

* Matthew Wilcox <[email protected]>:
> On Mon, Mar 09, 2009 at 10:50:10AM -0600, Alex Chiang wrote:
> > I thought about the allocators returning a pointer to the same
> > location that maybe has some valid looking data hanging around,
> > but it's not wise for someone like me to go pointing fingers at
> > the allocator before I've proven the bug isn't in my code. ;)
>
> Slab poisoning would be the logical next thing to try to decide whether
> the allocator is wrong or you're using it wrong ;-)

Hey shiny!

Thanks, this is good -- I hopefully muddle my way through and
figure out what's going on.

[output of slab poisoning below]

/ac

[root@tahitifp1 pci]# echo 1 > devices/0000\:04\:00.0/remove
kobject: '0000:06:00.0' (e0000001818153f0): kobject_uevent_env
kobject: '0000:06:00.0' (e0000001818153f0): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
kobject: '0000:06:00.0' (e0000001818153f0): kobject_cleanup
kobject: '0000:06:00.0' (e0000001818153f0): calling ktype release
kobject: '0000:06:00.0': free name
kobject: '0000:06:00.1' (e000000181815c38): kobject_uevent_env
kobject: '0000:06:00.1' (e000000181815c38): fill_kobj_path: path = '/devices/pci
0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.1'
kobject: '0000:06:00.1' (e000000181815c38): kobject_cleanup
kobject: '0000:06:00.1' (e000000181815c38): calling ktype release
kobject: '0000:06:00.1': free name
kobject: '0000:06' (e000000180186430): kobject_uevent_env
kobject: '0000:06' (e000000180186430): fill_kobj_path: path = '/class/pci_bus/00
00:06'
kobject: '0000:05:02.0' (e000000181814360): fill_kobj_path: path = '/devices/pci
0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0'
kobject: '0000:06' (e000000180186430): kobject_cleanup
kobject: '0000:06' (e000000180186430): calling ktype release
kobject: '0000:06': free name
aer 0000:05:02.0:pcie22: unloading service driver aer
kobject: '0000:05:02.0:pcie22' (e000000183350788): kobject_uevent_env
kobject: '0000:05:02.0:pcie22' (e000000183350788): fill_kobj_path: path = '/devi
ces/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie22'
kobject: '0000:05:02.0:pcie22' (e000000183350788): kobject_cleanup
kobject: '0000:05:02.0:pcie22' (e000000183350788): calling ktype release
kobject: '0000:05:02.0:pcie22': free name
kobject: '0000:05:02.0:pcie28' (e0000001833509d0): kobject_uevent_env
kobject: '0000:05:02.0:pcie28' (e0000001833509d0): fill_kobj_path: path = '/devi
ces/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:05:02.0:pcie28'
kobject: '0000:05:02.0:pcie28' (e0000001833509d0): kobject_cleanup
kobject: '0000:05:02.0:pcie28' (e0000001833509d0): calling ktype release
kobject: '0000:05:02.0:pcie28': free name
=============================================================================
BUG kmalloc-8: Object already free
-----------------------------------------------------------------------------

INFO: Allocated in pcie_port_device_register+0x60/0x920 age=56269 cpu=7 pid=1
INFO: Freed in pcie_port_device_remove+0xb0/0xe0 age=0 cpu=0 pid=28
INFO: Slab 0xa07fffffc81cb0f8 objects=819 used=812 fp=0xe000000183291770 flags=0
x10000000000083
INFO: Object 0xe000000183291770 @offset=6000 fp=0xe000000183291860

Bytes b4 0xe000000183291760: 3d ef ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a =�
....ZZZZZZZZ
Object 0xe000000183291770: 6b 6b 6b 6b 6b 6b 6b a5 kk
kkkkk�
Redzone 0xe000000183291778: bb bb bb bb bb bb bb bb ��
������
Padding 0xe0000001832917b8: 5a 5a 5a 5a 5a 5a 5a 5a ZZ
ZZZZZZ

Call Trace:
[<a0000001000146d0>] show_stack+0x50/0xa0
sp=e0000001813efc00 bsp=e0000001813e1308
[<a000000100014750>] dump_stack+0x30/0x60
sp=e0000001813efdd0 bsp=e0000001813e12f0
[<a000000100175fe0>] print_trailer+0x220/0x240
sp=e0000001813efdd0 bsp=e0000001813e12b0
[<a000000100176050>] object_err+0x50/0x80
sp=e0000001813efdd0 bsp=e0000001813e1278
[<a000000100179c70>] __slab_free+0x5b0/0x7a0
sp=e0000001813efdd0 bsp=e0000001813e1230
[<a00000010017b750>] kfree+0x250/0x2a0
sp=e0000001813efdd0 bsp=e0000001813e11e8
[<a0000001004226a0>] pcie_portdrv_remove+0x40/0x60
sp=e0000001813efdd0 bsp=e0000001813e11c8
[<a000000100417b20>] pci_device_remove+0x80/0x100
sp=e0000001813efdd0 bsp=e0000001813e11a0
[<a00000010050f920>] __device_release_driver+0x100/0x160
sp=e0000001813efdd0 bsp=e0000001813e1168
[<a00000010050f9b0>] device_release_driver+0x30/0x60
sp=e0000001813efdd0 bsp=e0000001813e1140
[<a00000010050d350>] bus_remove_device+0x1d0/0x220
sp=e0000001813efdd0 bsp=e0000001813e1100
[<a0000001005080e0>] device_del+0x2c0/0x3a0
sp=e0000001813efdd0 bsp=e0000001813e10c8
[<a000000100508290>] device_unregister+0xd0/0x100
sp=e0000001813efdd0 bsp=e0000001813e10a8
[<a00000010040d990>] pci_stop_dev+0x70/0x100
sp=e0000001813efdd0 bsp=e0000001813e1080
[<a00000010040dc00>] pci_remove_bus_device+0x80/0x180
sp=e0000001813efdd0 bsp=e0000001813e1050
[<a00000010040dd60>] pci_remove_behind_bridge+0x60/0xc0
sp=e0000001813efdd0 bsp=e0000001813e1028
[<a00000010040dbc0>] pci_remove_bus_device+0x40/0x180
sp=e0000001813efdd0 bsp=e0000001813e0ff0
[<a000000100419880>] remove_callback+0x40/0xa0
sp=e0000001813efdd0 bsp=e0000001813e0fc8
[<a000000100232970>] sysfs_schedule_callback_work+0x50/0xc0
sp=e0000001813efdd0 bsp=e0000001813e0fa0
[<a0000001000c12b0>] run_workqueue+0x1f0/0x340
sp=e0000001813efdd0 bsp=e0000001813e0f60
[<a0000001000c1540>] worker_thread+0x140/0x180
sp=e0000001813efdd0 bsp=e0000001813e0f38
[<a0000001000c9d00>] kthread+0xa0/0x120
sp=e0000001813efe30 bsp=e0000001813e0f08
[<a000000100016690>] kernel_thread_helper+0xd0/0x100
sp=e0000001813efe30 bsp=e0000001813e0ee0
[<a00000010000a4c0>] start_kernel_thread+0x20/0x40
sp=e0000001813efe30 bsp=e0000001813e0ee0
FIX kmalloc-8: Object at 0xe000000183291770 not freed

2009-03-09 18:02:45

by Alex Chiang

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

* Matthew Wilcox <[email protected]>:
> On Mon, Mar 09, 2009 at 10:50:10AM -0600, Alex Chiang wrote:
> > I thought about the allocators returning a pointer to the same
> > location that maybe has some valid looking data hanging around,
> > but it's not wise for someone like me to go pointing fingers at
> > the allocator before I've proven the bug isn't in my code. ;)
>
> Slab poisoning would be the logical next thing to try to decide whether
> the allocator is wrong or you're using it wrong ;-)

We had a double free in the PCIe port driver. Removing the double
free seems to fix my issue.

The double free is only in Jesse's linux-next branch, so no need
to worry about it in mainline/stable.

Thanks.

/ac

2009-03-09 18:21:19

by Vegard Nossum

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

2009/3/9 Alex Chiang <[email protected]>:
> * Matthew Wilcox <[email protected]>:
>> On Mon, Mar 09, 2009 at 10:50:10AM -0600, Alex Chiang wrote:
>> > I thought about the allocators returning a pointer to the same
>> > location that maybe has some valid looking data hanging around,
>> > but it's not wise for someone like me to go pointing fingers at
>> > the allocator before I've proven the bug isn't in my code. ;)
>>
>> Slab poisoning would be the logical next thing to try to decide whether
>> the allocator is wrong or you're using it wrong ;-)
>
> We had a double free in the PCIe port driver. Removing the double
> free seems to fix my issue.
>
> The double free is only in Jesse's linux-next branch, so no need
> to worry about it in mainline/stable.

Hi,

Sorry for butting in, just wondering if I could have the patch to fix
the double free? :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2009-03-09 18:41:43

by Alex Chiang

[permalink] [raw]
Subject: Re: kobj refcounting weirdness

* Vegard Nossum <[email protected]>:
> 2009/3/9 Alex Chiang <[email protected]>:
> > * Matthew Wilcox <[email protected]>:
> >> On Mon, Mar 09, 2009 at 10:50:10AM -0600, Alex Chiang wrote:
> >> > I thought about the allocators returning a pointer to the same
> >> > location that maybe has some valid looking data hanging around,
> >> > but it's not wise for someone like me to go pointing fingers at
> >> > the allocator before I've proven the bug isn't in my code. ;)
> >>
> >> Slab poisoning would be the logical next thing to try to decide whether
> >> the allocator is wrong or you're using it wrong ;-)
> >
> > We had a double free in the PCIe port driver. Removing the double
> > free seems to fix my issue.
> >
> > The double free is only in Jesse's linux-next branch, so no need
> > to worry about it in mainline/stable.
>
> Sorry for butting in, just wondering if I could have the patch to fix
> the double free? :-)

http://thread.gmane.org/gmane.linux.kernel.pci/3524

Remember, it only applies to Jesse's linux-next, since it has
Rafael's latest port driver cleanups.

If you apply it to mainline, you will leak memory.

Thanks.

/ac