This is v3 of the patch series that introduces function-level hotplug
to the PCI core. It is against Jesse's linux-next branch.
I've been doing lots of testing, and found and fixed several small
buglets, both in my code and elsewhere in PCI. In order to test this
patchset, you will also need these patches:
http://thread.gmane.org/gmane.linux.kernel.pci/3437
http://lkml.org/lkml/2009/3/7/173
There is still one major bug somewhere that shows up only when using
the PCIe portdriver (that is, any time PCIe support is built into
the kernel). You get an oops during multiple remove/rescan cycles,
especially on devices with an internal bridge.
This symptom does not appear on the exact same hardware topology, removing
and rescanning the exact same device, when not using the PCIe port driver.
I'm still not sure what's going on, so if anyone wants to play with
this series, I'd truly appreciate any help.
Thanks.
/ac
v2 -> v3:
- properly remove device with internal bridge
- added Kenji Kaneshige's pci_is_root_bus() interface
- dropped whitespace cleanups for another time
v1 -> v2:
- incorporated lots of Trent Piepho's work
- beefed up pci_do_scan_bus as heavy lifter for rescanning
- small bugfixes folded into earlier patches to get everything working
---
Alex Chiang (7):
PCI Hotplug: schedule fakephp for feature removal
PCI Hotplug: rename legacy_fakephp to fakephp
PCI: Introduce /sys/bus/pci/devices/.../rescan
PCI: Introduce /sys/bus/pci/devices/.../remove
PCI: Introduce /sys/bus/pci/rescan
PCI: beef up pci_do_scan_bus()
PCI: always scan child buses
Kenji Kaneshige (1):
PCI: pci_is_root_bus helper
Trent Piepho (3):
PCI Hotplug: restore fakephp interface with complete reimplementation
PCI: pci_scan_slot() returns newly found devices
PCI: don't scan existing devices
Documentation/ABI/testing/sysfs-bus-pci | 27 ++
Documentation/feature-removal-schedule.txt | 32 ++
Documentation/filesystems/sysfs-pci.txt | 9 +
drivers/pci/hotplug-pci.c | 16 +
drivers/pci/hotplug/fakephp.c | 443 +++++++---------------------
drivers/pci/pci-sysfs.c | 89 ++++++
drivers/pci/probe.c | 74 ++---
include/linux/pci.h | 9 +
8 files changed, 321 insertions(+), 378 deletions(-)
From: Kenji Kaneshige <[email protected]>
Introduce pci_is_root_bus helper function. This will help make code
more consistent, as well as prevent incorrect assumptions (such as
pci_bus->self == NULL on a root bus, which is _not_ always true).
Signed-off-by: Kenji Kaneshige <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
include/linux/pci.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7baf2a5..a31f935 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -341,6 +341,15 @@ struct pci_bus {
#define pci_bus_b(n) list_entry(n, struct pci_bus, node)
#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
+/*
+ * Returns true if the pci bus is root (behind host-pci bridge),
+ * false otherwise
+ */
+static inline bool pci_is_root_bus(struct pci_bus *pbus)
+{
+ return !(pbus->parent);
+}
+
#ifdef CONFIG_PCI_MSI
static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev)
{
From: Trent Piepho <[email protected]>
pci_scan_single_device is supposed to add newly discovered
devices to pci_bus->devices, but doesn't check to see if the
device has already been added. This can cause problems if we ever
want to use this interface to rescan the PCI bus.
If the device is already added, just return it.
Signed-off-by: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 55ec44a..66b5d1c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1006,6 +1006,12 @@ struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
{
struct pci_dev *dev;
+ dev = pci_get_slot(bus, devfn);
+ if (dev) {
+ pci_dev_put(dev);
+ return dev;
+ }
+
dev = pci_scan_device(bus, devfn);
if (!dev)
return NULL;
From: Trent Piepho <[email protected]>
pci_scan_slot() has been rewritten to be less complex and will now
return the number of *new* devices found.
Existing callers need not worry because they already assume that
they can't call pci_scan_slot() on an already-scanned slot.
Thus, there is no semantic change for existing callers: returning
newly found devices (this patch) is exactly equal to returning all
found devices (before this patch).
This patch adds some more groundwork to allow us to rescan the
PCI bus during runtime to discover newly added devices.
Signed-off-by: Trent Piepho <[email protected]>
Reviewed-by: Alex Chiang <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 35 +++++++++++++----------------------
1 files changed, 13 insertions(+), 22 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 66b5d1c..d392813 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1030,35 +1030,26 @@ EXPORT_SYMBOL(pci_scan_single_device);
* Scan a PCI slot on the specified PCI bus for devices, adding
* discovered devices to the @bus->devices list. New devices
* will not have is_added set.
+ *
+ * Returns the number of new devices found.
*/
int pci_scan_slot(struct pci_bus *bus, int devfn)
{
- int func, nr = 0;
- int scan_all_fns;
-
- scan_all_fns = pcibios_scan_all_fns(bus, devfn);
-
- for (func = 0; func < 8; func++, devfn++) {
- struct pci_dev *dev;
+ int fn, nr = 0;
+ struct pci_dev *dev;
- dev = pci_scan_single_device(bus, devfn);
- if (dev) {
+ if ((dev = pci_scan_single_device(bus, devfn)))
+ if (!dev->is_added) /* new device? */
nr++;
- /*
- * If this is a single function device,
- * don't scan past the first function.
- */
- if (!dev->multifunction) {
- if (func > 0) {
- dev->multifunction = 1;
- } else {
- break;
- }
+ if ((dev && dev->multifunction) ||
+ (!dev && pcibios_scan_all_fns(bus, devfn))) {
+ for (fn = 1; fn < 8; fn++) {
+ if ((dev = pci_scan_single_device(bus, devfn + fn))) {
+ if (!dev->is_added)
+ nr++;
+ dev->multifunction = 1;
}
- } else {
- if (func == 0 && !scan_all_fns)
- break;
}
}
While scanning bridges, we stop our scan if we encounter a bus
that we've seen before, to work around some buggy chipsets. This
is a good idea, but prevents us from fully scanning the PCI bus
at a future time (to find newly hot-added devices, for example).
Change the logic so that we skip _re-adding_ an existing bus
that we've seen before, but also allow the scan to descend to
all child buses.
Now that we're potentially scanning our child buses again, we
also need to be sure not to attempt re-initializing their BARs
so we avoid that.
This patch lays the groundwork to allow the user to issue a
rescan of the PCI bus at any time.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 33 +++++++++++++++++++--------------
1 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index d392813..a3959ce 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -511,21 +511,21 @@ int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
/*
* If we already got to this bus through a different bridge,
- * ignore it. This can happen with the i450NX chipset.
+ * don't re-add it. This can happen with the i450NX chipset.
+ *
+ * However, we continue to descend down the hierarchy and
+ * scan remaining child buses.
*/
- if (pci_find_bus(pci_domain_nr(bus), busnr)) {
- dev_info(&dev->dev, "bus %04x:%02x already known\n",
- pci_domain_nr(bus), busnr);
- goto out;
+ child = pci_find_bus(pci_domain_nr(bus), busnr);
+ if (!child) {
+ child = pci_add_new_bus(bus, dev, busnr);
+ if (!child)
+ goto out;
+ child->primary = buses & 0xFF;
+ child->subordinate = (buses >> 16) & 0xFF;
+ child->bridge_ctl = bctl;
}
- child = pci_add_new_bus(bus, dev, busnr);
- if (!child)
- goto out;
- child->primary = buses & 0xFF;
- child->subordinate = (buses >> 16) & 0xFF;
- child->bridge_ctl = bctl;
-
cmax = pci_scan_child_bus(child);
if (cmax > max)
max = cmax;
@@ -1075,8 +1075,13 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
* After performing arch-dependent fixup of the bus, look behind
* all PCI-to-PCI bridges on this bus.
*/
- pr_debug("PCI: Fixups for bus %04x:%02x\n", pci_domain_nr(bus), bus->number);
- pcibios_fixup_bus(bus);
+ if (!bus->is_added) {
+ pr_debug("PCI: Fixups for bus %04x:%02x\n", pci_domain_nr(bus), bus->number);
+ pcibios_fixup_bus(bus);
+ if (pci_is_root_bus(bus))
+ bus->is_added = 1;
+ }
+
for (pass=0; pass < 2; pass++)
list_for_each_entry(dev, &bus->devices, bus_list) {
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
We have a nice interface for re-scanning a PCI bus which will
discover newly added devices, add them to the device tree, and
enable them properly.
Ensure that the bridge resources are properly sized and assigned
during the rescan.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug-pci.c | 16 ++++++++++++----
1 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c
index 4d4a644..33ab2d2 100644
--- a/drivers/pci/hotplug-pci.c
+++ b/drivers/pci/hotplug-pci.c
@@ -6,13 +6,21 @@
unsigned int __devinit pci_do_scan_bus(struct pci_bus *bus)
{
- unsigned int max;
+ unsigned int max, pass;
+ struct pci_dev *dev;
max = pci_scan_child_bus(bus);
- /*
- * Make the discovered devices available.
- */
+ for (pass=0; pass < 2; pass++)
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
+ dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
+ if (pass && dev->subordinate)
+ pci_bus_size_bridges(dev->subordinate);
+ }
+
+ pci_bus_assign_resources(bus);
+ pci_enable_bridges(bus);
pci_bus_add_devices(bus);
return max;
This interface allows the user to force a rescan of all PCI buses
in system, and rediscover devices that have been removed earlier.
Cc: Trent Piepho <[email protected]>
Cc: [email protected]
Reviewed-by: James Cameron <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 9 +++++++++
drivers/pci/pci-sysfs.c | 26 ++++++++++++++++++++++++++
2 files changed, 35 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index e638e15..ea4aee2 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -41,6 +41,15 @@ Description:
for the device and attempt to bind to it. For example:
# echo "8086 10f5" > /sys/bus/pci/drivers/foo/new_id
+What: /sys/bus/pci/rescan
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ force a rescan of all PCI buses in the system, and
+ re-discover previously removed devices.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 1c89298..a29fc3b 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -971,12 +971,38 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
}
}
+#ifdef CONFIG_HOTPLUG
+static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf, size_t count)
+{
+ unsigned long val;
+ struct pci_bus *b = NULL;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (val)
+ while ((b = pci_find_next_bus(b)) != NULL)
+ pci_do_scan_bus(b);
+
+ return count;
+}
+static BUS_ATTR(rescan, S_IWUSR, NULL, bus_rescan_store);
+#endif
+
static int __init pci_sysfs_init(void)
{
struct pci_dev *pdev = NULL;
int retval;
sysfs_initialized = 1;
+#ifdef CONFIG_HOTPLUG
+ retval = bus_create_file(&pci_bus_type, &bus_attr_rescan);
+ if (retval)
+ return retval;
+#endif
for_each_pci_dev(pdev) {
retval = pci_create_sysfs_dev_files(pdev);
if (retval) {
This patch adds an attribute named "remove" to a PCI device's sysfs
directory. Writing a non-zero value to this attribute will remove the PCI
device and any children of it.
Trent Piepho wrote the original implementation and documentation.
Cc: [email protected]
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 8 ++++++
Documentation/filesystems/sysfs-pci.txt | 9 ++++++
drivers/pci/pci-sysfs.c | 43 +++++++++++++++++++++++++++++++
3 files changed, 60 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ea4aee2..5b1ddde 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -50,6 +50,14 @@ Description:
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../remove
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ hot-remove the PCI device and any of its children.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
index 9f8740c..89c15dc 100644
--- a/Documentation/filesystems/sysfs-pci.txt
+++ b/Documentation/filesystems/sysfs-pci.txt
@@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
| |-- enable
| |-- irq
| |-- local_cpus
+ | |-- remove
| |-- resource
| |-- resource0
| |-- resource1
@@ -36,6 +37,7 @@ files, each with their own function.
enable Whether the device is enabled (ascii, rw)
irq IRQ number (ascii, ro)
local_cpus nearby CPU mask (cpumask, ro)
+ remove remove device from kernel's list (ascii, wo)
resource PCI resource host addresses (ascii, ro)
resource0..N PCI resource N, if present (binary, mmap)
resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
@@ -46,6 +48,7 @@ files, each with their own function.
ro - read only file
rw - file is readable and writable
+ wo - write only file
mmap - file is mmapable
ascii - file contains ascii text
binary - file contains binary data
@@ -73,6 +76,12 @@ that the device must be enabled for a rom read to return data succesfully.
In the event a driver is not bound to the device, it can be enabled using the
'enable' file, documented above.
+The 'remove' file is used to remove the PCI device, by writing a non-zero
+integer to the file. This does not involve any kind of hot-plug functionality,
+e.g. powering off the device. The device is removed from the kernel's list of
+PCI devices, the sysfs directory for it is removed, and the device will be
+removed from any drivers attached to it.
+
Accessing legacy resources through sysfs
----------------------------------------
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a29fc3b..2f85383 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -219,6 +219,46 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
return count;
}
+#ifdef CONFIG_HOTPLUG
+static void remove_callback(void *data)
+{
+ int bridge = 0;
+ struct pci_dev *pdev = (struct pci_dev *)data;
+
+ if (pdev->subordinate)
+ bridge = 1;
+
+ pci_remove_bus_device(pdev);
+ if (bridge)
+ pci_remove_bus(pdev->bus);
+
+ pci_dev_put(pdev);
+}
+
+static ssize_t
+remove_store(struct device *dev, struct device_attribute *dummy,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (pdev->subordinate && pci_is_root_bus(pdev->bus))
+ return -EBUSY;
+
+ if (val)
+ sysfs_schedule_callback(&dev->kobj, remove_callback, pdev,
+ THIS_MODULE);
+
+ return count;
+}
+#endif
+
struct device_attribute pci_dev_attrs[] = {
__ATTR_RO(resource),
__ATTR_RO(vendor),
@@ -237,6 +277,9 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
broken_parity_status_show,broken_parity_status_store),
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
+#ifdef CONFIG_HOTPLUG
+ __ATTR(remove, S_IWUSR, NULL, remove_store),
+#endif
__ATTR_NULL,
};
This interface allows the user to force a rescan of the device's
parent bus and all subordinate buses, and rediscover devices removed
earlier from this part of the device tree.
Cc: Trent Piepho <[email protected]>
Cc: [email protected]
Reviewed-by: James Cameron <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 10 ++++++++++
drivers/pci/pci-sysfs.c | 20 ++++++++++++++++++++
2 files changed, 30 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 5b1ddde..d33840e 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -58,6 +58,16 @@ Description:
hot-remove the PCI device and any of its children.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../rescan
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ force a rescan of the device's parent bus and all
+ child buses, and re-discover devices removed earlier
+ from this part of the device tree.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 2f85383..be66dbd 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -220,6 +220,25 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
}
#ifdef CONFIG_HOTPLUG
+static ssize_t
+dev_rescan_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (val)
+ pci_do_scan_bus(pdev->bus);
+
+ return count;
+}
+
static void remove_callback(void *data)
{
int bridge = 0;
@@ -279,6 +298,7 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
#ifdef CONFIG_HOTPLUG
__ATTR(remove, S_IWUSR, NULL, remove_store),
+ __ATTR(rescan, S_IWUSR, NULL, dev_rescan_store),
#endif
__ATTR_NULL,
};
From: Trent Piepho <[email protected]>
A complete re-implementation of fakephp is necessary if it is to
present its former interface (pre-2.6.27, when it broke). The
reason is that PCI hotplug drivers call pci_hp_register(), which
enforces the rule that only one /sys/bus/pci/slots/ file may be
created per physical slot.
The change breaks the old fakephp's assumption that it could
create a file per function. So we re-implement fakephp to avoid
using the standard PCI hotplug API so that we can restore the old
fakephp user interface.
It puts entries in /sys/bus/pci/slots with the names of all PCI
devices/functions, exactly symmetrical to what is shown in
/sys/bus/pci/devices. Each slots/ entry has a "power" attribute,
which works the same way as the fakephp driver's power attribute
has worked.
There are a few improvements over old fakephp, which couldn't handle
PCI devices being added or removed via a means outside of
fakephp's knowledge. If a device was added another way, old fakephp
didn't notice and didn't create the fake slot for it. If a
device was removed another way, old fakephp didn't delete the fake
slot for it (and accessing the stale slot caused an oops).
The new implementation overcomes these limitations. As a
consequence, removing a bridge with other devices behind it now
works as well, which is something else old fakephp couldn't do
previously.
This duplicates a tiny bit of the code in the PCI core that does
this same function. Re-using that code ends up being more
complex than duplicating it, and it makes code in the PCI core
more ugly just to support this legacy fakephp interface
compatibility layer.
[[email protected]: rewrite changelog]
[[email protected]: use pci_do_scan_bus()]
Reviewed-by: James Cameron <[email protected]>
Signed-off-by: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug/Makefile | 2
drivers/pci/hotplug/fakephp.c | 395 ----------------------------------
drivers/pci/hotplug/legacy_fakephp.c | 162 ++++++++++++++
3 files changed, 163 insertions(+), 396 deletions(-)
delete mode 100644 drivers/pci/hotplug/fakephp.c
create mode 100644 drivers/pci/hotplug/legacy_fakephp.c
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 2aa117c..3314fe8 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o
obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o
# Link this last so it doesn't claim devices that have a real hotplug driver
-obj-$(CONFIG_HOTPLUG_PCI_FAKE) += fakephp.o
+obj-$(CONFIG_HOTPLUG_PCI_FAKE) += legacy_fakephp.o
pci_hotplug-objs := pci_hotplug_core.o
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
deleted file mode 100644
index d8649e1..0000000
--- a/drivers/pci/hotplug/fakephp.c
+++ /dev/null
@@ -1,395 +0,0 @@
-/*
- * Fake PCI Hot Plug Controller Driver
- *
- * Copyright (C) 2003 Greg Kroah-Hartman <[email protected]>
- * Copyright (C) 2003 IBM Corp.
- * Copyright (C) 2003 Rolf Eike Beer <[email protected]>
- *
- * Based on ideas and code from:
- * Vladimir Kondratiev <[email protected]>
- * Rolf Eike Beer <[email protected]>
- *
- * All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation, version 2 of the License.
- *
- * Send feedback to <[email protected]>
- */
-
-/*
- *
- * This driver will "emulate" removing PCI devices from the system. If
- * the "power" file is written to with "0" then the specified PCI device
- * will be completely removed from the kernel.
- *
- * WARNING, this does NOT turn off the power to the PCI device. This is
- * a "logical" removal, not a physical or electrical removal.
- *
- * Use this module at your own risk, you have been warned!
- *
- * Enabling PCI devices is left as an exercise for the reader...
- *
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/pci.h>
-#include <linux/pci_hotplug.h>
-#include <linux/init.h>
-#include <linux/string.h>
-#include <linux/slab.h>
-#include <linux/workqueue.h>
-#include "../pci.h"
-
-#if !defined(MODULE)
- #define MY_NAME "fakephp"
-#else
- #define MY_NAME THIS_MODULE->name
-#endif
-
-#define dbg(format, arg...) \
- do { \
- if (debug) \
- printk(KERN_DEBUG "%s: " format, \
- MY_NAME , ## arg); \
- } while (0)
-#define err(format, arg...) printk(KERN_ERR "%s: " format, MY_NAME , ## arg)
-#define info(format, arg...) printk(KERN_INFO "%s: " format, MY_NAME , ## arg)
-
-#define DRIVER_AUTHOR "Greg Kroah-Hartman <[email protected]>"
-#define DRIVER_DESC "Fake PCI Hot Plug Controller Driver"
-
-struct dummy_slot {
- struct list_head node;
- struct hotplug_slot *slot;
- struct pci_dev *dev;
- struct work_struct remove_work;
- unsigned long removed;
-};
-
-static int debug;
-static int dup_slots;
-static LIST_HEAD(slot_list);
-static struct workqueue_struct *dummyphp_wq;
-
-static void pci_rescan_worker(struct work_struct *work);
-static DECLARE_WORK(pci_rescan_work, pci_rescan_worker);
-
-static int enable_slot (struct hotplug_slot *slot);
-static int disable_slot (struct hotplug_slot *slot);
-
-static struct hotplug_slot_ops dummy_hotplug_slot_ops = {
- .owner = THIS_MODULE,
- .enable_slot = enable_slot,
- .disable_slot = disable_slot,
-};
-
-static void dummy_release(struct hotplug_slot *slot)
-{
- struct dummy_slot *dslot = slot->private;
-
- list_del(&dslot->node);
- kfree(dslot->slot->info);
- kfree(dslot->slot);
- pci_dev_put(dslot->dev);
- kfree(dslot);
-}
-
-#define SLOT_NAME_SIZE 8
-
-static int add_slot(struct pci_dev *dev)
-{
- struct dummy_slot *dslot;
- struct hotplug_slot *slot;
- char name[SLOT_NAME_SIZE];
- int retval = -ENOMEM;
- static int count = 1;
-
- slot = kzalloc(sizeof(struct hotplug_slot), GFP_KERNEL);
- if (!slot)
- goto error;
-
- slot->info = kzalloc(sizeof(struct hotplug_slot_info), GFP_KERNEL);
- if (!slot->info)
- goto error_slot;
-
- slot->info->power_status = 1;
- slot->info->max_bus_speed = PCI_SPEED_UNKNOWN;
- slot->info->cur_bus_speed = PCI_SPEED_UNKNOWN;
-
- dslot = kzalloc(sizeof(struct dummy_slot), GFP_KERNEL);
- if (!dslot)
- goto error_info;
-
- if (dup_slots)
- snprintf(name, SLOT_NAME_SIZE, "fake");
- else
- snprintf(name, SLOT_NAME_SIZE, "fake%d", count++);
- dbg("slot->name = %s\n", name);
- slot->ops = &dummy_hotplug_slot_ops;
- slot->release = &dummy_release;
- slot->private = dslot;
-
- retval = pci_hp_register(slot, dev->bus, PCI_SLOT(dev->devfn), name);
- if (retval) {
- err("pci_hp_register failed with error %d\n", retval);
- goto error_dslot;
- }
-
- dbg("slot->name = %s\n", hotplug_slot_name(slot));
- dslot->slot = slot;
- dslot->dev = pci_dev_get(dev);
- list_add (&dslot->node, &slot_list);
- return retval;
-
-error_dslot:
- kfree(dslot);
-error_info:
- kfree(slot->info);
-error_slot:
- kfree(slot);
-error:
- return retval;
-}
-
-static int __init pci_scan_buses(void)
-{
- struct pci_dev *dev = NULL;
- int lastslot = 0;
-
- while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
- if (PCI_FUNC(dev->devfn) > 0 &&
- lastslot == PCI_SLOT(dev->devfn))
- continue;
- lastslot = PCI_SLOT(dev->devfn);
- add_slot(dev);
- }
-
- return 0;
-}
-
-static void remove_slot(struct dummy_slot *dslot)
-{
- int retval;
-
- dbg("removing slot %s\n", hotplug_slot_name(dslot->slot));
- retval = pci_hp_deregister(dslot->slot);
- if (retval)
- err("Problem unregistering a slot %s\n",
- hotplug_slot_name(dslot->slot));
-}
-
-/* called from the single-threaded workqueue handler to remove a slot */
-static void remove_slot_worker(struct work_struct *work)
-{
- struct dummy_slot *dslot =
- container_of(work, struct dummy_slot, remove_work);
- remove_slot(dslot);
-}
-
-/**
- * pci_rescan_slot - Rescan slot
- * @temp: Device template. Should be set: bus and devfn.
- *
- * Tries hard not to re-enable already existing devices;
- * also handles scanning of subfunctions.
- */
-static int pci_rescan_slot(struct pci_dev *temp)
-{
- struct pci_bus *bus = temp->bus;
- struct pci_dev *dev;
- int func;
- u8 hdr_type;
- int count = 0;
-
- if (!pci_read_config_byte(temp, PCI_HEADER_TYPE, &hdr_type)) {
- temp->hdr_type = hdr_type & 0x7f;
- if ((dev = pci_get_slot(bus, temp->devfn)) != NULL)
- pci_dev_put(dev);
- else {
- dev = pci_scan_single_device(bus, temp->devfn);
- if (dev) {
- dbg("New device on %s function %x:%x\n",
- bus->name, temp->devfn >> 3,
- temp->devfn & 7);
- count++;
- }
- }
- /* multifunction device? */
- if (!(hdr_type & 0x80))
- return count;
-
- /* continue scanning for other functions */
- for (func = 1, temp->devfn++; func < 8; func++, temp->devfn++) {
- if (pci_read_config_byte(temp, PCI_HEADER_TYPE, &hdr_type))
- continue;
- temp->hdr_type = hdr_type & 0x7f;
-
- if ((dev = pci_get_slot(bus, temp->devfn)) != NULL)
- pci_dev_put(dev);
- else {
- dev = pci_scan_single_device(bus, temp->devfn);
- if (dev) {
- dbg("New device on %s function %x:%x\n",
- bus->name, temp->devfn >> 3,
- temp->devfn & 7);
- count++;
- }
- }
- }
- }
-
- return count;
-}
-
-
-/**
- * pci_rescan_bus - Rescan PCI bus
- * @bus: the PCI bus to rescan
- *
- * Call pci_rescan_slot for each possible function of the bus.
- */
-static void pci_rescan_bus(const struct pci_bus *bus)
-{
- unsigned int devfn;
- struct pci_dev *dev;
- int retval;
- int found = 0;
- dev = alloc_pci_dev();
- if (!dev)
- return;
-
- dev->bus = (struct pci_bus*)bus;
- dev->sysdata = bus->sysdata;
- for (devfn = 0; devfn < 0x100; devfn += 8) {
- dev->devfn = devfn;
- found += pci_rescan_slot(dev);
- }
-
- if (found) {
- pci_bus_assign_resources(bus);
- list_for_each_entry(dev, &bus->devices, bus_list) {
- /* Skip already-added devices */
- if (dev->is_added)
- continue;
- retval = pci_bus_add_device(dev);
- if (retval)
- dev_err(&dev->dev,
- "Error adding device, continuing\n");
- else
- add_slot(dev);
- }
- pci_bus_add_devices(bus);
- }
- kfree(dev);
-}
-
-/* recursively scan all buses */
-static void pci_rescan_buses(const struct list_head *list)
-{
- const struct list_head *l;
- list_for_each(l,list) {
- const struct pci_bus *b = pci_bus_b(l);
- pci_rescan_bus(b);
- pci_rescan_buses(&b->children);
- }
-}
-
-/* initiate rescan of all pci buses */
-static inline void pci_rescan(void) {
- pci_rescan_buses(&pci_root_buses);
-}
-
-/* called from the single-threaded workqueue handler to rescan all pci buses */
-static void pci_rescan_worker(struct work_struct *work)
-{
- pci_rescan();
-}
-
-static int enable_slot(struct hotplug_slot *hotplug_slot)
-{
- /* mis-use enable_slot for rescanning of the pci bus */
- cancel_work_sync(&pci_rescan_work);
- queue_work(dummyphp_wq, &pci_rescan_work);
- return 0;
-}
-
-static int disable_slot(struct hotplug_slot *slot)
-{
- struct dummy_slot *dslot;
- struct pci_dev *dev;
- int func;
-
- if (!slot)
- return -ENODEV;
- dslot = slot->private;
-
- dbg("%s - physical_slot = %s\n", __func__, hotplug_slot_name(slot));
-
- for (func = 7; func >= 0; func--) {
- dev = pci_get_slot(dslot->dev->bus, dslot->dev->devfn + func);
- if (!dev)
- continue;
-
- if (test_and_set_bit(0, &dslot->removed)) {
- dbg("Slot already scheduled for removal\n");
- pci_dev_put(dev);
- return -ENODEV;
- }
-
- /* remove the device from the pci core */
- pci_remove_bus_device(dev);
-
- /* queue work item to blow away this sysfs entry and other
- * parts.
- */
- INIT_WORK(&dslot->remove_work, remove_slot_worker);
- queue_work(dummyphp_wq, &dslot->remove_work);
-
- pci_dev_put(dev);
- }
- return 0;
-}
-
-static void cleanup_slots (void)
-{
- struct list_head *tmp;
- struct list_head *next;
- struct dummy_slot *dslot;
-
- destroy_workqueue(dummyphp_wq);
- list_for_each_safe (tmp, next, &slot_list) {
- dslot = list_entry (tmp, struct dummy_slot, node);
- remove_slot(dslot);
- }
-
-}
-
-static int __init dummyphp_init(void)
-{
- info(DRIVER_DESC "\n");
-
- dummyphp_wq = create_singlethread_workqueue(MY_NAME);
- if (!dummyphp_wq)
- return -ENOMEM;
-
- return pci_scan_buses();
-}
-
-
-static void __exit dummyphp_exit(void)
-{
- cleanup_slots();
-}
-
-module_init(dummyphp_init);
-module_exit(dummyphp_exit);
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_LICENSE("GPL");
-module_param(debug, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(debug, "Debugging mode enabled or not");
-module_param(dup_slots, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(dup_slots, "Force duplicate slot names for debugging");
diff --git a/drivers/pci/hotplug/legacy_fakephp.c b/drivers/pci/hotplug/legacy_fakephp.c
new file mode 100644
index 0000000..e75af9e
--- /dev/null
+++ b/drivers/pci/hotplug/legacy_fakephp.c
@@ -0,0 +1,162 @@
+/* Works like the fakephp driver used to, except a little better.
+ *
+ * - It's possible to remove devices with subordinate busses.
+ * - New PCI devices that appear via any method, not just a fakephp triggered
+ * rescan, will be noticed.
+ * - Devices that are removed via any method, not just a fakephp triggered
+ * removal, will also be noticed.
+ *
+ * Uses nothing from the pci-hotplug subsystem.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include "../pci.h"
+
+struct legacy_slot {
+ struct kobject kobj;
+ struct pci_dev *dev;
+ struct list_head list;
+};
+
+static LIST_HEAD(legacy_list);
+
+static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ strcpy(buf, "1\n");
+ return 2;
+}
+
+static void remove_callback(void *data)
+{
+ pci_remove_bus_device((struct pci_dev *)data);
+}
+
+static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t len)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (val)
+ pci_do_scan_bus(slot->dev->bus);
+ else
+ sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
+ slot->dev, THIS_MODULE);
+ return len;
+}
+
+static struct attribute *legacy_attrs[] = {
+ &(struct attribute){ .name = "power", .mode = 0644 },
+ NULL,
+};
+
+static void legacy_release(struct kobject *kobj)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+
+ pci_dev_put(slot->dev);
+ kfree(slot);
+}
+
+static struct kobj_type legacy_ktype = {
+ .sysfs_ops = &(struct sysfs_ops){
+ .store = legacy_store, .show = legacy_show
+ },
+ .release = &legacy_release,
+ .default_attrs = legacy_attrs,
+};
+
+static int legacy_add_slot(struct pci_dev *pdev)
+{
+ struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
+
+ if (!slot)
+ return -ENOMEM;
+
+ if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
+ &pci_slots_kset->kobj, "%s",
+ pdev->dev.bus_id)) {
+ dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
+ return -EINVAL;
+ }
+ slot->dev = pci_dev_get(pdev);
+
+ list_add(&slot->list, &legacy_list);
+
+ return 0;
+}
+
+static int legacy_notify(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct pci_dev *pdev = to_pci_dev(data);
+
+ if (action == BUS_NOTIFY_ADD_DEVICE) {
+ legacy_add_slot(pdev);
+ } else if (action == BUS_NOTIFY_DEL_DEVICE) {
+ struct legacy_slot *slot;
+
+ list_for_each_entry(slot, &legacy_list, list)
+ if (slot->dev == pdev)
+ goto found;
+
+ dev_warn(&pdev->dev, "Missing legacy fake slot?");
+ return -ENODEV;
+found:
+ kobject_del(&slot->kobj);
+ list_del(&slot->list);
+ kobject_put(&slot->kobj);
+ }
+
+ return 0;
+}
+
+static struct notifier_block legacy_notifier = {
+ .notifier_call = legacy_notify
+};
+
+static int __init init_legacy(void)
+{
+ struct pci_dev *pdev = NULL;
+
+ /* Add existing devices */
+ while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
+ legacy_add_slot(pdev);
+
+ /* Be alerted of any new ones */
+ bus_register_notifier(&pci_bus_type, &legacy_notifier);
+ return 0;
+}
+module_init(init_legacy);
+
+static void __exit remove_legacy(void)
+{
+ struct legacy_slot *slot, *tmp;
+
+ bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
+
+ list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
+ list_del(&slot->list);
+ kobject_del(&slot->kobj);
+ kobject_put(&slot->kobj);
+ }
+}
+module_exit(remove_legacy);
+
+
+MODULE_AUTHOR("Trent Piepho <[email protected]>");
+MODULE_DESCRIPTION("Legacy version of the fakephp interface");
+MODULE_LICENSE("GPL");
We wanted to replace fakephp wholesale, so rename legacy_fakephp back
to fakephp. Yes, this is a silly commit, but it produces a much easier
patch to read and review.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug/Makefile | 2
drivers/pci/hotplug/fakephp.c | 162 ++++++++++++++++++++++++++++++++++
drivers/pci/hotplug/legacy_fakephp.c | 162 ----------------------------------
3 files changed, 163 insertions(+), 163 deletions(-)
create mode 100644 drivers/pci/hotplug/fakephp.c
delete mode 100644 drivers/pci/hotplug/legacy_fakephp.c
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 3314fe8..2aa117c 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o
obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o
# Link this last so it doesn't claim devices that have a real hotplug driver
-obj-$(CONFIG_HOTPLUG_PCI_FAKE) += legacy_fakephp.o
+obj-$(CONFIG_HOTPLUG_PCI_FAKE) += fakephp.o
pci_hotplug-objs := pci_hotplug_core.o
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
new file mode 100644
index 0000000..e75af9e
--- /dev/null
+++ b/drivers/pci/hotplug/fakephp.c
@@ -0,0 +1,162 @@
+/* Works like the fakephp driver used to, except a little better.
+ *
+ * - It's possible to remove devices with subordinate busses.
+ * - New PCI devices that appear via any method, not just a fakephp triggered
+ * rescan, will be noticed.
+ * - Devices that are removed via any method, not just a fakephp triggered
+ * removal, will also be noticed.
+ *
+ * Uses nothing from the pci-hotplug subsystem.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include "../pci.h"
+
+struct legacy_slot {
+ struct kobject kobj;
+ struct pci_dev *dev;
+ struct list_head list;
+};
+
+static LIST_HEAD(legacy_list);
+
+static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ strcpy(buf, "1\n");
+ return 2;
+}
+
+static void remove_callback(void *data)
+{
+ pci_remove_bus_device((struct pci_dev *)data);
+}
+
+static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t len)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (val)
+ pci_do_scan_bus(slot->dev->bus);
+ else
+ sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
+ slot->dev, THIS_MODULE);
+ return len;
+}
+
+static struct attribute *legacy_attrs[] = {
+ &(struct attribute){ .name = "power", .mode = 0644 },
+ NULL,
+};
+
+static void legacy_release(struct kobject *kobj)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+
+ pci_dev_put(slot->dev);
+ kfree(slot);
+}
+
+static struct kobj_type legacy_ktype = {
+ .sysfs_ops = &(struct sysfs_ops){
+ .store = legacy_store, .show = legacy_show
+ },
+ .release = &legacy_release,
+ .default_attrs = legacy_attrs,
+};
+
+static int legacy_add_slot(struct pci_dev *pdev)
+{
+ struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
+
+ if (!slot)
+ return -ENOMEM;
+
+ if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
+ &pci_slots_kset->kobj, "%s",
+ pdev->dev.bus_id)) {
+ dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
+ return -EINVAL;
+ }
+ slot->dev = pci_dev_get(pdev);
+
+ list_add(&slot->list, &legacy_list);
+
+ return 0;
+}
+
+static int legacy_notify(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct pci_dev *pdev = to_pci_dev(data);
+
+ if (action == BUS_NOTIFY_ADD_DEVICE) {
+ legacy_add_slot(pdev);
+ } else if (action == BUS_NOTIFY_DEL_DEVICE) {
+ struct legacy_slot *slot;
+
+ list_for_each_entry(slot, &legacy_list, list)
+ if (slot->dev == pdev)
+ goto found;
+
+ dev_warn(&pdev->dev, "Missing legacy fake slot?");
+ return -ENODEV;
+found:
+ kobject_del(&slot->kobj);
+ list_del(&slot->list);
+ kobject_put(&slot->kobj);
+ }
+
+ return 0;
+}
+
+static struct notifier_block legacy_notifier = {
+ .notifier_call = legacy_notify
+};
+
+static int __init init_legacy(void)
+{
+ struct pci_dev *pdev = NULL;
+
+ /* Add existing devices */
+ while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
+ legacy_add_slot(pdev);
+
+ /* Be alerted of any new ones */
+ bus_register_notifier(&pci_bus_type, &legacy_notifier);
+ return 0;
+}
+module_init(init_legacy);
+
+static void __exit remove_legacy(void)
+{
+ struct legacy_slot *slot, *tmp;
+
+ bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
+
+ list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
+ list_del(&slot->list);
+ kobject_del(&slot->kobj);
+ kobject_put(&slot->kobj);
+ }
+}
+module_exit(remove_legacy);
+
+
+MODULE_AUTHOR("Trent Piepho <[email protected]>");
+MODULE_DESCRIPTION("Legacy version of the fakephp interface");
+MODULE_LICENSE("GPL");
diff --git a/drivers/pci/hotplug/legacy_fakephp.c b/drivers/pci/hotplug/legacy_fakephp.c
deleted file mode 100644
index e75af9e..0000000
--- a/drivers/pci/hotplug/legacy_fakephp.c
+++ /dev/null
@@ -1,162 +0,0 @@
-/* Works like the fakephp driver used to, except a little better.
- *
- * - It's possible to remove devices with subordinate busses.
- * - New PCI devices that appear via any method, not just a fakephp triggered
- * rescan, will be noticed.
- * - Devices that are removed via any method, not just a fakephp triggered
- * removal, will also be noticed.
- *
- * Uses nothing from the pci-hotplug subsystem.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/list.h>
-#include <linux/kobject.h>
-#include <linux/sysfs.h>
-#include <linux/init.h>
-#include <linux/pci.h>
-#include "../pci.h"
-
-struct legacy_slot {
- struct kobject kobj;
- struct pci_dev *dev;
- struct list_head list;
-};
-
-static LIST_HEAD(legacy_list);
-
-static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
- char *buf)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
- strcpy(buf, "1\n");
- return 2;
-}
-
-static void remove_callback(void *data)
-{
- pci_remove_bus_device((struct pci_dev *)data);
-}
-
-static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
- const char *buf, size_t len)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
- unsigned long val;
-
- if (strict_strtoul(buf, 0, &val) < 0)
- return -EINVAL;
-
- if (val)
- pci_do_scan_bus(slot->dev->bus);
- else
- sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
- slot->dev, THIS_MODULE);
- return len;
-}
-
-static struct attribute *legacy_attrs[] = {
- &(struct attribute){ .name = "power", .mode = 0644 },
- NULL,
-};
-
-static void legacy_release(struct kobject *kobj)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
-
- pci_dev_put(slot->dev);
- kfree(slot);
-}
-
-static struct kobj_type legacy_ktype = {
- .sysfs_ops = &(struct sysfs_ops){
- .store = legacy_store, .show = legacy_show
- },
- .release = &legacy_release,
- .default_attrs = legacy_attrs,
-};
-
-static int legacy_add_slot(struct pci_dev *pdev)
-{
- struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
-
- if (!slot)
- return -ENOMEM;
-
- if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
- &pci_slots_kset->kobj, "%s",
- pdev->dev.bus_id)) {
- dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
- return -EINVAL;
- }
- slot->dev = pci_dev_get(pdev);
-
- list_add(&slot->list, &legacy_list);
-
- return 0;
-}
-
-static int legacy_notify(struct notifier_block *nb,
- unsigned long action, void *data)
-{
- struct pci_dev *pdev = to_pci_dev(data);
-
- if (action == BUS_NOTIFY_ADD_DEVICE) {
- legacy_add_slot(pdev);
- } else if (action == BUS_NOTIFY_DEL_DEVICE) {
- struct legacy_slot *slot;
-
- list_for_each_entry(slot, &legacy_list, list)
- if (slot->dev == pdev)
- goto found;
-
- dev_warn(&pdev->dev, "Missing legacy fake slot?");
- return -ENODEV;
-found:
- kobject_del(&slot->kobj);
- list_del(&slot->list);
- kobject_put(&slot->kobj);
- }
-
- return 0;
-}
-
-static struct notifier_block legacy_notifier = {
- .notifier_call = legacy_notify
-};
-
-static int __init init_legacy(void)
-{
- struct pci_dev *pdev = NULL;
-
- /* Add existing devices */
- while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
- legacy_add_slot(pdev);
-
- /* Be alerted of any new ones */
- bus_register_notifier(&pci_bus_type, &legacy_notifier);
- return 0;
-}
-module_init(init_legacy);
-
-static void __exit remove_legacy(void)
-{
- struct legacy_slot *slot, *tmp;
-
- bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
-
- list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
- list_del(&slot->list);
- kobject_del(&slot->kobj);
- kobject_put(&slot->kobj);
- }
-}
-module_exit(remove_legacy);
-
-
-MODULE_AUTHOR("Trent Piepho <[email protected]>");
-MODULE_DESCRIPTION("Legacy version of the fakephp interface");
-MODULE_LICENSE("GPL");
Now that the PCI core is capable of function-level remove and rescan
as well as bus-level rescan, there's no functional need to keep fakephp
anymore.
We keep it around for userspace compatibility reasons, schedule removal
in three years.
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/feature-removal-schedule.txt | 32 ++++++++++++++++++++++++++++
1 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 5ddbe35..fc3505b 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -335,3 +335,35 @@ Why: In 2.6.18 the Secmark concept was introduced to replace the "compat_net"
Secmark, it is time to deprecate the older mechanism and start the
process of removing the old code.
Who: Paul Moore <[email protected]>
+
+---------------------------
+
+What: fakephp and associated sysfs files in /sys/bus/pci/slots/
+When: 2011
+Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to
+ represent a machine's physical PCI slots. The change in semantics
+ had userspace implications, as the hotplug core no longer allowed
+ drivers to create multiple sysfs files per physical slot (required
+ for multi-function devices, e.g.). fakephp was seen as a developer's
+ tool only, and its interface changed. Too late, we learned that
+ there were some users of the fakephp interface.
+
+ In 2.6.30, the original fakephp interface was restored. At the same
+ time, the PCI core gained the ability that fakephp provided, namely
+ function-level hot-remove and hot-add.
+
+ Since the PCI core now provides the same functionality, exposed in:
+
+ /sys/bus/pci/rescan
+ /sys/bus/pci/devices/.../remove
+ /sys/bus/pci/devices/.../rescan
+
+ there is no functional reason to maintain fakephp as well.
+
+ We will keep the existing module so that 'modprobe fakephp' will
+ present the old /sys/bus/pci/slots/... interface for compatibility,
+ but users are urged to migrate their applications to the API above.
+
+ After a reasonable transition period, we will remove the legacy
+ fakephp interface.
+Who: Alex Chiang <[email protected]>
* Alex Chiang <[email protected]>:
>
> There is still one major bug somewhere that shows up only when using
> the PCIe portdriver (that is, any time PCIe support is built into
> the kernel). You get an oops during multiple remove/rescan cycles,
> especially on devices with an internal bridge.
Got it, we had a double-free in the PCIe port driver which was
causing all sorts of problems.
I fixed that and now this patch series is stable enough for
others to actually apply and test. As of now, there are no known
bugs.
Of course, I'm going to keep testing and try to find some more
bugs. :)
As a reminder, if you want to play with this series, you'll also
need these two patches:
> http://thread.gmane.org/gmane.linux.kernel.pci/3437
> http://lkml.org/lkml/2009/3/7/173
And now this third patch:
http://thread.gmane.org/gmane.linux.kernel.pci/3524
Finally, patch 07/11 needs to be updated. I'll post a reply to
that mail with the updated patch.
Thanks.
/ac
This is an updated version of this patch. It fixes a bug where we
would remove a bridge even if it still had children.
Now we check for children before attempting to remove the bridge.
/ac
commit 3e9eed2b7f4dd6f8f0df98d6bfae71c7461dbdf2
Author: Alex Chiang <[email protected]>
Date: Mon Mar 9 12:43:10 2009 -0600
PCI: Introduce /sys/bus/pci/devices/.../remove
This patch adds an attribute named "remove" to a PCI device's sysfs
directory. Writing a non-zero value to this attribute will remove the PCI
device and any children of it.
Trent Piepho wrote the original implementation and documentation.
Cc: [email protected]
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ea4aee2..5b1ddde 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -50,6 +50,14 @@ Description:
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../remove
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ hot-remove the PCI device and any of its children.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
index 9f8740c..89c15dc 100644
--- a/Documentation/filesystems/sysfs-pci.txt
+++ b/Documentation/filesystems/sysfs-pci.txt
@@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
| |-- enable
| |-- irq
| |-- local_cpus
+ | |-- remove
| |-- resource
| |-- resource0
| |-- resource1
@@ -36,6 +37,7 @@ files, each with their own function.
enable Whether the device is enabled (ascii, rw)
irq IRQ number (ascii, ro)
local_cpus nearby CPU mask (cpumask, ro)
+ remove remove device from kernel's list (ascii, wo)
resource PCI resource host addresses (ascii, ro)
resource0..N PCI resource N, if present (binary, mmap)
resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
@@ -46,6 +48,7 @@ files, each with their own function.
ro - read only file
rw - file is readable and writable
+ wo - write only file
mmap - file is mmapable
ascii - file contains ascii text
binary - file contains binary data
@@ -73,6 +76,12 @@ that the device must be enabled for a rom read to return data succesfully.
In the event a driver is not bound to the device, it can be enabled using the
'enable' file, documented above.
+The 'remove' file is used to remove the PCI device, by writing a non-zero
+integer to the file. This does not involve any kind of hot-plug functionality,
+e.g. powering off the device. The device is removed from the kernel's list of
+PCI devices, the sysfs directory for it is removed, and the device will be
+removed from any drivers attached to it.
+
Accessing legacy resources through sysfs
----------------------------------------
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a29fc3b..b6a985b 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -219,6 +219,46 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
return count;
}
+#ifdef CONFIG_HOTPLUG
+static void remove_callback(void *data)
+{
+ int bridge = 0;
+ struct pci_dev *pdev = (struct pci_dev *)data;
+
+ if (pdev->subordinate)
+ bridge = 1;
+
+ pci_remove_bus_device(pdev);
+ if (bridge && list_empty(&pdev->bus->devices))
+ pci_remove_bus(pdev->bus);
+
+ pci_dev_put(pdev);
+}
+
+static ssize_t
+remove_store(struct device *dev, struct device_attribute *dummy,
+ const char *buf, size_t count)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (pdev->subordinate && pci_is_root_bus(pdev->bus))
+ return -EBUSY;
+
+ if (val)
+ sysfs_schedule_callback(&dev->kobj, remove_callback, pdev,
+ THIS_MODULE);
+
+ return count;
+}
+#endif
+
struct device_attribute pci_dev_attrs[] = {
__ATTR_RO(resource),
__ATTR_RO(vendor),
@@ -237,6 +277,9 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
broken_parity_status_show,broken_parity_status_store),
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
+#ifdef CONFIG_HOTPLUG
+ __ATTR(remove, S_IWUSR, NULL, remove_store),
+#endif
__ATTR_NULL,
};
2009/3/9 Alex Chiang <[email protected]>:
> * Alex Chiang <[email protected]>:
>>
>> There is still one major bug somewhere that shows up only when using
>> the PCIe portdriver (that is, any time PCIe support is built into
>> the kernel). You get an oops during multiple remove/rescan cycles,
>> especially on devices with an internal bridge.
>
> Got it, we had a double-free in the PCIe port driver which was
> causing all sorts of problems.
>
> I fixed that and now this patch series is stable enough for
> others to actually apply and test. As of now, there are no known
> bugs.
>
> Of course, I'm going to keep testing and try to find some more
> bugs. :)
>
> As a reminder, if you want to play with this series, you'll also
> need these two patches:
>
>> Â Â Â http://thread.gmane.org/gmane.linux.kernel.pci/3437
>> Â Â Â http://lkml.org/lkml/2009/3/7/173
>
> And now this third patch:
>
> Â Â Â Â http://thread.gmane.org/gmane.linux.kernel.pci/3524
>
> Finally, patch 07/11 needs to be updated. I'll post a reply to
> that mail with the updated patch.
Hi,
I got this crash:
[ 279.029673] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000008
[ 279.030011] IP: [<ffffffff811fce96>] pci_remove_bus_device+0x56/0xe0
[ 279.030011] PGD 3e47e067 PUD 3e4d1067 PMD 0
[ 279.030011] Oops: 0002 [#1] SMP
[ 279.030011] last sysfs file: /sys/devices/pci0000:00/0000:00:00.0/remove
[ 279.030011] CPU 0
[ 279.030011] Pid: 6, comm: events/0 Not tainted 2.6.29-rc6 #361 945P-A
[ 279.030011] RIP: 0010:[<ffffffff811fce96>] [<ffffffff811fce96>]
pci_remove_bus_device+0x56/0xe0
[ 279.030011] RSP: 0018:ffff88003f8bde30 EFLAGS: 00010286
[ 279.030011] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817ab9b8
[ 279.030011] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff817ab9b0
[ 279.030011] RBP: ffff88003f8bde50 R08: 00000000002ec000 R09: 0000000000000000
[ 279.030011] R10: ffff88003d9fd7c0 R11: 0000000000000040 R12: ffff88003d929800
[ 279.030011] R13: ffff88003d929800 R14: ffff88003f80a908 R15: ffff88003f8adf00
[ 279.030011] FS: 0000000000000000(0000) GS:ffff8800019f1000(0000)
knlGS:0000000000000000
[ 279.030011] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 279.030011] CR2: ffff88003e4d1000 CR3: 000000003e452000 CR4: 00000000000006a0
[ 279.030011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 279.030011] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[ 279.030011] Process events/0 (pid: 6, threadinfo ffff88003f8bc000,
task ffff88003f8a2350)
[ 279.030011] Stack:
[ 279.030011] ffffffffffffffff ffff88003d929800 ffff88003d9de800
ffff88003f80a908
[ 279.030011] ffff88003f8bde70 ffffffff81202f7d 0000000000000010
ffff88003d9de820
[ 279.030011] ffff88003f8bde90 ffffffff8112503f ffff88003f80a900
ffffffff81125020
[ 279.030011] Call Trace:
[ 279.030011] [<ffffffff81202f7d>] remove_callback+0x3d/0x60
[ 279.030011] [<ffffffff8112503f>] sysfs_schedule_callback_work+0x1f/0x40
[ 279.030011] [<ffffffff81125020>] ? sysfs_schedule_callback_work+0x0/0x40
[ 279.030011] [<ffffffff81055510>] run_workqueue+0x70/0x130
[ 279.030011] [<ffffffff81055677>] worker_thread+0xa7/0x120
[ 279.030011] [<ffffffff810597f0>] ? autoremove_wake_function+0x0/0x40
[ 279.030011] [<ffffffff810555d0>] ? worker_thread+0x0/0x120
[ 279.030011] [<ffffffff810593d9>] kthread+0x49/0x90
[ 279.030011] [<ffffffff8100d45a>] child_rip+0xa/0x20
[ 279.030011] [<ffffffff81059390>] ? kthread+0x0/0x90
[ 279.030011] [<ffffffff8100d450>] ? child_rip+0x0/0x20
[ 279.030011] Code: 00 00 00 4c 89 ef 4d 89 ec 31 db e8 75 fe ff ff
48 c7 c7 b0 b9 7a 81 e8 f9 f8 3a 00 49 8b 55 00 49 8b
45 08 48 c7 c7 b0 b9 7a 81 <48> 89 42 08 48 89 10 49 c7 45 08 00 00
00 00 49 c7 45 00 00 00
[ 279.030011] RIP [<ffffffff811fce96>] pci_remove_bus_device+0x56/0xe0
[ 279.030011] RSP <ffff88003f8bde30>
[ 279.030011] CR2: 0000000000000008
[ 279.291933] ---[ end trace 4ba18f2857f89768 ]---
It was with this patch queue on top of pci/linux-next
(487e348b0ff23e061f60010477a664ea378c1b30):
PCIe: portdrv: call pci_disable_device during remove
PCIe: AER: during disable, check subordinate before walking
PCIe portdrv: eliminate double kfree in remove path
PCI Hotplug: schedule fakephp for feature removal
PCI Hotplug: rename legacy_fakephp to fakephp
PCI Hotplug: restore fakephp interface with complete reimplementation
PCI: Introduce /sys/bus/pci/devices/.../rescan
PCI: Introduce /sys/bus/pci/devices/.../remove (new version)
PCI: Introduce /sys/bus/pci/rescan
PCI: beef up pci_do_scan_bus()
PCI: always scan child buses
PCI: pci_scan_slot() returns newly found devices
PCI: don't scan existing devices
PCI: pci_is_root_bus helper
It reproduces reliably if I do this:
$ while true; do echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove; done
Line numbers:
$ addr2line -e vmlinux -i ffffffff811fce96
include/linux/list.h:92
include/linux/list.h:105
drivers/pci/remove.c:40
drivers/pci/remove.c:106
And this is my drivers/pci/remove.c:
33 static void pci_destroy_dev(struct pci_dev *dev)
34 {
35 pci_stop_dev(dev);
36
37 /* Remove the device from the device lists, and prevent any further
38 * list accesses from this device */
39 down_write(&pci_bus_sem);
40 list_del(&dev->bus_list);
41 dev->bus_list.next = dev->bus_list.prev = NULL;
42 up_write(&pci_bus_sem);
43
44 pci_free_resources(dev);
45 pci_dev_put(dev);
46 }
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
Hi Vegard,
First, thanks for testing!
* Vegard Nossum <[email protected]>:
>
> I got this crash:
>
> [ 279.029673] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000008
[...]
> It reproduces reliably if I do this:
>
> $ while true; do echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove; done
I was going to ask for lspci -v output so that I could see what
device 0000:00:00.0 might be, but I was able to reproduce
something similar on my machine.
Can I ask why you're doing the above with a while loop? Just to
torture the code? Or something else?
I haven't started investigating too closely yet, but it looks
like I need some locking in there.
Thanks.
/ac
[root@tahitifp1 pci]# while true ; do echo 1 > devices/0000\:04\:00.0/remove ; done
kobject: '0000:06:00.0' (e000000181703120): kobject_uevent_env
------------[ cut here ]------------
WARNING: at fs/sysfs/group.c:138 sysfs_remove_group+0x80/0x180()
Hardware name: server BL860c
sysfs group a000000101432580 not found for kobject '0000:06:00.0'
Modules linked in: binfmt_misc dm_multipath pci_slot sg shpchp pci_hotplug dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ohci_hcd ehci_hcd usbcore
Call Trace:
[<a0000001000146d0>] show_stack+0x50/0xa0
sp=e00000018131fae0 bsp=e000000181311228
[<a000000100014750>] dump_stack+0x30/0x60
sp=e00000018131fcb0 bsp=e000000181311210
[<a00000010009a0f0>] warn_slowpath+0x130/0x180
sp=e00000018131fcb0 bsp=e0000001813111a8
[<a0000001002336a0>] sysfs_remove_group+0x80/0x180
sp=e00000018131fdd0 bsp=e000000181311178
[<a000000100513410>] dpm_sysfs_remove+0x30/0x60
sp=e00000018131fdd0 bsp=e000000181311158
[<a0000001005041d0>] device_del+0x70/0x3a0
sp=e00000018131fdd0 bsp=e000000181311120
[<a0000001005045d0>] device_unregister+0xd0/0x100
sp=e00000018131fdd0 bsp=e000000181311100
[<a000000100409cd0>] pci_stop_dev+0x70/0x100
sp=e00000018131fdd0 bsp=e0000001813110d8
[<a000000100409f40>] pci_remove_bus_device+0x80/0x180
sp=e00000018131fdd0 bsp=e0000001813110a8
[<a00000010040a0a0>] pci_remove_behind_bridge+0x60/0xc0
sp=e00000018131fdd0 bsp=e000000181311080
[<a000000100409f00>] pci_remove_bus_device+0x40/0x180
sp=e00000018131fdd0 bsp=e000000181311050
[<a00000010040a0a0>] pci_remove_behind_bridge+0x60/0xc0
sp=e00000018131fdd0 bsp=e000000181311028
[<a000000100409f00>] pci_remove_bus_device+0x40/0x180
sp=e00000018131fdd0 bsp=e000000181310ff0
[<a000000100415bc0>] remove_callback+0x40/0xc0
sp=e00000018131fdd0 bsp=e000000181310fc8
[<a00000010022edb0>] sysfs_schedule_callback_work+0x50/0xc0
sp=e00000018131fdd0 bsp=e000000181310fa0
[<a0000001000c1150>] run_workqueue+0x1f0/0x340
sp=e00000018131fdd0 bsp=e000000181310f60
[<a0000001000c13e0>] worker_thread+0x140/0x180
sp=e00000018131fdd0 bsp=e000000181310f38
[<a0000001000c9ba0>] kthread+0xa0/0x120
sp=e00000018131fe30 bsp=e000000181310f08
[<a000000100016690>] kernel_thread_helper+0xd0/0x100
sp=e00000018131fe30 bsp=e000000181310ee0
[<a00000010000a4c0>] start_kernel_thread+0x20/0x40
sp=e00000018131fe30 bsp=e000000181310ee0
---[ end trace 9397c0de832fd5ba ]---
Unable to handle kernel NULL pointer dereference (address 0000000000000020)
events/4[32]: Oops 8813272891392 [1]
Modules linked in: binfmt_misc dm_multipath pci_slot sg shpchp pci_hotplug dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ohci_hcd ehci_hcd usbcore
Pid: 32, CPU 4, comm: events/4
psr : 00001010085a6010 ifs : 800000000000038a ip : [<a0000001009059a0>] Tainted: G W (2.6.29-rc4)
ip is at klist_put+0x40/0x160
unat: 0000000000000000 pfs : 000000000000038a rsc : 0000000000000003
rnat: e00000018131fcb0 bsps: 9397c0de832fd5ba pr : 0000000000005a41
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100905990 b6 : a000000100044250 b7 : a00000010000cd10
f6 : 000000000000000000000 f7 : 1003e9e3779b97f4a7c16
f8 : 1003e0a00000010000a4c f9 : 1003e0000000000000056
f10 : 1003e000000000000025a f11 : 1003e6db6db6db6db6db7
r1 : a0000001016d1360 r2 : a0000001014f5f50 r3 : a0000001014f5188
r8 : 0000000000000000 r9 : 0000000000000000 r10 : 0000000000000200
r11 : 0000000000000000 r12 : e00000018131fdd0 r13 : e000000181310000
r14 : 0000000000000020 r15 : 0000000000000009 r16 : 0000000000000000
r17 : a0000001014f5f50 r18 : 0000000000000200 r19 : 000000000007aca1
r20 : fffffffffff7aca1 r21 : a0000001014f5f90 r22 : a0000001014f5f94
r23 : 000000000000000a r24 : 000000000000000a r25 : a000000102b76a98
r26 : 000000000007aca0 r27 : 00000010085a2010 r28 : 00000000000fffff
r29 : a0000001014f4e58 r30 : 0000000000000000 r31 : a000000101409818
Call Trace:
[<a0000001000146d0>] show_stack+0x50/0xa0
sp=e00000018131f9a0 bsp=e0000001813112c0
[<a000000100014fb0>] show_regs+0x830/0x860
sp=e00000018131fb70 bsp=e000000181311278
[<a000000100039bc0>] die+0x1c0/0x2c0
sp=e00000018131fb70 bsp=e000000181311230
[<a000000100063e50>] ia64_do_page_fault+0x830/0x960
sp=e00000018131fb70 bsp=e0000001813111c8
[<a00000010000c700>] ia64_native_leave_kernel+0x0/0x270
sp=e00000018131fc00 bsp=e0000001813111c8
[<a0000001009059a0>] klist_put+0x40/0x160
sp=e00000018131fdd0 bsp=e000000181311178
[<a000000100905af0>] klist_del+0x30/0x60
sp=e00000018131fdd0 bsp=e000000181311158
[<a0000001005041f0>] device_del+0x90/0x3a0
sp=e00000018131fdd0 bsp=e000000181311120
[<a0000001005045d0>] device_unregister+0xd0/0x100
sp=e00000018131fdd0 bsp=e000000181311100
[<a000000100409cd0>] pci_stop_dev+0x70/0x100
sp=e00000018131fdd0 bsp=e0000001813110d8
[<a000000100409f40>] pci_remove_bus_device+0x80/0x180
sp=e00000018131fdd0 bsp=e0000001813110a8
[<a00000010040a0a0>] pci_remove_behind_bridge+0x60/0xc0
sp=e00000018131fdd0 bsp=e000000181311080
[<a000000100409f00>] pci_remove_bus_device+0x40/0x180
sp=e00000018131fdd0 bsp=e000000181311050
[<a00000010040a0a0>] pci_remove_behind_bridge+0x60/0xc0
sp=e00000018131fdd0 bsp=e000000181311028
[<a000000100409f00>] pci_remove_bus_device+0x40/0x180
sp=e00000018131fdd0 bsp=e000000181310ff0
[<a000000100415bc0>] remove_callback+0x40/0xc0
sp=e00000018131fdd0 bsp=e000000181310fc8
[<a00000010022edb0>] sysfs_schedule_callback_work+0x50/0xc0
sp=e00000018131fdd0 bsp=e000000181310fa0
[<a0000001000c1150>] run_workqueue+0x1f0/0x340
sp=e00000018131fdd0 bsp=e000000181310f60
[<a0000001000c13e0>] worker_thread+0x140/0x180
sp=e00000018131fdd0 bsp=e000000181310f38
[<a0000001000c9ba0>] kthread+0xa0/0x120
sp=e00000018131fe30 bsp=e000000181310f08
[<a000000100016690>] kernel_thread_helper+0xd0/0x100
sp=e00000018131fe30 bsp=e000000181310ee0
[<a00000010000a4c0>] start_kernel_thread+0x20/0x40
sp=e00000018131fe30 bsp=e000000181310ee0
kobject: '0000:06:00.0' (e000000181703120): fill_kobj_path: path = '/devices/pci0000:03/0000:03:00.0/0000:04:00.0/0000:05:02.0/0000:06:00.0'
2009/3/9 Alex Chiang <[email protected]>:
>> It reproduces reliably if I do this:
>>
>> $ while true; do echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove; done
>
> I was going to ask for lspci -v output so that I could see what
> device 0000:00:00.0 might be, but I was able to reproduce
> something similar on my machine.
>
> Can I ask why you're doing the above with a while loop? Just to
> torture the code? Or something else?
Yes, purely for the purpose of torture ;-)
I also found one more use-after-free error using kmemcheck:
[ 263.258025] WARNING: kmemcheck: Caught 8-bit read from freed memory
(ffff88003d8f315c)
[ 263.266131] 80e1803f0088ffff20d67b81ffffffff0000000000000000000000000d000000
[ 263.275104] f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f f
[ 263.284053] ^
[ 263.290696]
[ 263.292303] Pid: 881, comm: udevd Not tainted 2.6.29-rc6 #361 945P-A
[ 263.298770] RIP: 0010:[<ffffffff811eb501>] [<ffffffff811eb501>]
kobject_put+0x11/0x60
[ 263.306938] RSP: 0018:ffff88003f8bde60 EFLAGS: 00010282
[ 263.312367] RAX: 0000000000000001 RBX: ffff88003d8f3120 RCX: 0000000000000000
[ 263.319616] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff88003d8f3120
[ 263.326865] RBP: ffff88003f8bde70 R08: 00000000002ec000 R09: 0000000000000000
[ 263.334114] R10: ffff88003d95fdc0 R11: 0000000000000010 R12: ffff88003d946ac0
[ 263.341362] R13: ffff88003f80a908 R14: ffff88003f80a908 R15: ffff88003f8adf00
[ 263.348613] FS: 0000000000000000(0000) GS:ffff8800019f1000(0063)
knlGS:00000000f7d6c700
[ 263.356884] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 263.362747] CR2: ffff88003f806ea0 CR3: 000000003e44a000 CR4: 00000000000006a0
[ 263.369995] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 263.377246] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[ 263.384494] [<ffffffff81125048>] sysfs_schedule_callback_work+0x28/0x40
[ 263.391378] [<ffffffff81055510>] run_workqueue+0x70/0x130
[ 263.397049] [<ffffffff81055677>] worker_thread+0xa7/0x120
[ 263.402720] [<ffffffff810593d9>] kthread+0x49/0x90
[ 263.407784] [<ffffffff8100d45a>] child_rip+0xa/0x20
[ 263.412935] [<ffffffffffffffff>] 0xffffffffffffffff
[ 281.464381] NOHZ: local_softirq_pending 01
This is how I triggered it:
# echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove
# echo 1 > /sys/bus/pci/rescan
# echo 1 > /sys/bus/pci/rescan
# echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove
The line numbers:
$ addr2line -e vmlinux -i ffffffff811eb501 ffffffff81125048
lib/kobject.c:589
fs/sysfs/file.c:677
586 void kobject_put(struct kobject *kobj)
587 {
588 if (kobj) {
589 if (!kobj->state_initialized)
590 WARN(1, KERN_WARNING "kobject: '%s' (%p): is not "
591 "initialized, yet kobject_put() is being "
592 "called.\n", kobject_name(kobj), kobj);
593 kref_put(&kobj->kref, kobject_release);
594 }
595 }
669 static void sysfs_schedule_callback_work(struct work_struct *work)
670 {
671 struct sysfs_schedule_callback_struct *ss = container_of(work,
672 struct sysfs_schedule_callback_struct, work);
673
674 (ss->func)(ss->data);
675 kobject_put(ss->kobj);
676 module_put(ss->owner);
677 kfree(ss);
678 }
(The short story: the ss->kobj was already freed when this function was called.)
Hope this helps :-)
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
* Vegard Nossum <[email protected]>:
> 2009/3/9 Alex Chiang <[email protected]>:
> >> It reproduces reliably if I do this:
> >>
> >> $ while true; do echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove; done
> >
> > I was going to ask for lspci -v output so that I could see what
> > device 0000:00:00.0 might be, but I was able to reproduce
> > something similar on my machine.
> >
> > Can I ask why you're doing the above with a while loop? Just to
> > torture the code? Or something else?
>
> Yes, purely for the purpose of torture ;-)
Ok, just checking. :)
> I also found one more use-after-free error using kmemcheck:
>
> Hope this helps :-)
Yes, thanks a lot!
I'll work on these. Thanks for the excellent bug reports.
/ac
* Alex Chiang <[email protected]>:
> This is an updated version of this patch. It fixes a bug where we
> would remove a bridge even if it still had children.
>
> Now we check for children before attempting to remove the bridge.
This is another updated version of the patch.
It adds a mutex for the sysfs callback function.
This mutex, along with another sysfs patch that I plan on posting
soon, should help prevent an oops that a crazy user might induce
with the following loop:
# while true ; do echo 1 > /sys/bus/pci/devices/.../remove ; done
I'm posting this updated version so that Vegard can drop it into
his test setup that he had.
When I post v4 of this entire patch series, I'll have a full
changelog, etc.
Thanks.
/ac
commit 53cb428d7a26be7b3ea38f6d7f30cd33cc1e6b4a
Author: Alex Chiang <[email protected]>
Date: Tue Mar 10 16:32:45 2009 -0600
PCI: Introduce /sys/bus/pci/devices/.../remove
This patch adds an attribute named "remove" to a PCI device's sysfs
directory. Writing a non-zero value to this attribute will remove the PCI
device and any children of it.
Trent Piepho wrote the original implementation and documentation.
Cc: [email protected]
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ea4aee2..5b1ddde 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -50,6 +50,14 @@ Description:
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../remove
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ hot-remove the PCI device and any of its children.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
index 9f8740c..26e4b8b 100644
--- a/Documentation/filesystems/sysfs-pci.txt
+++ b/Documentation/filesystems/sysfs-pci.txt
@@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
| |-- enable
| |-- irq
| |-- local_cpus
+ | |-- remove
| |-- resource
| |-- resource0
| |-- resource1
@@ -36,6 +37,7 @@ files, each with their own function.
enable Whether the device is enabled (ascii, rw)
irq IRQ number (ascii, ro)
local_cpus nearby CPU mask (cpumask, ro)
+ remove remove device from kernel's list (ascii, wo)
resource PCI resource host addresses (ascii, ro)
resource0..N PCI resource N, if present (binary, mmap)
resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
@@ -46,6 +48,7 @@ files, each with their own function.
ro - read only file
rw - file is readable and writable
+ wo - write only file
mmap - file is mmapable
ascii - file contains ascii text
binary - file contains binary data
@@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
In the event a driver is not bound to the device, it can be enabled using the
'enable' file, documented above.
+The 'remove' file is used to remove the PCI device, by writing a non-zero
+integer to the file. This does not involve any kind of hot-plug functionality,
+e.g. powering off the device. The device is removed from the kernel's list of
+PCI devices, the sysfs directory for it is removed, and the device will be
+removed from any drivers attached to it. Removal of PCI root buses is
+disallowed.
+
Accessing legacy resources through sysfs
----------------------------------------
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a29fc3b..11bc9b1 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -219,6 +219,51 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
return count;
}
+#ifdef CONFIG_HOTPLUG
+static DEFINE_MUTEX(pci_remove_mutex);
+static void remove_callback(struct device *dev)
+{
+ int bridge = 0;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ mutex_lock(&pci_remove_mutex);
+
+ if (pdev->subordinate)
+ bridge = 1;
+
+ pci_remove_bus_device(pdev);
+ if (bridge && list_empty(&pdev->bus->devices))
+ pci_remove_bus(pdev->bus);
+ pci_dev_put(pdev);
+
+ mutex_unlock(&pci_remove_mutex);
+}
+
+static ssize_t
+remove_store(struct device *dev, struct device_attribute *dummy,
+ const char *buf, size_t count)
+{
+ int ret = 0;
+ unsigned long val;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (pdev->subordinate && pci_is_root_bus(pdev->bus))
+ return -EBUSY;
+
+ if (val)
+ ret = device_schedule_callback(dev, remove_callback);
+ if (ret)
+ count = ret;
+ return count;
+}
+#endif
+
struct device_attribute pci_dev_attrs[] = {
__ATTR_RO(resource),
__ATTR_RO(vendor),
@@ -237,6 +282,9 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
broken_parity_status_show,broken_parity_status_store),
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
+#ifdef CONFIG_HOTPLUG
+ __ATTR(remove, S_IWUSR, NULL, remove_store),
+#endif
__ATTR_NULL,
};
* Alex Chiang <[email protected]>:
> * Alex Chiang <[email protected]>:
> > This is an updated version of this patch. It fixes a bug where we
> > would remove a bridge even if it still had children.
> >
> > Now we check for children before attempting to remove the bridge.
>
> This is another updated version of the patch.
>
> It adds a mutex for the sysfs callback function.
Yet another updated version. Removes an extra pci_dev_put that
was causing us to reference the pci_dev after it was freed
(which showed up when slab poisoning debugging was turned on).
This iteration plus the sysfs rfc patch I posted earlier today
survives Vegard's test cases.
http://lkml.org/lkml/2009/3/10/476
Need to try and test the entire series under under kmemcheck once
I figure out how to do that.
Thanks.
/ac
commit 22222eefb1535eca972e60f60e6a44bbada6bd35
Author: Alex Chiang <[email protected]>
Date: Tue Mar 10 22:01:32 2009 -0600
PCI: Introduce /sys/bus/pci/devices/.../remove
This patch adds an attribute named "remove" to a PCI device's sysfs
directory. Writing a non-zero value to this attribute will remove the PCI
device and any children of it.
Trent Piepho wrote the original implementation and documentation.
Cc: [email protected]
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ea4aee2..5b1ddde 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -50,6 +50,14 @@ Description:
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../remove
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ hot-remove the PCI device and any of its children.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
index 9f8740c..26e4b8b 100644
--- a/Documentation/filesystems/sysfs-pci.txt
+++ b/Documentation/filesystems/sysfs-pci.txt
@@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
| |-- enable
| |-- irq
| |-- local_cpus
+ | |-- remove
| |-- resource
| |-- resource0
| |-- resource1
@@ -36,6 +37,7 @@ files, each with their own function.
enable Whether the device is enabled (ascii, rw)
irq IRQ number (ascii, ro)
local_cpus nearby CPU mask (cpumask, ro)
+ remove remove device from kernel's list (ascii, wo)
resource PCI resource host addresses (ascii, ro)
resource0..N PCI resource N, if present (binary, mmap)
resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
@@ -46,6 +48,7 @@ files, each with their own function.
ro - read only file
rw - file is readable and writable
+ wo - write only file
mmap - file is mmapable
ascii - file contains ascii text
binary - file contains binary data
@@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
In the event a driver is not bound to the device, it can be enabled using the
'enable' file, documented above.
+The 'remove' file is used to remove the PCI device, by writing a non-zero
+integer to the file. This does not involve any kind of hot-plug functionality,
+e.g. powering off the device. The device is removed from the kernel's list of
+PCI devices, the sysfs directory for it is removed, and the device will be
+removed from any drivers attached to it. Removal of PCI root buses is
+disallowed.
+
Accessing legacy resources through sysfs
----------------------------------------
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index a29fc3b..b276fa2 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -219,6 +219,50 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
return count;
}
+#ifdef CONFIG_HOTPLUG
+static DEFINE_MUTEX(pci_remove_mutex);
+static void remove_callback(struct device *dev)
+{
+ int bridge = 0;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ mutex_lock(&pci_remove_mutex);
+
+ if (pdev->subordinate)
+ bridge = 1;
+
+ pci_remove_bus_device(pdev);
+ if (bridge && list_empty(&pdev->bus->devices))
+ pci_remove_bus(pdev->bus);
+
+ mutex_unlock(&pci_remove_mutex);
+}
+
+static ssize_t
+remove_store(struct device *dev, struct device_attribute *dummy,
+ const char *buf, size_t count)
+{
+ int ret = 0;
+ unsigned long val;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (pdev->subordinate && pci_is_root_bus(pdev->bus))
+ return -EBUSY;
+
+ if (val)
+ ret = device_schedule_callback(dev, remove_callback);
+ if (ret)
+ count = ret;
+ return count;
+}
+#endif
+
struct device_attribute pci_dev_attrs[] = {
__ATTR_RO(resource),
__ATTR_RO(vendor),
@@ -237,6 +281,9 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
broken_parity_status_show,broken_parity_status_store),
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
+#ifdef CONFIG_HOTPLUG
+ __ATTR(remove, S_IWUSR, NULL, remove_store),
+#endif
__ATTR_NULL,
};
Alex Chiang wrote:
> We have a nice interface for re-scanning a PCI bus which will
> discover newly added devices, add them to the device tree, and
> enable them properly.
>
> Ensure that the bridge resources are properly sized and assigned
> during the rescan.
>
> Signed-off-by: Alex Chiang <[email protected]>
> ---
>
> drivers/pci/hotplug-pci.c | 16 ++++++++++++----
> 1 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c
> index 4d4a644..33ab2d2 100644
> --- a/drivers/pci/hotplug-pci.c
> +++ b/drivers/pci/hotplug-pci.c
> @@ -6,13 +6,21 @@
>
> unsigned int __devinit pci_do_scan_bus(struct pci_bus *bus)
> {
> - unsigned int max;
> + unsigned int max, pass;
> + struct pci_dev *dev;
>
> max = pci_scan_child_bus(bus);
>
> - /*
> - * Make the discovered devices available.
> - */
> + for (pass=0; pass < 2; pass++)
> + list_for_each_entry(dev, &bus->devices, bus_list) {
> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> + if (pass && dev->subordinate)
> + pci_bus_size_bridges(dev->subordinate);
> + }
> +
> + pci_bus_assign_resources(bus);
> + pci_enable_bridges(bus);
> pci_bus_add_devices(bus);
The "for (pass=0; pass <2; pass++)" loop doesn't look necessary.
And I'm worrying that your change have some bad effect to the
existing user of pci_do_scan_bus(). Did you confirm that?
Thanks,
Kenji Kaneshige
Hello Kenji-san,
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
> > We have a nice interface for re-scanning a PCI bus which will
> > discover newly added devices, add them to the device tree, and
> > enable them properly.
> >
> > Ensure that the bridge resources are properly sized and assigned
> > during the rescan.
> >
> > Signed-off-by: Alex Chiang <[email protected]>
> > ---
> >
> > drivers/pci/hotplug-pci.c | 16 ++++++++++++----
> > 1 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c
> > index 4d4a644..33ab2d2 100644
> > --- a/drivers/pci/hotplug-pci.c
> > +++ b/drivers/pci/hotplug-pci.c
> > @@ -6,13 +6,21 @@
> >
> > unsigned int __devinit pci_do_scan_bus(struct pci_bus *bus)
> > {
> > - unsigned int max;
> > + unsigned int max, pass;
> > + struct pci_dev *dev;
> >
> > max = pci_scan_child_bus(bus);
> >
> > - /*
> > - * Make the discovered devices available.
> > - */
> > + for (pass=0; pass < 2; pass++)
> > + list_for_each_entry(dev, &bus->devices, bus_list) {
> > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> > + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> > + if (pass && dev->subordinate)
> > + pci_bus_size_bridges(dev->subordinate);
> > + }
> > +
> > + pci_bus_assign_resources(bus);
> > + pci_enable_bridges(bus);
> > pci_bus_add_devices(bus);
>
> The "for (pass=0; pass <2; pass++)" loop doesn't look necessary.
Yes, I don't know why I had it in there originally. I can remove
that.
> And I'm worrying that your change have some bad effect to the
> existing user of pci_do_scan_bus(). Did you confirm that?
I hadn't gotten around to verifying/fixing existing callers of
pci_do_scan_bus yet. I was focusing on the core first.
There aren't too many callers, but unfortunately, I don't have
any hardware that actually uses the existing drivers.
I seem to recall that your machines support shpchp. Would you
mind testing this patch and telling me if your machine still
behaves properly?
I looked at shpchp_configure_device and I think that simply
scanning the device's parent bus should work.
Thanks.
/ac
diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
index aa315e5..7e8457b 100644
--- a/drivers/pci/hotplug/shpchp_pci.c
+++ b/drivers/pci/hotplug/shpchp_pci.c
@@ -110,11 +110,7 @@ int __ref shpchp_configure_device(struct slot *p_slot)
return -EINVAL;
}
- num = pci_scan_slot(parent, PCI_DEVFN(p_slot->device, 0));
- if (num == 0) {
- ctrl_err(ctrl, "No new device found\n");
- return -ENODEV;
- }
+ pci_do_scan_bus(parent);
for (fn = 0; fn < 8; fn++) {
dev = pci_get_slot(parent, PCI_DEVFN(p_slot->device, fn));
@@ -126,40 +122,10 @@ int __ref shpchp_configure_device(struct slot *p_slot)
pci_dev_put(dev);
continue;
}
- if ((dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) ||
- (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
- /* Find an unused bus number for the new bridge */
- struct pci_bus *child;
- unsigned char busnr, start = parent->secondary;
- unsigned char end = parent->subordinate;
- for (busnr = start; busnr <= end; busnr++) {
- if (!pci_find_bus(pci_domain_nr(parent),
- busnr))
- break;
- }
- if (busnr > end) {
- ctrl_err(ctrl,
- "No free bus for hot-added bridge\n");
- pci_dev_put(dev);
- continue;
- }
- child = pci_add_new_bus(parent, dev, busnr);
- if (!child) {
- ctrl_err(ctrl, "Cannot add new bus for %s\n",
- pci_name(dev));
- pci_dev_put(dev);
- continue;
- }
- child->subordinate = pci_do_scan_bus(child);
- pci_bus_size_bridges(child);
- }
program_fw_provided_values(dev);
pci_dev_put(dev);
}
- pci_bus_assign_resources(parent);
- pci_bus_add_devices(parent);
- pci_enable_bridges(parent);
return 0;
}
Alex Chiang wrote:
> Hello Kenji-san,
>
> * Kenji Kaneshige <[email protected]>:
>> Alex Chiang wrote:
>>> We have a nice interface for re-scanning a PCI bus which will
>>> discover newly added devices, add them to the device tree, and
>>> enable them properly.
>>>
>>> Ensure that the bridge resources are properly sized and assigned
>>> during the rescan.
>>>
>>> Signed-off-by: Alex Chiang <[email protected]>
>>> ---
>>>
>>> drivers/pci/hotplug-pci.c | 16 ++++++++++++----
>>> 1 files changed, 12 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/pci/hotplug-pci.c b/drivers/pci/hotplug-pci.c
>>> index 4d4a644..33ab2d2 100644
>>> --- a/drivers/pci/hotplug-pci.c
>>> +++ b/drivers/pci/hotplug-pci.c
>>> @@ -6,13 +6,21 @@
>>>
>>> unsigned int __devinit pci_do_scan_bus(struct pci_bus *bus)
>>> {
>>> - unsigned int max;
>>> + unsigned int max, pass;
>>> + struct pci_dev *dev;
>>>
>>> max = pci_scan_child_bus(bus);
>>>
>>> - /*
>>> - * Make the discovered devices available.
>>> - */
>>> + for (pass=0; pass < 2; pass++)
>>> + list_for_each_entry(dev, &bus->devices, bus_list) {
>>> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
>>> + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
>>> + if (pass && dev->subordinate)
>>> + pci_bus_size_bridges(dev->subordinate);
>>> + }
>>> +
>>> + pci_bus_assign_resources(bus);
>>> + pci_enable_bridges(bus);
>>> pci_bus_add_devices(bus);
>> The "for (pass=0; pass <2; pass++)" loop doesn't look necessary.
>
> Yes, I don't know why I had it in there originally. I can remove
> that.
>
>> And I'm worrying that your change have some bad effect to the
>> existing user of pci_do_scan_bus(). Did you confirm that?
>
> I hadn't gotten around to verifying/fixing existing callers of
> pci_do_scan_bus yet. I was focusing on the core first.
>
> There aren't too many callers, but unfortunately, I don't have
> any hardware that actually uses the existing drivers.
>
> I seem to recall that your machines support shpchp. Would you
> mind testing this patch and telling me if your machine still
> behaves properly?
>
I have machines that support shpchp. But unfortunately I don't
have any adapter card that contains bridge, which is needed to
test your change.
> I looked at shpchp_configure_device and I think that simply
> scanning the device's parent bus should work.
>
Ok, I'll try it.
Thanks,
Kenji Kaneshige
> Thanks.
>
> /ac
>
> diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
> index aa315e5..7e8457b 100644
> --- a/drivers/pci/hotplug/shpchp_pci.c
> +++ b/drivers/pci/hotplug/shpchp_pci.c
> @@ -110,11 +110,7 @@ int __ref shpchp_configure_device(struct slot *p_slot)
> return -EINVAL;
> }
>
> - num = pci_scan_slot(parent, PCI_DEVFN(p_slot->device, 0));
> - if (num == 0) {
> - ctrl_err(ctrl, "No new device found\n");
> - return -ENODEV;
> - }
> + pci_do_scan_bus(parent);
>
> for (fn = 0; fn < 8; fn++) {
> dev = pci_get_slot(parent, PCI_DEVFN(p_slot->device, fn));
> @@ -126,40 +122,10 @@ int __ref shpchp_configure_device(struct slot *p_slot)
> pci_dev_put(dev);
> continue;
> }
> - if ((dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) ||
> - (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
> - /* Find an unused bus number for the new bridge */
> - struct pci_bus *child;
> - unsigned char busnr, start = parent->secondary;
> - unsigned char end = parent->subordinate;
> - for (busnr = start; busnr <= end; busnr++) {
> - if (!pci_find_bus(pci_domain_nr(parent),
> - busnr))
> - break;
> - }
> - if (busnr > end) {
> - ctrl_err(ctrl,
> - "No free bus for hot-added bridge\n");
> - pci_dev_put(dev);
> - continue;
> - }
> - child = pci_add_new_bus(parent, dev, busnr);
> - if (!child) {
> - ctrl_err(ctrl, "Cannot add new bus for %s\n",
> - pci_name(dev));
> - pci_dev_put(dev);
> - continue;
> - }
> - child->subordinate = pci_do_scan_bus(child);
> - pci_bus_size_bridges(child);
> - }
> program_fw_provided_values(dev);
> pci_dev_put(dev);
> }
>
> - pci_bus_assign_resources(parent);
> - pci_bus_add_devices(parent);
> - pci_enable_bridges(parent);
> return 0;
> }
>
>
Hello Kenji-san,
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
> >
> > I hadn't gotten around to verifying/fixing existing callers of
> > pci_do_scan_bus yet. I was focusing on the core first.
> >
> > There aren't too many callers, but unfortunately, I don't have
> > any hardware that actually uses the existing drivers.
> >
> > I seem to recall that your machines support shpchp. Would you
> > mind testing this patch and telling me if your machine still
> > behaves properly?
> >
>
> I have machines that support shpchp. But unfortunately I don't
> have any adapter card that contains bridge, which is needed to
> test your change.
You're right.
The more I think about it though, the more I think that even
without the below patch to clean up the callers of
pci_do_scan_bus, we should be ok, because:
- all the old code (which I removed below) existed
because the old PCI core would refuse to scan PCI buses
that had already been discovered
- that meant that it would never descend past a known
bridge to try and find new child bridges
- that meant that hotplug drivers had to manually
discover new bridges and add them, essentially
duplicating functionality in pci_scan_bridge
This patch series allows the PCI core to scan existing bridges
and descend down into the children every time, looking for new
bridges and devices, so all the code in shpchp, cpcihp, and other
callers of pci_do_scan_bus shouldn't be necessary anymore.
Also, if we do add new bridges once manually in shpchp, and then
call the new pci_do_scan_bus again, we will _not_ add devices
twice because the core should check each bridge and device for
struct pci_dev.is_added.
So anyway, I think that cleaning up the callers of
pci_do_scan_bus is a good idea, but multiple calls to the
interface definitely should not result in problems. If they do,
then that's a bug in my patch series.
> > I looked at shpchp_configure_device and I think that simply
> > scanning the device's parent bus should work.
> >
>
> Ok, I'll try it.
I set up a git tree to make it easier to test:
git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
The 'test-20090313' branch contains all the latest fixes to
enable this patch series. It does not contain the patch below, so
you will have to apply it by hand.
Thanks for testing!
/ac
>
> > Thanks.
> >
> > /ac
> >
> > diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
> > index aa315e5..7e8457b 100644
> > --- a/drivers/pci/hotplug/shpchp_pci.c
> > +++ b/drivers/pci/hotplug/shpchp_pci.c
> > @@ -110,11 +110,7 @@ int __ref shpchp_configure_device(struct slot *p_slot)
> > return -EINVAL;
> > }
> >
> > - num = pci_scan_slot(parent, PCI_DEVFN(p_slot->device, 0));
> > - if (num == 0) {
> > - ctrl_err(ctrl, "No new device found\n");
> > - return -ENODEV;
> > - }
> > + pci_do_scan_bus(parent);
> >
> > for (fn = 0; fn < 8; fn++) {
> > dev = pci_get_slot(parent, PCI_DEVFN(p_slot->device, fn));
> > @@ -126,40 +122,10 @@ int __ref shpchp_configure_device(struct slot *p_slot)
> > pci_dev_put(dev);
> > continue;
> > }
> > - if ((dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) ||
> > - (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
> > - /* Find an unused bus number for the new bridge */
> > - struct pci_bus *child;
> > - unsigned char busnr, start = parent->secondary;
> > - unsigned char end = parent->subordinate;
> > - for (busnr = start; busnr <= end; busnr++) {
> > - if (!pci_find_bus(pci_domain_nr(parent),
> > - busnr))
> > - break;
> > - }
> > - if (busnr > end) {
> > - ctrl_err(ctrl,
> > - "No free bus for hot-added bridge\n");
> > - pci_dev_put(dev);
> > - continue;
> > - }
> > - child = pci_add_new_bus(parent, dev, busnr);
> > - if (!child) {
> > - ctrl_err(ctrl, "Cannot add new bus for %s\n",
> > - pci_name(dev));
> > - pci_dev_put(dev);
> > - continue;
> > - }
> > - child->subordinate = pci_do_scan_bus(child);
> > - pci_bus_size_bridges(child);
> > - }
> > program_fw_provided_values(dev);
> > pci_dev_put(dev);
> > }
> >
> > - pci_bus_assign_resources(parent);
> > - pci_bus_add_devices(parent);
> > - pci_enable_bridges(parent);
> > return 0;
> > }
> >
> >
>
Alex Chiang wrote:
> Hello Kenji-san,
>
> * Kenji Kaneshige <[email protected]>:
>> Alex Chiang wrote:
>>> I hadn't gotten around to verifying/fixing existing callers of
>>> pci_do_scan_bus yet. I was focusing on the core first.
>>>
>>> There aren't too many callers, but unfortunately, I don't have
>>> any hardware that actually uses the existing drivers.
>>>
>>> I seem to recall that your machines support shpchp. Would you
>>> mind testing this patch and telling me if your machine still
>>> behaves properly?
>>>
>> I have machines that support shpchp. But unfortunately I don't
>> have any adapter card that contains bridge, which is needed to
>> test your change.
>
> You're right.
>
> The more I think about it though, the more I think that even
> without the below patch to clean up the callers of
> pci_do_scan_bus, we should be ok, because:
>
> - all the old code (which I removed below) existed
> because the old PCI core would refuse to scan PCI buses
> that had already been discovered
>
> - that meant that it would never descend past a known
> bridge to try and find new child bridges
>
> - that meant that hotplug drivers had to manually
> discover new bridges and add them, essentially
> duplicating functionality in pci_scan_bridge
>
> This patch series allows the PCI core to scan existing bridges
> and descend down into the children every time, looking for new
> bridges and devices, so all the code in shpchp, cpcihp, and other
> callers of pci_do_scan_bus shouldn't be necessary anymore.
>
> Also, if we do add new bridges once manually in shpchp, and then
> call the new pci_do_scan_bus again, we will _not_ add devices
> twice because the core should check each bridge and device for
> struct pci_dev.is_added.
>
> So anyway, I think that cleaning up the callers of
> pci_do_scan_bus is a good idea, but multiple calls to the
> interface definitely should not result in problems. If they do,
> then that's a bug in my patch series.
>
I'm sorry, but I didn't have enough time to try your patch on
my environment. So I'm still just looking at the code.
I looked at shpchp_configure_device() from the view point of
bridge hot-add. I think it is broken regardless of your change
because it calls pci_bus_add_devices() (through pci_do_scan_bus)
before assigning resources. So I think it must be changed
regardless of your change. But it's a little difficult for me
because I don't have any test environment as I mentioned before.
But I'm still worrying about your change against pci_do_scan_bus().
Without your change, pci_do_scan_bus() scans child buses and add
devices without assigning resources. I guess that it means existing
callers of pci_do_scan_bus() have some mechanism to assign resource
by theirselves and they don't expect pci_do_scan_bus() assigns
resources.
By the way, I have one question about rescan. Please suppose that
we enable the bridge(B) and its children using rescan interface
in the picture below.
|
-------------------------------------- parent bus
| |
bridge(A) bridge(B)
(working) (Not working)
| |
------------- -------------
| | | |
dev dev dev dev
(working) (working) (Not working)
In this case, your rescan mechanism calls pci_do_scan_bus() for
parent bus, and pci_do_scan_bus() calls pci_bus_assign_resources()
for parent bus. My question is, does pci_bus_assign_resources() do
nothing against bridge(A) that is currently working? I guess
pci_bus_assign_resources() would update some registers of bridge(A)
and it would breaks currently working devices.
Thanks,
Kenji Kaneshige
>>> I looked at shpchp_configure_device and I think that simply
>>> scanning the device's parent bus should work.
>>>
>> Ok, I'll try it.
>
> I set up a git tree to make it easier to test:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
>
> The 'test-20090313' branch contains all the latest fixes to
> enable this patch series. It does not contain the patch below, so
> you will have to apply it by hand.
>
> Thanks for testing!
>
> /ac
>
>>> Thanks.
>>>
>>> /ac
>>>
>>> diff --git a/drivers/pci/hotplug/shpchp_pci.c b/drivers/pci/hotplug/shpchp_pci.c
>>> index aa315e5..7e8457b 100644
>>> --- a/drivers/pci/hotplug/shpchp_pci.c
>>> +++ b/drivers/pci/hotplug/shpchp_pci.c
>>> @@ -110,11 +110,7 @@ int __ref shpchp_configure_device(struct slot *p_slot)
>>> return -EINVAL;
>>> }
>>>
>>> - num = pci_scan_slot(parent, PCI_DEVFN(p_slot->device, 0));
>>> - if (num == 0) {
>>> - ctrl_err(ctrl, "No new device found\n");
>>> - return -ENODEV;
>>> - }
>>> + pci_do_scan_bus(parent);
>>>
>>> for (fn = 0; fn < 8; fn++) {
>>> dev = pci_get_slot(parent, PCI_DEVFN(p_slot->device, fn));
>>> @@ -126,40 +122,10 @@ int __ref shpchp_configure_device(struct slot *p_slot)
>>> pci_dev_put(dev);
>>> continue;
>>> }
>>> - if ((dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) ||
>>> - (dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
>>> - /* Find an unused bus number for the new bridge */
>>> - struct pci_bus *child;
>>> - unsigned char busnr, start = parent->secondary;
>>> - unsigned char end = parent->subordinate;
>>> - for (busnr = start; busnr <= end; busnr++) {
>>> - if (!pci_find_bus(pci_domain_nr(parent),
>>> - busnr))
>>> - break;
>>> - }
>>> - if (busnr > end) {
>>> - ctrl_err(ctrl,
>>> - "No free bus for hot-added bridge\n");
>>> - pci_dev_put(dev);
>>> - continue;
>>> - }
>>> - child = pci_add_new_bus(parent, dev, busnr);
>>> - if (!child) {
>>> - ctrl_err(ctrl, "Cannot add new bus for %s\n",
>>> - pci_name(dev));
>>> - pci_dev_put(dev);
>>> - continue;
>>> - }
>>> - child->subordinate = pci_do_scan_bus(child);
>>> - pci_bus_size_bridges(child);
>>> - }
>>> program_fw_provided_values(dev);
>>> pci_dev_put(dev);
>>> }
>>>
>>> - pci_bus_assign_resources(parent);
>>> - pci_bus_add_devices(parent);
>>> - pci_enable_bridges(parent);
>>> return 0;
>>> }
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> The more I think about it though, the more I think that even
>> without the below patch to clean up the callers of
>> pci_do_scan_bus, we should be ok, because:
>>
>> - all the old code (which I removed below) existed
>> because the old PCI core would refuse to scan PCI buses
>> that had already been discovered
>>
>> - that meant that it would never descend past a known
>> bridge to try and find new child bridges
>>
>> - that meant that hotplug drivers had to manually
>> discover new bridges and add them, essentially
>> duplicating functionality in pci_scan_bridge
>>
>> This patch series allows the PCI core to scan existing bridges
>> and descend down into the children every time, looking for new
>> bridges and devices, so all the code in shpchp, cpcihp, and other
>> callers of pci_do_scan_bus shouldn't be necessary anymore.
>>
>> Also, if we do add new bridges once manually in shpchp, and then
>> call the new pci_do_scan_bus again, we will _not_ add devices
>> twice because the core should check each bridge and device for
>> struct pci_dev.is_added.
>>
>> So anyway, I think that cleaning up the callers of
>> pci_do_scan_bus is a good idea, but multiple calls to the
>> interface definitely should not result in problems. If they do,
>> then that's a bug in my patch series.
>>
>
> I'm sorry, but I didn't have enough time to try your patch on
> my environment. So I'm still just looking at the code.
Ok.
> I looked at shpchp_configure_device() from the view point of
> bridge hot-add. I think it is broken regardless of your change
> because it calls pci_bus_add_devices() (through pci_do_scan_bus)
> before assigning resources. So I think it must be changed
> regardless of your change. But it's a little difficult for me
> because I don't have any test environment as I mentioned before.
Hm, what you say makes sense.
I managed to find a very old machine supported by cpqphp, and
also found a card with a bridge.
cpqhp_configure_device() follows a similar algorithm to
shpchp_configure_device(). I'm just starting my testing now, and
there is good news and bad news.
The bad news is that although cpqphp loads successfully, and we
can successfully offline a card, we cannot online it again
afterwards due to BAR collisions. This failure occurs even
without my changes (2.6.27 kernel), and I haven't had time to
track the regression down yet.
We do discover the bridge on the device correctly and it is added
back into the device tree correctly, but we can't use it because
it's not programmed correctly.
The good news is, after rewriting cpqphp_configure_device() to
resemble the shpchp patch I gave you, we still discover the
bridge correctly and add it back into the device tree in the
proper place. We no longer get BAR collisions, but we fail in a
slightly different way.
At least I'm not introducing a new regression in cpqphp, and I
suspect shpchp will be similar.
> But I'm still worrying about your change against pci_do_scan_bus().
> Without your change, pci_do_scan_bus() scans child buses and add
> devices without assigning resources. I guess that it means existing
> callers of pci_do_scan_bus() have some mechanism to assign resource
> by theirselves and they don't expect pci_do_scan_bus() assigns
> resources.
I looked through shpchp and couldn't find this assumption. Is it
stored in the struct controller, under mmio_base and mmio_size?
I am motivated to get this patch series into 2.6.30 for several
reasons, so I think for now, I will not change pci_do_scan_bus().
Instead, I'll create a new interface that only the PCI core will
use, and leave the drivers alone.
Over time, we can migrate the drivers to the PCI core interface.
> By the way, I have one question about rescan. Please suppose that
> we enable the bridge(B) and its children using rescan interface
> in the picture below.
>
> |
> -------------------------------------- parent bus
> | |
> bridge(A) bridge(B)
> (working) (Not working)
> | |
> ------------- -------------
> | | | |
> dev dev dev dev
> (working) (working) (Not working)
>
> In this case, your rescan mechanism calls pci_do_scan_bus() for
> parent bus, and pci_do_scan_bus() calls pci_bus_assign_resources()
> for parent bus. My question is, does pci_bus_assign_resources() do
> nothing against bridge(A) that is currently working? I guess
> pci_bus_assign_resources() would update some registers of bridge(A)
> and it would breaks currently working devices.
This is a very good catch, thank you.
I added another patch to prevent this situation. We now check to
see if the bridge is already added inside of pci_setup_bridge().
Thanks.
/ac
Alex Chiang wrote:
> * Kenji Kaneshige <[email protected]>:
>> Alex Chiang wrote:
>>> The more I think about it though, the more I think that even
>>> without the below patch to clean up the callers of
>>> pci_do_scan_bus, we should be ok, because:
>>>
>>> - all the old code (which I removed below) existed
>>> because the old PCI core would refuse to scan PCI buses
>>> that had already been discovered
>>>
>>> - that meant that it would never descend past a known
>>> bridge to try and find new child bridges
>>>
>>> - that meant that hotplug drivers had to manually
>>> discover new bridges and add them, essentially
>>> duplicating functionality in pci_scan_bridge
>>>
>>> This patch series allows the PCI core to scan existing bridges
>>> and descend down into the children every time, looking for new
>>> bridges and devices, so all the code in shpchp, cpcihp, and other
>>> callers of pci_do_scan_bus shouldn't be necessary anymore.
>>>
>>> Also, if we do add new bridges once manually in shpchp, and then
>>> call the new pci_do_scan_bus again, we will _not_ add devices
>>> twice because the core should check each bridge and device for
>>> struct pci_dev.is_added.
>>>
>>> So anyway, I think that cleaning up the callers of
>>> pci_do_scan_bus is a good idea, but multiple calls to the
>>> interface definitely should not result in problems. If they do,
>>> then that's a bug in my patch series.
>>>
>> I'm sorry, but I didn't have enough time to try your patch on
>> my environment. So I'm still just looking at the code.
>
> Ok.
>
>> I looked at shpchp_configure_device() from the view point of
>> bridge hot-add. I think it is broken regardless of your change
>> because it calls pci_bus_add_devices() (through pci_do_scan_bus)
>> before assigning resources. So I think it must be changed
>> regardless of your change. But it's a little difficult for me
>> because I don't have any test environment as I mentioned before.
>
> Hm, what you say makes sense.
>
> I managed to find a very old machine supported by cpqphp, and
> also found a card with a bridge.
>
> cpqhp_configure_device() follows a similar algorithm to
> shpchp_configure_device(). I'm just starting my testing now, and
> there is good news and bad news.
>
> The bad news is that although cpqphp loads successfully, and we
> can successfully offline a card, we cannot online it again
> afterwards due to BAR collisions. This failure occurs even
> without my changes (2.6.27 kernel), and I haven't had time to
> track the regression down yet.
>
> We do discover the bridge on the device correctly and it is added
> back into the device tree correctly, but we can't use it because
> it's not programmed correctly.
>
> The good news is, after rewriting cpqphp_configure_device() to
> resemble the shpchp patch I gave you, we still discover the
> bridge correctly and add it back into the device tree in the
> proper place. We no longer get BAR collisions, but we fail in a
> slightly different way.
>
> At least I'm not introducing a new regression in cpqphp, and I
> suspect shpchp will be similar.
>
>> But I'm still worrying about your change against pci_do_scan_bus().
>> Without your change, pci_do_scan_bus() scans child buses and add
>> devices without assigning resources. I guess that it means existing
>> callers of pci_do_scan_bus() have some mechanism to assign resource
>> by theirselves and they don't expect pci_do_scan_bus() assigns
>> resources.
>
> I looked through shpchp and couldn't find this assumption. Is it
> stored in the struct controller, under mmio_base and mmio_size?
>
Yes, shpchp doesn't have this assumption. As I mentioned in
the previous e-mail, I think shpchp's bridge hot-add code is
broken even without your change. I'm worrying about the other
hotplug drivers such as cpqphp, cpcihp, rpaphp and ibmphp, though
I don't have any knowledge about those hotplug drivers.
> I am motivated to get this patch series into 2.6.30 for several
> reasons, so I think for now, I will not change pci_do_scan_bus().
> Instead, I'll create a new interface that only the PCI core will
> use, and leave the drivers alone.
>
> Over time, we can migrate the drivers to the PCI core interface.
>
I think it's much safer way.
>> By the way, I have one question about rescan. Please suppose that
>> we enable the bridge(B) and its children using rescan interface
>> in the picture below.
>>
>> |
>> -------------------------------------- parent bus
>> | |
>> bridge(A) bridge(B)
>> (working) (Not working)
>> | |
>> ------------- -------------
>> | | | |
>> dev dev dev dev
>> (working) (working) (Not working)
>>
>> In this case, your rescan mechanism calls pci_do_scan_bus() for
>> parent bus, and pci_do_scan_bus() calls pci_bus_assign_resources()
>> for parent bus. My question is, does pci_bus_assign_resources() do
>> nothing against bridge(A) that is currently working? I guess
>> pci_bus_assign_resources() would update some registers of bridge(A)
>> and it would breaks currently working devices.
>
> This is a very good catch, thank you.
>
> I added another patch to prevent this situation. We now check to
> see if the bridge is already added inside of pci_setup_bridge().
>
Sounds good to me.
Thanks,
Kenji Kaneshige
> Thanks.
>
> /ac
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>