We're getting close to the new merge window, and I _think_ this
patch series is ready for consideration. Notably:
- multiple rescans/removes of devices with and without bridges
- verified that resource allocation after multiple remove/rescan
cycles is the same as what we had during initial boot
- fixes the complete suckage of fakephp (that I created)
- doesn't affect existing hotplug drivers
- tested on x86 and ia64 platforms
Please review, and consider testing. For testing ease, you can pull
from my git branch:
git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
branch 'test-20090318' is what you want
Note, this test branch does contain the small assorted patches in AER
and the PCIe portdriver that I fixed along the way, but does not contain
the sysfs callback mutex that I introduced to protect myself from
Vegard Nossum. ;) If you want to hammer away at the sysfs interface,
please apply this patch on top:
http://thread.gmane.org/gmane.linux.kernel/806648
What I'm continuing to do:
- investigate converting existing hotplug drivers to use
new pci_rescan_bus() interface
Thanks.
/ac
v3 -> v4:
- protect sysfs interfaces with mutex
- undo changes in pci_do_scan_bus
- introduce pci_rescan_bus instead
- do not initialize bridges more than once
v2 -> v3:
- properly remove device with internal bridge
- added Kenji Kaneshige's pci_is_root_bus() interface
- dropped whitespace cleanups for another time
v1 -> v2:
- incorporated lots of Trent Piepho's work
- beefed up pci_do_scan_bus as heavy lifter for rescanning
- small bugfixes folded into earlier patches to get everything working
---
Alex Chiang (8):
PCI Hotplug: schedule fakephp for feature removal
PCI Hotplug: rename legacy_fakephp to fakephp
PCI: Introduce /sys/bus/pci/devices/.../rescan
PCI: Introduce /sys/bus/pci/devices/.../remove
PCI: Introduce /sys/bus/pci/rescan
PCI: Introduce pci_rescan_bus()
PCI: do not initialize bridges more than once
PCI: always scan child buses
Kenji Kaneshige (1):
PCI: pci_is_root_bus helper
Trent Piepho (3):
PCI Hotplug: restore fakephp interface with complete reimplementation
PCI: pci_scan_slot() returns newly found devices
PCI: don't scan existing devices
Documentation/ABI/testing/sysfs-bus-pci | 27 ++
Documentation/feature-removal-schedule.txt | 32 ++
Documentation/filesystems/sysfs-pci.txt | 10 +
drivers/pci/hotplug/fakephp.c | 443 +++++++---------------------
drivers/pci/pci-driver.c | 1
drivers/pci/pci-sysfs.c | 95 ++++++
drivers/pci/pci.h | 6
drivers/pci/probe.c | 103 ++++---
drivers/pci/setup-bus.c | 3
include/linux/pci.h | 12 +
10 files changed, 358 insertions(+), 374 deletions(-)
From: Kenji Kaneshige <[email protected]>
Introduce pci_is_root_bus helper function. This will help make code
more consistent, as well as prevent incorrect assumptions (such as
pci_bus->self == NULL on a root bus, which is not always true).
Signed-off-by: Kenji Kaneshige <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
include/linux/pci.h | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7baf2a5..a31f935 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -341,6 +341,15 @@ struct pci_bus {
#define pci_bus_b(n) list_entry(n, struct pci_bus, node)
#define to_pci_bus(n) container_of(n, struct pci_bus, dev)
+/*
+ * Returns true if the pci bus is root (behind host-pci bridge),
+ * false otherwise
+ */
+static inline bool pci_is_root_bus(struct pci_bus *pbus)
+{
+ return !(pbus->parent);
+}
+
#ifdef CONFIG_PCI_MSI
static inline bool pci_dev_msi_enabled(struct pci_dev *pci_dev)
{
From: Trent Piepho <[email protected]>
pci_scan_slot() has been rewritten to be less complex and will now
return the number of *new* devices found.
Existing callers need not worry because they already assume that
they can't call pci_scan_slot() on an already-scanned slot.
Thus, there is no semantic change for existing callers: returning
newly found devices (this patch) is exactly equal to returning all
found devices (before this patch).
This patch adds some more groundwork to allow us to rescan the
PCI bus during runtime to discover newly added devices.
Signed-off-by: Trent Piepho <[email protected]>
Reviewed-by: Alex Chiang <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 35 +++++++++++++----------------------
1 files changed, 13 insertions(+), 22 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 66b5d1c..d392813 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1030,35 +1030,26 @@ EXPORT_SYMBOL(pci_scan_single_device);
* Scan a PCI slot on the specified PCI bus for devices, adding
* discovered devices to the @bus->devices list. New devices
* will not have is_added set.
+ *
+ * Returns the number of new devices found.
*/
int pci_scan_slot(struct pci_bus *bus, int devfn)
{
- int func, nr = 0;
- int scan_all_fns;
-
- scan_all_fns = pcibios_scan_all_fns(bus, devfn);
-
- for (func = 0; func < 8; func++, devfn++) {
- struct pci_dev *dev;
+ int fn, nr = 0;
+ struct pci_dev *dev;
- dev = pci_scan_single_device(bus, devfn);
- if (dev) {
+ if ((dev = pci_scan_single_device(bus, devfn)))
+ if (!dev->is_added) /* new device? */
nr++;
- /*
- * If this is a single function device,
- * don't scan past the first function.
- */
- if (!dev->multifunction) {
- if (func > 0) {
- dev->multifunction = 1;
- } else {
- break;
- }
+ if ((dev && dev->multifunction) ||
+ (!dev && pcibios_scan_all_fns(bus, devfn))) {
+ for (fn = 1; fn < 8; fn++) {
+ if ((dev = pci_scan_single_device(bus, devfn + fn))) {
+ if (!dev->is_added)
+ nr++;
+ dev->multifunction = 1;
}
- } else {
- if (func == 0 && !scan_all_fns)
- break;
}
}
From: Trent Piepho <[email protected]>
pci_scan_single_device is supposed to add newly discovered
devices to pci_bus->devices, but doesn't check to see if the
device has already been added. This can cause problems if we ever
want to use this interface to rescan the PCI bus.
If the device is already added, just return it.
Signed-off-by: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 55ec44a..66b5d1c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1006,6 +1006,12 @@ struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
{
struct pci_dev *dev;
+ dev = pci_get_slot(bus, devfn);
+ if (dev) {
+ pci_dev_put(dev);
+ return dev;
+ }
+
dev = pci_scan_device(bus, devfn);
if (!dev)
return NULL;
While scanning bridges, we stop our scan if we encounter a bus
that we've seen before, to work around some buggy chipsets. This
is a good idea, but prevents us from fully scanning the PCI bus
at a future time (to find newly hot-added devices, for example).
Change the logic so that we skip _re-adding_ an existing bus
that we've seen before, but also allow the scan to descend to
all child buses.
Now that we're potentially scanning our child buses again, we
also need to be sure not to attempt re-initializing their BARs
so we avoid that.
This patch lays the groundwork to allow the user to issue a
rescan of the PCI bus at any time.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/probe.c | 33 +++++++++++++++++++--------------
1 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index d392813..a3959ce 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -511,21 +511,21 @@ int __devinit pci_scan_bridge(struct pci_bus *bus, struct pci_dev *dev, int max,
/*
* If we already got to this bus through a different bridge,
- * ignore it. This can happen with the i450NX chipset.
+ * don't re-add it. This can happen with the i450NX chipset.
+ *
+ * However, we continue to descend down the hierarchy and
+ * scan remaining child buses.
*/
- if (pci_find_bus(pci_domain_nr(bus), busnr)) {
- dev_info(&dev->dev, "bus %04x:%02x already known\n",
- pci_domain_nr(bus), busnr);
- goto out;
+ child = pci_find_bus(pci_domain_nr(bus), busnr);
+ if (!child) {
+ child = pci_add_new_bus(bus, dev, busnr);
+ if (!child)
+ goto out;
+ child->primary = buses & 0xFF;
+ child->subordinate = (buses >> 16) & 0xFF;
+ child->bridge_ctl = bctl;
}
- child = pci_add_new_bus(bus, dev, busnr);
- if (!child)
- goto out;
- child->primary = buses & 0xFF;
- child->subordinate = (buses >> 16) & 0xFF;
- child->bridge_ctl = bctl;
-
cmax = pci_scan_child_bus(child);
if (cmax > max)
max = cmax;
@@ -1075,8 +1075,13 @@ unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
* After performing arch-dependent fixup of the bus, look behind
* all PCI-to-PCI bridges on this bus.
*/
- pr_debug("PCI: Fixups for bus %04x:%02x\n", pci_domain_nr(bus), bus->number);
- pcibios_fixup_bus(bus);
+ if (!bus->is_added) {
+ pr_debug("PCI: Fixups for bus %04x:%02x\n", pci_domain_nr(bus), bus->number);
+ pcibios_fixup_bus(bus);
+ if (pci_is_root_bus(bus))
+ bus->is_added = 1;
+ }
+
for (pass=0; pass < 2; pass++)
list_for_each_entry(dev, &bus->devices, bus_list) {
if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
This API is used by the PCI core to rescan a bus and rediscover
newly added devices.
Over time, it is expected that the various PCI hotplug drivers
will migrate to this interface and away from the old
pci_do_scan_bus() interface.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug/fakephp.c | 6 +++---
drivers/pci/probe.c | 29 +++++++++++++++++++++++++++++
include/linux/pci.h | 3 +++
3 files changed, 35 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
index d8649e1..1606374 100644
--- a/drivers/pci/hotplug/fakephp.c
+++ b/drivers/pci/hotplug/fakephp.c
@@ -245,12 +245,12 @@ static int pci_rescan_slot(struct pci_dev *temp)
/**
- * pci_rescan_bus - Rescan PCI bus
+ * pci_rescan_bus_local - fakephp version of rescan PCI bus
* @bus: the PCI bus to rescan
*
* Call pci_rescan_slot for each possible function of the bus.
*/
-static void pci_rescan_bus(const struct pci_bus *bus)
+static void pci_rescan_bus_local(const struct pci_bus *bus)
{
unsigned int devfn;
struct pci_dev *dev;
@@ -291,7 +291,7 @@ static void pci_rescan_buses(const struct list_head *list)
const struct list_head *l;
list_for_each(l,list) {
const struct pci_bus *b = pci_bus_b(l);
- pci_rescan_bus(b);
+ pci_rescan_bus_local(b);
pci_rescan_buses(&b->children);
}
}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index a3959ce..452a1cc 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1195,10 +1195,39 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent,
EXPORT_SYMBOL(pci_scan_bus_parented);
#ifdef CONFIG_HOTPLUG
+/**
+ * pci_rescan_bus - scan a PCI bus for devices.
+ * @bus: PCI bus to scan
+ *
+ * Scan a PCI bus and child buses for new devices, adds them,
+ * and enables them.
+ *
+ * Returns the max number of subordinate bus discovered.
+ */
+unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
+{
+ unsigned int max;
+ struct pci_dev *dev;
+
+ max = pci_scan_child_bus(bus);
+ list_for_each_entry(dev, &bus->devices, bus_list)
+ if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
+ dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
+ if (dev->subordinate)
+ pci_bus_size_bridges(dev->subordinate);
+
+ pci_bus_assign_resources(bus);
+ pci_enable_bridges(bus);
+ pci_bus_add_devices(bus);
+
+ return max;
+}
+
EXPORT_SYMBOL(pci_add_new_bus);
EXPORT_SYMBOL(pci_scan_slot);
EXPORT_SYMBOL(pci_scan_bridge);
EXPORT_SYMBOL_GPL(pci_scan_child_bus);
+EXPORT_SYMBOL_GPL(pci_rescan_bus);
#endif
static int __init pci_sort_bf_cmp(const struct device *d_a, const struct device *d_b)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a31f935..08f57fc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -710,6 +710,9 @@ int pci_back_from_sleep(struct pci_dev *dev);
/* Functions for PCI Hotplug drivers to use */
int pci_bus_find_capability(struct pci_bus *bus, unsigned int devfn, int cap);
+#ifdef CONFIG_HOTPLUG
+unsigned int pci_rescan_bus(struct pci_bus *bus);
+#endif
/* Vital product data routines */
ssize_t pci_read_vpd(struct pci_dev *dev, loff_t pos, size_t count, void *buf);
This interface allows the user to force a rescan of all PCI buses
in system, and rediscover devices that have been removed earlier.
pci_bus_attrs implementation from Trent Piepho.
Thanks to Vegard Nossum for discovering locking issues with the
sysfs interface.
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 9 +++++++++
drivers/pci/pci-driver.c | 1 +
drivers/pci/pci-sysfs.c | 29 +++++++++++++++++++++++++++++
drivers/pci/pci.h | 6 ++++++
4 files changed, 45 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index e638e15..ea4aee2 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -41,6 +41,15 @@ Description:
for the device and attempt to bind to it. For example:
# echo "8086 10f5" > /sys/bus/pci/drivers/foo/new_id
+What: /sys/bus/pci/rescan
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ force a rescan of all PCI buses in the system, and
+ re-discover previously removed devices.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index a5f11ad..62ed864 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -973,6 +973,7 @@ struct bus_type pci_bus_type = {
.remove = pci_device_remove,
.shutdown = pci_device_shutdown,
.dev_attrs = pci_dev_attrs,
+ .bus_attrs = pci_bus_attrs,
.pm = PCI_PM_OPS_PTR,
};
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 1c89298..22dbc65 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -219,6 +219,35 @@ msi_bus_store(struct device *dev, struct device_attribute *attr,
return count;
}
+#ifdef CONFIG_HOTPLUG
+static DEFINE_MUTEX(pci_remove_rescan_mutex);
+static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
+ size_t count)
+{
+ unsigned long val;
+ struct pci_bus *b = NULL;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (val) {
+ mutex_lock(&pci_remove_rescan_mutex);
+ while ((b = pci_find_next_bus(b)) != NULL)
+ pci_rescan_bus(b);
+ mutex_unlock(&pci_remove_rescan_mutex);
+ }
+ return count;
+}
+
+struct bus_attribute pci_bus_attrs[] = {
+ __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
+ __ATTR_NULL
+};
+#endif
+
struct device_attribute pci_dev_attrs[] = {
__ATTR_RO(resource),
__ATTR_RO(vendor),
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 07c0aa5..6dea35f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -136,6 +136,12 @@ extern int pcie_mch_quirk;
extern struct device_attribute pci_dev_attrs[];
extern struct device_attribute dev_attr_cpuaffinity;
extern struct device_attribute dev_attr_cpulistaffinity;
+#ifdef CONFIG_HOTPLUG
+extern struct bus_attribute pci_bus_attrs[];
+#else
+#define pci_bus_attrs NULL
+#endif
+
/**
* pci_match_one_device - Tell if a PCI device structure has a matching
This patch adds an attribute named "remove" to a PCI device's sysfs
directory. Writing a non-zero value to this attribute will remove the PCI
device and any children of it.
Trent Piepho wrote the original implementation and documentation.
Thanks to Vegard Nossum for testing under kmemcheck and finding locking
issues with the sysfs interface.
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 8 ++++++
Documentation/filesystems/sysfs-pci.txt | 10 +++++++
drivers/pci/pci-sysfs.c | 44 +++++++++++++++++++++++++++++++
3 files changed, 62 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ea4aee2..5b1ddde 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -50,6 +50,14 @@ Description:
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../remove
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ hot-remove the PCI device and any of its children.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
index 9f8740c..26e4b8b 100644
--- a/Documentation/filesystems/sysfs-pci.txt
+++ b/Documentation/filesystems/sysfs-pci.txt
@@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
| |-- enable
| |-- irq
| |-- local_cpus
+ | |-- remove
| |-- resource
| |-- resource0
| |-- resource1
@@ -36,6 +37,7 @@ files, each with their own function.
enable Whether the device is enabled (ascii, rw)
irq IRQ number (ascii, ro)
local_cpus nearby CPU mask (cpumask, ro)
+ remove remove device from kernel's list (ascii, wo)
resource PCI resource host addresses (ascii, ro)
resource0..N PCI resource N, if present (binary, mmap)
resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
@@ -46,6 +48,7 @@ files, each with their own function.
ro - read only file
rw - file is readable and writable
+ wo - write only file
mmap - file is mmapable
ascii - file contains ascii text
binary - file contains binary data
@@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
In the event a driver is not bound to the device, it can be enabled using the
'enable' file, documented above.
+The 'remove' file is used to remove the PCI device, by writing a non-zero
+integer to the file. This does not involve any kind of hot-plug functionality,
+e.g. powering off the device. The device is removed from the kernel's list of
+PCI devices, the sysfs directory for it is removed, and the device will be
+removed from any drivers attached to it. Removal of PCI root buses is
+disallowed.
+
Accessing legacy resources through sysfs
----------------------------------------
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 22dbc65..6e2b1fd 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
__ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
__ATTR_NULL
};
+
+static void remove_callback(struct device *dev)
+{
+ int bridge = 0;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ mutex_lock(&pci_remove_rescan_mutex);
+
+ if (pdev->subordinate)
+ bridge = 1;
+
+ pci_remove_bus_device(pdev);
+ if (bridge && list_empty(&pdev->bus->devices))
+ pci_remove_bus(pdev->bus);
+
+ mutex_unlock(&pci_remove_rescan_mutex);
+}
+
+static ssize_t
+remove_store(struct device *dev, struct device_attribute *dummy,
+ const char *buf, size_t count)
+{
+ int ret = 0;
+ unsigned long val;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (pdev->subordinate && pci_is_root_bus(pdev->bus))
+ return -EBUSY;
+
+ if (val)
+ ret = device_schedule_callback(dev, remove_callback);
+ if (ret)
+ count = ret;
+ return count;
+}
#endif
struct device_attribute pci_dev_attrs[] = {
@@ -266,6 +307,9 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
broken_parity_status_show,broken_parity_status_store),
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
+#ifdef CONFIG_HOTPLUG
+ __ATTR(remove, S_IWUSR, NULL, remove_store),
+#endif
__ATTR_NULL,
};
In preparation for PCI core hotplug, we need to ensure that we do
not attempt to re-initialize bridges that have already been initialized.
We only need to worry about non-root buses, since we will not allow
root bus removal.
Reported-by: Kenji Kaneshige <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/setup-bus.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 170a3ed..334285a 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -144,6 +144,9 @@ static void pci_setup_bridge(struct pci_bus *bus)
struct pci_bus_region region;
u32 l, bu, lu, io_upper16;
+ if (!pci_is_root_bus(bus) && bus->is_added)
+ return;
+
dev_info(&bridge->dev, "PCI bridge, secondary bus %04x:%02x\n",
pci_domain_nr(bus), bus->number);
This interface allows the user to force a rescan of the device's
parent bus and all subordinate buses, and rediscover devices removed
earlier from this part of the device tree.
Cc: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-pci | 10 ++++++++++
drivers/pci/pci-sysfs.c | 22 ++++++++++++++++++++++
2 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 5b1ddde..d33840e 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -58,6 +58,16 @@ Description:
hot-remove the PCI device and any of its children.
Depends on CONFIG_HOTPLUG.
+What: /sys/bus/pci/devices/.../rescan
+Date: January 2009
+Contact: Linux PCI developers <[email protected]>
+Description:
+ Writing a non-zero value to this attribute will
+ force a rescan of the device's parent bus and all
+ child buses, and re-discover devices removed earlier
+ from this part of the device tree.
+ Depends on CONFIG_HOTPLUG.
+
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <[email protected]>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 6e2b1fd..92e7fd9 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -247,6 +247,27 @@ struct bus_attribute pci_bus_attrs[] = {
__ATTR_NULL
};
+static ssize_t
+dev_rescan_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ unsigned long val;
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (val) {
+ mutex_lock(&pci_remove_rescan_mutex);
+ pci_rescan_bus(pdev->bus);
+ mutex_unlock(&pci_remove_rescan_mutex);
+ }
+ return count;
+}
+
static void remove_callback(struct device *dev)
{
int bridge = 0;
@@ -309,6 +330,7 @@ struct device_attribute pci_dev_attrs[] = {
__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
#ifdef CONFIG_HOTPLUG
__ATTR(remove, S_IWUSR, NULL, remove_store),
+ __ATTR(rescan, S_IWUSR, NULL, dev_rescan_store),
#endif
__ATTR_NULL,
};
We wanted to replace fakephp wholesale, so rename legacy_fakephp back
to fakephp. Yes, this is a silly commit, but it produces a much easier
patch to read and review.
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug/Makefile | 2
drivers/pci/hotplug/fakephp.c | 162 ++++++++++++++++++++++++++++++++++
drivers/pci/hotplug/legacy_fakephp.c | 162 ----------------------------------
3 files changed, 163 insertions(+), 163 deletions(-)
create mode 100644 drivers/pci/hotplug/fakephp.c
delete mode 100644 drivers/pci/hotplug/legacy_fakephp.c
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 3314fe8..2aa117c 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o
obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o
# Link this last so it doesn't claim devices that have a real hotplug driver
-obj-$(CONFIG_HOTPLUG_PCI_FAKE) += legacy_fakephp.o
+obj-$(CONFIG_HOTPLUG_PCI_FAKE) += fakephp.o
pci_hotplug-objs := pci_hotplug_core.o
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
new file mode 100644
index 0000000..58a9bfa
--- /dev/null
+++ b/drivers/pci/hotplug/fakephp.c
@@ -0,0 +1,162 @@
+/* Works like the fakephp driver used to, except a little better.
+ *
+ * - It's possible to remove devices with subordinate busses.
+ * - New PCI devices that appear via any method, not just a fakephp triggered
+ * rescan, will be noticed.
+ * - Devices that are removed via any method, not just a fakephp triggered
+ * removal, will also be noticed.
+ *
+ * Uses nothing from the pci-hotplug subsystem.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include "../pci.h"
+
+struct legacy_slot {
+ struct kobject kobj;
+ struct pci_dev *dev;
+ struct list_head list;
+};
+
+static LIST_HEAD(legacy_list);
+
+static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ strcpy(buf, "1\n");
+ return 2;
+}
+
+static void remove_callback(void *data)
+{
+ pci_remove_bus_device((struct pci_dev *)data);
+}
+
+static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t len)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (val)
+ pci_rescan_bus(slot->dev->bus);
+ else
+ sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
+ slot->dev, THIS_MODULE);
+ return len;
+}
+
+static struct attribute *legacy_attrs[] = {
+ &(struct attribute){ .name = "power", .mode = 0644 },
+ NULL,
+};
+
+static void legacy_release(struct kobject *kobj)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+
+ pci_dev_put(slot->dev);
+ kfree(slot);
+}
+
+static struct kobj_type legacy_ktype = {
+ .sysfs_ops = &(struct sysfs_ops){
+ .store = legacy_store, .show = legacy_show
+ },
+ .release = &legacy_release,
+ .default_attrs = legacy_attrs,
+};
+
+static int legacy_add_slot(struct pci_dev *pdev)
+{
+ struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
+
+ if (!slot)
+ return -ENOMEM;
+
+ if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
+ &pci_slots_kset->kobj, "%s",
+ pdev->dev.bus_id)) {
+ dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
+ return -EINVAL;
+ }
+ slot->dev = pci_dev_get(pdev);
+
+ list_add(&slot->list, &legacy_list);
+
+ return 0;
+}
+
+static int legacy_notify(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct pci_dev *pdev = to_pci_dev(data);
+
+ if (action == BUS_NOTIFY_ADD_DEVICE) {
+ legacy_add_slot(pdev);
+ } else if (action == BUS_NOTIFY_DEL_DEVICE) {
+ struct legacy_slot *slot;
+
+ list_for_each_entry(slot, &legacy_list, list)
+ if (slot->dev == pdev)
+ goto found;
+
+ dev_warn(&pdev->dev, "Missing legacy fake slot?");
+ return -ENODEV;
+found:
+ kobject_del(&slot->kobj);
+ list_del(&slot->list);
+ kobject_put(&slot->kobj);
+ }
+
+ return 0;
+}
+
+static struct notifier_block legacy_notifier = {
+ .notifier_call = legacy_notify
+};
+
+static int __init init_legacy(void)
+{
+ struct pci_dev *pdev = NULL;
+
+ /* Add existing devices */
+ while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
+ legacy_add_slot(pdev);
+
+ /* Be alerted of any new ones */
+ bus_register_notifier(&pci_bus_type, &legacy_notifier);
+ return 0;
+}
+module_init(init_legacy);
+
+static void __exit remove_legacy(void)
+{
+ struct legacy_slot *slot, *tmp;
+
+ bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
+
+ list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
+ list_del(&slot->list);
+ kobject_del(&slot->kobj);
+ kobject_put(&slot->kobj);
+ }
+}
+module_exit(remove_legacy);
+
+
+MODULE_AUTHOR("Trent Piepho <[email protected]>");
+MODULE_DESCRIPTION("Legacy version of the fakephp interface");
+MODULE_LICENSE("GPL");
diff --git a/drivers/pci/hotplug/legacy_fakephp.c b/drivers/pci/hotplug/legacy_fakephp.c
deleted file mode 100644
index 58a9bfa..0000000
--- a/drivers/pci/hotplug/legacy_fakephp.c
+++ /dev/null
@@ -1,162 +0,0 @@
-/* Works like the fakephp driver used to, except a little better.
- *
- * - It's possible to remove devices with subordinate busses.
- * - New PCI devices that appear via any method, not just a fakephp triggered
- * rescan, will be noticed.
- * - Devices that are removed via any method, not just a fakephp triggered
- * removal, will also be noticed.
- *
- * Uses nothing from the pci-hotplug subsystem.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/types.h>
-#include <linux/list.h>
-#include <linux/kobject.h>
-#include <linux/sysfs.h>
-#include <linux/init.h>
-#include <linux/pci.h>
-#include "../pci.h"
-
-struct legacy_slot {
- struct kobject kobj;
- struct pci_dev *dev;
- struct list_head list;
-};
-
-static LIST_HEAD(legacy_list);
-
-static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
- char *buf)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
- strcpy(buf, "1\n");
- return 2;
-}
-
-static void remove_callback(void *data)
-{
- pci_remove_bus_device((struct pci_dev *)data);
-}
-
-static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
- const char *buf, size_t len)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
- unsigned long val;
-
- if (strict_strtoul(buf, 0, &val) < 0)
- return -EINVAL;
-
- if (val)
- pci_rescan_bus(slot->dev->bus);
- else
- sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
- slot->dev, THIS_MODULE);
- return len;
-}
-
-static struct attribute *legacy_attrs[] = {
- &(struct attribute){ .name = "power", .mode = 0644 },
- NULL,
-};
-
-static void legacy_release(struct kobject *kobj)
-{
- struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
-
- pci_dev_put(slot->dev);
- kfree(slot);
-}
-
-static struct kobj_type legacy_ktype = {
- .sysfs_ops = &(struct sysfs_ops){
- .store = legacy_store, .show = legacy_show
- },
- .release = &legacy_release,
- .default_attrs = legacy_attrs,
-};
-
-static int legacy_add_slot(struct pci_dev *pdev)
-{
- struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
-
- if (!slot)
- return -ENOMEM;
-
- if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
- &pci_slots_kset->kobj, "%s",
- pdev->dev.bus_id)) {
- dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
- return -EINVAL;
- }
- slot->dev = pci_dev_get(pdev);
-
- list_add(&slot->list, &legacy_list);
-
- return 0;
-}
-
-static int legacy_notify(struct notifier_block *nb,
- unsigned long action, void *data)
-{
- struct pci_dev *pdev = to_pci_dev(data);
-
- if (action == BUS_NOTIFY_ADD_DEVICE) {
- legacy_add_slot(pdev);
- } else if (action == BUS_NOTIFY_DEL_DEVICE) {
- struct legacy_slot *slot;
-
- list_for_each_entry(slot, &legacy_list, list)
- if (slot->dev == pdev)
- goto found;
-
- dev_warn(&pdev->dev, "Missing legacy fake slot?");
- return -ENODEV;
-found:
- kobject_del(&slot->kobj);
- list_del(&slot->list);
- kobject_put(&slot->kobj);
- }
-
- return 0;
-}
-
-static struct notifier_block legacy_notifier = {
- .notifier_call = legacy_notify
-};
-
-static int __init init_legacy(void)
-{
- struct pci_dev *pdev = NULL;
-
- /* Add existing devices */
- while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
- legacy_add_slot(pdev);
-
- /* Be alerted of any new ones */
- bus_register_notifier(&pci_bus_type, &legacy_notifier);
- return 0;
-}
-module_init(init_legacy);
-
-static void __exit remove_legacy(void)
-{
- struct legacy_slot *slot, *tmp;
-
- bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
-
- list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
- list_del(&slot->list);
- kobject_del(&slot->kobj);
- kobject_put(&slot->kobj);
- }
-}
-module_exit(remove_legacy);
-
-
-MODULE_AUTHOR("Trent Piepho <[email protected]>");
-MODULE_DESCRIPTION("Legacy version of the fakephp interface");
-MODULE_LICENSE("GPL");
Now that the PCI core is capable of function-level remove and rescan
as well as bus-level rescan, there's no functional need to keep fakephp
anymore.
We keep it around for userspace compatibility reasons, schedule removal
in three years.
Signed-off-by: Alex Chiang <[email protected]>
---
Documentation/feature-removal-schedule.txt | 32 ++++++++++++++++++++++++++++
1 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index 5ddbe35..fc3505b 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -335,3 +335,35 @@ Why: In 2.6.18 the Secmark concept was introduced to replace the "compat_net"
Secmark, it is time to deprecate the older mechanism and start the
process of removing the old code.
Who: Paul Moore <[email protected]>
+
+---------------------------
+
+What: fakephp and associated sysfs files in /sys/bus/pci/slots/
+When: 2011
+Why: In 2.6.27, the semantics of /sys/bus/pci/slots was redefined to
+ represent a machine's physical PCI slots. The change in semantics
+ had userspace implications, as the hotplug core no longer allowed
+ drivers to create multiple sysfs files per physical slot (required
+ for multi-function devices, e.g.). fakephp was seen as a developer's
+ tool only, and its interface changed. Too late, we learned that
+ there were some users of the fakephp interface.
+
+ In 2.6.30, the original fakephp interface was restored. At the same
+ time, the PCI core gained the ability that fakephp provided, namely
+ function-level hot-remove and hot-add.
+
+ Since the PCI core now provides the same functionality, exposed in:
+
+ /sys/bus/pci/rescan
+ /sys/bus/pci/devices/.../remove
+ /sys/bus/pci/devices/.../rescan
+
+ there is no functional reason to maintain fakephp as well.
+
+ We will keep the existing module so that 'modprobe fakephp' will
+ present the old /sys/bus/pci/slots/... interface for compatibility,
+ but users are urged to migrate their applications to the API above.
+
+ After a reasonable transition period, we will remove the legacy
+ fakephp interface.
+Who: Alex Chiang <[email protected]>
From: Trent Piepho <[email protected]>
A complete re-implementation of fakephp is necessary if it is to
present its former interface (pre-2.6.27, when it broke). The
reason is that PCI hotplug drivers call pci_hp_register(), which
enforces the rule that only one /sys/bus/pci/slots/ file may be
created per physical slot.
The change breaks the old fakephp's assumption that it could
create a file per function. So we re-implement fakephp to avoid
using the standard PCI hotplug API so that we can restore the old
fakephp user interface.
It puts entries in /sys/bus/pci/slots with the names of all PCI
devices/functions, exactly symmetrical to what is shown in
/sys/bus/pci/devices. Each slots/ entry has a "power" attribute,
which works the same way as the fakephp driver's power attribute
has worked.
There are a few improvements over old fakephp, which couldn't handle
PCI devices being added or removed via a means outside of
fakephp's knowledge. If a device was added another way, old fakephp
didn't notice and didn't create the fake slot for it. If a
device was removed another way, old fakephp didn't delete the fake
slot for it (and accessing the stale slot caused an oops).
The new implementation overcomes these limitations. As a
consequence, removing a bridge with other devices behind it now
works as well, which is something else old fakephp couldn't do
previously.
This duplicates a tiny bit of the code in the PCI core that does
this same function. Re-using that code ends up being more
complex than duplicating it, and it makes code in the PCI core
more ugly just to support this legacy fakephp interface
compatibility layer.
[[email protected]: rewrite changelog]
[[email protected]: use pci_rescan_bus()]
Reviewed-by: James Cameron <[email protected]>
Signed-off-by: Trent Piepho <[email protected]>
Signed-off-by: Alex Chiang <[email protected]>
---
drivers/pci/hotplug/Makefile | 2
drivers/pci/hotplug/fakephp.c | 395 ----------------------------------
drivers/pci/hotplug/legacy_fakephp.c | 162 ++++++++++++++
3 files changed, 163 insertions(+), 396 deletions(-)
delete mode 100644 drivers/pci/hotplug/fakephp.c
create mode 100644 drivers/pci/hotplug/legacy_fakephp.c
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 2aa117c..3314fe8 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -20,7 +20,7 @@ obj-$(CONFIG_HOTPLUG_PCI_RPA_DLPAR) += rpadlpar_io.o
obj-$(CONFIG_HOTPLUG_PCI_SGI) += sgi_hotplug.o
# Link this last so it doesn't claim devices that have a real hotplug driver
-obj-$(CONFIG_HOTPLUG_PCI_FAKE) += fakephp.o
+obj-$(CONFIG_HOTPLUG_PCI_FAKE) += legacy_fakephp.o
pci_hotplug-objs := pci_hotplug_core.o
diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
deleted file mode 100644
index 1606374..0000000
--- a/drivers/pci/hotplug/fakephp.c
+++ /dev/null
@@ -1,395 +0,0 @@
-/*
- * Fake PCI Hot Plug Controller Driver
- *
- * Copyright (C) 2003 Greg Kroah-Hartman <[email protected]>
- * Copyright (C) 2003 IBM Corp.
- * Copyright (C) 2003 Rolf Eike Beer <[email protected]>
- *
- * Based on ideas and code from:
- * Vladimir Kondratiev <[email protected]>
- * Rolf Eike Beer <[email protected]>
- *
- * All rights reserved.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation, version 2 of the License.
- *
- * Send feedback to <[email protected]>
- */
-
-/*
- *
- * This driver will "emulate" removing PCI devices from the system. If
- * the "power" file is written to with "0" then the specified PCI device
- * will be completely removed from the kernel.
- *
- * WARNING, this does NOT turn off the power to the PCI device. This is
- * a "logical" removal, not a physical or electrical removal.
- *
- * Use this module at your own risk, you have been warned!
- *
- * Enabling PCI devices is left as an exercise for the reader...
- *
- */
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/pci.h>
-#include <linux/pci_hotplug.h>
-#include <linux/init.h>
-#include <linux/string.h>
-#include <linux/slab.h>
-#include <linux/workqueue.h>
-#include "../pci.h"
-
-#if !defined(MODULE)
- #define MY_NAME "fakephp"
-#else
- #define MY_NAME THIS_MODULE->name
-#endif
-
-#define dbg(format, arg...) \
- do { \
- if (debug) \
- printk(KERN_DEBUG "%s: " format, \
- MY_NAME , ## arg); \
- } while (0)
-#define err(format, arg...) printk(KERN_ERR "%s: " format, MY_NAME , ## arg)
-#define info(format, arg...) printk(KERN_INFO "%s: " format, MY_NAME , ## arg)
-
-#define DRIVER_AUTHOR "Greg Kroah-Hartman <[email protected]>"
-#define DRIVER_DESC "Fake PCI Hot Plug Controller Driver"
-
-struct dummy_slot {
- struct list_head node;
- struct hotplug_slot *slot;
- struct pci_dev *dev;
- struct work_struct remove_work;
- unsigned long removed;
-};
-
-static int debug;
-static int dup_slots;
-static LIST_HEAD(slot_list);
-static struct workqueue_struct *dummyphp_wq;
-
-static void pci_rescan_worker(struct work_struct *work);
-static DECLARE_WORK(pci_rescan_work, pci_rescan_worker);
-
-static int enable_slot (struct hotplug_slot *slot);
-static int disable_slot (struct hotplug_slot *slot);
-
-static struct hotplug_slot_ops dummy_hotplug_slot_ops = {
- .owner = THIS_MODULE,
- .enable_slot = enable_slot,
- .disable_slot = disable_slot,
-};
-
-static void dummy_release(struct hotplug_slot *slot)
-{
- struct dummy_slot *dslot = slot->private;
-
- list_del(&dslot->node);
- kfree(dslot->slot->info);
- kfree(dslot->slot);
- pci_dev_put(dslot->dev);
- kfree(dslot);
-}
-
-#define SLOT_NAME_SIZE 8
-
-static int add_slot(struct pci_dev *dev)
-{
- struct dummy_slot *dslot;
- struct hotplug_slot *slot;
- char name[SLOT_NAME_SIZE];
- int retval = -ENOMEM;
- static int count = 1;
-
- slot = kzalloc(sizeof(struct hotplug_slot), GFP_KERNEL);
- if (!slot)
- goto error;
-
- slot->info = kzalloc(sizeof(struct hotplug_slot_info), GFP_KERNEL);
- if (!slot->info)
- goto error_slot;
-
- slot->info->power_status = 1;
- slot->info->max_bus_speed = PCI_SPEED_UNKNOWN;
- slot->info->cur_bus_speed = PCI_SPEED_UNKNOWN;
-
- dslot = kzalloc(sizeof(struct dummy_slot), GFP_KERNEL);
- if (!dslot)
- goto error_info;
-
- if (dup_slots)
- snprintf(name, SLOT_NAME_SIZE, "fake");
- else
- snprintf(name, SLOT_NAME_SIZE, "fake%d", count++);
- dbg("slot->name = %s\n", name);
- slot->ops = &dummy_hotplug_slot_ops;
- slot->release = &dummy_release;
- slot->private = dslot;
-
- retval = pci_hp_register(slot, dev->bus, PCI_SLOT(dev->devfn), name);
- if (retval) {
- err("pci_hp_register failed with error %d\n", retval);
- goto error_dslot;
- }
-
- dbg("slot->name = %s\n", hotplug_slot_name(slot));
- dslot->slot = slot;
- dslot->dev = pci_dev_get(dev);
- list_add (&dslot->node, &slot_list);
- return retval;
-
-error_dslot:
- kfree(dslot);
-error_info:
- kfree(slot->info);
-error_slot:
- kfree(slot);
-error:
- return retval;
-}
-
-static int __init pci_scan_buses(void)
-{
- struct pci_dev *dev = NULL;
- int lastslot = 0;
-
- while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
- if (PCI_FUNC(dev->devfn) > 0 &&
- lastslot == PCI_SLOT(dev->devfn))
- continue;
- lastslot = PCI_SLOT(dev->devfn);
- add_slot(dev);
- }
-
- return 0;
-}
-
-static void remove_slot(struct dummy_slot *dslot)
-{
- int retval;
-
- dbg("removing slot %s\n", hotplug_slot_name(dslot->slot));
- retval = pci_hp_deregister(dslot->slot);
- if (retval)
- err("Problem unregistering a slot %s\n",
- hotplug_slot_name(dslot->slot));
-}
-
-/* called from the single-threaded workqueue handler to remove a slot */
-static void remove_slot_worker(struct work_struct *work)
-{
- struct dummy_slot *dslot =
- container_of(work, struct dummy_slot, remove_work);
- remove_slot(dslot);
-}
-
-/**
- * pci_rescan_slot - Rescan slot
- * @temp: Device template. Should be set: bus and devfn.
- *
- * Tries hard not to re-enable already existing devices;
- * also handles scanning of subfunctions.
- */
-static int pci_rescan_slot(struct pci_dev *temp)
-{
- struct pci_bus *bus = temp->bus;
- struct pci_dev *dev;
- int func;
- u8 hdr_type;
- int count = 0;
-
- if (!pci_read_config_byte(temp, PCI_HEADER_TYPE, &hdr_type)) {
- temp->hdr_type = hdr_type & 0x7f;
- if ((dev = pci_get_slot(bus, temp->devfn)) != NULL)
- pci_dev_put(dev);
- else {
- dev = pci_scan_single_device(bus, temp->devfn);
- if (dev) {
- dbg("New device on %s function %x:%x\n",
- bus->name, temp->devfn >> 3,
- temp->devfn & 7);
- count++;
- }
- }
- /* multifunction device? */
- if (!(hdr_type & 0x80))
- return count;
-
- /* continue scanning for other functions */
- for (func = 1, temp->devfn++; func < 8; func++, temp->devfn++) {
- if (pci_read_config_byte(temp, PCI_HEADER_TYPE, &hdr_type))
- continue;
- temp->hdr_type = hdr_type & 0x7f;
-
- if ((dev = pci_get_slot(bus, temp->devfn)) != NULL)
- pci_dev_put(dev);
- else {
- dev = pci_scan_single_device(bus, temp->devfn);
- if (dev) {
- dbg("New device on %s function %x:%x\n",
- bus->name, temp->devfn >> 3,
- temp->devfn & 7);
- count++;
- }
- }
- }
- }
-
- return count;
-}
-
-
-/**
- * pci_rescan_bus_local - fakephp version of rescan PCI bus
- * @bus: the PCI bus to rescan
- *
- * Call pci_rescan_slot for each possible function of the bus.
- */
-static void pci_rescan_bus_local(const struct pci_bus *bus)
-{
- unsigned int devfn;
- struct pci_dev *dev;
- int retval;
- int found = 0;
- dev = alloc_pci_dev();
- if (!dev)
- return;
-
- dev->bus = (struct pci_bus*)bus;
- dev->sysdata = bus->sysdata;
- for (devfn = 0; devfn < 0x100; devfn += 8) {
- dev->devfn = devfn;
- found += pci_rescan_slot(dev);
- }
-
- if (found) {
- pci_bus_assign_resources(bus);
- list_for_each_entry(dev, &bus->devices, bus_list) {
- /* Skip already-added devices */
- if (dev->is_added)
- continue;
- retval = pci_bus_add_device(dev);
- if (retval)
- dev_err(&dev->dev,
- "Error adding device, continuing\n");
- else
- add_slot(dev);
- }
- pci_bus_add_devices(bus);
- }
- kfree(dev);
-}
-
-/* recursively scan all buses */
-static void pci_rescan_buses(const struct list_head *list)
-{
- const struct list_head *l;
- list_for_each(l,list) {
- const struct pci_bus *b = pci_bus_b(l);
- pci_rescan_bus_local(b);
- pci_rescan_buses(&b->children);
- }
-}
-
-/* initiate rescan of all pci buses */
-static inline void pci_rescan(void) {
- pci_rescan_buses(&pci_root_buses);
-}
-
-/* called from the single-threaded workqueue handler to rescan all pci buses */
-static void pci_rescan_worker(struct work_struct *work)
-{
- pci_rescan();
-}
-
-static int enable_slot(struct hotplug_slot *hotplug_slot)
-{
- /* mis-use enable_slot for rescanning of the pci bus */
- cancel_work_sync(&pci_rescan_work);
- queue_work(dummyphp_wq, &pci_rescan_work);
- return 0;
-}
-
-static int disable_slot(struct hotplug_slot *slot)
-{
- struct dummy_slot *dslot;
- struct pci_dev *dev;
- int func;
-
- if (!slot)
- return -ENODEV;
- dslot = slot->private;
-
- dbg("%s - physical_slot = %s\n", __func__, hotplug_slot_name(slot));
-
- for (func = 7; func >= 0; func--) {
- dev = pci_get_slot(dslot->dev->bus, dslot->dev->devfn + func);
- if (!dev)
- continue;
-
- if (test_and_set_bit(0, &dslot->removed)) {
- dbg("Slot already scheduled for removal\n");
- pci_dev_put(dev);
- return -ENODEV;
- }
-
- /* remove the device from the pci core */
- pci_remove_bus_device(dev);
-
- /* queue work item to blow away this sysfs entry and other
- * parts.
- */
- INIT_WORK(&dslot->remove_work, remove_slot_worker);
- queue_work(dummyphp_wq, &dslot->remove_work);
-
- pci_dev_put(dev);
- }
- return 0;
-}
-
-static void cleanup_slots (void)
-{
- struct list_head *tmp;
- struct list_head *next;
- struct dummy_slot *dslot;
-
- destroy_workqueue(dummyphp_wq);
- list_for_each_safe (tmp, next, &slot_list) {
- dslot = list_entry (tmp, struct dummy_slot, node);
- remove_slot(dslot);
- }
-
-}
-
-static int __init dummyphp_init(void)
-{
- info(DRIVER_DESC "\n");
-
- dummyphp_wq = create_singlethread_workqueue(MY_NAME);
- if (!dummyphp_wq)
- return -ENOMEM;
-
- return pci_scan_buses();
-}
-
-
-static void __exit dummyphp_exit(void)
-{
- cleanup_slots();
-}
-
-module_init(dummyphp_init);
-module_exit(dummyphp_exit);
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_LICENSE("GPL");
-module_param(debug, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(debug, "Debugging mode enabled or not");
-module_param(dup_slots, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(dup_slots, "Force duplicate slot names for debugging");
diff --git a/drivers/pci/hotplug/legacy_fakephp.c b/drivers/pci/hotplug/legacy_fakephp.c
new file mode 100644
index 0000000..58a9bfa
--- /dev/null
+++ b/drivers/pci/hotplug/legacy_fakephp.c
@@ -0,0 +1,162 @@
+/* Works like the fakephp driver used to, except a little better.
+ *
+ * - It's possible to remove devices with subordinate busses.
+ * - New PCI devices that appear via any method, not just a fakephp triggered
+ * rescan, will be noticed.
+ * - Devices that are removed via any method, not just a fakephp triggered
+ * removal, will also be noticed.
+ *
+ * Uses nothing from the pci-hotplug subsystem.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/list.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/init.h>
+#include <linux/pci.h>
+#include "../pci.h"
+
+struct legacy_slot {
+ struct kobject kobj;
+ struct pci_dev *dev;
+ struct list_head list;
+};
+
+static LIST_HEAD(legacy_list);
+
+static ssize_t legacy_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ strcpy(buf, "1\n");
+ return 2;
+}
+
+static void remove_callback(void *data)
+{
+ pci_remove_bus_device((struct pci_dev *)data);
+}
+
+static ssize_t legacy_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t len)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+ unsigned long val;
+
+ if (strict_strtoul(buf, 0, &val) < 0)
+ return -EINVAL;
+
+ if (val)
+ pci_rescan_bus(slot->dev->bus);
+ else
+ sysfs_schedule_callback(&slot->dev->dev.kobj, remove_callback,
+ slot->dev, THIS_MODULE);
+ return len;
+}
+
+static struct attribute *legacy_attrs[] = {
+ &(struct attribute){ .name = "power", .mode = 0644 },
+ NULL,
+};
+
+static void legacy_release(struct kobject *kobj)
+{
+ struct legacy_slot *slot = container_of(kobj, typeof(*slot), kobj);
+
+ pci_dev_put(slot->dev);
+ kfree(slot);
+}
+
+static struct kobj_type legacy_ktype = {
+ .sysfs_ops = &(struct sysfs_ops){
+ .store = legacy_store, .show = legacy_show
+ },
+ .release = &legacy_release,
+ .default_attrs = legacy_attrs,
+};
+
+static int legacy_add_slot(struct pci_dev *pdev)
+{
+ struct legacy_slot *slot = kzalloc(sizeof(*slot), GFP_KERNEL);
+
+ if (!slot)
+ return -ENOMEM;
+
+ if (kobject_init_and_add(&slot->kobj, &legacy_ktype,
+ &pci_slots_kset->kobj, "%s",
+ pdev->dev.bus_id)) {
+ dev_warn(&pdev->dev, "Failed to created legacy fake slot\n");
+ return -EINVAL;
+ }
+ slot->dev = pci_dev_get(pdev);
+
+ list_add(&slot->list, &legacy_list);
+
+ return 0;
+}
+
+static int legacy_notify(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ struct pci_dev *pdev = to_pci_dev(data);
+
+ if (action == BUS_NOTIFY_ADD_DEVICE) {
+ legacy_add_slot(pdev);
+ } else if (action == BUS_NOTIFY_DEL_DEVICE) {
+ struct legacy_slot *slot;
+
+ list_for_each_entry(slot, &legacy_list, list)
+ if (slot->dev == pdev)
+ goto found;
+
+ dev_warn(&pdev->dev, "Missing legacy fake slot?");
+ return -ENODEV;
+found:
+ kobject_del(&slot->kobj);
+ list_del(&slot->list);
+ kobject_put(&slot->kobj);
+ }
+
+ return 0;
+}
+
+static struct notifier_block legacy_notifier = {
+ .notifier_call = legacy_notify
+};
+
+static int __init init_legacy(void)
+{
+ struct pci_dev *pdev = NULL;
+
+ /* Add existing devices */
+ while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev)))
+ legacy_add_slot(pdev);
+
+ /* Be alerted of any new ones */
+ bus_register_notifier(&pci_bus_type, &legacy_notifier);
+ return 0;
+}
+module_init(init_legacy);
+
+static void __exit remove_legacy(void)
+{
+ struct legacy_slot *slot, *tmp;
+
+ bus_unregister_notifier(&pci_bus_type, &legacy_notifier);
+
+ list_for_each_entry_safe(slot, tmp, &legacy_list, list) {
+ list_del(&slot->list);
+ kobject_del(&slot->kobj);
+ kobject_put(&slot->kobj);
+ }
+}
+module_exit(remove_legacy);
+
+
+MODULE_AUTHOR("Trent Piepho <[email protected]>");
+MODULE_DESCRIPTION("Legacy version of the fakephp interface");
+MODULE_LICENSE("GPL");
Alex Chiang wrote:
> We're getting close to the new merge window, and I _think_ this
> patch series is ready for consideration. Notably:
>
> - multiple rescans/removes of devices with and without bridges
> - verified that resource allocation after multiple remove/rescan
> cycles is the same as what we had during initial boot
> - fixes the complete suckage of fakephp (that I created)
> - doesn't affect existing hotplug drivers
> - tested on x86 and ia64 platforms
>
> Please review, and consider testing. For testing ease, you can pull
> from my git branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
> branch 'test-20090318' is what you want
>
I got the following oops when I did
# echo 1 > remove
on the bridge device.
[ 6639.239164] =============================================
[ 6639.239191] [ INFO: possible recursive locking detected ]
[ 6639.239212] 2.6.29-rc8-kk #1
[ 6639.239227] ---------------------------------------------
[ 6639.239243] events/8/60 is trying to acquire lock:
[ 6639.239252] (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
[ 6639.239252]
[ 6639.239252] but task is already holding lock:
[ 6639.239252] (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[ 6639.239252]
[ 6639.239252] other info that might help us debug this:
[ 6639.239252] 3 locks held by events/8/60:
[ 6639.239252] #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[ 6639.239252] #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[ 6639.239252] #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c05b9>] remove_callback+0x29/0x80
[ 6639.239252]
[ 6639.239252] stack backtrace:
[ 6639.239252] Pid: 60, comm: events/8 Not tainted 2.6.29-rc8-kk #1
[ 6639.239252] Call Trace:
[ 6639.239252] [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
[ 6639.239252] [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
[ 6639.239252] [<ffffffff8026f148>] lock_acquire+0x58/0x80
[ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[ 6639.239252] [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
[ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[ 6639.239252] [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
[ 6639.239252] [<ffffffffa0158065>] e1000_remove+0x55/0xfe [e1000e]
[ 6639.239252] [<ffffffff8033ee60>] ? sysfs_schedule_callback_work+0x0/0x50
[ 6639.239252] [<ffffffff803bf7a2>] pci_device_remove+0x32/0x70
[ 6639.239252] [<ffffffff80441409>] __device_release_driver+0x59/0x90
[ 6639.241006] [<ffffffff8044153b>] device_release_driver+0x2b/0x40
[ 6639.241006] [<ffffffff80441036>] bus_remove_device+0xa6/0x120
[ 6639.241006] [<ffffffff8043dacb>] device_del+0x12b/0x190
[ 6639.241006] [<ffffffff8043db56>] device_unregister+0x26/0x70
[ 6639.241006] [<ffffffff803ba529>] pci_stop_dev+0x49/0x60
[ 6639.241006] [<ffffffff803ba670>] pci_remove_bus_device+0x40/0xc0
[ 6639.241006] [<ffffffff803ba71d>] pci_remove_behind_bridge+0x2d/0x50
[ 6639.241006] [<ffffffff803ba64e>] pci_remove_bus_device+0x1e/0xc0
[ 6639.241006] [<ffffffff803c05c8>] remove_callback+0x38/0x80
[ 6639.241006] [<ffffffff8033ee7f>] sysfs_schedule_callback_work+0x1f/0x50
[ 6639.241006] [<ffffffff8025769a>] run_workqueue+0x15a/0x230
[ 6639.241006] [<ffffffff80257648>] ? run_workqueue+0x108/0x230
[ 6639.241006] [<ffffffff8025846f>] worker_thread+0x9f/0x100
[ 6639.241006] [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
[ 6639.241006] [<ffffffff802583d0>] ? worker_thread+0x0/0x100
[ 6639.241006] [<ffffffff8025b89d>] kthread+0x4d/0x80
[ 6639.241006] [<ffffffff8020d4ba>] child_rip+0xa/0x20
[ 6639.241006] [<ffffffff8020cebc>] ? restore_args+0x0/0x30
[ 6639.241006] [<ffffffff8025b850>] ? kthread+0x0/0x80
[ 6639.241006] [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
[ 6639.283330] e1000e 0000:40:00.0: PCI INT A disabled
[ 6639.324332] e1000e 0000:40:00.1: PCI INT B disabled
[ 6639.325031] aer 0000:2f:04.0:pcie22: unloading service driver aer
Thanks,
Kenji Kaneshige
> Note, this test branch does contain the small assorted patches in AER
> and the PCIe portdriver that I fixed along the way, but does not contain
> the sysfs callback mutex that I introduced to protect myself from
> Vegard Nossum. ;) If you want to hammer away at the sysfs interface,
> please apply this patch on top:
>
> http://thread.gmane.org/gmane.linux.kernel/806648
>
> What I'm continuing to do:
>
> - investigate converting existing hotplug drivers to use
> new pci_rescan_bus() interface
>
> Thanks.
>
> /ac
>
> v3 -> v4:
> - protect sysfs interfaces with mutex
> - undo changes in pci_do_scan_bus
> - introduce pci_rescan_bus instead
> - do not initialize bridges more than once
>
> v2 -> v3:
> - properly remove device with internal bridge
> - added Kenji Kaneshige's pci_is_root_bus() interface
> - dropped whitespace cleanups for another time
>
> v1 -> v2:
> - incorporated lots of Trent Piepho's work
> - beefed up pci_do_scan_bus as heavy lifter for rescanning
> - small bugfixes folded into earlier patches to get everything working
>
> ---
>
> Alex Chiang (8):
> PCI Hotplug: schedule fakephp for feature removal
> PCI Hotplug: rename legacy_fakephp to fakephp
> PCI: Introduce /sys/bus/pci/devices/.../rescan
> PCI: Introduce /sys/bus/pci/devices/.../remove
> PCI: Introduce /sys/bus/pci/rescan
> PCI: Introduce pci_rescan_bus()
> PCI: do not initialize bridges more than once
> PCI: always scan child buses
>
> Kenji Kaneshige (1):
> PCI: pci_is_root_bus helper
>
> Trent Piepho (3):
> PCI Hotplug: restore fakephp interface with complete reimplementation
> PCI: pci_scan_slot() returns newly found devices
> PCI: don't scan existing devices
>
>
> Documentation/ABI/testing/sysfs-bus-pci | 27 ++
> Documentation/feature-removal-schedule.txt | 32 ++
> Documentation/filesystems/sysfs-pci.txt | 10 +
> drivers/pci/hotplug/fakephp.c | 443 +++++++---------------------
> drivers/pci/pci-driver.c | 1
> drivers/pci/pci-sysfs.c | 95 ++++++
> drivers/pci/pci.h | 6
> drivers/pci/probe.c | 103 ++++---
> drivers/pci/setup-bus.c | 3
> include/linux/pci.h | 12 +
> 10 files changed, 358 insertions(+), 374 deletions(-)
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
> This API is used by the PCI core to rescan a bus and rediscover
> newly added devices.
>
> Over time, it is expected that the various PCI hotplug drivers
> will migrate to this interface and away from the old
> pci_do_scan_bus() interface.
>
> ...
>
> @@ -1195,10 +1195,39 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent,
> EXPORT_SYMBOL(pci_scan_bus_parented);
>
> #ifdef CONFIG_HOTPLUG
> +/**
> + * pci_rescan_bus - scan a PCI bus for devices.
> + * @bus: PCI bus to scan
> + *
> + * Scan a PCI bus and child buses for new devices, adds them,
> + * and enables them.
> + *
> + * Returns the max number of subordinate bus discovered.
> + */
> +unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
> +{
> + unsigned int max;
> + struct pci_dev *dev;
> +
> + max = pci_scan_child_bus(bus);
> + list_for_each_entry(dev, &bus->devices, bus_list)
> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> + if (dev->subordinate)
> + pci_bus_size_bridges(dev->subordinate);
> +
> + pci_bus_assign_resources(bus);
> + pci_enable_bridges(bus);
> + pci_bus_add_devices(bus);
> +
> + return max;
> +}
> +
> EXPORT_SYMBOL(pci_add_new_bus);
> EXPORT_SYMBOL(pci_scan_slot);
> EXPORT_SYMBOL(pci_scan_bridge);
> EXPORT_SYMBOL_GPL(pci_scan_child_bus);
> +EXPORT_SYMBOL_GPL(pci_rescan_bus);
uhm, is there any rationale for the seemingly-random mixup of export types
in this interface?
On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
> +/**
> + * pci_rescan_bus - scan a PCI bus for devices.
> + * @bus: PCI bus to scan
> + *
> + * Scan a PCI bus and child buses for new devices, adds them,
> + * and enables them.
> + *
> + * Returns the max number of subordinate bus discovered.
> + */
> +unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
> +{
> + unsigned int max;
> + struct pci_dev *dev;
> +
> + max = pci_scan_child_bus(bus);
> + list_for_each_entry(dev, &bus->devices, bus_list)
> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> + if (dev->subordinate)
> + pci_bus_size_bridges(dev->subordinate);
> +
> + pci_bus_assign_resources(bus);
> + pci_enable_bridges(bus);
> + pci_bus_add_devices(bus);
> +
> + return max;
> +}
The conspicuous lack of locking for the list walk requires either
a) documenting or
b) fixing
IMO.
On Wed, 18 Mar 2009 16:40:01 -0600 Alex Chiang <[email protected]> wrote:
> +#ifdef CONFIG_HOTPLUG
> +static DEFINE_MUTEX(pci_remove_rescan_mutex);
> +static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
> + size_t count)
> +{
> + unsigned long val;
> + struct pci_bus *b = NULL;
> +
> + if (strict_strtoul(buf, 0, &val) < 0)
> + return -EINVAL;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (val) {
> + mutex_lock(&pci_remove_rescan_mutex);
> + while ((b = pci_find_next_bus(b)) != NULL)
> + pci_rescan_bus(b);
> + mutex_unlock(&pci_remove_rescan_mutex);
> + }
> + return count;
> +}
> +
> +struct bus_attribute pci_bus_attrs[] = {
> + __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> + __ATTR_NULL
> +};
> +#endif
Why CONFIG_HOTPLUG rather than CONFIG_HOTPLUG_PCI (or similar)?
On Wed, 18 Mar 2009 16:40:06 -0600 Alex Chiang <[email protected]> wrote:
> This patch adds an attribute named "remove" to a PCI device's sysfs
> directory. Writing a non-zero value to this attribute will remove the PCI
> device and any children of it.
>
> Trent Piepho wrote the original implementation and documentation.
>
> Thanks to Vegard Nossum for testing under kmemcheck and finding locking
> issues with the sysfs interface.
>
> ...
>
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
> __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> __ATTR_NULL
> };
> +
> +static void remove_callback(struct device *dev)
> +{
> + int bridge = 0;
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + mutex_lock(&pci_remove_rescan_mutex);
> +
> + if (pdev->subordinate)
> + bridge = 1;
> +
> + pci_remove_bus_device(pdev);
> + if (bridge && list_empty(&pdev->bus->devices))
> + pci_remove_bus(pdev->bus);
> +
> + mutex_unlock(&pci_remove_rescan_mutex);
> +}
> +
> +static ssize_t
> +remove_store(struct device *dev, struct device_attribute *dummy,
> + const char *buf, size_t count)
> +{
> + int ret = 0;
> + unsigned long val;
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + if (strict_strtoul(buf, 0, &val) < 0)
> + return -EINVAL;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
> + return -EBUSY;
> +
> + if (val)
> + ret = device_schedule_callback(dev, remove_callback);
> + if (ret)
> + count = ret;
> + return count;
> +}
> #endif
It is very hard for the reader (this one at least) to work out why
device_schedule_callback() is used here, instead of simply doing the work
directly.
The way to solve that problem is to add a code comment.
Given that we're in a sysfs write() handler where no relevant locks at all
are held, it seems rather weird that we cannot perform this operation
synchronously, but no doubt the comment will explain all of this.
Do we need the CAP_SYS_ADMIN check if the sysfs file permissions are
correct? (I keep on asking this then forgetting the answer).
The device_schedule_callback() thing exposes us to (I assume) a pile of
races, the most obvious of which is "what locking or refcounting keeps
*dev alive?". It would be nice to see an analysis/description of the
lifetime issues here. Perhaps in the changelog, preferably in code
comments.
On Wed, 18 Mar 2009 16:40:16 -0600 Alex Chiang <[email protected]> wrote:
> A complete re-implementation of fakephp is necessary if it is to
> present its former interface (pre-2.6.27, when it broke).
We broke an existing interface? wtf?
If it's been broken for this long then do we actually need to
resurrect it?
If so, do we need to resurrect it in its old form? This would appear
to be an opportunity to improve that interface, unless the old one
was perfect?
Alex Chiang wrote:
> We're getting close to the new merge window, and I _think_ this
> patch series is ready for consideration. Notably:
>
> - multiple rescans/removes of devices with and without bridges
> - verified that resource allocation after multiple remove/rescan
> cycles is the same as what we had during initial boot
> - fixes the complete suckage of fakephp (that I created)
> - doesn't affect existing hotplug drivers
> - tested on x86 and ia64 platforms
>
> Please review, and consider testing. For testing ease, you can pull
> from my git branch:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
> branch 'test-20090318' is what you want
>
When I rescan the bus even without remove anything, I got the
following messages. And enable count of the bridge is incremented
every time.
[17240.309096] pci 0000:2d:00.0: BAR 6: bogus alignment [0x0-0x0] flags 0x2
[17240.309135] pcieport-driver 0000:00:01.0: setting latency timer to 64
[17240.309143] pcieport-driver 0000:00:02.0: setting latency timer to 64
[17240.309150] pcieport-driver 0000:14:00.0: setting latency timer to 64
[17240.309159] pcieport-driver 0000:15:00.0: setting latency timer to 64
[17240.309166] pcieport-driver 0000:15:01.0: setting latency timer to 64
[17240.309174] pcieport-driver 0000:15:02.0: setting latency timer to 64
[17240.309182] pci 0000:14:00.3: setting latency timer to 64
[17240.309189] pcieport-driver 0000:00:03.0: setting latency timer to 64
[17240.309198] pcieport-driver 0000:2a:00.0: setting latency timer to 64
[17240.309207] pcieport-driver 0000:2b:02.0: setting latency timer to 64
[17240.309215] pcieport-driver 0000:2b:04.0: setting latency timer to 64
[17240.309223] pcieport-driver 0000:00:04.0: setting latency timer to 64
[17240.309231] pcieport-driver 0000:2e:00.0: setting latency timer to 64
[17240.309240] pcieport-driver 0000:2f:02.0: setting latency timer to 64
[17240.309248] pcieport-driver 0000:2f:04.0: setting latency timer to 64
[17240.309256] pcieport-driver 0000:00:06.0: setting latency timer to 64
[17240.309264] pcieport-driver 0000:51:00.0: setting latency timer to 64
[17240.309273] pcieport-driver 0000:52:02.0: setting latency timer to 64
[17240.309281] pcieport-driver 0000:52:04.0: setting latency timer to 64
[17240.309291] pci 0000:64:00.0: setting latency timer to 64
[17240.309300] pcieport-driver 0000:00:1c.0: setting latency timer to 64
[17240.309308] pci 0000:00:1e.0: setting latency timer to 64
[root@localhost pci]# echo 1 > rescan
[root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
6
[root@localhost pci]# echo 1 > rescan
[root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
7
[root@localhost pci]# echo 1 > rescan
[root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
8
[root@localhost pci]# echo 1 > rescan
[root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
9
Thanks,
Kenji Kaneshige
Alex Chiang wrote:
> This patch adds an attribute named "remove" to a PCI device's sysfs
> directory. Writing a non-zero value to this attribute will remove the PCI
> device and any children of it.
>
> Trent Piepho wrote the original implementation and documentation.
>
> Thanks to Vegard Nossum for testing under kmemcheck and finding locking
> issues with the sysfs interface.
>
> Cc: Trent Piepho <[email protected]>
> Signed-off-by: Alex Chiang <[email protected]>
> ---
>
> Documentation/ABI/testing/sysfs-bus-pci | 8 ++++++
> Documentation/filesystems/sysfs-pci.txt | 10 +++++++
> drivers/pci/pci-sysfs.c | 44 +++++++++++++++++++++++++++++++
> 3 files changed, 62 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index ea4aee2..5b1ddde 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -50,6 +50,14 @@ Description:
> re-discover previously removed devices.
> Depends on CONFIG_HOTPLUG.
>
> +What: /sys/bus/pci/devices/.../remove
> +Date: January 2009
> +Contact: Linux PCI developers <[email protected]>
> +Description:
> + Writing a non-zero value to this attribute will
> + hot-remove the PCI device and any of its children.
> + Depends on CONFIG_HOTPLUG.
> +
> What: /sys/bus/pci/devices/.../vpd
> Date: February 2008
> Contact: Ben Hutchings <[email protected]>
> diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
> index 9f8740c..26e4b8b 100644
> --- a/Documentation/filesystems/sysfs-pci.txt
> +++ b/Documentation/filesystems/sysfs-pci.txt
> @@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
> | |-- enable
> | |-- irq
> | |-- local_cpus
> + | |-- remove
> | |-- resource
> | |-- resource0
> | |-- resource1
> @@ -36,6 +37,7 @@ files, each with their own function.
> enable Whether the device is enabled (ascii, rw)
> irq IRQ number (ascii, ro)
> local_cpus nearby CPU mask (cpumask, ro)
> + remove remove device from kernel's list (ascii, wo)
> resource PCI resource host addresses (ascii, ro)
> resource0..N PCI resource N, if present (binary, mmap)
> resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
> @@ -46,6 +48,7 @@ files, each with their own function.
>
> ro - read only file
> rw - file is readable and writable
> + wo - write only file
> mmap - file is mmapable
> ascii - file contains ascii text
> binary - file contains binary data
> @@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
> In the event a driver is not bound to the device, it can be enabled using the
> 'enable' file, documented above.
>
> +The 'remove' file is used to remove the PCI device, by writing a non-zero
> +integer to the file. This does not involve any kind of hot-plug functionality,
> +e.g. powering off the device. The device is removed from the kernel's list of
> +PCI devices, the sysfs directory for it is removed, and the device will be
> +removed from any drivers attached to it. Removal of PCI root buses is
> +disallowed.
> +
> Accessing legacy resources through sysfs
> ----------------------------------------
>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 22dbc65..6e2b1fd 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
> __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> __ATTR_NULL
> };
> +
> +static void remove_callback(struct device *dev)
> +{
> + int bridge = 0;
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + mutex_lock(&pci_remove_rescan_mutex);
> +
> + if (pdev->subordinate)
> + bridge = 1;
> +
> + pci_remove_bus_device(pdev);
> + if (bridge && list_empty(&pdev->bus->devices))
> + pci_remove_bus(pdev->bus);
I cannot understand the above two lines. Could you explain
what it intend?
Thanks,
Kenji Kaneshige
> +
> + mutex_unlock(&pci_remove_rescan_mutex);
> +}
> +
> +static ssize_t
> +remove_store(struct device *dev, struct device_attribute *dummy,
> + const char *buf, size_t count)
> +{
> + int ret = 0;
> + unsigned long val;
> + struct pci_dev *pdev = to_pci_dev(dev);
> +
> + if (strict_strtoul(buf, 0, &val) < 0)
> + return -EINVAL;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
> +
> + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
> + return -EBUSY;
> +
> + if (val)
> + ret = device_schedule_callback(dev, remove_callback);
> + if (ret)
> + count = ret;
> + return count;
> +}
> #endif
>
> struct device_attribute pci_dev_attrs[] = {
> @@ -266,6 +307,9 @@ struct device_attribute pci_dev_attrs[] = {
> __ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
> broken_parity_status_show,broken_parity_status_store),
> __ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
> +#ifdef CONFIG_HOTPLUG
> + __ATTR(remove, S_IWUSR, NULL, remove_store),
> +#endif
> __ATTR_NULL,
> };
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
On Thu, Mar 19, 2009 at 02:43:34AM -0700, Andrew Morton wrote:
> On Wed, 18 Mar 2009 16:40:06 -0600 Alex Chiang <[email protected]> wrote:
>
> > This patch adds an attribute named "remove" to a PCI device's sysfs
> > directory. Writing a non-zero value to this attribute will remove the PCI
> > device and any children of it.
> >
> > Trent Piepho wrote the original implementation and documentation.
> >
> > Thanks to Vegard Nossum for testing under kmemcheck and finding locking
> > issues with the sysfs interface.
> >
> > ...
> >
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
> > __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> > __ATTR_NULL
> > };
> > +
> > +static void remove_callback(struct device *dev)
> > +{
> > + int bridge = 0;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + mutex_lock(&pci_remove_rescan_mutex);
> > +
> > + if (pdev->subordinate)
> > + bridge = 1;
> > +
> > + pci_remove_bus_device(pdev);
> > + if (bridge && list_empty(&pdev->bus->devices))
> > + pci_remove_bus(pdev->bus);
> > +
> > + mutex_unlock(&pci_remove_rescan_mutex);
> > +}
> > +
> > +static ssize_t
> > +remove_store(struct device *dev, struct device_attribute *dummy,
> > + const char *buf, size_t count)
> > +{
> > + int ret = 0;
> > + unsigned long val;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + if (strict_strtoul(buf, 0, &val) < 0)
> > + return -EINVAL;
> > +
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EPERM;
> > +
> > + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
> > + return -EBUSY;
> > +
> > + if (val)
> > + ret = device_schedule_callback(dev, remove_callback);
> > + if (ret)
> > + count = ret;
> > + return count;
> > +}
> > #endif
>
> It is very hard for the reader (this one at least) to work out why
> device_schedule_callback() is used here, instead of simply doing the work
> directly.
>
> The way to solve that problem is to add a code comment.
>
> Given that we're in a sysfs write() handler where no relevant locks at all
> are held, it seems rather weird that we cannot perform this operation
> synchronously, but no doubt the comment will explain all of this.
>
> Do we need the CAP_SYS_ADMIN check if the sysfs file permissions are
> correct? (I keep on asking this then forgetting the answer).
You can do this kind of check if you want to be paranoid. You can
specify the mode of the file when you create it, but if a properly
permissive user (like root) changes the mode, it sticks, so that a
"normal" use could then write to the file.
Usually I wouldn't recommend checking, as it's overkill if you set the
mode properly in the code.
thanks,
greg k-h
* Andrew Morton <[email protected]>:
> On Wed, 18 Mar 2009 16:40:06 -0600 Alex Chiang <[email protected]> wrote:
>
> > This patch adds an attribute named "remove" to a PCI device's sysfs
> > directory. Writing a non-zero value to this attribute will remove the PCI
> > device and any children of it.
> >
> > Trent Piepho wrote the original implementation and documentation.
> >
> > Thanks to Vegard Nossum for testing under kmemcheck and finding locking
> > issues with the sysfs interface.
> >
> > ...
> >
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
> > __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> > __ATTR_NULL
> > };
> > +
> > +static void remove_callback(struct device *dev)
> > +{
> > + int bridge = 0;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + mutex_lock(&pci_remove_rescan_mutex);
> > +
> > + if (pdev->subordinate)
> > + bridge = 1;
> > +
> > + pci_remove_bus_device(pdev);
> > + if (bridge && list_empty(&pdev->bus->devices))
> > + pci_remove_bus(pdev->bus);
> > +
> > + mutex_unlock(&pci_remove_rescan_mutex);
> > +}
> > +
> > +static ssize_t
> > +remove_store(struct device *dev, struct device_attribute *dummy,
> > + const char *buf, size_t count)
> > +{
> > + int ret = 0;
> > + unsigned long val;
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + if (strict_strtoul(buf, 0, &val) < 0)
> > + return -EINVAL;
> > +
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EPERM;
> > +
> > + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
> > + return -EBUSY;
> > +
> > + if (val)
> > + ret = device_schedule_callback(dev, remove_callback);
> > + if (ret)
> > + count = ret;
> > + return count;
> > +}
> > #endif
>
> It is very hard for the reader (this one at least) to work out why
> device_schedule_callback() is used here, instead of simply doing the work
> directly.
>
> The way to solve that problem is to add a code comment.
Hm, I thought it was well-known that a sysfs attribute cannot
remove itself without deadlocking. Thus, we need to use this
callback mechanism.
This thread has the most recent discussion:
http://thread.gmane.org/gmane.linux.kernel/805033
I will add a comment to the code.
> Given that we're in a sysfs write() handler where no relevant locks at all
> are held, it seems rather weird that we cannot perform this operation
> synchronously, but no doubt the comment will explain all of this.
>
> Do we need the CAP_SYS_ADMIN check if the sysfs file permissions are
> correct? (I keep on asking this then forgetting the answer).
I will remove this (as per Greg's advice in a later mail).
> The device_schedule_callback() thing exposes us to (I assume) a pile of
> races, the most obvious of which is "what locking or refcounting keeps
> *dev alive?". It would be nice to see an analysis/description of the
> lifetime issues here. Perhaps in the changelog, preferably in code
> comments.
No races; device_schedule_callback() takes a ref on dev, pinning
it until the callback handler returns, after which it releases
the ref. You can see this in sysfs_schedule_callback.
Thanks for the review.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> This patch adds an attribute named "remove" to a PCI device's sysfs
>> directory. Writing a non-zero value to this attribute will remove the PCI
>> device and any children of it.
>>
>> Trent Piepho wrote the original implementation and documentation.
>>
>> Thanks to Vegard Nossum for testing under kmemcheck and finding locking
>> issues with the sysfs interface.
>>
>> Cc: Trent Piepho <[email protected]>
>> Signed-off-by: Alex Chiang <[email protected]>
>> ---
>>
>> Documentation/ABI/testing/sysfs-bus-pci | 8 ++++++
>> Documentation/filesystems/sysfs-pci.txt | 10 +++++++
>> drivers/pci/pci-sysfs.c | 44 +++++++++++++++++++++++++++++++
>> 3 files changed, 62 insertions(+), 0 deletions(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>> index ea4aee2..5b1ddde 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>> @@ -50,6 +50,14 @@ Description:
>> re-discover previously removed devices.
>> Depends on CONFIG_HOTPLUG.
>> +What: /sys/bus/pci/devices/.../remove
>> +Date: January 2009
>> +Contact: Linux PCI developers <[email protected]>
>> +Description:
>> + Writing a non-zero value to this attribute will
>> + hot-remove the PCI device and any of its children.
>> + Depends on CONFIG_HOTPLUG.
>> +
>> What: /sys/bus/pci/devices/.../vpd
>> Date: February 2008
>> Contact: Ben Hutchings <[email protected]>
>> diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
>> index 9f8740c..26e4b8b 100644
>> --- a/Documentation/filesystems/sysfs-pci.txt
>> +++ b/Documentation/filesystems/sysfs-pci.txt
>> @@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
>> | |-- enable
>> | |-- irq
>> | |-- local_cpus
>> + | |-- remove
>> | |-- resource
>> | |-- resource0
>> | |-- resource1
>> @@ -36,6 +37,7 @@ files, each with their own function.
>> enable Whether the device is enabled (ascii, rw)
>> irq IRQ number (ascii, ro)
>> local_cpus nearby CPU mask (cpumask, ro)
>> + remove remove device from kernel's list (ascii, wo)
>> resource PCI resource host addresses (ascii, ro)
>> resource0..N PCI resource N, if present (binary, mmap)
>> resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
>> @@ -46,6 +48,7 @@ files, each with their own function.
>> ro - read only file
>> rw - file is readable and writable
>> + wo - write only file
>> mmap - file is mmapable
>> ascii - file contains ascii text
>> binary - file contains binary data
>> @@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
>> In the event a driver is not bound to the device, it can be enabled using the
>> 'enable' file, documented above.
>> +The 'remove' file is used to remove the PCI device, by writing a
>> non-zero
>> +integer to the file. This does not involve any kind of hot-plug functionality,
>> +e.g. powering off the device. The device is removed from the kernel's list of
>> +PCI devices, the sysfs directory for it is removed, and the device will be
>> +removed from any drivers attached to it. Removal of PCI root buses is
>> +disallowed.
>> +
>> Accessing legacy resources through sysfs
>> ----------------------------------------
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 22dbc65..6e2b1fd 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
>> __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
>> __ATTR_NULL
>> };
>> +
>> +static void remove_callback(struct device *dev)
>> +{
>> + int bridge = 0;
>> + struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> + mutex_lock(&pci_remove_rescan_mutex);
>> +
>> + if (pdev->subordinate)
>> + bridge = 1;
>> +
>> + pci_remove_bus_device(pdev);
>> + if (bridge && list_empty(&pdev->bus->devices))
>> + pci_remove_bus(pdev->bus);
>
> I cannot understand the above two lines. Could you explain
> what it intend?
If the user says:
echo 1 > /sys/bus/pci/devices/.../remove
And that device is a bridge, then we need to specifically call
pci_remove_bus as well, to actually remove the bus itself.
Without it, pci_bus_remove_device() will remove all of its
children (and subordinate buses) in a depth-first manner, but we
will never actually remove the bus that the user specified.
In other words, without it, we will still see the bus in:
/sys/class/pci_bus/...
We only want to remove the bus if it has no children left. I
think the check for list_empty(&pdev->bus->devices) might be
overkill... I can try taking that bit out and testing again.
Thanks for the review.
/ac
> Thanks,
> Kenji Kaneshige
>
>
>> +
>> + mutex_unlock(&pci_remove_rescan_mutex);
>> +}
>> +
>> +static ssize_t
>> +remove_store(struct device *dev, struct device_attribute *dummy,
>> + const char *buf, size_t count)
>> +{
>> + int ret = 0;
>> + unsigned long val;
>> + struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> + if (strict_strtoul(buf, 0, &val) < 0)
>> + return -EINVAL;
>> +
>> + if (!capable(CAP_SYS_ADMIN))
>> + return -EPERM;
>> +
>> + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
>> + return -EBUSY;
>> +
>> + if (val)
>> + ret = device_schedule_callback(dev, remove_callback);
>> + if (ret)
>> + count = ret;
>> + return count;
>> +}
>> #endif
>> struct device_attribute pci_dev_attrs[] = {
>> @@ -266,6 +307,9 @@ struct device_attribute pci_dev_attrs[] = {
>> __ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
>> broken_parity_status_show,broken_parity_status_store),
>> __ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
>> +#ifdef CONFIG_HOTPLUG
>> + __ATTR(remove, S_IWUSR, NULL, remove_store),
>> +#endif
>> __ATTR_NULL,
>> };
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
>
* Andrew Morton <[email protected]>:
> On Wed, 18 Mar 2009 16:40:16 -0600 Alex Chiang <[email protected]> wrote:
>
> > A complete re-implementation of fakephp is necessary if it is to
> > present its former interface (pre-2.6.27, when it broke).
>
> We broke an existing interface? wtf?
When I introduced physical PCI slots in 2.6.27, one of the
changes was that the interface we presented in
/sys/bus/pci/slots/ was supposed to represent _physical_ slots
only.
Before the change, fakephp "abused" that directory by presenting
all the PCI devices, independent of any notions of physical
slots.
No one thought much of changing what fakephp did during review
because we thought it was a developer/debug feature only. After
the change, instead of presenting a PCI bus address in sysfs,
fakephp present something like "fake%d" where %d is the order
that we discovered the device.
The thought process was that, "this is a debug feature only, so
let's make it more obvious".
This was an incorrect assumption, it turns out. :(
People like embedded were using it for various things, and it was
also the only way to remove (and rediscover) PCI devices (if they
weren't supported by platform hotplug).
You can read the thread here:
http://thread.gmane.org/gmane.linux.kernel/761944
The interesting stuff starts in the middle.
> If it's been broken for this long then do we actually need to
> resurrect it?
This entire patch series is about fixing the mess that I helped
create. At the time, it was indicated that there was existing
software that depended on finding 2.6.26-style (and older)
fakephp stuff in /sys/bus/pci/slots/
After some discussion, we thought it would be better for the
functionality to live in the new sysfs interfaces I'm introducing
in this series.
The legacy fakephp stuff is to help transition existing software.
Since the only use case of existing software that I know of is in
embedded, I think their workaround was to not move to 2.6.27.
> If so, do we need to resurrect it in its old form? This would appear
> to be an opportunity to improve that interface, unless the old one
> was perfect?
Well, the goal is to get people away from fakephp completely and
move them to these new interfaces. The legacy fakephp stuff is
just a transition aid.
Thanks.
/ac
* Andrew Morton <[email protected]>:
> On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
>
> > This API is used by the PCI core to rescan a bus and rediscover
> > newly added devices.
> >
> > Over time, it is expected that the various PCI hotplug drivers
> > will migrate to this interface and away from the old
> > pci_do_scan_bus() interface.
> >
> > ...
> >
> > @@ -1195,10 +1195,39 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent,
> > EXPORT_SYMBOL(pci_scan_bus_parented);
> >
> > #ifdef CONFIG_HOTPLUG
> > +/**
> > + * pci_rescan_bus - scan a PCI bus for devices.
> > + * @bus: PCI bus to scan
> > + *
> > + * Scan a PCI bus and child buses for new devices, adds them,
> > + * and enables them.
> > + *
> > + * Returns the max number of subordinate bus discovered.
> > + */
> > +unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
> > +{
> > + unsigned int max;
> > + struct pci_dev *dev;
> > +
> > + max = pci_scan_child_bus(bus);
> > + list_for_each_entry(dev, &bus->devices, bus_list)
> > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> > + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> > + if (dev->subordinate)
> > + pci_bus_size_bridges(dev->subordinate);
> > +
> > + pci_bus_assign_resources(bus);
> > + pci_enable_bridges(bus);
> > + pci_bus_add_devices(bus);
> > +
> > + return max;
> > +}
> > +
> > EXPORT_SYMBOL(pci_add_new_bus);
> > EXPORT_SYMBOL(pci_scan_slot);
> > EXPORT_SYMBOL(pci_scan_bridge);
> > EXPORT_SYMBOL_GPL(pci_scan_child_bus);
> > +EXPORT_SYMBOL_GPL(pci_rescan_bus);
>
> uhm, is there any rationale for the seemingly-random mixup of export types
> in this interface?
History?
I've no clue why we're mixing EXPORT_SYMBOL and EXPORT_SYMBOL_GPL.
git-blame says that the mishmash existed pre-git.
For my new interface, I thought that EXPORT_SYMBOL_GPL would be
most appropriate, but maybe not?
What would you like me to do? Make them all the same?
Thanks.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> We're getting close to the new merge window, and I _think_ this
>> patch series is ready for consideration. Notably:
>>
>> - multiple rescans/removes of devices with and without bridges
>> - verified that resource allocation after multiple remove/rescan
>> cycles is the same as what we had during initial boot
>> - fixes the complete suckage of fakephp (that I created)
>> - doesn't affect existing hotplug drivers
>> - tested on x86 and ia64 platforms
>>
>> Please review, and consider testing. For testing ease, you can pull
>> from my git branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
>> branch 'test-20090318' is what you want
>>
>
> I got the following oops when I did
>
> # echo 1 > remove
>
> on the bridge device.
Hm, this is odd; I did test under lockdep.
I'll try again, thanks for testing and the bug report.
/ac
>
> [ 6639.239164] =============================================
> [ 6639.239191] [ INFO: possible recursive locking detected ]
> [ 6639.239212] 2.6.29-rc8-kk #1
> [ 6639.239227] ---------------------------------------------
> [ 6639.239243] events/8/60 is trying to acquire lock:
> [ 6639.239252] (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> [ 6639.239252]
> [ 6639.239252] but task is already holding lock:
> [ 6639.239252] (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252]
> [ 6639.239252] other info that might help us debug this:
> [ 6639.239252] 3 locks held by events/8/60:
> [ 6639.239252] #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252] #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252] #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c05b9>] remove_callback+0x29/0x80
> [ 6639.239252]
> [ 6639.239252] stack backtrace:
> [ 6639.239252] Pid: 60, comm: events/8 Not tainted 2.6.29-rc8-kk #1
> [ 6639.239252] Call Trace:
> [ 6639.239252] [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
> [ 6639.239252] [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
> [ 6639.239252] [<ffffffff8026f148>] lock_acquire+0x58/0x80
> [ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [ 6639.239252] [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
> [ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [ 6639.239252] [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
> [ 6639.239252] [<ffffffffa0158065>] e1000_remove+0x55/0xfe [e1000e]
> [ 6639.239252] [<ffffffff8033ee60>] ? sysfs_schedule_callback_work+0x0/0x50
> [ 6639.239252] [<ffffffff803bf7a2>] pci_device_remove+0x32/0x70
> [ 6639.239252] [<ffffffff80441409>] __device_release_driver+0x59/0x90
> [ 6639.241006] [<ffffffff8044153b>] device_release_driver+0x2b/0x40
> [ 6639.241006] [<ffffffff80441036>] bus_remove_device+0xa6/0x120
> [ 6639.241006] [<ffffffff8043dacb>] device_del+0x12b/0x190
> [ 6639.241006] [<ffffffff8043db56>] device_unregister+0x26/0x70
> [ 6639.241006] [<ffffffff803ba529>] pci_stop_dev+0x49/0x60
> [ 6639.241006] [<ffffffff803ba670>] pci_remove_bus_device+0x40/0xc0
> [ 6639.241006] [<ffffffff803ba71d>] pci_remove_behind_bridge+0x2d/0x50
> [ 6639.241006] [<ffffffff803ba64e>] pci_remove_bus_device+0x1e/0xc0
> [ 6639.241006] [<ffffffff803c05c8>] remove_callback+0x38/0x80
> [ 6639.241006] [<ffffffff8033ee7f>] sysfs_schedule_callback_work+0x1f/0x50
> [ 6639.241006] [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> [ 6639.241006] [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> [ 6639.241006] [<ffffffff8025846f>] worker_thread+0x9f/0x100
> [ 6639.241006] [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> [ 6639.241006] [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> [ 6639.241006] [<ffffffff8025b89d>] kthread+0x4d/0x80
> [ 6639.241006] [<ffffffff8020d4ba>] child_rip+0xa/0x20
> [ 6639.241006] [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> [ 6639.241006] [<ffffffff8025b850>] ? kthread+0x0/0x80
> [ 6639.241006] [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> [ 6639.283330] e1000e 0000:40:00.0: PCI INT A disabled
> [ 6639.324332] e1000e 0000:40:00.1: PCI INT B disabled
> [ 6639.325031] aer 0000:2f:04.0:pcie22: unloading service driver aer
>
> Thanks,
> Kenji Kaneshige
>
>
>> Note, this test branch does contain the small assorted patches in AER
>> and the PCIe portdriver that I fixed along the way, but does not contain
>> the sysfs callback mutex that I introduced to protect myself from
>> Vegard Nossum. ;) If you want to hammer away at the sysfs interface,
>> please apply this patch on top:
>>
>> http://thread.gmane.org/gmane.linux.kernel/806648
>>
>> What I'm continuing to do:
>>
>> - investigate converting existing hotplug drivers to use
>> new pci_rescan_bus() interface
>>
>> Thanks.
>>
>> /ac
>>
>> v3 -> v4:
>> - protect sysfs interfaces with mutex
>> - undo changes in pci_do_scan_bus
>> - introduce pci_rescan_bus instead
>> - do not initialize bridges more than once
>>
>> v2 -> v3:
>> - properly remove device with internal bridge
>> - added Kenji Kaneshige's pci_is_root_bus() interface
>> - dropped whitespace cleanups for another time
>>
>> v1 -> v2:
>> - incorporated lots of Trent Piepho's work
>> - beefed up pci_do_scan_bus as heavy lifter for rescanning
>> - small bugfixes folded into earlier patches to get everything working
>>
>> ---
>>
>> Alex Chiang (8):
>> PCI Hotplug: schedule fakephp for feature removal
>> PCI Hotplug: rename legacy_fakephp to fakephp
>> PCI: Introduce /sys/bus/pci/devices/.../rescan
>> PCI: Introduce /sys/bus/pci/devices/.../remove
>> PCI: Introduce /sys/bus/pci/rescan
>> PCI: Introduce pci_rescan_bus()
>> PCI: do not initialize bridges more than once
>> PCI: always scan child buses
>>
>> Kenji Kaneshige (1):
>> PCI: pci_is_root_bus helper
>>
>> Trent Piepho (3):
>> PCI Hotplug: restore fakephp interface with complete reimplementation
>> PCI: pci_scan_slot() returns newly found devices
>> PCI: don't scan existing devices
>>
>>
>> Documentation/ABI/testing/sysfs-bus-pci | 27 ++
>> Documentation/feature-removal-schedule.txt | 32 ++
>> Documentation/filesystems/sysfs-pci.txt | 10 +
>> drivers/pci/hotplug/fakephp.c | 443 +++++++---------------------
>> drivers/pci/pci-driver.c | 1
>> drivers/pci/pci-sysfs.c | 95 ++++++
>> drivers/pci/pci.h | 6 drivers/pci/probe.c
>> | 103 ++++---
>> drivers/pci/setup-bus.c | 3 include/linux/pci.h
>> | 12 +
>> 10 files changed, 358 insertions(+), 374 deletions(-)
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>
>
* Andrew Morton <[email protected]>:
> On Wed, 18 Mar 2009 16:40:01 -0600 Alex Chiang <[email protected]> wrote:
>
> > +#ifdef CONFIG_HOTPLUG
> > +static DEFINE_MUTEX(pci_remove_rescan_mutex);
> > +static ssize_t bus_rescan_store(struct bus_type *bus, const char *buf,
> > + size_t count)
> > +{
> > + unsigned long val;
> > + struct pci_bus *b = NULL;
> > +
> > + if (strict_strtoul(buf, 0, &val) < 0)
> > + return -EINVAL;
> > +
> > + if (!capable(CAP_SYS_ADMIN))
> > + return -EPERM;
> > +
> > + if (val) {
> > + mutex_lock(&pci_remove_rescan_mutex);
> > + while ((b = pci_find_next_bus(b)) != NULL)
> > + pci_rescan_bus(b);
> > + mutex_unlock(&pci_remove_rescan_mutex);
> > + }
> > + return count;
> > +}
> > +
> > +struct bus_attribute pci_bus_attrs[] = {
> > + __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> > + __ATTR_NULL
> > +};
> > +#endif
>
> Why CONFIG_HOTPLUG rather than CONFIG_HOTPLUG_PCI (or similar)?
I first started out with CONFIG_HOTPLUG_PCI but then all that
stuff got compiled out, and I couldn't figure out why at the
time.
I'll go back and try and figure out what's really going on. :-/
Thanks.
/ac
* Andrew Morton <[email protected]>:
> On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
>
> > +/**
> > + * pci_rescan_bus - scan a PCI bus for devices.
> > + * @bus: PCI bus to scan
> > + *
> > + * Scan a PCI bus and child buses for new devices, adds them,
> > + * and enables them.
> > + *
> > + * Returns the max number of subordinate bus discovered.
> > + */
> > +unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
> > +{
> > + unsigned int max;
> > + struct pci_dev *dev;
> > +
> > + max = pci_scan_child_bus(bus);
> > + list_for_each_entry(dev, &bus->devices, bus_list)
> > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> > + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> > + if (dev->subordinate)
> > + pci_bus_size_bridges(dev->subordinate);
> > +
> > + pci_bus_assign_resources(bus);
> > + pci_enable_bridges(bus);
> > + pci_bus_add_devices(bus);
> > +
> > + return max;
> > +}
>
> The conspicuous lack of locking for the list walk requires either
>
> a) documenting or
> b) fixing
Ok, will figure out (a) or (b).
Thanks.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> We're getting close to the new merge window, and I _think_ this
>> patch series is ready for consideration. Notably:
>>
>> - multiple rescans/removes of devices with and without bridges
>> - verified that resource allocation after multiple remove/rescan
>> cycles is the same as what we had during initial boot
>> - fixes the complete suckage of fakephp (that I created)
>> - doesn't affect existing hotplug drivers
>> - tested on x86 and ia64 platforms
>>
>> Please review, and consider testing. For testing ease, you can pull
>> from my git branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
>> branch 'test-20090318' is what you want
>>
>
> When I rescan the bus even without remove anything, I got the
> following messages. And enable count of the bridge is incremented
> every time.
>
> [17240.309096] pci 0000:2d:00.0: BAR 6: bogus alignment [0x0-0x0] flags 0x2
> [17240.309135] pcieport-driver 0000:00:01.0: setting latency timer to 64
> [17240.309143] pcieport-driver 0000:00:02.0: setting latency timer to 64
> [17240.309150] pcieport-driver 0000:14:00.0: setting latency timer to 64
> [17240.309159] pcieport-driver 0000:15:00.0: setting latency timer to 64
> [17240.309166] pcieport-driver 0000:15:01.0: setting latency timer to 64
> [17240.309174] pcieport-driver 0000:15:02.0: setting latency timer to 64
> [17240.309182] pci 0000:14:00.3: setting latency timer to 64
> [17240.309189] pcieport-driver 0000:00:03.0: setting latency timer to 64
> [17240.309198] pcieport-driver 0000:2a:00.0: setting latency timer to 64
> [17240.309207] pcieport-driver 0000:2b:02.0: setting latency timer to 64
> [17240.309215] pcieport-driver 0000:2b:04.0: setting latency timer to 64
> [17240.309223] pcieport-driver 0000:00:04.0: setting latency timer to 64
> [17240.309231] pcieport-driver 0000:2e:00.0: setting latency timer to 64
> [17240.309240] pcieport-driver 0000:2f:02.0: setting latency timer to 64
> [17240.309248] pcieport-driver 0000:2f:04.0: setting latency timer to 64
> [17240.309256] pcieport-driver 0000:00:06.0: setting latency timer to 64
> [17240.309264] pcieport-driver 0000:51:00.0: setting latency timer to 64
> [17240.309273] pcieport-driver 0000:52:02.0: setting latency timer to 64
> [17240.309281] pcieport-driver 0000:52:04.0: setting latency timer to 64
> [17240.309291] pci 0000:64:00.0: setting latency timer to 64
> [17240.309300] pcieport-driver 0000:00:1c.0: setting latency timer to 64
> [17240.309308] pci 0000:00:1e.0: setting latency timer to 64
I will investigate these.
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 6
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 7
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 8
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 9
Hm, that's because I'm calling pci_enable_bridges()
unconditionally every time even if we didn't add any new bridges,
which is obviously wrong.
I will try and figure out a fix for this.
Thanks for testing.
/ac
On Thu, Mar 19, 2009 at 11:05:48AM -0600, Alex Chiang wrote:
> * Andrew Morton <[email protected]>:
> > On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
> >
> > > This API is used by the PCI core to rescan a bus and rediscover
> > > newly added devices.
> > >
> > > Over time, it is expected that the various PCI hotplug drivers
> > > will migrate to this interface and away from the old
> > > pci_do_scan_bus() interface.
> > >
> > > ...
> > >
> > > @@ -1195,10 +1195,39 @@ struct pci_bus * __devinit pci_scan_bus_parented(struct device *parent,
> > > EXPORT_SYMBOL(pci_scan_bus_parented);
> > >
> > > #ifdef CONFIG_HOTPLUG
> > > +/**
> > > + * pci_rescan_bus - scan a PCI bus for devices.
> > > + * @bus: PCI bus to scan
> > > + *
> > > + * Scan a PCI bus and child buses for new devices, adds them,
> > > + * and enables them.
> > > + *
> > > + * Returns the max number of subordinate bus discovered.
> > > + */
> > > +unsigned int __devinit pci_rescan_bus(struct pci_bus *bus)
> > > +{
> > > + unsigned int max;
> > > + struct pci_dev *dev;
> > > +
> > > + max = pci_scan_child_bus(bus);
> > > + list_for_each_entry(dev, &bus->devices, bus_list)
> > > + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
> > > + dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)
> > > + if (dev->subordinate)
> > > + pci_bus_size_bridges(dev->subordinate);
> > > +
> > > + pci_bus_assign_resources(bus);
> > > + pci_enable_bridges(bus);
> > > + pci_bus_add_devices(bus);
> > > +
> > > + return max;
> > > +}
> > > +
> > > EXPORT_SYMBOL(pci_add_new_bus);
> > > EXPORT_SYMBOL(pci_scan_slot);
> > > EXPORT_SYMBOL(pci_scan_bridge);
> > > EXPORT_SYMBOL_GPL(pci_scan_child_bus);
> > > +EXPORT_SYMBOL_GPL(pci_rescan_bus);
> >
> > uhm, is there any rationale for the seemingly-random mixup of export types
> > in this interface?
>
> History?
>
> I've no clue why we're mixing EXPORT_SYMBOL and EXPORT_SYMBOL_GPL.
I do :)
New PCI core exports were added with _GPL, it's the older ones that were
left at the "normal" level.
> git-blame says that the mishmash existed pre-git.
>
> For my new interface, I thought that EXPORT_SYMBOL_GPL would be
> most appropriate, but maybe not?
Yes it is, pci hotplug exports has always been EXPORT_SYMBOL_GPL(), in
fact I think it was the first in-tree user of this marking.
> What would you like me to do? Make them all the same?
New ones should be _GPL() please.
But please put the export in the proper place, checkpatch.pl will
complain if you do not. Which means you didn't run it on your patches
:)
thanks,
greg k-h
* Greg KH <[email protected]>:
> On Thu, Mar 19, 2009 at 11:05:48AM -0600, Alex Chiang wrote:
> > * Andrew Morton <[email protected]>:
> > > On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
> > >
> > > > EXPORT_SYMBOL(pci_add_new_bus);
> > > > EXPORT_SYMBOL(pci_scan_slot);
> > > > EXPORT_SYMBOL(pci_scan_bridge);
> > > > EXPORT_SYMBOL_GPL(pci_scan_child_bus);
> > > > +EXPORT_SYMBOL_GPL(pci_rescan_bus);
> > >
> > > uhm, is there any rationale for the seemingly-random mixup of export types
> > > in this interface?
> >
> > History?
> >
> > I've no clue why we're mixing EXPORT_SYMBOL and EXPORT_SYMBOL_GPL.
>
> I do :)
>
> New PCI core exports were added with _GPL, it's the older ones that were
> left at the "normal" level.
>
> > git-blame says that the mishmash existed pre-git.
> >
> > For my new interface, I thought that EXPORT_SYMBOL_GPL would be
> > most appropriate, but maybe not?
>
> Yes it is, pci hotplug exports has always been EXPORT_SYMBOL_GPL(), in
> fact I think it was the first in-tree user of this marking.
>
> > What would you like me to do? Make them all the same?
>
> New ones should be _GPL() please.
>
> But please put the export in the proper place, checkpatch.pl will
> complain if you do not. Which means you didn't run it on your patches
> :)
Andrew already yelled at me privately about checkpatch.
I went over and ran it on all my patches, and that was indeed one
of the complaints.
But when it comes to file consistency vs "letter of the law" I
like to keep things consistent (even if broken) in the file.
I could go back and move all those EXPORT_SYMBOL declarations to
the "proper" place. Is that too much noise though?
Thanks.
/ac
On Thu, Mar 19, 2009 at 11:49:38AM -0600, Alex Chiang wrote:
> I went over and ran it on all my patches, and that was indeed one
> of the complaints.
>
> But when it comes to file consistency vs "letter of the law" I
> like to keep things consistent (even if broken) in the file.
>
> I could go back and move all those EXPORT_SYMBOL declarations to
> the "proper" place. Is that too much noise though?
The accepted way to do this is to add a prep patch which just fixes the
checkpatch warnings (ie moves the EXPORT_SYMBOLs). Then your patch is
all shiny and fresh.
--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
On Thu, Mar 19, 2009 at 11:49:38AM -0600, Alex Chiang wrote:
> * Greg KH <[email protected]>:
> > On Thu, Mar 19, 2009 at 11:05:48AM -0600, Alex Chiang wrote:
> > > * Andrew Morton <[email protected]>:
> > > > On Wed, 18 Mar 2009 16:39:56 -0600 Alex Chiang <[email protected]> wrote:
> > > >
> > > > > EXPORT_SYMBOL(pci_add_new_bus);
> > > > > EXPORT_SYMBOL(pci_scan_slot);
> > > > > EXPORT_SYMBOL(pci_scan_bridge);
> > > > > EXPORT_SYMBOL_GPL(pci_scan_child_bus);
> > > > > +EXPORT_SYMBOL_GPL(pci_rescan_bus);
> > > >
> > > > uhm, is there any rationale for the seemingly-random mixup of export types
> > > > in this interface?
> > >
> > > History?
> > >
> > > I've no clue why we're mixing EXPORT_SYMBOL and EXPORT_SYMBOL_GPL.
> >
> > I do :)
> >
> > New PCI core exports were added with _GPL, it's the older ones that were
> > left at the "normal" level.
> >
> > > git-blame says that the mishmash existed pre-git.
> > >
> > > For my new interface, I thought that EXPORT_SYMBOL_GPL would be
> > > most appropriate, but maybe not?
> >
> > Yes it is, pci hotplug exports has always been EXPORT_SYMBOL_GPL(), in
> > fact I think it was the first in-tree user of this marking.
> >
> > > What would you like me to do? Make them all the same?
> >
> > New ones should be _GPL() please.
> >
> > But please put the export in the proper place, checkpatch.pl will
> > complain if you do not. Which means you didn't run it on your patches
> > :)
>
> Andrew already yelled at me privately about checkpatch.
>
> I went over and ran it on all my patches, and that was indeed one
> of the complaints.
>
> But when it comes to file consistency vs "letter of the law" I
> like to keep things consistent (even if broken) in the file.
>
> I could go back and move all those EXPORT_SYMBOL declarations to
> the "proper" place. Is that too much noise though?
Too much noise for this patch, yes.
Put things in the right place for new patches/changes.
If you want to go back and just make a "fix this file to be checkpatch
clean" patch, that's fine as well, but don't merge it with other changes
at the same time, as that makes it pretty unreviewable.
Hope this helps,
greg k-h
On Thu, 19 Mar 2009, Andrew Morton wrote:
> On Wed, 18 Mar 2009 16:40:16 -0600 Alex Chiang <[email protected]> wrote:
>
> > A complete re-implementation of fakephp is necessary if it is to
> > present its former interface (pre-2.6.27, when it broke).
>
> We broke an existing interface? wtf?
>
> If it's been broken for this long then do we actually need to
> resurrect it?
I found out it had been broken several months ago and made some of these
patches to fix it, but then lost my motivation for getting them into the
kernel.
There are people who use the old fakephp interface. I was one of them.
They probably don't update their kernels in production systems and embedded
devices very often and haven't discovered the breakage in 2.6.27.
> If so, do we need to resurrect it in its old form? This would appear
> to be an opportunity to improve that interface, unless the old one
> was perfect?
That's what I did in my initial patches. I created a new interface with
the 'remove' file in the pci device's sysfs directory and the 'rescan' file
in /sys/bus/pci. This is the new better interface. legacy_fakephp is a
module that provides a compatible interface to the pre-2.6.27 one, though
it fixes some bugs with the old one and is overall much less code.
On Thu, 19 Mar 2009, Alex Chiang wrote:
> * Kenji Kaneshige <[email protected]>:
> >> + pci_remove_bus_device(pdev);
> >> + if (bridge && list_empty(&pdev->bus->devices))
> >> + pci_remove_bus(pdev->bus);
> >
> > I cannot understand the above two lines. Could you explain
> > what it intend?
>
> If the user says:
>
> echo 1 > /sys/bus/pci/devices/.../remove
>
> And that device is a bridge, then we need to specifically call
> pci_remove_bus as well, to actually remove the bus itself.
> Without it, pci_bus_remove_device() will remove all of its
> children (and subordinate buses) in a depth-first manner, but we
> will never actually remove the bus that the user specified.
Did this end up being the source of the pci resource assignment warnings
that were produced when re-adding a bridge that was removed?
* Trent Piepho <[email protected]>:
> On Thu, 19 Mar 2009, Alex Chiang wrote:
> > * Kenji Kaneshige <[email protected]>:
> > >> + pci_remove_bus_device(pdev);
> > >> + if (bridge && list_empty(&pdev->bus->devices))
> > >> + pci_remove_bus(pdev->bus);
> > >
> > > I cannot understand the above two lines. Could you explain
> > > what it intend?
> >
> > If the user says:
> >
> > echo 1 > /sys/bus/pci/devices/.../remove
> >
> > And that device is a bridge, then we need to specifically call
> > pci_remove_bus as well, to actually remove the bus itself.
> > Without it, pci_bus_remove_device() will remove all of its
> > children (and subordinate buses) in a depth-first manner, but we
> > will never actually remove the bus that the user specified.
>
> Did this end up being the source of the pci resource assignment warnings
> that were produced when re-adding a bridge that was removed?
I think that was one reason. I found that it was necessary during
multiple remove/rescan cycles; after the 2nd rescan, we had a
stale pci_bus hanging around that was manifesting itself as a
sysfs issue.
The other thing that Kenji found was in pci_setup_bridge(), where
we don't want to initialize the BARs if the bridge has already
been added.
Thanks.
/ac
On Thu, 19 Mar 2009, Alex Chiang wrote:
> * Andrew Morton <[email protected]>:
> > > +
> > > +struct bus_attribute pci_bus_attrs[] = {
> > > + __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
> > > + __ATTR_NULL
> > > +};
> > > +#endif
> >
> > Why CONFIG_HOTPLUG rather than CONFIG_HOTPLUG_PCI (or similar)?
>
> I first started out with CONFIG_HOTPLUG_PCI but then all that
> stuff got compiled out, and I couldn't figure out why at the
> time.
HOTPLUG_PCI is a subset of HOTPLUG. It gives you support for actual
hotplug hardware. The rescan/remove functions don't do anything to hotplug
hardware and don't use any code from the pci hotplug core (which is
designed to support hotplug hardware). Rescan and remove just need the
ability to unbind a driver from device, delete a device and create a new
device. Which HOTPLUG gives us.
So, we use HOTPLUG because that's all we need. The idea is to make the
sysfs interface for PCI have the most features it can support.
If we wanted to make the interface not have remove/rescan unless needed,
then using HOTPLUG_PCI wouldn't make much sense. It would be better to
have a new option "PCI remove/rescan sysfs interface" or something that
turned it on. But it isn't much code, so I don't think it's necessary to
do that.
Alex Chiang wrote:
> * Kenji Kaneshige <[email protected]>:
>> Alex Chiang wrote:
>>> This patch adds an attribute named "remove" to a PCI device's sysfs
>>> directory. Writing a non-zero value to this attribute will remove the PCI
>>> device and any children of it.
>>>
>>> Trent Piepho wrote the original implementation and documentation.
>>>
>>> Thanks to Vegard Nossum for testing under kmemcheck and finding locking
>>> issues with the sysfs interface.
>>>
>>> Cc: Trent Piepho <[email protected]>
>>> Signed-off-by: Alex Chiang <[email protected]>
>>> ---
>>>
>>> Documentation/ABI/testing/sysfs-bus-pci | 8 ++++++
>>> Documentation/filesystems/sysfs-pci.txt | 10 +++++++
>>> drivers/pci/pci-sysfs.c | 44 +++++++++++++++++++++++++++++++
>>> 3 files changed, 62 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>>> index ea4aee2..5b1ddde 100644
>>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>>> @@ -50,6 +50,14 @@ Description:
>>> re-discover previously removed devices.
>>> Depends on CONFIG_HOTPLUG.
>>> +What: /sys/bus/pci/devices/.../remove
>>> +Date: January 2009
>>> +Contact: Linux PCI developers <[email protected]>
>>> +Description:
>>> + Writing a non-zero value to this attribute will
>>> + hot-remove the PCI device and any of its children.
>>> + Depends on CONFIG_HOTPLUG.
>>> +
>>> What: /sys/bus/pci/devices/.../vpd
>>> Date: February 2008
>>> Contact: Ben Hutchings <[email protected]>
>>> diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
>>> index 9f8740c..26e4b8b 100644
>>> --- a/Documentation/filesystems/sysfs-pci.txt
>>> +++ b/Documentation/filesystems/sysfs-pci.txt
>>> @@ -12,6 +12,7 @@ that support it. For example, a given bus might look like this:
>>> | |-- enable
>>> | |-- irq
>>> | |-- local_cpus
>>> + | |-- remove
>>> | |-- resource
>>> | |-- resource0
>>> | |-- resource1
>>> @@ -36,6 +37,7 @@ files, each with their own function.
>>> enable Whether the device is enabled (ascii, rw)
>>> irq IRQ number (ascii, ro)
>>> local_cpus nearby CPU mask (cpumask, ro)
>>> + remove remove device from kernel's list (ascii, wo)
>>> resource PCI resource host addresses (ascii, ro)
>>> resource0..N PCI resource N, if present (binary, mmap)
>>> resource0_wc..N_wc PCI WC map resource N, if prefetchable (binary, mmap)
>>> @@ -46,6 +48,7 @@ files, each with their own function.
>>> ro - read only file
>>> rw - file is readable and writable
>>> + wo - write only file
>>> mmap - file is mmapable
>>> ascii - file contains ascii text
>>> binary - file contains binary data
>>> @@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
>>> In the event a driver is not bound to the device, it can be enabled using the
>>> 'enable' file, documented above.
>>> +The 'remove' file is used to remove the PCI device, by writing a
>>> non-zero
>>> +integer to the file. This does not involve any kind of hot-plug functionality,
>>> +e.g. powering off the device. The device is removed from the kernel's list of
>>> +PCI devices, the sysfs directory for it is removed, and the device will be
>>> +removed from any drivers attached to it. Removal of PCI root buses is
>>> +disallowed.
>>> +
>>> Accessing legacy resources through sysfs
>>> ----------------------------------------
>>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>>> index 22dbc65..6e2b1fd 100644
>>> --- a/drivers/pci/pci-sysfs.c
>>> +++ b/drivers/pci/pci-sysfs.c
>>> @@ -246,6 +246,47 @@ struct bus_attribute pci_bus_attrs[] = {
>>> __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store),
>>> __ATTR_NULL
>>> };
>>> +
>>> +static void remove_callback(struct device *dev)
>>> +{
>>> + int bridge = 0;
>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>> +
>>> + mutex_lock(&pci_remove_rescan_mutex);
>>> +
>>> + if (pdev->subordinate)
>>> + bridge = 1;
>>> +
>>> + pci_remove_bus_device(pdev);
>>> + if (bridge && list_empty(&pdev->bus->devices))
>>> + pci_remove_bus(pdev->bus);
>> I cannot understand the above two lines. Could you explain
>> what it intend?
>
> If the user says:
>
> echo 1 > /sys/bus/pci/devices/.../remove
>
> And that device is a bridge, then we need to specifically call
> pci_remove_bus as well, to actually remove the bus itself.
> Without it, pci_bus_remove_device() will remove all of its
> children (and subordinate buses) in a depth-first manner, but we
> will never actually remove the bus that the user specified.
>
Do you mean user removes bridge device to remove its *primary*
bus? It is very strange. I think the bus should be removed
when its parent bridge is removed.
> In other words, without it, we will still see the bus in:
>
> /sys/class/pci_bus/...
>
What is the problem?
> We only want to remove the bus if it has no children left. I
> think the check for list_empty(&pdev->bus->devices) might be
> overkill... I can try taking that bit out and testing again.
>
I think we don't need the two lines. But if you do that, you
need list_empty(&pdev->bus->devices), doesn't it? On the other
hand, we must not check 'bridge' in the if statement. Or bus
will never be removed when non-bridge device is removed last
on the bus.
Again, I think we don't need the two lines. But am I
misunderstanding something?
Thanks,
Kenji Kaneshige
> Thanks for the review.
>
> /ac
>
>> Thanks,
>> Kenji Kaneshige
>>
>>
>>> +
>>> + mutex_unlock(&pci_remove_rescan_mutex);
>>> +}
>>> +
>>> +static ssize_t
>>> +remove_store(struct device *dev, struct device_attribute *dummy,
>>> + const char *buf, size_t count)
>>> +{
>>> + int ret = 0;
>>> + unsigned long val;
>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>> +
>>> + if (strict_strtoul(buf, 0, &val) < 0)
>>> + return -EINVAL;
>>> +
>>> + if (!capable(CAP_SYS_ADMIN))
>>> + return -EPERM;
>>> +
>>> + if (pdev->subordinate && pci_is_root_bus(pdev->bus))
>>> + return -EBUSY;
>>> +
>>> + if (val)
>>> + ret = device_schedule_callback(dev, remove_callback);
>>> + if (ret)
>>> + count = ret;
>>> + return count;
>>> +}
>>> #endif
>>> struct device_attribute pci_dev_attrs[] = {
>>> @@ -266,6 +307,9 @@ struct device_attribute pci_dev_attrs[] = {
>>> __ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
>>> broken_parity_status_show,broken_parity_status_store),
>>> __ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
>>> +#ifdef CONFIG_HOTPLUG
>>> + __ATTR(remove, S_IWUSR, NULL, remove_store),
>>> +#endif
>>> __ATTR_NULL,
>>> };
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
* Trent Piepho <[email protected]>:
> On Thu, 19 Mar 2009, Alex Chiang wrote:
> > * Andrew Morton <[email protected]>:
> > > > + +struct bus_attribute pci_bus_attrs[] = { +
> > > > __ATTR(rescan, S_IWUSR, NULL, bus_rescan_store), +
> > > > __ATTR_NULL +}; +#endif
> > >
> > > Why CONFIG_HOTPLUG rather than CONFIG_HOTPLUG_PCI (or
> > > similar)?
> >
> > I first started out with CONFIG_HOTPLUG_PCI but then all that
> > stuff got compiled out, and I couldn't figure out why at the
> > time.
>
> HOTPLUG_PCI is a subset of HOTPLUG. It gives you support for
> actual hotplug hardware. The rescan/remove functions don't do
> anything to hotplug hardware and don't use any code from the
> pci hotplug core (which is designed to support hotplug
> hardware). Rescan and remove just need the ability to unbind a
> driver from device, delete a device and create a new device.
> Which HOTPLUG gives us.
>
> So, we use HOTPLUG because that's all we need. The idea is to
> make the sysfs interface for PCI have the most features it can
> support.
>
> If we wanted to make the interface not have remove/rescan
> unless needed, then using HOTPLUG_PCI wouldn't make much sense.
> It would be better to have a new option "PCI remove/rescan
> sysfs interface" or something that turned it on. But it isn't
> much code, so I don't think it's necessary to do that.
Well, I don't know what I did wrong last time, but I no longer
have technical difficulty putting it under CONFIG_HOTPLUG_PCI.
Looking at Documentation/PCI/pci.txt, and the various archs, it
seems they all assume CONFIG_HOTPLUG means PCI.
This could be cleaned up, but it's certainly non-trivial, and a
bit more than I'd want to bite off for this particular patchset.
Plus, Trent is right. Looks like we use HOTPLUG_PCI to really
mean physically supported hotplug, not the logical stuff that
this patchset is adding.
Thanks.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> * Kenji Kaneshige <[email protected]>:
>>> Alex Chiang wrote:
>>>> +
>>>> +static void remove_callback(struct device *dev)
>>>> +{
>>>> + int bridge = 0;
>>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>>> +
>>>> + mutex_lock(&pci_remove_rescan_mutex);
>>>> +
>>>> + if (pdev->subordinate)
>>>> + bridge = 1;
>>>> +
>>>> + pci_remove_bus_device(pdev);
>>>> + if (bridge && list_empty(&pdev->bus->devices))
>>>> + pci_remove_bus(pdev->bus);
>>> I cannot understand the above two lines. Could you explain
>>> what it intend?
>>
>> If the user says:
>>
>> echo 1 > /sys/bus/pci/devices/.../remove
>>
>> And that device is a bridge, then we need to specifically call
>> pci_remove_bus as well, to actually remove the bus itself.
>> Without it, pci_bus_remove_device() will remove all of its
>> children (and subordinate buses) in a depth-first manner, but we
>> will never actually remove the bus that the user specified.
>>
>
> Do you mean user removes bridge device to remove its *primary*
> bus? It is very strange. I think the bus should be removed
> when its parent bridge is removed.
You are correct.
>> In other words, without it, we will still see the bus in:
>>
>> /sys/class/pci_bus/...
>>
>
> What is the problem?
>
>> We only want to remove the bus if it has no children left. I
>> think the check for list_empty(&pdev->bus->devices) might be
>> overkill... I can try taking that bit out and testing again.
>>
>
> I think we don't need the two lines. But if you do that, you
> need list_empty(&pdev->bus->devices), doesn't it? On the other
> hand, we must not check 'bridge' in the if statement. Or bus
> will never be removed when non-bridge device is removed last
> on the bus.
>
> Again, I think we don't need the two lines. But am I
> misunderstanding something?
No, you are correct.
I think what was happening was that I inserted that code before I
discovered the double-free in the PCIe port driver, and that
extra call to pci_remove_bus() helped mask the double-free.
I re-tested again tonight with the port driver fix, and also
removing the two lines you mention, and it is behaving correctly.
As always, thanks for your review.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> We're getting close to the new merge window, and I _think_ this
>> patch series is ready for consideration. Notably:
>>
>> - multiple rescans/removes of devices with and without bridges
>> - verified that resource allocation after multiple remove/rescan
>> cycles is the same as what we had during initial boot
>> - fixes the complete suckage of fakephp (that I created)
>> - doesn't affect existing hotplug drivers
>> - tested on x86 and ia64 platforms
>>
>> Please review, and consider testing. For testing ease, you can pull
>> from my git branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
>> branch 'test-20090318' is what you want
>>
>
> I got the following oops when I did
>
> # echo 1 > remove
>
> on the bridge device.
>
> [ 6639.239164] =============================================
> [ 6639.239191] [ INFO: possible recursive locking detected ]
> [ 6639.239212] 2.6.29-rc8-kk #1
> [ 6639.239227] ---------------------------------------------
> [ 6639.239243] events/8/60 is trying to acquire lock:
> [ 6639.239252] (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> [ 6639.239252]
> [ 6639.239252] but task is already holding lock:
> [ 6639.239252] (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252]
> [ 6639.239252] other info that might help us debug this:
> [ 6639.239252] 3 locks held by events/8/60:
> [ 6639.239252] #0: (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252] #1: (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> [ 6639.239252] #2: (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c05b9>] remove_callback+0x29/0x80
> [ 6639.239252]
> [ 6639.239252] stack backtrace:
> [ 6639.239252] Pid: 60, comm: events/8 Not tainted 2.6.29-rc8-kk #1
> [ 6639.239252] Call Trace:
> [ 6639.239252] [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
> [ 6639.239252] [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
> [ 6639.239252] [<ffffffff8026f148>] lock_acquire+0x58/0x80
> [ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [ 6639.239252] [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
> [ 6639.239252] [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
> [ 6639.239252] [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
> [ 6639.239252] [<ffffffffa0158065>] e1000_remove+0x55/0xfe [e1000e]
> [ 6639.239252] [<ffffffff8033ee60>] ? sysfs_schedule_callback_work+0x0/0x50
> [ 6639.239252] [<ffffffff803bf7a2>] pci_device_remove+0x32/0x70
> [ 6639.239252] [<ffffffff80441409>] __device_release_driver+0x59/0x90
> [ 6639.241006] [<ffffffff8044153b>] device_release_driver+0x2b/0x40
> [ 6639.241006] [<ffffffff80441036>] bus_remove_device+0xa6/0x120
> [ 6639.241006] [<ffffffff8043dacb>] device_del+0x12b/0x190
> [ 6639.241006] [<ffffffff8043db56>] device_unregister+0x26/0x70
> [ 6639.241006] [<ffffffff803ba529>] pci_stop_dev+0x49/0x60
> [ 6639.241006] [<ffffffff803ba670>] pci_remove_bus_device+0x40/0xc0
> [ 6639.241006] [<ffffffff803ba71d>] pci_remove_behind_bridge+0x2d/0x50
> [ 6639.241006] [<ffffffff803ba64e>] pci_remove_bus_device+0x1e/0xc0
> [ 6639.241006] [<ffffffff803c05c8>] remove_callback+0x38/0x80
> [ 6639.241006] [<ffffffff8033ee7f>] sysfs_schedule_callback_work+0x1f/0x50
> [ 6639.241006] [<ffffffff8025769a>] run_workqueue+0x15a/0x230
> [ 6639.241006] [<ffffffff80257648>] ? run_workqueue+0x108/0x230
> [ 6639.241006] [<ffffffff8025846f>] worker_thread+0x9f/0x100
> [ 6639.241006] [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
> [ 6639.241006] [<ffffffff802583d0>] ? worker_thread+0x0/0x100
> [ 6639.241006] [<ffffffff8025b89d>] kthread+0x4d/0x80
> [ 6639.241006] [<ffffffff8020d4ba>] child_rip+0xa/0x20
> [ 6639.241006] [<ffffffff8020cebc>] ? restore_args+0x0/0x30
> [ 6639.241006] [<ffffffff8025b850>] ? kthread+0x0/0x80
> [ 6639.241006] [<ffffffff8020d4b0>] ? child_rip+0x0/0x20
> [ 6639.283330] e1000e 0000:40:00.0: PCI INT A disabled
> [ 6639.324332] e1000e 0000:40:00.1: PCI INT B disabled
> [ 6639.325031] aer 0000:2f:04.0:pcie22: unloading service driver aer
pci_device_remove() calls the driver's ->remove method directly;
it shouldn't be messing around with that sysfs callback.
Does this only happen with e1000e devices? Or does it happen on
any bridge device?
I am stumped; I've never seen this locking issue before...
Will continue to look, thanks.
/ac
* Kenji Kaneshige <[email protected]>:
> Alex Chiang wrote:
>> We're getting close to the new merge window, and I _think_ this
>> patch series is ready for consideration. Notably:
>>
>> - multiple rescans/removes of devices with and without bridges
>> - verified that resource allocation after multiple remove/rescan
>> cycles is the same as what we had during initial boot
>> - fixes the complete suckage of fakephp (that I created)
>> - doesn't affect existing hotplug drivers
>> - tested on x86 and ia64 platforms
>>
>> Please review, and consider testing. For testing ease, you can pull
>> from my git branch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git
>> branch 'test-20090318' is what you want
>>
>
> When I rescan the bus even without remove anything, I got the
> following messages. And enable count of the bridge is incremented
> every time.
>
> [17240.309096] pci 0000:2d:00.0: BAR 6: bogus alignment [0x0-0x0] flags 0x2
I'm not sure where this is coming from.
> [17240.309135] pcieport-driver 0000:00:01.0: setting latency timer to 64
> [17240.309143] pcieport-driver 0000:00:02.0: setting latency timer to 64
> [17240.309150] pcieport-driver 0000:14:00.0: setting latency timer to 64
> [17240.309159] pcieport-driver 0000:15:00.0: setting latency timer to 64
> [17240.309166] pcieport-driver 0000:15:01.0: setting latency timer to 64
> [17240.309174] pcieport-driver 0000:15:02.0: setting latency timer to 64
> [17240.309182] pci 0000:14:00.3: setting latency timer to 64
> [17240.309189] pcieport-driver 0000:00:03.0: setting latency timer to 64
> [17240.309198] pcieport-driver 0000:2a:00.0: setting latency timer to 64
> [17240.309207] pcieport-driver 0000:2b:02.0: setting latency timer to 64
> [17240.309215] pcieport-driver 0000:2b:04.0: setting latency timer to 64
> [17240.309223] pcieport-driver 0000:00:04.0: setting latency timer to 64
> [17240.309231] pcieport-driver 0000:2e:00.0: setting latency timer to 64
> [17240.309240] pcieport-driver 0000:2f:02.0: setting latency timer to 64
> [17240.309248] pcieport-driver 0000:2f:04.0: setting latency timer to 64
> [17240.309256] pcieport-driver 0000:00:06.0: setting latency timer to 64
> [17240.309264] pcieport-driver 0000:51:00.0: setting latency timer to 64
> [17240.309273] pcieport-driver 0000:52:02.0: setting latency timer to 64
> [17240.309281] pcieport-driver 0000:52:04.0: setting latency timer to 64
> [17240.309291] pci 0000:64:00.0: setting latency timer to 64
> [17240.309300] pcieport-driver 0000:00:1c.0: setting latency timer to 64
> [17240.309308] pci 0000:00:1e.0: setting latency timer to 64
>
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 6
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 7
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 8
> [root@localhost pci]# echo 1 > rescan
> [root@localhost pci]# cat /sys/bus/pci/devices/0000\:2f\:04.0/enable
> 9
But the above should be fixed with a new patch that I added
today.
Would you mind testing again?
git://git.kernel.org/pub/scm/linux/kernel/git/achiang/pci-hotplug.git test-20090319
I still haven't figured out why you're seeing that recursive lock
warning.
Thanks.
/ac