this patchset is trying to update pci bridge BAR when that BAR is big enough.
1. boot time:
BIOS separate IO range between several IOHs, and on some slots, BIOS assign
the resource to the bridge, but stop assigning resource to the device under
that bridge, because the device need big resource.
so patches (first 8) are trying to
a. pci assign unassign and record the failed device resource.
b. clear the BIOS assigned resource of the parent bridge of fail device
c. go back and call pci assign unsigned
d. if it still fail, will go up more bridges. and clear and try again.
v2: Jesse doesn't like it is in find_free_bus_resource...
try to move out of pci_bus_size_bridges loop.
v3: add pci_setup_bridge calling after pci_bridge_release_not_used_res.
only clear release those res for x86.
v4: Bjorn want to release use dev instead of bus.
v5: Kenji pointed out it will have problem with several level bridge.
so let only handle leaf bridge.
v6: address Kenji's request (new pci_bus_release...). and change applying order
move back release to pci_assign_unassigned_resource
v7: change functions name pci_bus_release_unused_bridge_res according to Jesse
v8: address Eric's concern, only overwrite leaf bridge resource that is not big
enough need to do it in two steps, and first step recore the failed res,
and don't touch bridge res that programmed by firmware. second step will
try to release bridge resource that is too small at first.
v9: refresh to be applied after bjorn's patch, and remove trick about save
size and restore resource second try.
v11:add pci=try=5, about more try to change more bridge
v12:not shrink pci bridge resource
2. hotplug:
BIOS separate IO range between several IOHs, and on some slots, BIOS assign
the resource to every bridge. (8M) but when insert one card that big resource,
the card can not get resource. because kernel will not touch the bridge
resource.
so patches (9-11) are trying to
a. assign resource to devices with that slot. and record fail devices
b. if there is some failed, will clear sepcifically io port of bridge, or mmio of bridge, or mmio pref of bridge.
c. try to assign the parent bridge of the slot.
v2: address Alex's concern about pci remove/rescan feature about
pci_setup_bridge changes.
v3: Kenji pointed out that pci_config_slot need to be called before
pci_bus_add_devices()
v4: move out pci_is_enabled checkout of pci_setup_bridge()
v5: change the applying sequence.
v6: change the functions name according to Jesse
v8: address Eric's concern, only overwrite leaf bridge resource that is not
big enough
v9: refresh to be applied after bjorn's patch, and remove trick about save
size and restore resource second try.
v10:alex found need to have export for pci_assign_unassigned_bridge_resources
v11: pass check_leaf with pci_bus_release_unused_bridge_res
patches (12-13)
will try to shrink, the pci hotplug related pcie port, so could save some for
others.
-v13: change resource_list to resource_list_x, to save size and flags aside,
otherwise grandchild res will get confused with son's res as could be used
-v14: default it is disabled. could use pci=try=2 to enable it.
-v15: seperate release_child_resources aside.
Thanks
Yinghai
prepare to use those small functions according to resource type later
-v2: remove pref_mem64 typo, it should be removed
Signed-off-by: Yinghai Lu <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
---
drivers/pci/setup-bus.c | 66 +++++++++++++++++++++++++++++++++++-----------
1 files changed, 50 insertions(+), 16 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index c48cd37..1bd41ac 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -134,18 +134,12 @@ EXPORT_SYMBOL(pci_setup_cardbus);
config space writes, so it's quite possible that an I/O window of
the bridge will have some undesirable address (e.g. 0) after the
first write. Ditto 64-bit prefetchable MMIO. */
-static void pci_setup_bridge(struct pci_bus *bus)
+static void pci_setup_bridge_io(struct pci_bus *bus)
{
struct pci_dev *bridge = bus->self;
struct resource *res;
struct pci_bus_region region;
- u32 l, bu, lu, io_upper16;
-
- if (pci_is_enabled(bridge))
- return;
-
- dev_info(&bridge->dev, "PCI bridge to [bus %02x-%02x]\n",
- bus->secondary, bus->subordinate);
+ u32 l, io_upper16;
/* Set up the top and bottom of the PCI I/O segment for this bus. */
res = bus->resource[0];
@@ -158,8 +152,7 @@ static void pci_setup_bridge(struct pci_bus *bus)
/* Set up upper 16 bits of I/O base/limit. */
io_upper16 = (region.end & 0xffff0000) | (region.start >> 16);
dev_info(&bridge->dev, " bridge window %pR\n", res);
- }
- else {
+ } else {
/* Clear upper 16 bits of I/O base/limit. */
io_upper16 = 0;
l = 0x00f0;
@@ -171,21 +164,35 @@ static void pci_setup_bridge(struct pci_bus *bus)
pci_write_config_dword(bridge, PCI_IO_BASE, l);
/* Update upper 16 bits of I/O base/limit. */
pci_write_config_dword(bridge, PCI_IO_BASE_UPPER16, io_upper16);
+}
+
+static void pci_setup_bridge_mmio(struct pci_bus *bus)
+{
+ struct pci_dev *bridge = bus->self;
+ struct resource *res;
+ struct pci_bus_region region;
+ u32 l;
- /* Set up the top and bottom of the PCI Memory segment
- for this bus. */
+ /* Set up the top and bottom of the PCI Memory segment for this bus. */
res = bus->resource[1];
pcibios_resource_to_bus(bridge, ®ion, res);
if (res->flags & IORESOURCE_MEM) {
l = (region.start >> 16) & 0xfff0;
l |= region.end & 0xfff00000;
dev_info(&bridge->dev, " bridge window %pR\n", res);
- }
- else {
+ } else {
l = 0x0000fff0;
dev_info(&bridge->dev, " bridge window [mem disabled]\n");
}
pci_write_config_dword(bridge, PCI_MEMORY_BASE, l);
+}
+
+static void pci_setup_bridge_mmio_pref(struct pci_bus *bus)
+{
+ struct pci_dev *bridge = bus->self;
+ struct resource *res;
+ struct pci_bus_region region;
+ u32 l, bu, lu;
/* Clear out the upper 32 bits of PREF limit.
If PCI_PREF_BASE_UPPER32 was non-zero, this temporarily
@@ -204,8 +211,7 @@ static void pci_setup_bridge(struct pci_bus *bus)
lu = upper_32_bits(region.end);
}
dev_info(&bridge->dev, " bridge window %pR\n", res);
- }
- else {
+ } else {
l = 0x0000fff0;
dev_info(&bridge->dev, " bridge window [mem pref disabled]\n");
}
@@ -214,10 +220,38 @@ static void pci_setup_bridge(struct pci_bus *bus)
/* Set the upper 32 bits of PREF base & limit. */
pci_write_config_dword(bridge, PCI_PREF_BASE_UPPER32, bu);
pci_write_config_dword(bridge, PCI_PREF_LIMIT_UPPER32, lu);
+}
+
+static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
+{
+ struct pci_dev *bridge = bus->self;
+
+ if (pci_is_enabled(bridge))
+ return;
+
+ dev_info(&bridge->dev, "PCI bridge to [bus %02x-%02x]\n",
+ bus->secondary, bus->subordinate);
+
+ if (type & IORESOURCE_IO)
+ pci_setup_bridge_io(bus);
+
+ if (type & IORESOURCE_MEM)
+ pci_setup_bridge_mmio(bus);
+
+ if (type & IORESOURCE_PREFETCH)
+ pci_setup_bridge_mmio_pref(bus);
pci_write_config_word(bridge, PCI_BRIDGE_CONTROL, bus->bridge_ctl);
}
+static void pci_setup_bridge(struct pci_bus *bus)
+{
+ unsigned long type = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+
+ __pci_setup_bridge(bus, type);
+}
+
/* Check whether the bridge supports optional I/O and
prefetchable memory ranges. If not, the respective
base/limit registers must be read-only and read as 0. */
--
1.6.0.2
-v2: change name to release_child_resources according to Jesse
-v3: according to Linus, move release_child_resources to resource.c
also need to put the lock around them all to avoid recursive deep.
(my test case only have one level that need to be released)
-v4: seperate to another patch
Signed-off-by: Yinghai Lu <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
---
include/linux/ioport.h | 1 +
kernel/resource.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 31 insertions(+), 0 deletions(-)
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 83aa812..23a5fd1 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -112,6 +112,7 @@ extern struct resource iomem_resource;
extern int request_resource(struct resource *root, struct resource *new);
extern int release_resource(struct resource *new);
+void release_child_resources(struct resource *new);
extern void reserve_region_with_split(struct resource *root,
resource_size_t start, resource_size_t end,
const char *name);
diff --git a/kernel/resource.c b/kernel/resource.c
index dc15686..5051617 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -188,6 +188,36 @@ static int __release_resource(struct resource *old)
return -EINVAL;
}
+static void __release_child_resources(struct resource *r)
+{
+ struct resource *tmp, *p;
+ resource_size_t size;
+
+ p = r->child;
+ r->child = NULL;
+ while (p) {
+ tmp = p;
+ p = p->sibling;
+
+ tmp->parent = NULL;
+ tmp->sibling = NULL;
+ __release_child_resources(tmp);
+
+ printk(KERN_DEBUG "release child resource %pR\n", tmp);
+ /* need to restore size, and keep flags */
+ size = resource_size(tmp);
+ tmp->start = 0;
+ tmp->end = size - 1;
+ }
+}
+
+void release_child_resources(struct resource *r)
+{
+ write_lock(&resource_lock);
+ __release_child_resources(r);
+ write_unlock(&resource_lock);
+}
+
/**
* request_resource - request and reserve an I/O or memory resource
* @root: root resource descriptor
--
1.6.0.2
so later we could use it to release small resource before pci assign unassign res
-v2: change name to release_child_resources according to Jesse
-v3: according to Linus, move release_child_resources to resource.c
also need to put the lock around them all to avoid recursive deep.
(my test case only have one level that need to be released)
-v4: seperate release_child_resources to another patch
according to Bjorn, change hard code 3 to PCI_BRIDGE_RESOURCE_END
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 93 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 1bd41ac..7d6d393 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -604,6 +604,99 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_assign_resources);
+static void pci_bridge_release_unused_res(struct pci_bus *bus,
+ unsigned long type)
+{
+ int idx;
+ bool changed = false;
+ struct pci_dev *dev;
+ struct resource *r;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+
+ dev = bus->self;
+ for (idx = PCI_BRIDGE_RESOURCES; idx <= PCI_BRIDGE_RESOURCE_END;
+ idx++) {
+ r = &dev->resource[idx];
+ if ((r->flags & type_mask) != type)
+ continue;
+ if (!r->parent)
+ continue;
+ /*
+ * if there are children under that, we should release them
+ * all
+ */
+ release_child_resources(r);
+ if (!release_resource(r)) {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ "resource %d %pR released\n", idx, r);
+ /* keep the old size */
+ r->end = resource_size(r) - 1;
+ r->start = 0;
+ r->flags = 0;
+ changed = true;
+ }
+ }
+
+ if (changed) {
+ if (type & IORESOURCE_PREFETCH) {
+ /* avoiding touch the one without PREF */
+ type = IORESOURCE_PREFETCH;
+ }
+ __pci_setup_bridge(bus, type);
+ }
+}
+
+/*
+ * try to release pci bridge resources that is from leaf bridge,
+ * so we can allocate big new one later
+ * check:
+ * 0: only release the bridge and only the bridge is leaf
+ * 1: release all down side bridge for third shoot
+ */
+static void __ref pci_bus_release_unused_bridge_res(struct pci_bus *bus,
+ unsigned long type,
+ int check_leaf)
+{
+ struct pci_dev *dev;
+ bool is_leaf_bridge = true;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ struct pci_bus *b = dev->subordinate;
+ if (!b)
+ continue;
+
+ switch (dev->class >> 8) {
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ is_leaf_bridge = false;
+ break;
+
+ case PCI_CLASS_BRIDGE_PCI:
+ default:
+ is_leaf_bridge = false;
+ if (!check_leaf)
+ pci_bus_release_unused_bridge_res(b, type,
+ check_leaf);
+ break;
+ }
+ }
+
+ /* The root bus? */
+ if (!bus->self)
+ return;
+
+ switch (bus->self->class >> 8) {
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ break;
+
+ case PCI_CLASS_BRIDGE_PCI:
+ default:
+ if ((check_leaf && is_leaf_bridge) || !check_leaf)
+ pci_bridge_release_unused_res(bus, type);
+ break;
+ }
+}
+
static void pci_bus_dump_res(struct pci_bus *bus)
{
int i;
--
1.6.0.2
mean it is not used, so skip it.
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 7d6d393..e538fa1 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -703,7 +703,8 @@ static void pci_bus_dump_res(struct pci_bus *bus)
for (i = 0; i < PCI_BUS_NUM_RESOURCES; i++) {
struct resource *res = bus->resource[i];
- if (!res || !res->end)
+
+ if (!res || !res->end || !res->flags)
continue;
dev_printk(KERN_DEBUG, &bus->dev, "resource %d %pR\n", i, res);
--
1.6.0.2
so later we can do sth with those failed one
-v2: store start, end, flags aside. so could keep res cleared when assign
failed. and make following assignment of its children do not go wild
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 66 ++++++++++++++++++++++++++++++++++++++++++++---
1 files changed, 62 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index e538fa1..9bb4435 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -27,7 +27,52 @@
#include <linux/slab.h>
#include "pci.h"
-static void pbus_assign_resources_sorted(const struct pci_bus *bus)
+struct resource_list_x {
+ struct resource_list_x *next;
+ struct resource *res;
+ struct pci_dev *dev;
+ resource_size_t start;
+ resource_size_t end;
+ unsigned long flags;
+};
+
+static void add_to_failed_list(struct resource_list_x *head,
+ struct pci_dev *dev, struct resource *res)
+{
+ struct resource_list_x *list = head;
+ struct resource_list_x *ln = list->next;
+ struct resource_list_x *tmp;
+
+ tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
+ if (!tmp) {
+ pr_warning("add_to_failed_list: kmalloc() failed!\n");
+ return;
+ }
+
+ tmp->next = ln;
+ tmp->res = res;
+ tmp->dev = dev;
+ tmp->start = res->start;
+ tmp->end = res->end;
+ tmp->flags = res->flags;
+ list->next = tmp;
+}
+
+static void free_failed_list(struct resource_list_x *head)
+{
+ struct resource_list_x *list, *tmp;
+
+ for (list = head->next; list;) {
+ tmp = list;
+ list = list->next;
+ kfree(tmp);
+ }
+
+ head->next = NULL;
+}
+
+static void pbus_assign_resources_sorted(const struct pci_bus *bus,
+ struct resource_list_x *fail_head)
{
struct pci_dev *dev;
struct resource *res;
@@ -58,6 +103,13 @@ static void pbus_assign_resources_sorted(const struct pci_bus *bus)
res = list->res;
idx = res - &list->dev->resource[0];
if (pci_assign_resource(list->dev, idx)) {
+ if (fail_head && !pci_is_root_bus(list->dev->bus)) {
+ /*
+ * device need to keep flags and size
+ * for next try
+ */
+ add_to_failed_list(fail_head, list->dev, res);
+ }
res->start = 0;
res->end = 0;
res->flags = 0;
@@ -572,19 +624,20 @@ void __ref pci_bus_size_bridges(struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_size_bridges);
-void __ref pci_bus_assign_resources(const struct pci_bus *bus)
+static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
+ struct resource_list_x *fail_head)
{
struct pci_bus *b;
struct pci_dev *dev;
- pbus_assign_resources_sorted(bus);
+ pbus_assign_resources_sorted(bus, fail_head);
list_for_each_entry(dev, &bus->devices, bus_list) {
b = dev->subordinate;
if (!b)
continue;
- pci_bus_assign_resources(b);
+ __pci_bus_assign_resources(b, fail_head);
switch (dev->class >> 8) {
case PCI_CLASS_BRIDGE_PCI:
@@ -602,6 +655,11 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
}
}
+
+void __ref pci_bus_assign_resources(const struct pci_bus *bus)
+{
+ __pci_bus_assign_resources(bus, NULL);
+}
EXPORT_SYMBOL(pci_bus_assign_resources);
static void pci_bridge_release_unused_res(struct pci_bus *bus,
--
1.6.0.2
that is wrong.
exposed by that patch that doesn's shrink pci bridge res.
-v2: change to "bar reading" to "reg reading"
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/probe.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 98ffb2d..4b9f56f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -316,13 +316,17 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
limit |= (io_limit_hi << 16);
}
- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (io_base_lo & PCI_IO_RANGE_TYPE_MASK) | IORESOURCE_IO;
if (!res->start)
res->start = base;
if (!res->end)
res->end = limit + 0xfff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [io %04lx - %04lx] reg reading\n",
+ base, limit);
}
res = child->resource[1];
@@ -330,11 +334,15 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
pci_read_config_word(dev, PCI_MEMORY_LIMIT, &mem_limit_lo);
base = (mem_base_lo & PCI_MEMORY_RANGE_MASK) << 16;
limit = (mem_limit_lo & PCI_MEMORY_RANGE_MASK) << 16;
- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) | IORESOURCE_MEM;
res->start = base;
res->end = limit + 0xfffff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [mem 0x%08lx - 0x%08lx] reg reading\n",
+ base, limit + 0xfffff);
}
res = child->resource[2];
@@ -366,7 +374,7 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
#endif
}
}
- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (mem_base_lo & PCI_PREF_RANGE_TYPE_MASK) |
IORESOURCE_MEM | IORESOURCE_PREFETCH;
if (res->flags & PCI_PREF_RANGE_TYPE_64)
@@ -374,6 +382,10 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
res->start = base;
res->end = limit + 0xfffff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [mem 0x%08lx - %08lx pref] reg reading\n",
+ base, limit + 0xfffff);
}
}
--
1.6.0.2
when we are clearing leaf bridge resource and try to get big one, we could
shrink the bridge if there is no resource under it.
let check with old resource size and make sure we are trying to get big one.
-v2: keep disable window print out, still could happen on non pci hotplug system
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 9bb4435..d53b42e 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size)
{
struct pci_dev *dev;
struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO);
- unsigned long size = 0, size1 = 0;
+ unsigned long size = 0, size1 = 0, old_size;
if (!b_res)
return;
@@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size)
}
if (size < min_size)
size = min_size;
+ old_size = resource_size(b_res);
+ if (old_size == 1)
+ old_size = 0;
/* To be fixed in 2.5: we should have sort of HAVE_ISA
flag in the struct pci_bus. */
#if defined(CONFIG_ISA) || defined(CONFIG_EISA)
size = (size & 0xff) + ((size & ~0xffUL) << 2);
#endif
size = ALIGN(size + size1, 4096);
+ if (size < old_size)
+ size = old_size;
if (!size) {
if (b_res->start || b_res->end)
dev_info(&bus->self->dev, "disabling bridge window "
@@ -438,7 +443,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
unsigned long type, resource_size_t min_size)
{
struct pci_dev *dev;
- resource_size_t min_align, align, size;
+ resource_size_t min_align, align, size, old_size;
resource_size_t aligns[12]; /* Alignments from 1Mb to 2Gb */
int order, max_order;
struct resource *b_res = find_free_bus_resource(bus, type);
@@ -488,6 +493,11 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
}
if (size < min_size)
size = min_size;
+ old_size = resource_size(b_res);
+ if (old_size == 1)
+ old_size = 0;
+ if (size < old_size)
+ size = old_size;
align = 0;
min_align = 0;
--
1.6.0.2
BIOS separate IO range between several IOHs, and on some slots, BIOS assign the
resource to the bridge, but stop assigning resource to the device under that
bridge, because the device need big resource.
1. pci assign unassign and record the failed device resource.
2. clear the BIOS assigned resource of the parent bridge of fail device
3. go back and call pci assign unsigned
4. if it still fail, will go up more bridges. and clear and try again.
use pci_try_num to control back track bridge levels.
-v2: update it with resource_list_x
-v3: make pci_try_num default to 1. and when pci_try_num is set to more than 1
will check it with max_depth, and adjust that to make sure it is bigger
enough
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/pci.c | 5 ++
drivers/pci/pci.h | 2 +
drivers/pci/setup-bus.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 123 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 864e703..143bce0 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2816,6 +2816,11 @@ static int __init pci_setup(char *str)
pci_no_aer();
} else if (!strcmp(str, "nodomains")) {
pci_no_domains();
+ } else if (!strncmp(str, "try=", 4)) {
+ int try_num = memparse(str + 4, &str);
+
+ if (try_num > 0 && try_num < 16)
+ pci_try_num = try_num;
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 709eaa4..2b2330a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -203,6 +203,8 @@ static inline int pci_ari_enabled(struct pci_bus *bus)
return bus->self && bus->self->ari_enabled;
}
+extern int pci_try_num;
+
#ifdef CONFIG_PCI_QUIRKS
extern int pci_is_reassigndev(struct pci_dev *dev);
resource_size_t pci_specified_resource_alignment(struct pci_dev *dev);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index d53b42e..6b8491d 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -796,11 +796,69 @@ static void pci_bus_dump_resources(struct pci_bus *bus)
}
}
+static int __init pci_bus_get_depth(struct pci_bus *bus)
+{
+ int depth = 0;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ int ret;
+ struct pci_bus *b = dev->subordinate;
+ if (!b)
+ continue;
+
+ ret = pci_bus_get_depth(b);
+ if (ret + 1 > depth)
+ depth = ret + 1;
+ }
+
+ return depth;
+}
+static int __init pci_get_max_depth(void)
+{
+ int depth = 0;
+ struct pci_bus *bus;
+
+ list_for_each_entry(bus, &pci_root_buses, node) {
+ int ret;
+
+ ret = pci_bus_get_depth(bus);
+ if (ret > depth)
+ depth = ret;
+ }
+
+ return depth;
+}
+
+/*
+ * first try will not touch pci bridge res
+ * second try will clear small leaf bridge res
+ * third try will clear related bridge: some aggressive
+ */
+int pci_try_num = 1;
void __init
pci_assign_unassigned_resources(void)
{
struct pci_bus *bus;
+ int tried_times = 0;
+ int check_leaf = 1;
+ struct resource_list_x head, *list;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+ unsigned long failed_type;
+ int max_depth = pci_get_max_depth();
+
+ head.next = NULL;
+
+ if (pci_try_num > 1) {
+ if (max_depth + 1 > pci_try_num)
+ pci_try_num = max_depth + 1;
+ }
+
+ printk(KERN_DEBUG "PCI: max depth: %d pci_try_num: %d\n",
+ max_depth, pci_try_num);
+again:
/* Depth first, calculate sizes and alignments of all
subordinate buses. */
list_for_each_entry(bus, &pci_root_buses, node) {
@@ -808,7 +866,64 @@ pci_assign_unassigned_resources(void)
}
/* Depth last, allocate resources and update the hardware. */
list_for_each_entry(bus, &pci_root_buses, node) {
- pci_bus_assign_resources(bus);
+ __pci_bus_assign_resources(bus, &head);
+ }
+ tried_times++;
+
+ /* any device complain? */
+ if (!head.next)
+ goto enable_and_dump;
+ failed_type = 0;
+ for (list = head.next; list;) {
+ failed_type |= list->flags;
+ list = list->next;
+ }
+ /*
+ * io port are tight, don't try extra
+ * or if reach the limit, don't want to try more
+ */
+ failed_type &= type_mask;
+ if ((failed_type == IORESOURCE_IO) || (tried_times >= pci_try_num)) {
+ free_failed_list(&head);
+ goto enable_and_dump;
+ }
+
+ printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
+ tried_times + 1);
+
+ /* third times and later will not check if it is leaf */
+ if ((tried_times + 1) > 2)
+ check_leaf = 0;
+
+ /*
+ * Try to release leaf bridge's resources that doesn't fit resource of
+ * child device under that bridge
+ */
+ for (list = head.next; list;) {
+ bus = list->dev->bus;
+ pci_bus_release_unused_bridge_res(bus, list->flags & type_mask,
+ check_leaf);
+ list = list->next;
+ }
+ /* retore size and flags */
+ for (list = head.next; list;) {
+ struct resource *res = list->res;
+
+ res->start = list->start;
+ res->end = list->end;
+ res->flags = list->flags;
+ if (list->dev->subordinate)
+ res->flags = 0;
+
+ list = list->next;
+ }
+ free_failed_list(&head);
+
+ goto again;
+
+enable_and_dump:
+ /* Depth last, update the hardware. */
+ list_for_each_entry(bus, &pci_root_buses, node) {
pci_enable_bridges(bus);
}
--
1.6.0.2
for pciehp to use it later
pci_setup_bridge() will not check enabled for the slot bridge, otherwise
update res is not updated to bridge BAR. that is bridge is enabled already for
port service.
-v2: update it with resource_list_x
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 93 +++++++++++++++++++++++++++++++++++++++++++++--
include/linux/pci.h | 1 +
2 files changed, 90 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 6b8491d..056b98d 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -71,6 +71,50 @@ static void free_failed_list(struct resource_list_x *head)
head->next = NULL;
}
+static void pdev_assign_resources_sorted(struct pci_dev *dev,
+ struct resource_list_x *fail_head)
+{
+ struct resource *res;
+ struct resource_list head, *list, *tmp;
+ int idx;
+ u16 class = dev->class >> 8;
+
+ head.next = NULL;
+
+ /* Don't touch classless devices or host bridges or ioapics. */
+ if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
+ return;
+
+ /* Don't touch ioapic devices already enabled by firmware */
+ if (class == PCI_CLASS_SYSTEM_PIC) {
+ u16 command;
+ pci_read_config_word(dev, PCI_COMMAND, &command);
+ if (command & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY))
+ return;
+ }
+
+ pdev_sort_resources(dev, &head);
+
+ for (list = head.next; list;) {
+ res = list->res;
+ idx = res - &list->dev->resource[0];
+ if (pci_assign_resource(list->dev, idx)) {
+ if (fail_head && !pci_is_root_bus(list->dev->bus)) {
+ /*
+ * device need to keep flags and size
+ * for second try
+ */
+ add_to_failed_list(fail_head, list->dev, res);
+ }
+ res->start = 0;
+ res->end = 0;
+ res->flags = 0;
+ }
+ tmp = list;
+ list = list->next;
+ kfree(tmp);
+ }
+}
static void pbus_assign_resources_sorted(const struct pci_bus *bus,
struct resource_list_x *fail_head)
{
@@ -278,9 +322,6 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
{
struct pci_dev *bridge = bus->self;
- if (pci_is_enabled(bridge))
- return;
-
dev_info(&bridge->dev, "PCI bridge to [bus %02x-%02x]\n",
bus->secondary, bus->subordinate);
@@ -651,7 +692,8 @@ static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
switch (dev->class >> 8) {
case PCI_CLASS_BRIDGE_PCI:
- pci_setup_bridge(b);
+ if (!pci_is_enabled(dev))
+ pci_setup_bridge(b);
break;
case PCI_CLASS_BRIDGE_CARDBUS:
@@ -672,6 +714,34 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_assign_resources);
+static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
+ struct resource_list_x *fail_head)
+{
+ struct pci_bus *b;
+
+ pdev_assign_resources_sorted((struct pci_dev *)bridge, fail_head);
+
+ b = bridge->subordinate;
+ if (!b)
+ return;
+
+ __pci_bus_assign_resources(b, fail_head);
+
+ switch (bridge->class >> 8) {
+ case PCI_CLASS_BRIDGE_PCI:
+ pci_setup_bridge(b);
+ break;
+
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ pci_setup_cardbus(b);
+ break;
+
+ default:
+ dev_info(&bridge->dev, "not setting up bridge for bus "
+ "%04x:%02x\n", pci_domain_nr(b), b->number);
+ break;
+ }
+}
static void pci_bridge_release_unused_res(struct pci_bus *bus,
unsigned long type)
{
@@ -932,3 +1002,18 @@ enable_and_dump:
pci_bus_dump_resources(bus);
}
}
+
+void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
+{
+ struct pci_bus *bus;
+ struct pci_bus *parent = bridge->subordinate;
+ int retval;
+
+ pci_bus_size_bridges(parent);
+ pci_clear_master(bridge);
+ __pci_bridge_assign_resources(bridge, NULL);
+ retval = pci_reenable_device(bridge);
+ pci_set_master(bridge);
+ pci_enable_bridges(parent);
+}
+EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5da0690..31fec6f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -771,6 +771,7 @@ void pci_bus_assign_resources(const struct pci_bus *bus);
void pci_bus_size_bridges(struct pci_bus *bus);
int pci_claim_resource(struct pci_dev *, int);
void pci_assign_unassigned_resources(void);
+void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
void pdev_enable_device(struct pci_dev *);
void pdev_sort_resources(struct pci_dev *, struct resource_list *);
int pci_enable_resources(struct pci_dev *, int mask);
--
1.6.0.2
move out bus_size_bridges and assign resources out of pciehp_add_bridge()
and at last do them all together one time including slot bridge, to avoid to
call assign resources several times, when there are several bridges under the
slot bridge.
use pci_assign_nassigned_bridge_resources
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/hotplug/pciehp_pci.c | 23 +++++++++++++++++------
1 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 2173310..0a16444 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -53,17 +53,15 @@ static int __ref pciehp_add_bridge(struct pci_dev *dev)
busnr = pci_scan_bridge(parent, dev, busnr, pass);
if (!dev->subordinate)
return -1;
- pci_bus_size_bridges(dev->subordinate);
- pci_bus_assign_resources(parent);
- pci_enable_bridges(parent);
- pci_bus_add_devices(parent);
+
return 0;
}
int pciehp_configure_device(struct slot *p_slot)
{
struct pci_dev *dev;
- struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
+ struct pci_dev *bridge = p_slot->ctrl->pcie->port;
+ struct pci_bus *parent = bridge->subordinate;
int num, fn;
struct controller *ctrl = p_slot->ctrl;
@@ -96,12 +94,25 @@ int pciehp_configure_device(struct slot *p_slot)
(dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
pciehp_add_bridge(dev);
}
+ pci_dev_put(dev);
+ }
+
+ pci_assign_unassigned_bridge_resources(bridge);
+
+ for (fn = 0; fn < 8; fn++) {
+ dev = pci_get_slot(parent, PCI_DEVFN(0, fn));
+ if (!dev)
+ continue;
+ if ((dev->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
+ pci_dev_put(dev);
+ continue;
+ }
pci_configure_slot(dev);
pci_dev_put(dev);
}
- pci_bus_assign_resources(parent);
pci_bus_add_devices(parent);
+
return 0;
}
--
1.6.0.2
handle the case the slot bridge that doesn't get pre-allocated big enough res
from FW.
for example pcie devices need 256M, but the bridge only get preallocated 2M...
-v2: use resource_list_x
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 48 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 056b98d..cc3474e 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1007,13 +1007,60 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
{
struct pci_bus *bus;
struct pci_bus *parent = bridge->subordinate;
+ bool second_tried = false;
+ struct resource_list_x head, *list;
int retval;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+
+ head.next = NULL;
+again:
pci_bus_size_bridges(parent);
pci_clear_master(bridge);
- __pci_bridge_assign_resources(bridge, NULL);
+ __pci_bridge_assign_resources(bridge, &head);
retval = pci_reenable_device(bridge);
pci_set_master(bridge);
pci_enable_bridges(parent);
+
+ /* any device complain? */
+ if (!head.next)
+ return;
+
+ if (second_tried) {
+ /* still fail, don't need to try more */
+ free_failed_list(&head);
+ return;
+ }
+
+ second_tried = true;
+ printk(KERN_DEBUG "PCI: second try to assign unassigned res\n");
+
+ /*
+ * Try to release leaf bridge's resources that doesn't fit resource of
+ * child device under that bridge
+ */
+ for (list = head.next; list;) {
+ unsigned long flags = list->flags;
+
+ bus = list->dev->bus;
+ pci_bus_release_unused_bridge_res(bus, flags & type_mask, 0);
+ list = list->next;
+ }
+ /* retore size and flags */
+ for (list = head.next; list;) {
+ struct resource *res = list->res;
+
+ res->start = list->start;
+ res->end = list->end;
+ res->flags = list->flags;
+ if (list->dev->subordinate)
+ res->flags = 0;
+
+ list = list->next;
+ }
+ free_failed_list(&head);
+
+ goto again;
}
EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
--
1.6.0.2
prepare for pciehp_realloc
it will clear the resource size for bridge
-v2: patrick Keller pointed out need to export it...
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 43 +++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index cc3474e..4c61b8c 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -742,6 +742,49 @@ static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
break;
}
}
+
+void pci_bridge_release_res(struct pci_bus *bus)
+{
+ int idx;
+ bool changed = false;
+ struct pci_dev *dev;
+ struct resource *r;
+
+ /* The root bus? */
+ if (!bus->self)
+ return;
+
+ /* for pci bridges res only */
+ dev = bus->self;
+ if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
+ return;
+
+ for (idx = PCI_BRIDGE_RESOURCES; idx < PCI_BRIDGE_RESOURCES + 3;
+ idx++) {
+ r = &dev->resource[idx];
+ if (!r->parent)
+ continue;
+
+ /* if there are children under that, we should not release it */
+ if (r->child)
+ continue;
+
+ if (!release_resource(r)) {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ "resource %d %pR released\n", idx, r);
+ /* old size is not kept */
+ r->start = 0;
+ r->end = 0;
+ r->flags = 0;
+ changed = true;
+ }
+ }
+
+ if (changed)
+ pci_setup_bridge(bus);
+}
+EXPORT_SYMBOL_GPL(pci_bridge_release_res);
+
static void pci_bridge_release_unused_res(struct pci_bus *bus,
unsigned long type)
{
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 31fec6f..4b8d35a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -770,6 +770,7 @@ int pci_vpd_truncate(struct pci_dev *dev, size_t size);
void pci_bus_assign_resources(const struct pci_bus *bus);
void pci_bus_size_bridges(struct pci_bus *bus);
int pci_claim_resource(struct pci_dev *, int);
+void pci_bridge_release_res(struct pci_bus *bus);
void pci_assign_unassigned_resources(void);
void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
void pdev_enable_device(struct pci_dev *);
--
1.6.0.2
From: Kenji Kaneshige <[email protected]>
With this patch, pciehp driver try to clear PCI bridge resources to
parent bridge (root port or switch downstream port) of the slot
so we can shrink pci bridge resource for those port
This feature is enabled when 'pciehp_realloc' option is specified.
-v2: make it could be appiled after Yinghai patchset that touch pci bridge
resource also remove poweron check, because pci_bridge_release_res will
check child at first
Signed-off-by: Kenji Kaneshige <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/hotplug/pciehp.h | 1 +
drivers/pci/hotplug/pciehp_core.c | 7 +++++++
drivers/pci/hotplug/pciehp_pci.c | 4 ++++
3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4ed76b4..78cf0f4 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -43,6 +43,7 @@ extern int pciehp_poll_mode;
extern int pciehp_poll_time;
extern int pciehp_debug;
extern int pciehp_force;
+extern int pciehp_realloc;
extern struct workqueue_struct *pciehp_wq;
#define dbg(format, arg...) \
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 5674b20..cc3fc5d 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -41,6 +41,7 @@ int pciehp_debug;
int pciehp_poll_mode;
int pciehp_poll_time;
int pciehp_force;
+int pciehp_realloc;
struct workqueue_struct *pciehp_wq;
#define DRIVER_VERSION "0.4"
@@ -55,10 +56,12 @@ module_param(pciehp_debug, bool, 0644);
module_param(pciehp_poll_mode, bool, 0644);
module_param(pciehp_poll_time, int, 0644);
module_param(pciehp_force, bool, 0644);
+module_param(pciehp_realloc, bool, 0644);
MODULE_PARM_DESC(pciehp_debug, "Debugging mode enabled or not");
MODULE_PARM_DESC(pciehp_poll_mode, "Using polling mechanism for hot-plug events or not");
MODULE_PARM_DESC(pciehp_poll_time, "Polling mechanism frequency, in seconds");
MODULE_PARM_DESC(pciehp_force, "Force pciehp, even if _OSC and OSHP are missing");
+MODULE_PARM_DESC(pciehp_realloc, "Realloc resources for slot's parent bridge");
#define PCIE_MODULE_NAME "pciehp"
@@ -297,6 +300,10 @@ static int pciehp_probe(struct pcie_device *dev)
if (!occupied && poweron && POWER_CTRL(ctrl))
pciehp_power_off_slot(slot);
+ /* Release I/O window of the slots's parent bridge */
+ if (pciehp_realloc)
+ pci_bridge_release_res(dev->port->subordinate);
+
return 0;
err_out_free_ctrl_slot:
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 0a16444..f6119e4 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -166,5 +166,9 @@ int pciehp_unconfigure_device(struct slot *p_slot)
pci_dev_put(temp);
}
+ /* Release I/O window of the slots's parent bridge */
+ if (pciehp_realloc)
+ pci_bridge_release_res(parent);
+
return rc;
}
--
1.6.0.2
make pci_bridge_check_ranges() to store the PCI_PREF_RANGE_TYPE_64 in addition
to IORESOURCE_MEM_64. just like pci_read_bridge_bases()
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4c61b8c..02ff894 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -380,8 +380,10 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
}
if (pmem) {
b_res[2].flags |= IORESOURCE_MEM | IORESOURCE_PREFETCH;
- if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == PCI_PREF_RANGE_TYPE_64)
+ if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == PCI_PREF_RANGE_TYPE_64){
b_res[2].flags |= IORESOURCE_MEM_64;
+ b_res[2].flags |= PCI_PREF_RANGE_TYPE_64;
+ }
}
/* double check if bridge does support 64 bit pref */
--
1.6.0.2
On Tue, 2009-12-22 at 23:02 +0000, Yinghai Lu wrote:
> this patchset is trying to update pci bridge BAR when that BAR is big enough.
>
> 1. boot time:
>
> BIOS separate IO range between several IOHs, and on some slots, BIOS assign
> the resource to the bridge, but stop assigning resource to the device under
> that bridge, because the device need big resource.
>
> so patches (first 8) are trying to
> a. pci assign unassign and record the failed device resource.
> b. clear the BIOS assigned resource of the parent bridge of fail device
> c. go back and call pci assign unsigned
> d. if it still fail, will go up more bridges. and clear and try again.
>
> v2: Jesse doesn't like it is in find_free_bus_resource...
> try to move out of pci_bus_size_bridges loop.
> v3: add pci_setup_bridge calling after pci_bridge_release_not_used_res.
> only clear release those res for x86.
> v4: Bjorn want to release use dev instead of bus.
> v5: Kenji pointed out it will have problem with several level bridge.
> so let only handle leaf bridge.
> v6: address Kenji's request (new pci_bus_release...). and change applying order
> move back release to pci_assign_unassigned_resource
> v7: change functions name pci_bus_release_unused_bridge_res according to Jesse
> v8: address Eric's concern, only overwrite leaf bridge resource that is not big
> enough need to do it in two steps, and first step recore the failed res,
> and don't touch bridge res that programmed by firmware. second step will
> try to release bridge resource that is too small at first.
> v9: refresh to be applied after bjorn's patch, and remove trick about save
> size and restore resource second try.
> v11:add pci=try=5, about more try to change more bridge
> v12:not shrink pci bridge resource
>
> 2. hotplug:
> BIOS separate IO range between several IOHs, and on some slots, BIOS assign
> the resource to every bridge. (8M) but when insert one card that big resource,
> the card can not get resource. because kernel will not touch the bridge
> resource.
>
> so patches (9-11) are trying to
> a. assign resource to devices with that slot. and record fail devices
> b. if there is some failed, will clear sepcifically io port of bridge, or mmio of bridge, or mmio pref of bridge.
> c. try to assign the parent bridge of the slot.
>
> v2: address Alex's concern about pci remove/rescan feature about
> pci_setup_bridge changes.
> v3: Kenji pointed out that pci_config_slot need to be called before
> pci_bus_add_devices()
> v4: move out pci_is_enabled checkout of pci_setup_bridge()
> v5: change the applying sequence.
> v6: change the functions name according to Jesse
> v8: address Eric's concern, only overwrite leaf bridge resource that is not
> big enough
> v9: refresh to be applied after bjorn's patch, and remove trick about save
> size and restore resource second try.
> v10:alex found need to have export for pci_assign_unassigned_bridge_resources
> v11: pass check_leaf with pci_bus_release_unused_bridge_res
>
> patches (12-13)
> will try to shrink, the pci hotplug related pcie port, so could save some for
> others.
>
> -v13: change resource_list to resource_list_x, to save size and flags aside,
> otherwise grandchild res will get confused with son's res as could be used
>
> -v14: default it is disabled. could use pci=try=2 to enable it.
> -v15: seperate release_child_resources aside.
>
> Thanks
>
> Yinghai
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
I'm going to work on testing this patch series today with a focus on
hotplug and fakephp functionality.
Pat
On Tue, 2009-12-22 at 23:02 +0000, Yinghai Lu wrote:
> 2. hotplug:
> BIOS separate IO range between several IOHs, and on some slots, BIOS assign
> the resource to every bridge. (8M) but when insert one card that big resource,
> the card can not get resource. because kernel will not touch the bridge
> resource.
> Thanks
>
> Yinghai
>
These patches appear to fix logical hotplug so that it is functional
again. I think we're good.
Tested-by: Patrick Keller <[email protected]>
On Mon, 11 Jan 2010 14:57:00 -0700
Patrick Keller <[email protected]> wrote:
>
>
> On Tue, 2009-12-22 at 23:02 +0000, Yinghai Lu wrote:
>
> > 2. hotplug:
> > BIOS separate IO range between several IOHs, and on some slots,
> > BIOS assign the resource to every bridge. (8M) but when insert one
> > card that big resource, the card can not get resource. because
> > kernel will not touch the bridge resource.
>
> > Thanks
> >
> > Yinghai
> >
>
> These patches appear to fix logical hotplug so that it is functional
> again. I think we're good.
>
> Tested-by: Patrick Keller <[email protected]>
Thanks Patrick, I'll check them out and start applying.
--
Jesse Barnes, Intel Open Source Technology Center
Yinghai Lu wrote:
> for pciehp to use it later
>
> pci_setup_bridge() will not check enabled for the slot bridge, otherwise
> update res is not updated to bridge BAR. that is bridge is enabled already for
> port service.
>
> -v2: update it with resource_list_x
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/setup-bus.c | 93 +++++++++++++++++++++++++++++++++++++++++++++--
> include/linux/pci.h | 1 +
> 2 files changed, 90 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 6b8491d..056b98d 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -71,6 +71,50 @@ static void free_failed_list(struct resource_list_x *head)
> head->next = NULL;
> }
>
> +static void pdev_assign_resources_sorted(struct pci_dev *dev,
> + struct resource_list_x *fail_head)
> +{
> + struct resource *res;
> + struct resource_list head, *list, *tmp;
> + int idx;
> + u16 class = dev->class >> 8;
> +
> + head.next = NULL;
> +
> + /* Don't touch classless devices or host bridges or ioapics. */
> + if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
> + return;
> +
> + /* Don't touch ioapic devices already enabled by firmware */
> + if (class == PCI_CLASS_SYSTEM_PIC) {
> + u16 command;
> + pci_read_config_word(dev, PCI_COMMAND, &command);
> + if (command & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY))
> + return;
> + }
> +
> + pdev_sort_resources(dev, &head);
> +
> + for (list = head.next; list;) {
> + res = list->res;
> + idx = res - &list->dev->resource[0];
> + if (pci_assign_resource(list->dev, idx)) {
> + if (fail_head && !pci_is_root_bus(list->dev->bus)) {
> + /*
> + * device need to keep flags and size
> + * for second try
> + */
> + add_to_failed_list(fail_head, list->dev, res);
> + }
> + res->start = 0;
> + res->end = 0;
> + res->flags = 0;
> + }
> + tmp = list;
> + list = list->next;
> + kfree(tmp);
> + }
> +}
> static void pbus_assign_resources_sorted(const struct pci_bus *bus,
> struct resource_list_x *fail_head)
> {
> @@ -278,9 +322,6 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
> {
> struct pci_dev *bridge = bus->self;
>
> - if (pci_is_enabled(bridge))
> - return;
> -
> dev_info(&bridge->dev, "PCI bridge to [bus %02x-%02x]\n",
> bus->secondary, bus->subordinate);
>
> @@ -651,7 +692,8 @@ static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
>
> switch (dev->class >> 8) {
> case PCI_CLASS_BRIDGE_PCI:
> - pci_setup_bridge(b);
> + if (!pci_is_enabled(dev))
> + pci_setup_bridge(b);
> break;
>
> case PCI_CLASS_BRIDGE_CARDBUS:
> @@ -672,6 +714,34 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
> }
> EXPORT_SYMBOL(pci_bus_assign_resources);
>
> +static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
> + struct resource_list_x *fail_head)
> +{
> + struct pci_bus *b;
> +
> + pdev_assign_resources_sorted((struct pci_dev *)bridge, fail_head);
> +
> + b = bridge->subordinate;
> + if (!b)
> + return;
> +
> + __pci_bus_assign_resources(b, fail_head);
> +
> + switch (bridge->class >> 8) {
> + case PCI_CLASS_BRIDGE_PCI:
> + pci_setup_bridge(b);
> + break;
> +
> + case PCI_CLASS_BRIDGE_CARDBUS:
> + pci_setup_cardbus(b);
> + break;
> +
> + default:
> + dev_info(&bridge->dev, "not setting up bridge for bus "
> + "%04x:%02x\n", pci_domain_nr(b), b->number);
> + break;
> + }
> +}
> static void pci_bridge_release_unused_res(struct pci_bus *bus,
> unsigned long type)
> {
> @@ -932,3 +1002,18 @@ enable_and_dump:
> pci_bus_dump_resources(bus);
> }
> }
> +
> +void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
> +{
> + struct pci_bus *bus;
> + struct pci_bus *parent = bridge->subordinate;
> + int retval;
> +
> + pci_bus_size_bridges(parent);
> + pci_clear_master(bridge);
I have a concern about clearing bus master enable bit here, though
I'm not sure about it. I'm wondering if clearing bus master enable
bit might have some bad effect for the port services to work. For
example, does MSI interrupt work without enabling bus mastering?
Thanks,
Kenji Kaneshige
> + __pci_bridge_assign_resources(bridge, NULL);
> + retval = pci_reenable_device(bridge);
> + pci_set_master(bridge);
> + pci_enable_bridges(parent);
> +}
> +EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 5da0690..31fec6f 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -771,6 +771,7 @@ void pci_bus_assign_resources(const struct pci_bus *bus);
> void pci_bus_size_bridges(struct pci_bus *bus);
> int pci_claim_resource(struct pci_dev *, int);
> void pci_assign_unassigned_resources(void);
> +void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
> void pdev_enable_device(struct pci_dev *);
> void pdev_sort_resources(struct pci_dev *, struct resource_list *);
> int pci_enable_resources(struct pci_dev *, int mask);
On 01/12/2010 04:50 PM, Kenji Kaneshige wrote:
> Yinghai Lu wrote:
>> for pciehp to use it later
>>
>> pci_setup_bridge() will not check enabled for the slot bridge, otherwise
>> update res is not updated to bridge BAR. that is bridge is enabled already for
>> port service.
>>
>> -v2: update it with resource_list_x
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
...
>> +
>> +void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
>> +{
>> + struct pci_bus *bus;
>> + struct pci_bus *parent = bridge->subordinate;
>> + int retval;
>> +
>> + pci_bus_size_bridges(parent);
>> + pci_clear_master(bridge);
>
> I have a concern about clearing bus master enable bit here, though
> I'm not sure about it. I'm wondering if clearing bus master enable
> bit might have some bad effect for the port services to work. For
> example, does MSI interrupt work without enabling bus mastering?
but we set that pci_set_master right away after we assign the new resource
>
>
>> + __pci_bridge_assign_resources(bridge, NULL);
>> + retval = pci_reenable_device(bridge);
>> + pci_set_master(bridge);
>> + pci_enable_bridges(parent);
YH
Yinghai Lu wrote:
> On 01/12/2010 04:50 PM, Kenji Kaneshige wrote:
>> Yinghai Lu wrote:
>>> for pciehp to use it later
>>>
>>> pci_setup_bridge() will not check enabled for the slot bridge, otherwise
>>> update res is not updated to bridge BAR. that is bridge is enabled already for
>>> port service.
>>>
>>> -v2: update it with resource_list_x
>>>
>>> Signed-off-by: Yinghai Lu <[email protected]>
> ...
>
>>> +
>>> +void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
>>> +{
>>> + struct pci_bus *bus;
>>> + struct pci_bus *parent = bridge->subordinate;
>>> + int retval;
>>> +
>>> + pci_bus_size_bridges(parent);
>>> + pci_clear_master(bridge);
>> I have a concern about clearing bus master enable bit here, though
>> I'm not sure about it. I'm wondering if clearing bus master enable
>> bit might have some bad effect for the port services to work. For
>> example, does MSI interrupt work without enabling bus mastering?
> but we set that pci_set_master right away after we assign the new resource
Yes, I know you set bus master enable bit again.
In my understanding, bus master enable bit of the bridge is
cleared temporary while its port service driver is working.
I'm worrying about this temporary operation. For example,
I'm worrying about whether the MSI interrupt works, if some
port service generates interrupts when bus master enable bit
is cleared temporary.
Thanks,
Kenji Kaneshige
On 01/12/2010 11:31 PM, Kenji Kaneshige wrote:
> Yinghai Lu wrote:
>> On 01/12/2010 04:50 PM, Kenji Kaneshige wrote:
>>> Yinghai Lu wrote:
>>>> for pciehp to use it later
>>>>
>>>> pci_setup_bridge() will not check enabled for the slot bridge,
>>>> otherwise
>>>> update res is not updated to bridge BAR. that is bridge is enabled
>>>> already for
>>>> port service.
>>>>
>>>> -v2: update it with resource_list_x
>>>>
>>>> Signed-off-by: Yinghai Lu <[email protected]>
>> ...
>>
>>>> +
>>>> +void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
>>>> +{
>>>> + struct pci_bus *bus;
>>>> + struct pci_bus *parent = bridge->subordinate;
>>>> + int retval;
>>>> +
>>>> + pci_bus_size_bridges(parent);
>>>> + pci_clear_master(bridge);
>>> I have a concern about clearing bus master enable bit here, though
>>> I'm not sure about it. I'm wondering if clearing bus master enable
>>> bit might have some bad effect for the port services to work. For
>>> example, does MSI interrupt work without enabling bus mastering?
>> but we set that pci_set_master right away after we assign the new
>> resource
>
> Yes, I know you set bus master enable bit again.
>
> In my understanding, bus master enable bit of the bridge is
> cleared temporary while its port service driver is working.
> I'm worrying about this temporary operation. For example,
> I'm worrying about whether the MSI interrupt works, if some
> port service generates interrupts when bus master enable bit
> is cleared temporary.
>
ok, will remove clear and set for busmaster, in following patch.
[PATCH] pci: don't clear and set busmaster when assign unsigned bridge for pciehp.
Kenji pointed out that could have some pcie port service send out msi
at that point.
So don't clear the bus master.
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 2 --
1 file changed, 2 deletions(-)
Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -1062,10 +1062,8 @@ void pci_assign_unassigned_bridge_resour
again:
pci_bus_size_bridges(parent);
- pci_clear_master(bridge);
__pci_bridge_assign_resources(bridge, &head);
retval = pci_reenable_device(bridge);
- pci_set_master(bridge);
pci_enable_bridges(parent);
/* any device complain? */
On Tue, 22 Dec 2009 15:02:23 -0800
Yinghai Lu <[email protected]> wrote:
> +static void pci_bridge_release_unused_res(struct pci_bus *bus,
> + unsigned long type)
> +{
> + int idx;
> + bool changed = false;
> + struct pci_dev *dev;
> + struct resource *r;
> + unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> + IORESOURCE_PREFETCH;
> +
> + dev = bus->self;
> + for (idx = PCI_BRIDGE_RESOURCES; idx <=
> PCI_BRIDGE_RESOURCE_END;
> + idx++) {
> + r = &dev->resource[idx];
> + if ((r->flags & type_mask) != type)
> + continue;
> + if (!r->parent)
> + continue;
> + /*
> + * if there are children under that, we should
> release them
> + * all
> + */
> + release_child_resources(r);
> + if (!release_resource(r)) {
> + dev_printk(KERN_DEBUG, &dev->dev,
> + "resource %d %pR released\n", idx,
> r);
> + /* keep the old size */
> + r->end = resource_size(r) - 1;
> + r->start = 0;
> + r->flags = 0;
> + changed = true;
> + }
> + }
> +
> + if (changed) {
> + if (type & IORESOURCE_PREFETCH) {
> + /* avoiding touch the one without PREF */
> + type = IORESOURCE_PREFETCH;
> + }
> + __pci_setup_bridge(bus, type);
> + }
> +}
Isn't this freeing resources regardless of whether there are children?
If so, shouldn't it just be called pci_bridge_release_resources?
> +
> +/*
> + * try to release pci bridge resources that is from leaf bridge,
> + * so we can allocate big new one later
> + * check:
> + * 0: only release the bridge and only the bridge is leaf
> + * 1: release all down side bridge for third shoot
> + */
> +static void __ref pci_bus_release_unused_bridge_res(struct pci_bus
> *bus,
> + unsigned long
> type,
> + int check_leaf)
> +{
> + struct pci_dev *dev;
> + bool is_leaf_bridge = true;
> +
> + list_for_each_entry(dev, &bus->devices, bus_list) {
> + struct pci_bus *b = dev->subordinate;
> + if (!b)
> + continue;
> +
> + switch (dev->class >> 8) {
> + case PCI_CLASS_BRIDGE_CARDBUS:
> + is_leaf_bridge = false;
> + break;
> +
> + case PCI_CLASS_BRIDGE_PCI:
> + default:
> + is_leaf_bridge = false;
> + if (!check_leaf)
> + pci_bus_release_unused_bridge_res(b,
> type,
> + check_leaf);
> + break;
> + }
> + }
> +
> + /* The root bus? */
> + if (!bus->self)
> + return;
> +
> + switch (bus->self->class >> 8) {
> + case PCI_CLASS_BRIDGE_CARDBUS:
> + break;
> +
> + case PCI_CLASS_BRIDGE_PCI:
> + default:
> + if ((check_leaf && is_leaf_bridge) || !check_leaf)
> + pci_bridge_release_unused_res(bus, type);
> + break;
> + }
> +}
Naming comment applies here too. I'd also rather see the "check_leaf"
flag be an enum, that makes the callers more self documenting. The
enums should probably be called "leaf_only" and "whole_subtree" or
similar , since the function will only release the resources of a leaf
bridge when the former is passed, while the whole bridge and its
subtree will be released in the latter case.
This is starting to look a bit easier to follow though, thanks for your
patience so far.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:21 -0800
Yinghai Lu <[email protected]> wrote:
> prepare to use those small functions according to resource type later
>
> -v2: remove pref_mem64 typo, it should be removed
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Acked-by: Linus Torvalds <[email protected]>
> ---
Applied to my linux-next branch, thanks.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:22 -0800
Yinghai Lu <[email protected]> wrote:
> -v2: change name to release_child_resources according to Jesse
> -v3: according to Linus, move release_child_resources to resource.c
> also need to put the lock around them all to avoid recursive
> deep. (my test case only have one level that need to be released)
> -v4: seperate to another patch
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Acked-by: Linus Torvalds <[email protected]>
> ---
Applied to linux-next, thanks.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:24 -0800
Yinghai Lu <[email protected]> wrote:
> mean it is not used, so skip it.
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/setup-bus.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 7d6d393..e538fa1 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -703,7 +703,8 @@ static void pci_bus_dump_res(struct pci_bus *bus)
>
> for (i = 0; i < PCI_BUS_NUM_RESOURCES; i++) {
> struct resource *res = bus->resource[i];
> - if (!res || !res->end)
> +
> + if (!res || !res->end || !res->flags)
> continue;
>
> dev_printk(KERN_DEBUG, &bus->dev, "resource %d
> %pR\n", i, res);
Applied, thanks.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:25 -0800
Yinghai Lu <[email protected]> wrote:
> so later we can do sth with those failed one
>
> -v2: store start, end, flags aside. so could keep res cleared when
> assign failed. and make following assignment of its children do not
> go wild
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/setup-bus.c | 66
> ++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 62
> insertions(+), 4 deletions(-)
Bjorn's comments still stand here, this patch doesn't make any sense by
itself. Either the failed list functions and struct def should stand
alone, or the whole thing should be rolled into the patch that actually
makes use of these functions.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:27 -0800
Yinghai Lu <[email protected]> wrote:
> when we are clearing leaf bridge resource and try to get big one, we
> could shrink the bridge if there is no resource under it.
>
> let check with old resource size and make sure we are trying to get
> big one.
>
> -v2: keep disable window print out, still could happen on non pci
> hotplug system
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/setup-bus.c | 14 ++++++++++++--
> 1 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 9bb4435..d53b42e 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus,
> resource_size_t min_size) {
> struct pci_dev *dev;
> struct resource *b_res = find_free_bus_resource(bus,
> IORESOURCE_IO);
> - unsigned long size = 0, size1 = 0;
> + unsigned long size = 0, size1 = 0, old_size;
>
> if (!b_res)
> return;
> @@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus,
> resource_size_t min_size) }
> if (size < min_size)
> size = min_size;
> + old_size = resource_size(b_res);
> + if (old_size == 1)
> + old_size = 0;
Do we even need these == 1 checks? If old_size really was 1, it means
we had a very small decode range. Might make more sense to do...
> /* To be fixed in 2.5: we should have sort of HAVE_ISA
> flag in the struct pci_bus. */
> #if defined(CONFIG_ISA) || defined(CONFIG_EISA)
> size = (size & 0xff) + ((size & ~0xffUL) << 2);
> #endif
> size = ALIGN(size + size1, 4096);
> + if (size < old_size)
> + size = old_size;
if (size && size < old_size)
here instead? Just a nit because the if (old_size == 1) seemed a bit
magical at first. Same thing applies later on.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:28 -0800
Yinghai Lu <[email protected]> wrote:
> BIOS separate IO range between several IOHs, and on some slots, BIOS
> assign the resource to the bridge, but stop assigning resource to the
> device under that bridge, because the device need big resource.
>
> 1. pci assign unassign and record the failed device resource.
> 2. clear the BIOS assigned resource of the parent bridge of fail
> device 3. go back and call pci assign unsigned
> 4. if it still fail, will go up more bridges. and clear and try again.
>
> use pci_try_num to control back track bridge levels.
>
> -v2: update it with resource_list_x
> -v3: make pci_try_num default to 1. and when pci_try_num is set to
> more than 1 will check it with max_depth, and adjust that to make
> sure it is bigger enough
I really don't like the 'try' argument. Either we can assign the
resource or not; 'try=' just makes the whole thing scarier, as if we
expect problems if we release too many resources. If that's the case,
then the whole approach must be flawed, since it means we're not taking
into account some resources, or we're missing something about the
system configuration.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:30 -0800
Yinghai Lu <[email protected]> wrote:
> move out bus_size_bridges and assign resources out of
> pciehp_add_bridge() and at last do them all together one time
> including slot bridge, to avoid to call assign resources several
> times, when there are several bridges under the slot bridge.
> use pci_assign_nassigned_bridge_resources
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/hotplug/pciehp_pci.c | 23 +++++++++++++++++------
> 1 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/hotplug/pciehp_pci.c
> b/drivers/pci/hotplug/pciehp_pci.c index 2173310..0a16444 100644
> --- a/drivers/pci/hotplug/pciehp_pci.c
> +++ b/drivers/pci/hotplug/pciehp_pci.c
> @@ -53,17 +53,15 @@ static int __ref pciehp_add_bridge(struct pci_dev
> *dev) busnr = pci_scan_bridge(parent, dev, busnr, pass);
> if (!dev->subordinate)
> return -1;
> - pci_bus_size_bridges(dev->subordinate);
> - pci_bus_assign_resources(parent);
> - pci_enable_bridges(parent);
> - pci_bus_add_devices(parent);
> +
> return 0;
> }
>
> int pciehp_configure_device(struct slot *p_slot)
> {
> struct pci_dev *dev;
> - struct pci_bus *parent =
> p_slot->ctrl->pcie->port->subordinate;
> + struct pci_dev *bridge = p_slot->ctrl->pcie->port;
> + struct pci_bus *parent = bridge->subordinate;
> int num, fn;
> struct controller *ctrl = p_slot->ctrl;
>
> @@ -96,12 +94,25 @@ int pciehp_configure_device(struct slot *p_slot)
> (dev->hdr_type ==
> PCI_HEADER_TYPE_CARDBUS)) { pciehp_add_bridge(dev);
> }
> + pci_dev_put(dev);
> + }
> +
> + pci_assign_unassigned_bridge_resources(bridge);
> +
> + for (fn = 0; fn < 8; fn++) {
> + dev = pci_get_slot(parent, PCI_DEVFN(0, fn));
> + if (!dev)
> + continue;
> + if ((dev->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
> + pci_dev_put(dev);
> + continue;
> + }
> pci_configure_slot(dev);
> pci_dev_put(dev);
> }
>
> - pci_bus_assign_resources(parent);
> pci_bus_add_devices(parent);
> +
> return 0;
> }
This seems like a nice cleanup, can you rebase on the merged patches
from 9/14 (including the master bit change Kenji-san wanted)?
Thanks,
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 22 Dec 2009 15:02:26 -0800
Yinghai Lu <[email protected]> wrote:
> that is wrong.
>
> exposed by that patch that doesn's shrink pci bridge res.
>
> -v2: change to "bar reading" to "reg reading"
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
Is this really necessary? Won't we conflict with the reserved zero
page anyway?
--
Jesse Barnes, Intel Open Source Technology Center
On 01/15/2010 11:04 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:27 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> when we are clearing leaf bridge resource and try to get big one, we
>> could shrink the bridge if there is no resource under it.
>>
>> let check with old resource size and make sure we are trying to get
>> big one.
>>
>> -v2: keep disable window print out, still could happen on non pci
>> hotplug system
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>> ---
>> drivers/pci/setup-bus.c | 14 ++++++++++++--
>> 1 files changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>> index 9bb4435..d53b42e 100644
>> --- a/drivers/pci/setup-bus.c
>> +++ b/drivers/pci/setup-bus.c
>> @@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus,
>> resource_size_t min_size) {
>> struct pci_dev *dev;
>> struct resource *b_res = find_free_bus_resource(bus,
>> IORESOURCE_IO);
>> - unsigned long size = 0, size1 = 0;
>> + unsigned long size = 0, size1 = 0, old_size;
>>
>> if (!b_res)
>> return;
>> @@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus,
>> resource_size_t min_size) }
>> if (size < min_size)
>> size = min_size;
>> + old_size = resource_size(b_res);
>> + if (old_size == 1)
>> + old_size = 0;
>
> Do we even need these == 1 checks? If old_size really was 1, it means
> we had a very small decode range. Might make more sense to do...
when start=0 and end =0, will get old_size = 1
>
>> /* To be fixed in 2.5: we should have sort of HAVE_ISA
>> flag in the struct pci_bus. */
>> #if defined(CONFIG_ISA) || defined(CONFIG_EISA)
>> size = (size & 0xff) + ((size & ~0xffUL) << 2);
>> #endif
>> size = ALIGN(size + size1, 4096);
>> + if (size < old_size)
>> + size = old_size;
>
> if (size && size < old_size)
>
> here instead? Just a nit because the if (old_size == 1) seemed a bit
> magical at first. Same thing applies later on.
>
no
it will shrink.
YH
On 01/15/2010 11:12 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:28 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> BIOS separate IO range between several IOHs, and on some slots, BIOS
>> assign the resource to the bridge, but stop assigning resource to the
>> device under that bridge, because the device need big resource.
>>
>> 1. pci assign unassign and record the failed device resource.
>> 2. clear the BIOS assigned resource of the parent bridge of fail
>> device 3. go back and call pci assign unsigned
>> 4. if it still fail, will go up more bridges. and clear and try again.
>>
>> use pci_try_num to control back track bridge levels.
>>
>> -v2: update it with resource_list_x
>> -v3: make pci_try_num default to 1. and when pci_try_num is set to
>> more than 1 will check it with max_depth, and adjust that to make
>> sure it is bigger enough
>
> I really don't like the 'try' argument. Either we can assign the
> resource or not; 'try=' just makes the whole thing scarier, as if we
> expect problems if we release too many resources. If that's the case,
> then the whole approach must be flawed, since it means we're not taking
> into account some resources, or we're missing something about the
> system configuration.
before this patchset, acctually try = 1, and will not touch pci bridge resource if that are assigned by BIOS.
with this patchset, try = 1, will just like the old ways.
try = 2, it will find the deepest bridge, and increase the try.
YH
On 01/15/2010 10:56 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:25 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> so later we can do sth with those failed one
>>
>> -v2: store start, end, flags aside. so could keep res cleared when
>> assign failed. and make following assignment of its children do not
>> go wild
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>> ---
>> drivers/pci/setup-bus.c | 66
>> ++++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 62
>> insertions(+), 4 deletions(-)
>
> Bjorn's comments still stand here, this patch doesn't make any sense by
> itself. Either the failed list functions and struct def should stand
> alone, or the whole thing should be rolled into the patch that actually
> makes use of these functions.
>
just want to make patch to be reviewed easyly.
YH
On 01/15/2010 11:14 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:30 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> move out bus_size_bridges and assign resources out of
>> pciehp_add_bridge() and at last do them all together one time
>> including slot bridge, to avoid to call assign resources several
>> times, when there are several bridges under the slot bridge.
>> use pci_assign_nassigned_bridge_resources
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>> ---
>> drivers/pci/hotplug/pciehp_pci.c | 23 +++++++++++++++++------
>> 1 files changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/pci/hotplug/pciehp_pci.c
>> b/drivers/pci/hotplug/pciehp_pci.c index 2173310..0a16444 100644
>> --- a/drivers/pci/hotplug/pciehp_pci.c
>> +++ b/drivers/pci/hotplug/pciehp_pci.c
>> @@ -53,17 +53,15 @@ static int __ref pciehp_add_bridge(struct pci_dev
>> *dev) busnr = pci_scan_bridge(parent, dev, busnr, pass);
>> if (!dev->subordinate)
>> return -1;
>> - pci_bus_size_bridges(dev->subordinate);
>> - pci_bus_assign_resources(parent);
>> - pci_enable_bridges(parent);
>> - pci_bus_add_devices(parent);
>> +
>> return 0;
>> }
>>
>> int pciehp_configure_device(struct slot *p_slot)
>> {
>> struct pci_dev *dev;
>> - struct pci_bus *parent =
>> p_slot->ctrl->pcie->port->subordinate;
>> + struct pci_dev *bridge = p_slot->ctrl->pcie->port;
>> + struct pci_bus *parent = bridge->subordinate;
>> int num, fn;
>> struct controller *ctrl = p_slot->ctrl;
>>
>> @@ -96,12 +94,25 @@ int pciehp_configure_device(struct slot *p_slot)
>> (dev->hdr_type ==
>> PCI_HEADER_TYPE_CARDBUS)) { pciehp_add_bridge(dev);
>> }
>> + pci_dev_put(dev);
>> + }
>> +
>> + pci_assign_unassigned_bridge_resources(bridge);
>> +
>> + for (fn = 0; fn < 8; fn++) {
>> + dev = pci_get_slot(parent, PCI_DEVFN(0, fn));
>> + if (!dev)
>> + continue;
>> + if ((dev->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
>> + pci_dev_put(dev);
>> + continue;
>> + }
>> pci_configure_slot(dev);
>> pci_dev_put(dev);
>> }
>>
>> - pci_bus_assign_resources(parent);
>> pci_bus_add_devices(parent);
>> +
>> return 0;
>> }
>
> This seems like a nice cleanup, can you rebase on the merged patches
> from 9/14 (including the master bit change Kenji-san wanted)?
that share some functions from 0 - 6...
YH
On 01/15/2010 11:19 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:26 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> that is wrong.
>>
>> exposed by that patch that doesn's shrink pci bridge res.
>>
>> -v2: change to "bar reading" to "reg reading"
>>
>> Signed-off-by: Yinghai Lu <[email protected]>
>> ---
>
> Is this really necessary? Won't we conflict with the reserved zero
> page anyway?
>
just try to print out the work for pci bridge BAR by BIOS, and just like we did for pci device BAR
before we touch it.
YH
On Fri, 15 Jan 2010 13:09:31 -0800
Yinghai Lu <[email protected]> wrote:
> On 01/15/2010 11:04 AM, Jesse Barnes wrote:
> > On Tue, 22 Dec 2009 15:02:27 -0800
> > Yinghai Lu <[email protected]> wrote:
> >
> >> when we are clearing leaf bridge resource and try to get big one,
> >> we could shrink the bridge if there is no resource under it.
> >>
> >> let check with old resource size and make sure we are trying to get
> >> big one.
> >>
> >> -v2: keep disable window print out, still could happen on non pci
> >> hotplug system
> >>
> >> Signed-off-by: Yinghai Lu <[email protected]>
> >> ---
> >> drivers/pci/setup-bus.c | 14 ++++++++++++--
> >> 1 files changed, 12 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> >> index 9bb4435..d53b42e 100644
> >> --- a/drivers/pci/setup-bus.c
> >> +++ b/drivers/pci/setup-bus.c
> >> @@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus,
> >> resource_size_t min_size) {
> >> struct pci_dev *dev;
> >> struct resource *b_res = find_free_bus_resource(bus,
> >> IORESOURCE_IO);
> >> - unsigned long size = 0, size1 = 0;
> >> + unsigned long size = 0, size1 = 0, old_size;
> >>
> >> if (!b_res)
> >> return;
> >> @@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus,
> >> resource_size_t min_size) }
> >> if (size < min_size)
> >> size = min_size;
> >> + old_size = resource_size(b_res);
> >> + if (old_size == 1)
> >> + old_size = 0;
> >
> > Do we even need these == 1 checks? If old_size really was 1, it
> > means we had a very small decode range. Might make more sense to
> > do...
>
> when start=0 and end =0, will get old_size = 1
...
if (old_size == 1)
old_size = 0
...
if (size < old_size)
size = old_size
...
If old_size > 1 we'll make sure size doesn't decrease.
If old_size == 1, we'll never touch the changed size because size < 0
will never be true for size (unsigned).
However, if old_size == 1 and we left it at 1, we'd only set size =
old_size if size was 0, which is why I suggested the size check.
What am I missing? When will size shrink? I know it's just
nitpicking, I just want to make sure the code is clear, and that it's
obvious that we're preventing size from decreasing.
--
Jesse Barnes, Intel Open Source Technology Center
On Friday 15 January 2010 02:12:39 pm Yinghai Lu wrote:
> On 01/15/2010 11:12 AM, Jesse Barnes wrote:
> > On Tue, 22 Dec 2009 15:02:28 -0800
> > Yinghai Lu <[email protected]> wrote:
> >
> >> BIOS separate IO range between several IOHs, and on some slots, BIOS
> >> assign the resource to the bridge, but stop assigning resource to the
> >> device under that bridge, because the device need big resource.
> >>
> >> 1. pci assign unassign and record the failed device resource.
> >> 2. clear the BIOS assigned resource of the parent bridge of fail
> >> device 3. go back and call pci assign unsigned
> >> 4. if it still fail, will go up more bridges. and clear and try again.
> >>
> >> use pci_try_num to control back track bridge levels.
> >>
> >> -v2: update it with resource_list_x
> >> -v3: make pci_try_num default to 1. and when pci_try_num is set to
> >> more than 1 will check it with max_depth, and adjust that to make
> >> sure it is bigger enough
> >
> > I really don't like the 'try' argument. Either we can assign the
> > resource or not; 'try=' just makes the whole thing scarier, as if we
> > expect problems if we release too many resources. If that's the case,
> > then the whole approach must be flawed, since it means we're not taking
> > into account some resources, or we're missing something about the
> > system configuration.
>
> before this patchset, acctually try = 1, and will not touch pci bridge resource if that are assigned by BIOS.
>
> with this patchset, try = 1, will just like the old ways.
> try = 2, it will find the deepest bridge, and increase the try.
I think Jesse understands how this works.
My opinion is that it's just an unacceptable user interface. We
can't tell users "boot Linux, if it doesn't work boot with 'try=1',
if *that* doesn't work boot with 'try=2', etc." That just makes
us look stupid, IMHO.
Bjorn
On Fri, 15 Jan 2010 13:12:39 -0800
Yinghai Lu <[email protected]> wrote:
> On 01/15/2010 11:12 AM, Jesse Barnes wrote:
> > On Tue, 22 Dec 2009 15:02:28 -0800
> > Yinghai Lu <[email protected]> wrote:
> >
> >> BIOS separate IO range between several IOHs, and on some slots,
> >> BIOS assign the resource to the bridge, but stop assigning
> >> resource to the device under that bridge, because the device need
> >> big resource.
> >>
> >> 1. pci assign unassign and record the failed device resource.
> >> 2. clear the BIOS assigned resource of the parent bridge of fail
> >> device 3. go back and call pci assign unsigned
> >> 4. if it still fail, will go up more bridges. and clear and try
> >> again.
> >>
> >> use pci_try_num to control back track bridge levels.
> >>
> >> -v2: update it with resource_list_x
> >> -v3: make pci_try_num default to 1. and when pci_try_num is set to
> >> more than 1 will check it with max_depth, and adjust that to make
> >> sure it is bigger enough
> >
> > I really don't like the 'try' argument. Either we can assign the
> > resource or not; 'try=' just makes the whole thing scarier, as if we
> > expect problems if we release too many resources. If that's the
> > case, then the whole approach must be flawed, since it means we're
> > not taking into account some resources, or we're missing something
> > about the system configuration.
>
> before this patchset, acctually try = 1, and will not touch pci
> bridge resource if that are assigned by BIOS.
>
> with this patchset, try = 1, will just like the old ways.
> try = 2, it will find the deepest bridge, and increase the try.
Yeah, I know what it's supposed to do, I just don't like it in
principle. We should be getting this right in the first place, not
providing the user magical options. For debugging maybe it's ok, but
it should at least be called pci_reassign_depth= or something instead
to make it clear. I guess the options would be =leaf, =<num>, or =top?
--
Jesse Barnes, Intel Open Source Technology Center
On Tuesday 22 December 2009 04:02:25 pm Yinghai Lu wrote:
> so later we can do sth with those failed one
I feel like the grumpy old man yelling at the kids to stay off his lawn,
but I would like it a lot better if we avoided the IRC/texting shorthand
like "sth" in commit logs. I can sort of deal with it in email, but the
commit logs are forever. I *know* real English makes things easier for
me, and I suspect it makes an even bigger difference for non-native
speakers.
Bjorn
On 01/15/2010 10:53 AM, Jesse Barnes wrote:
> On Tue, 22 Dec 2009 15:02:23 -0800
> Yinghai Lu <[email protected]> wrote:
>> +static void pci_bridge_release_unused_res(struct pci_bus *bus,
>> + unsigned long type)
>> +{
>> + int idx;
>> + bool changed = false;
>> + struct pci_dev *dev;
>> + struct resource *r;
>> + unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
>> + IORESOURCE_PREFETCH;
>> +
>> + dev = bus->self;
>> + for (idx = PCI_BRIDGE_RESOURCES; idx <=
>> PCI_BRIDGE_RESOURCE_END;
>> + idx++) {
>> + r = &dev->resource[idx];
>> + if ((r->flags & type_mask) != type)
>> + continue;
>> + if (!r->parent)
>> + continue;
>> + /*
>> + * if there are children under that, we should
>> release them
>> + * all
>> + */
>> + release_child_resources(r);
>> + if (!release_resource(r)) {
>> + dev_printk(KERN_DEBUG, &dev->dev,
>> + "resource %d %pR released\n", idx,
>> r);
>> + /* keep the old size */
>> + r->end = resource_size(r) - 1;
>> + r->start = 0;
>> + r->flags = 0;
>> + changed = true;
>> + }
>> + }
>> +
>> + if (changed) {
>> + if (type & IORESOURCE_PREFETCH) {
>> + /* avoiding touch the one without PREF */
>> + type = IORESOURCE_PREFETCH;
>> + }
>> + __pci_setup_bridge(bus, type);
>> + }
>> +}
>
> Isn't this freeing resources regardless of whether there are children?
> If so, shouldn't it just be called pci_bridge_release_resources?
>
ok
>> +
>> +/*
>> + * try to release pci bridge resources that is from leaf bridge,
>> + * so we can allocate big new one later
>> + * check:
>> + * 0: only release the bridge and only the bridge is leaf
>> + * 1: release all down side bridge for third shoot
>> + */
>> +static void __ref pci_bus_release_unused_bridge_res(struct pci_bus
>> *bus,
>> + unsigned long
>> type,
>> + int check_leaf)
>> +{
>> + struct pci_dev *dev;
>> + bool is_leaf_bridge = true;
>> +
>> + list_for_each_entry(dev, &bus->devices, bus_list) {
>> + struct pci_bus *b = dev->subordinate;
>> + if (!b)
>> + continue;
>> +
>> + switch (dev->class >> 8) {
>> + case PCI_CLASS_BRIDGE_CARDBUS:
>> + is_leaf_bridge = false;
>> + break;
>> +
>> + case PCI_CLASS_BRIDGE_PCI:
>> + default:
>> + is_leaf_bridge = false;
>> + if (!check_leaf)
>> + pci_bus_release_unused_bridge_res(b,
>> type,
>> + check_leaf);
>> + break;
>> + }
>> + }
>> +
>> + /* The root bus? */
>> + if (!bus->self)
>> + return;
>> +
>> + switch (bus->self->class >> 8) {
>> + case PCI_CLASS_BRIDGE_CARDBUS:
>> + break;
>> +
>> + case PCI_CLASS_BRIDGE_PCI:
>> + default:
>> + if ((check_leaf && is_leaf_bridge) || !check_leaf)
>> + pci_bridge_release_unused_res(bus, type);
>> + break;
>> + }
>> +}
>
> Naming comment applies here too. I'd also rather see the "check_leaf"
> flag be an enum, that makes the callers more self documenting. The
> enums should probably be called "leaf_only" and "whole_subtree" or
> similar , since the function will only release the resources of a leaf
> bridge when the former is passed, while the whole bridge and its
> subtree will be released in the latter case.
ok
YH
On 01/15/2010 01:31 PM, Jesse Barnes wrote:
> On Fri, 15 Jan 2010 13:09:31 -0800
> Yinghai Lu <[email protected]> wrote:
>
>> On 01/15/2010 11:04 AM, Jesse Barnes wrote:
>>> On Tue, 22 Dec 2009 15:02:27 -0800
>>> Yinghai Lu <[email protected]> wrote:
>>>
>>>> when we are clearing leaf bridge resource and try to get big one,
>>>> we could shrink the bridge if there is no resource under it.
>>>>
>>>> let check with old resource size and make sure we are trying to get
>>>> big one.
>>>>
>>>> -v2: keep disable window print out, still could happen on non pci
>>>> hotplug system
>>>>
>>>> Signed-off-by: Yinghai Lu <[email protected]>
>>>> ---
>>>> drivers/pci/setup-bus.c | 14 ++++++++++++--
>>>> 1 files changed, 12 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>>>> index 9bb4435..d53b42e 100644
>>>> --- a/drivers/pci/setup-bus.c
>>>> +++ b/drivers/pci/setup-bus.c
>>>> @@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus,
>>>> resource_size_t min_size) {
>>>> struct pci_dev *dev;
>>>> struct resource *b_res = find_free_bus_resource(bus,
>>>> IORESOURCE_IO);
>>>> - unsigned long size = 0, size1 = 0;
>>>> + unsigned long size = 0, size1 = 0, old_size;
>>>>
>>>> if (!b_res)
>>>> return;
>>>> @@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus,
>>>> resource_size_t min_size) }
>>>> if (size < min_size)
>>>> size = min_size;
>>>> + old_size = resource_size(b_res);
>>>> + if (old_size == 1)
>>>> + old_size = 0;
>>>
>>> Do we even need these == 1 checks? If old_size really was 1, it
>>> means we had a very small decode range. Might make more sense to
>>> do...
>>
>> when start=0 and end =0, will get old_size = 1
>
> ...
> if (old_size == 1)
> old_size = 0
> ...
> if (size < old_size)
> size = old_size
> ...
>
>
> If old_size > 1 we'll make sure size doesn't decrease.
>
> If old_size == 1, we'll never touch the changed size because size < 0
> will never be true for size (unsigned).
>
> However, if old_size == 1 and we left it at 1, we'd only set size =
> old_size if size was 0, which is why I suggested the size check.
>
but:
for
if (size && size < old_size)
if the new one will 0 ( no device under it and it is not hotplug supported <so min_size = 0> )
then we will overwrite the real old size...during second try.
YH