2010-01-16 02:45:34

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 00/11] pci: update pci bridge resources

this patchset is trying to update pci bridge BAR when that BAR is big enough.

1. boot time:

BIOS separate IO range between several IOHs, and on some slots, BIOS assign
the resource to the bridge, but stop assigning resource to the device under
that bridge, because the device need big resource.

so patches (first 8) are trying to
a. pci assign unassign and record the failed device resource.
b. clear the BIOS assigned resource of the parent bridge of fail device
c. go back and call pci assign unsigned
d. if it still fail, will go up more bridges. and clear and try again.

v2: Jesse doesn't like it is in find_free_bus_resource...
try to move out of pci_bus_size_bridges loop.
v3: add pci_setup_bridge calling after pci_bridge_release_not_used_res.
only clear release those res for x86.
v4: Bjorn want to release use dev instead of bus.
v5: Kenji pointed out it will have problem with several level bridge.
so let only handle leaf bridge.
v6: address Kenji's request (new pci_bus_release...). and change applying order
move back release to pci_assign_unassigned_resource
v7: change functions name pci_bus_release_unused_bridge_res according to Jesse
v8: address Eric's concern, only overwrite leaf bridge resource that is not big
enough need to do it in two steps, and first step recore the failed res,
and don't touch bridge res that programmed by firmware. second step will
try to release bridge resource that is too small at first.
v9: refresh to be applied after bjorn's patch, and remove trick about save
size and restore resource second try.
v11:add pci=try=5, about more try to change more bridge
v12:not shrink pci bridge resource

2. hotplug:
BIOS separate IO range between several IOHs, and on some slots, BIOS assign
the resource to every bridge. (8M) but when insert one card that big resource,
the card can not get resource. because kernel will not touch the bridge
resource.

so patches (9-11) are trying to
a. assign resource to devices with that slot. and record fail devices
b. if there is some failed, will clear sepcifically io port of bridge, or mmio of bridge, or mmio pref of bridge.
c. try to assign the parent bridge of the slot.

v2: address Alex's concern about pci remove/rescan feature about
pci_setup_bridge changes.
v3: Kenji pointed out that pci_config_slot need to be called before
pci_bus_add_devices()
v4: move out pci_is_enabled checkout of pci_setup_bridge()
v5: change the applying sequence.
v6: change the functions name according to Jesse
v8: address Eric's concern, only overwrite leaf bridge resource that is not
big enough
v9: refresh to be applied after bjorn's patch, and remove trick about save
size and restore resource second try.
v10:alex found need to have export for pci_assign_unassigned_bridge_resources
v11: pass check_leaf with pci_bus_release_unused_bridge_res

patches (12-13)
will try to shrink, the pci hotplug related pcie port, so could save some for
others.

-v13: change resource_list to resource_list_x, to save size and flags aside,
otherwise grandchild res will get confused with son's res as could be used

-v14: default it is disabled. could use pci=try=2 to enable it.
-v15: seperate release_child_resources aside.

-v16: refresh according to new pci/linux-next of 2009_01_15
change some name according to Jesse

Thanks

Yinghai




2010-01-16 23:20:09

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

On Sat, 2010-01-16 at 01:34 -0800, Yinghai Lu wrote:
> On 01/15/2010 08:59 PM, Alex Chiang wrote:
> > * Yinghai Lu <[email protected]>:
> >> From: Kenji Kaneshige <[email protected]>
> >>
> >> With this patch, pciehp driver try to clear PCI bridge resources to
> >> parent bridge (root port or switch downstream port) of the slot
> >>
> >> so we can shrink pci bridge resource for those port
> >>
> >> This feature is enabled when 'pciehp_realloc' option is specified.
> >>
> >> -v2: make it could be appiled after Yinghai patchset that touch pci bridge
> >> resource also remove poweron check, because pci_bridge_release_res will
> >> check child at first
> >
> > Same comment as my earlier patch. Why not just make this the
> > default behavior, instead of introducing yet another command line
> > parameter for users to guess at?
>
> it will break Eric's setup/

I think this is a clue that we don't understand the problem well enough
yet.

I'm opposed to adding kernel parameters for this sort of thing. It is
unreasonable to expect users to figure out whether they need to use this
parameter or not.

Special-case switches like this make it much harder to maintain the code
in the future.

Bjorn

2010-01-17 17:34:18

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

On Sat, 16 Jan 2010 16:14:31 -0700
Bjorn Helgaas <[email protected]> wrote:

> On Sat, 2010-01-16 at 01:34 -0800, Yinghai Lu wrote:
> > On 01/15/2010 08:59 PM, Alex Chiang wrote:
> > > * Yinghai Lu <[email protected]>:
> > >> From: Kenji Kaneshige <[email protected]>
> > >>
> > >> With this patch, pciehp driver try to clear PCI bridge resources
> > >> to parent bridge (root port or switch downstream port) of the
> > >> slot
> > >>
> > >> so we can shrink pci bridge resource for those port
> > >>
> > >> This feature is enabled when 'pciehp_realloc' option is
> > >> specified.
> > >>
> > >> -v2: make it could be appiled after Yinghai patchset that touch
> > >> pci bridge resource also remove poweron check, because
> > >> pci_bridge_release_res will check child at first
> > >
> > > Same comment as my earlier patch. Why not just make this the
> > > default behavior, instead of introducing yet another command line
> > > parameter for users to guess at?
> >
> > it will break Eric's setup/
>
> I think this is a clue that we don't understand the problem well
> enough yet.
>
> I'm opposed to adding kernel parameters for this sort of thing. It is
> unreasonable to expect users to figure out whether they need to use
> this parameter or not.
>
> Special-case switches like this make it much harder to maintain the
> code in the future.

Agreed. Yinghai, what's wrong in the new reassignment code that
causes Eric's setup to break? Can we just fix that instead and enable
reallocation by default?

--
Jesse Barnes, Intel Open Source Technology Center

2010-01-17 22:37:16

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

On 01/17/2010 09:34 AM, Jesse Barnes wrote:
> On Sat, 16 Jan 2010 16:14:31 -0700
> Bjorn Helgaas <[email protected]> wrote:
>
>> On Sat, 2010-01-16 at 01:34 -0800, Yinghai Lu wrote:
>>> On 01/15/2010 08:59 PM, Alex Chiang wrote:
>>>> * Yinghai Lu <[email protected]>:
>>>>> From: Kenji Kaneshige <[email protected]>
>>>>>
>>>>> With this patch, pciehp driver try to clear PCI bridge resources
>>>>> to parent bridge (root port or switch downstream port) of the
>>>>> slot
>>>>>
>>>>> so we can shrink pci bridge resource for those port
>>>>>
>>>>> This feature is enabled when 'pciehp_realloc' option is
>>>>> specified.
>>>>>
>>>>> -v2: make it could be appiled after Yinghai patchset that touch
>>>>> pci bridge resource also remove poweron check, because
>>>>> pci_bridge_release_res will check child at first
>>>>
>>>> Same comment as my earlier patch. Why not just make this the
>>>> default behavior, instead of introducing yet another command line
>>>> parameter for users to guess at?
>>>
>>> it will break Eric's setup/
>>
>> I think this is a clue that we don't understand the problem well
>> enough yet.
>>
>> I'm opposed to adding kernel parameters for this sort of thing. It is
>> unreasonable to expect users to figure out whether they need to use
>> this parameter or not.
>>
>> Special-case switches like this make it much harder to maintain the
>> code in the future.
>
> Agreed. Yinghai, what's wrong in the new reassignment code that
> causes Eric's setup to break? Can we just fix that instead and enable
> reallocation by default?

no.

Eric's setup is:
hot plug one addon card into one hotplug slots and new addon cards have more hot plug slots.

SLOTA will have SLOT B, SLOT C, and SLOT D as children. and not devices in SLOTB/C/D at first.

with the pciehp_realloc is specified, the SLOTA will not get any device assigned, and later
if put devices in SLOTB/SLOTC/SLOTD, will not get resource for those devices.
and we can not go above to update SLOT A, because: some devices on SLOTB could already have driver loaded.

if you don't like pciehp_realloc at this point, please only apply 1-8 and 11.

Thanks

Yinghai Lu

2010-01-16 02:43:33

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 08/11] pci: pciehp second try to get big range for pcie devices

handle the case the slot bridge that doesn't get pre-allocated big enough res
from FW.
for example pcie devices need 256M, but the bridge only get preallocated 2M...

-v2: use resource_list_x

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 49 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index c348882..2ff34ca 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1009,12 +1009,60 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
{
struct pci_bus *bus;
struct pci_bus *parent = bridge->subordinate;
+ bool second_tried = false;
+ struct resource_list_x head, *list;
int retval;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+
+ head.next = NULL;

+again:
pci_bus_size_bridges(parent);
- __pci_bridge_assign_resources(bridge, NULL);
+ __pci_bridge_assign_resources(bridge, &head);
retval = pci_reenable_device(bridge);
pci_set_master(bridge);
pci_enable_bridges(parent);
+
+ /* any device complain? */
+ if (!head.next)
+ return;
+
+ if (second_tried) {
+ /* still fail, don't need to try more */
+ free_failed_list(&head);
+ return;
+ }
+
+ second_tried = true;
+ printk(KERN_DEBUG "PCI: second try to assign unassigned res\n");
+
+ /*
+ * Try to release leaf bridge's resources that doesn't fit resource of
+ * child device under that bridge
+ */
+ for (list = head.next; list;) {
+ unsigned long flags = list->flags;
+
+ bus = list->dev->bus;
+ pci_bus_release_bridge_resources(bus, flags & type_mask,
+ whole_subtree);
+ list = list->next;
+ }
+ /* retore size and flags */
+ for (list = head.next; list;) {
+ struct resource *res = list->res;
+
+ res->start = list->start;
+ res->end = list->end;
+ res->flags = list->flags;
+ if (list->dev->subordinate)
+ res->flags = 0;
+
+ list = list->next;
+ }
+ free_failed_list(&head);
+
+ goto again;
}
EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
--
1.6.4.2

2010-01-16 02:43:41

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 11/11] pci: set PCI_PREF_RANGE_TYPE_64 in pci_bridge_check_ranges

make pci_bridge_check_ranges() to store the PCI_PREF_RANGE_TYPE_64 in addition
to IORESOURCE_MEM_64. just like pci_read_bridge_bases()

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 89a531f..6ca0152 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -380,8 +380,10 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
}
if (pmem) {
b_res[2].flags |= IORESOURCE_MEM | IORESOURCE_PREFETCH;
- if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == PCI_PREF_RANGE_TYPE_64)
+ if ((pmem & PCI_PREF_RANGE_TYPE_MASK) == PCI_PREF_RANGE_TYPE_64){
b_res[2].flags |= IORESOURCE_MEM_64;
+ b_res[2].flags |= PCI_PREF_RANGE_TYPE_64;
+ }
}

/* double check if bridge does support 64 bit pref */
--
1.6.4.2

2010-01-16 02:43:31

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 05/11] pci: update bridge res to get more big range in pci assign unssign

BIOS separate IO range between several IOHs, and on some slots, BIOS assign the
resource to the bridge, but stop assigning resource to the device under that
bridge, because the device need big resource.

1. pci assign unassign and record the failed device resource.
2. clear the BIOS assigned resource of the parent bridge of fail device
3. go back and call pci assign unsigned
4. if it still fail, will go up more bridges. and clear and try again.

use pci_try_num to control back track bridge levels.

-v2: update it with resource_list_x
-v3: make pci_try_num default to 1. and when pci_try_num is set to more than 1
will check it with max_depth, and adjust that to make sure it is bigger
enough
-v4: update to enum release_type and _used_calling

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/pci.c | 5 ++
drivers/pci/pci.h | 2 +
drivers/pci/setup-bus.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++-
3 files changed, 123 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d62a5de..9dc358d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2951,6 +2951,11 @@ static int __init pci_setup(char *str)
pci_no_aer();
} else if (!strcmp(str, "nodomains")) {
pci_no_domains();
+ } else if (!strncmp(str, "try=", 4)) {
+ int try_num = memparse(str + 4, &str);
+
+ if (try_num > 0 && try_num < 16)
+ pci_try_num = try_num;
} else if (!strncmp(str, "cbiosize=", 9)) {
pci_cardbus_io_size = memparse(str + 9, &str);
} else if (!strncmp(str, "cbmemsize=", 10)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4eb10f4..f6582c4 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -212,6 +212,8 @@ static inline int pci_ari_enabled(struct pci_bus *bus)
return bus->self && bus->self->ari_enabled;
}

+extern int pci_try_num;
+
#ifdef CONFIG_PCI_QUIRKS
extern int pci_is_reassigndev(struct pci_dev *dev);
resource_size_t pci_specified_resource_alignment(struct pci_dev *dev);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 5eebec8..8ad515e 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -798,11 +798,69 @@ static void pci_bus_dump_resources(struct pci_bus *bus)
}
}

+static int __init pci_bus_get_depth(struct pci_bus *bus)
+{
+ int depth = 0;
+ struct pci_dev *dev;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ int ret;
+ struct pci_bus *b = dev->subordinate;
+ if (!b)
+ continue;
+
+ ret = pci_bus_get_depth(b);
+ if (ret + 1 > depth)
+ depth = ret + 1;
+ }
+
+ return depth;
+}
+static int __init pci_get_max_depth(void)
+{
+ int depth = 0;
+ struct pci_bus *bus;
+
+ list_for_each_entry(bus, &pci_root_buses, node) {
+ int ret;
+
+ ret = pci_bus_get_depth(bus);
+ if (ret > depth)
+ depth = ret;
+ }
+
+ return depth;
+}
+
+/*
+ * first try will not touch pci bridge res
+ * second try will clear small leaf bridge res
+ * third try will clear related bridge: some aggressive
+ */
+int pci_try_num = 1;
void __init
pci_assign_unassigned_resources(void)
{
struct pci_bus *bus;
+ int tried_times = 0;
+ enum release_type rel_type = leaf_only;
+ struct resource_list_x head, *list;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+ unsigned long failed_type;
+ int max_depth = pci_get_max_depth();
+
+ head.next = NULL;
+
+ if (pci_try_num > 1) {
+ if (max_depth + 1 > pci_try_num)
+ pci_try_num = max_depth + 1;
+ }
+
+ printk(KERN_DEBUG "PCI: max depth: %d pci_try_num: %d\n",
+ max_depth, pci_try_num);

+again:
/* Depth first, calculate sizes and alignments of all
subordinate buses. */
list_for_each_entry(bus, &pci_root_buses, node) {
@@ -810,7 +868,64 @@ pci_assign_unassigned_resources(void)
}
/* Depth last, allocate resources and update the hardware. */
list_for_each_entry(bus, &pci_root_buses, node) {
- pci_bus_assign_resources(bus);
+ __pci_bus_assign_resources(bus, &head);
+ }
+ tried_times++;
+
+ /* any device complain? */
+ if (!head.next)
+ goto enable_and_dump;
+ failed_type = 0;
+ for (list = head.next; list;) {
+ failed_type |= list->flags;
+ list = list->next;
+ }
+ /*
+ * io port are tight, don't try extra
+ * or if reach the limit, don't want to try more
+ */
+ failed_type &= type_mask;
+ if ((failed_type == IORESOURCE_IO) || (tried_times >= pci_try_num)) {
+ free_failed_list(&head);
+ goto enable_and_dump;
+ }
+
+ printk(KERN_DEBUG "PCI: No. %d try to assign unassigned res\n",
+ tried_times + 1);
+
+ /* third times and later will not check if it is leaf */
+ if ((tried_times + 1) > 2)
+ rel_type = whole_subtree;
+
+ /*
+ * Try to release leaf bridge's resources that doesn't fit resource of
+ * child device under that bridge
+ */
+ for (list = head.next; list;) {
+ bus = list->dev->bus;
+ pci_bus_release_bridge_resources(bus, list->flags & type_mask,
+ rel_type);
+ list = list->next;
+ }
+ /* retore size and flags */
+ for (list = head.next; list;) {
+ struct resource *res = list->res;
+
+ res->start = list->start;
+ res->end = list->end;
+ res->flags = list->flags;
+ if (list->dev->subordinate)
+ res->flags = 0;
+
+ list = list->next;
+ }
+ free_failed_list(&head);
+
+ goto again;
+
+enable_and_dump:
+ /* Depth last, update the hardware. */
+ list_for_each_entry(bus, &pci_root_buses, node) {
pci_enable_bridges(bus);
}

--
1.6.4.2

2010-01-16 02:43:29

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 03/11] pci: reject mmio range start from 0 on pci_bridge read

that is wrong.

exposed by that patch that doesn's shrink pci bridge res.

-v2: change to "bar reading" to "reg reading"

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/probe.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 11824d7..70b1f74 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -316,13 +316,17 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
limit |= (io_limit_hi << 16);
}

- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (io_base_lo & PCI_IO_RANGE_TYPE_MASK) | IORESOURCE_IO;
if (!res->start)
res->start = base;
if (!res->end)
res->end = limit + 0xfff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [io %04lx - %04lx] reg reading\n",
+ base, limit);
}

res = child->resource[1];
@@ -330,11 +334,15 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
pci_read_config_word(dev, PCI_MEMORY_LIMIT, &mem_limit_lo);
base = (mem_base_lo & PCI_MEMORY_RANGE_MASK) << 16;
limit = (mem_limit_lo & PCI_MEMORY_RANGE_MASK) << 16;
- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (mem_base_lo & PCI_MEMORY_RANGE_TYPE_MASK) | IORESOURCE_MEM;
res->start = base;
res->end = limit + 0xfffff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [mem 0x%08lx - 0x%08lx] reg reading\n",
+ base, limit + 0xfffff);
}

res = child->resource[2];
@@ -366,7 +374,7 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
#endif
}
}
- if (base <= limit) {
+ if (base <= limit && base) {
res->flags = (mem_base_lo & PCI_PREF_RANGE_TYPE_MASK) |
IORESOURCE_MEM | IORESOURCE_PREFETCH;
if (res->flags & PCI_PREF_RANGE_TYPE_64)
@@ -374,6 +382,10 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
res->start = base;
res->end = limit + 0xfffff;
dev_printk(KERN_DEBUG, &dev->dev, " bridge window %pR\n", res);
+ } else {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ " bridge window [mem 0x%08lx - %08lx pref] reg reading\n",
+ base, limit + 0xfffff);
}
}

--
1.6.4.2

2010-01-16 02:44:14

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 07/11] pci: pciehp clean flow in pciehp_configure_device

move out bus_size_bridges and assign resources out of pciehp_add_bridge()
and at last do them all together one time including slot bridge, to avoid to
call assign resources several times, when there are several bridges under the
slot bridge.
use pci_assign_nassigned_bridge_resources

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/hotplug/pciehp_pci.c | 23 +++++++++++++++++------
1 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 2173310..0a16444 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -53,17 +53,15 @@ static int __ref pciehp_add_bridge(struct pci_dev *dev)
busnr = pci_scan_bridge(parent, dev, busnr, pass);
if (!dev->subordinate)
return -1;
- pci_bus_size_bridges(dev->subordinate);
- pci_bus_assign_resources(parent);
- pci_enable_bridges(parent);
- pci_bus_add_devices(parent);
+
return 0;
}

int pciehp_configure_device(struct slot *p_slot)
{
struct pci_dev *dev;
- struct pci_bus *parent = p_slot->ctrl->pcie->port->subordinate;
+ struct pci_dev *bridge = p_slot->ctrl->pcie->port;
+ struct pci_bus *parent = bridge->subordinate;
int num, fn;
struct controller *ctrl = p_slot->ctrl;

@@ -96,12 +94,25 @@ int pciehp_configure_device(struct slot *p_slot)
(dev->hdr_type == PCI_HEADER_TYPE_CARDBUS)) {
pciehp_add_bridge(dev);
}
+ pci_dev_put(dev);
+ }
+
+ pci_assign_unassigned_bridge_resources(bridge);
+
+ for (fn = 0; fn < 8; fn++) {
+ dev = pci_get_slot(parent, PCI_DEVFN(0, fn));
+ if (!dev)
+ continue;
+ if ((dev->class >> 16) == PCI_BASE_CLASS_DISPLAY) {
+ pci_dev_put(dev);
+ continue;
+ }
pci_configure_slot(dev);
pci_dev_put(dev);
}

- pci_bus_assign_resources(parent);
pci_bus_add_devices(parent);
+
return 0;
}

--
1.6.4.2

2010-01-16 02:44:31

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 04/11] pci: don't shrink bridge resources

when we are clearing leaf bridge resource and try to get big one, we could
shrink the bridge if there is no resource under it.

let check with old resource size and make sure we are trying to get big one.

-v2: keep disable window print out, still could happen on non pci hotplug system

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index c704e5c..5eebec8 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -387,7 +387,7 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size)
{
struct pci_dev *dev;
struct resource *b_res = find_free_bus_resource(bus, IORESOURCE_IO);
- unsigned long size = 0, size1 = 0;
+ unsigned long size = 0, size1 = 0, old_size;

if (!b_res)
return;
@@ -412,12 +412,17 @@ static void pbus_size_io(struct pci_bus *bus, resource_size_t min_size)
}
if (size < min_size)
size = min_size;
+ old_size = resource_size(b_res);
+ if (old_size == 1)
+ old_size = 0;
/* To be fixed in 2.5: we should have sort of HAVE_ISA
flag in the struct pci_bus. */
#if defined(CONFIG_ISA) || defined(CONFIG_EISA)
size = (size & 0xff) + ((size & ~0xffUL) << 2);
#endif
size = ALIGN(size + size1, 4096);
+ if (size < old_size)
+ size = old_size;
if (!size) {
if (b_res->start || b_res->end)
dev_info(&bus->self->dev, "disabling bridge window "
@@ -438,7 +443,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
unsigned long type, resource_size_t min_size)
{
struct pci_dev *dev;
- resource_size_t min_align, align, size;
+ resource_size_t min_align, align, size, old_size;
resource_size_t aligns[12]; /* Alignments from 1Mb to 2Gb */
int order, max_order;
struct resource *b_res = find_free_bus_resource(bus, type);
@@ -488,6 +493,11 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
}
if (size < min_size)
size = min_size;
+ old_size = resource_size(b_res);
+ if (old_size == 1)
+ old_size = 0;
+ if (size < old_size)
+ size = old_size;

align = 0;
min_align = 0;
--
1.6.4.2

2010-01-16 02:44:45

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 02/11] pci: add failed_list to record failed one for pci_bus_assign_resources

so later we can do sth with those failed one

-v2: store start, end, flags aside. so could keep res cleared when assign
failed. and make following assignment of its children do not go wild

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 66 ++++++++++++++++++++++++++++++++++++++++++++---
1 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 3dcbb65..c704e5c 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -27,7 +27,52 @@
#include <linux/slab.h>
#include "pci.h"

-static void pbus_assign_resources_sorted(const struct pci_bus *bus)
+struct resource_list_x {
+ struct resource_list_x *next;
+ struct resource *res;
+ struct pci_dev *dev;
+ resource_size_t start;
+ resource_size_t end;
+ unsigned long flags;
+};
+
+static void add_to_failed_list(struct resource_list_x *head,
+ struct pci_dev *dev, struct resource *res)
+{
+ struct resource_list_x *list = head;
+ struct resource_list_x *ln = list->next;
+ struct resource_list_x *tmp;
+
+ tmp = kmalloc(sizeof(*tmp), GFP_KERNEL);
+ if (!tmp) {
+ pr_warning("add_to_failed_list: kmalloc() failed!\n");
+ return;
+ }
+
+ tmp->next = ln;
+ tmp->res = res;
+ tmp->dev = dev;
+ tmp->start = res->start;
+ tmp->end = res->end;
+ tmp->flags = res->flags;
+ list->next = tmp;
+}
+
+static void free_failed_list(struct resource_list_x *head)
+{
+ struct resource_list_x *list, *tmp;
+
+ for (list = head->next; list;) {
+ tmp = list;
+ list = list->next;
+ kfree(tmp);
+ }
+
+ head->next = NULL;
+}
+
+static void pbus_assign_resources_sorted(const struct pci_bus *bus,
+ struct resource_list_x *fail_head)
{
struct pci_dev *dev;
struct resource *res;
@@ -58,6 +103,13 @@ static void pbus_assign_resources_sorted(const struct pci_bus *bus)
res = list->res;
idx = res - &list->dev->resource[0];
if (pci_assign_resource(list->dev, idx)) {
+ if (fail_head && !pci_is_root_bus(list->dev->bus)) {
+ /*
+ * device need to keep flags and size
+ * for next try
+ */
+ add_to_failed_list(fail_head, list->dev, res);
+ }
res->start = 0;
res->end = 0;
res->flags = 0;
@@ -572,19 +624,20 @@ void __ref pci_bus_size_bridges(struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_size_bridges);

-void __ref pci_bus_assign_resources(const struct pci_bus *bus)
+static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,
+ struct resource_list_x *fail_head)
{
struct pci_bus *b;
struct pci_dev *dev;

- pbus_assign_resources_sorted(bus);
+ pbus_assign_resources_sorted(bus, fail_head);

list_for_each_entry(dev, &bus->devices, bus_list) {
b = dev->subordinate;
if (!b)
continue;

- pci_bus_assign_resources(b);
+ __pci_bus_assign_resources(b, fail_head);

switch (dev->class >> 8) {
case PCI_CLASS_BRIDGE_PCI:
@@ -602,6 +655,11 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
}
}
+
+void __ref pci_bus_assign_resources(const struct pci_bus *bus)
+{
+ __pci_bus_assign_resources(bus, NULL);
+}
EXPORT_SYMBOL(pci_bus_assign_resources);

static void pci_bridge_release_resources(struct pci_bus *bus,
--
1.6.4.2

2010-01-16 02:43:28

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 01/11] pci: add pci_bridge_release_unused_res and pci_bus_release_unused_bridge_res

so later we could use it to release small resource before pci assign unassign res

-v2: change name to release_child_resources according to Jesse
-v3: according to Linus, move release_child_resources to resource.c
also need to put the lock around them all to avoid recursive deep.
(my test case only have one level that need to be released)
-v4: seperate release_child_resources to another patch
according to Bjorn, change hard code 3 to PCI_BRIDGE_RESOURCE_END
-v5: remove unused in the name, and also use enum for release type according
to Jesse

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 95 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 52fbd42..3dcbb65 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -604,6 +604,101 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_assign_resources);

+static void pci_bridge_release_resources(struct pci_bus *bus,
+ unsigned long type)
+{
+ int idx;
+ bool changed = false;
+ struct pci_dev *dev;
+ struct resource *r;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;
+
+ dev = bus->self;
+ for (idx = PCI_BRIDGE_RESOURCES; idx <= PCI_BRIDGE_RESOURCE_END;
+ idx++) {
+ r = &dev->resource[idx];
+ if ((r->flags & type_mask) != type)
+ continue;
+ if (!r->parent)
+ continue;
+ /*
+ * if there are children under that, we should release them
+ * all
+ */
+ release_child_resources(r);
+ if (!release_resource(r)) {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ "resource %d %pR released\n", idx, r);
+ /* keep the old size */
+ r->end = resource_size(r) - 1;
+ r->start = 0;
+ r->flags = 0;
+ changed = true;
+ }
+ }
+
+ if (changed) {
+ if (type & IORESOURCE_PREFETCH) {
+ /* avoiding touch the one without PREF */
+ type = IORESOURCE_PREFETCH;
+ }
+ __pci_setup_bridge(bus, type);
+ }
+}
+
+enum release_type {
+ leaf_only,
+ whole_subtree,
+};
+/*
+ * try to release pci bridge resources that is from leaf bridge,
+ * so we can allocate big new one later
+ */
+static void __ref pci_bus_release_bridge_resources(struct pci_bus *bus,
+ unsigned long type,
+ enum release_type rel_type)
+{
+ struct pci_dev *dev;
+ bool is_leaf_bridge = true;
+
+ list_for_each_entry(dev, &bus->devices, bus_list) {
+ struct pci_bus *b = dev->subordinate;
+ if (!b)
+ continue;
+
+ switch (dev->class >> 8) {
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ is_leaf_bridge = false;
+ break;
+
+ case PCI_CLASS_BRIDGE_PCI:
+ default:
+ is_leaf_bridge = false;
+ if (rel_type == whole_subtree)
+ pci_bus_release_bridge_resources(b, type,
+ whole_subtree);
+ break;
+ }
+ }
+
+ /* The root bus? */
+ if (!bus->self)
+ return;
+
+ switch (bus->self->class >> 8) {
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ break;
+
+ case PCI_CLASS_BRIDGE_PCI:
+ default:
+ if (((rel_type == leaf_only) && is_leaf_bridge) ||
+ (rel_type == whole_subtree))
+ pci_bridge_release_resources(bus, type);
+ break;
+ }
+}
+
static void pci_bus_dump_res(struct pci_bus *bus)
{
int i;
--
1.6.4.2

2010-01-16 02:45:56

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 10/11] pciehp: add support for bridge resource reallocation

From: Kenji Kaneshige <[email protected]>

With this patch, pciehp driver try to clear PCI bridge resources to
parent bridge (root port or switch downstream port) of the slot

so we can shrink pci bridge resource for those port

This feature is enabled when 'pciehp_realloc' option is specified.

-v2: make it could be appiled after Yinghai patchset that touch pci bridge
resource also remove poweron check, because pci_bridge_release_res will
check child at first

Signed-off-by: Kenji Kaneshige <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/hotplug/pciehp.h | 1 +
drivers/pci/hotplug/pciehp_core.c | 7 +++++++
drivers/pci/hotplug/pciehp_pci.c | 4 ++++
3 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4ed76b4..78cf0f4 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -43,6 +43,7 @@ extern int pciehp_poll_mode;
extern int pciehp_poll_time;
extern int pciehp_debug;
extern int pciehp_force;
+extern int pciehp_realloc;
extern struct workqueue_struct *pciehp_wq;

#define dbg(format, arg...) \
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 920f820..8f990ad 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -41,6 +41,7 @@ int pciehp_debug;
int pciehp_poll_mode;
int pciehp_poll_time;
int pciehp_force;
+int pciehp_realloc;
struct workqueue_struct *pciehp_wq;

#define DRIVER_VERSION "0.4"
@@ -55,10 +56,12 @@ module_param(pciehp_debug, bool, 0644);
module_param(pciehp_poll_mode, bool, 0644);
module_param(pciehp_poll_time, int, 0644);
module_param(pciehp_force, bool, 0644);
+module_param(pciehp_realloc, bool, 0644);
MODULE_PARM_DESC(pciehp_debug, "Debugging mode enabled or not");
MODULE_PARM_DESC(pciehp_poll_mode, "Using polling mechanism for hot-plug events or not");
MODULE_PARM_DESC(pciehp_poll_time, "Polling mechanism frequency, in seconds");
MODULE_PARM_DESC(pciehp_force, "Force pciehp, even if _OSC and OSHP are missing");
+MODULE_PARM_DESC(pciehp_realloc, "Realloc resources for slot's parent bridge");

#define PCIE_MODULE_NAME "pciehp"

@@ -272,6 +275,10 @@ static int pciehp_probe(struct pcie_device *dev)
if (!occupied && poweron && POWER_CTRL(ctrl))
pciehp_power_off_slot(slot);

+ /* Release I/O window of the slots's parent bridge */
+ if (pciehp_realloc)
+ pci_bridge_release_res(dev->port->subordinate);
+
return 0;

err_out_free_ctrl_slot:
diff --git a/drivers/pci/hotplug/pciehp_pci.c b/drivers/pci/hotplug/pciehp_pci.c
index 0a16444..f6119e4 100644
--- a/drivers/pci/hotplug/pciehp_pci.c
+++ b/drivers/pci/hotplug/pciehp_pci.c
@@ -166,5 +166,9 @@ int pciehp_unconfigure_device(struct slot *p_slot)
pci_dev_put(temp);
}

+ /* Release I/O window of the slots's parent bridge */
+ if (pciehp_realloc)
+ pci_bridge_release_res(parent);
+
return rc;
}
--
1.6.4.2

2010-01-16 02:46:07

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 06/11] pci: introduce pci_assign_unassigned_bridge_resources

for pciehp to use it later

pci_setup_bridge() will not check enabled for the slot bridge, otherwise
update res is not updated to bridge BAR. that is bridge is enabled already for
port service.

-v2: update it with resource_list_x
-v3: remove the clear busmaster. according to Kenji

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 92 ++++++++++++++++++++++++++++++++++++++++++++--
include/linux/pci.h | 1 +
2 files changed, 89 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 8ad515e..c348882 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -71,6 +71,50 @@ static void free_failed_list(struct resource_list_x *head)
head->next = NULL;
}

+static void pdev_assign_resources_sorted(struct pci_dev *dev,
+ struct resource_list_x *fail_head)
+{
+ struct resource *res;
+ struct resource_list head, *list, *tmp;
+ int idx;
+ u16 class = dev->class >> 8;
+
+ head.next = NULL;
+
+ /* Don't touch classless devices or host bridges or ioapics. */
+ if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
+ return;
+
+ /* Don't touch ioapic devices already enabled by firmware */
+ if (class == PCI_CLASS_SYSTEM_PIC) {
+ u16 command;
+ pci_read_config_word(dev, PCI_COMMAND, &command);
+ if (command & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY))
+ return;
+ }
+
+ pdev_sort_resources(dev, &head);
+
+ for (list = head.next; list;) {
+ res = list->res;
+ idx = res - &list->dev->resource[0];
+ if (pci_assign_resource(list->dev, idx)) {
+ if (fail_head && !pci_is_root_bus(list->dev->bus)) {
+ /*
+ * device need to keep flags and size
+ * for second try
+ */
+ add_to_failed_list(fail_head, list->dev, res);
+ }
+ res->start = 0;
+ res->end = 0;
+ res->flags = 0;
+ }
+ tmp = list;
+ list = list->next;
+ kfree(tmp);
+ }
+}
static void pbus_assign_resources_sorted(const struct pci_bus *bus,
struct resource_list_x *fail_head)
{
@@ -278,9 +322,6 @@ static void __pci_setup_bridge(struct pci_bus *bus, unsigned long type)
{
struct pci_dev *bridge = bus->self;

- if (pci_is_enabled(bridge))
- return;
-
dev_info(&bridge->dev, "PCI bridge to [bus %02x-%02x]\n",
bus->secondary, bus->subordinate);

@@ -651,7 +692,8 @@ static void __ref __pci_bus_assign_resources(const struct pci_bus *bus,

switch (dev->class >> 8) {
case PCI_CLASS_BRIDGE_PCI:
- pci_setup_bridge(b);
+ if (!pci_is_enabled(dev))
+ pci_setup_bridge(b);
break;

case PCI_CLASS_BRIDGE_CARDBUS:
@@ -672,6 +714,34 @@ void __ref pci_bus_assign_resources(const struct pci_bus *bus)
}
EXPORT_SYMBOL(pci_bus_assign_resources);

+static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
+ struct resource_list_x *fail_head)
+{
+ struct pci_bus *b;
+
+ pdev_assign_resources_sorted((struct pci_dev *)bridge, fail_head);
+
+ b = bridge->subordinate;
+ if (!b)
+ return;
+
+ __pci_bus_assign_resources(b, fail_head);
+
+ switch (bridge->class >> 8) {
+ case PCI_CLASS_BRIDGE_PCI:
+ pci_setup_bridge(b);
+ break;
+
+ case PCI_CLASS_BRIDGE_CARDBUS:
+ pci_setup_cardbus(b);
+ break;
+
+ default:
+ dev_info(&bridge->dev, "not setting up bridge for bus "
+ "%04x:%02x\n", pci_domain_nr(b), b->number);
+ break;
+ }
+}
static void pci_bridge_release_resources(struct pci_bus *bus,
unsigned long type)
{
@@ -934,3 +1004,17 @@ enable_and_dump:
pci_bus_dump_resources(bus);
}
}
+
+void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
+{
+ struct pci_bus *bus;
+ struct pci_bus *parent = bridge->subordinate;
+ int retval;
+
+ pci_bus_size_bridges(parent);
+ __pci_bridge_assign_resources(bridge, NULL);
+ retval = pci_reenable_device(bridge);
+ pci_set_master(bridge);
+ pci_enable_bridges(parent);
+}
+EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6b2949c..0e240ed 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -806,6 +806,7 @@ void pci_bus_assign_resources(const struct pci_bus *bus);
void pci_bus_size_bridges(struct pci_bus *bus);
int pci_claim_resource(struct pci_dev *, int);
void pci_assign_unassigned_resources(void);
+void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
void pdev_enable_device(struct pci_dev *);
void pdev_sort_resources(struct pci_dev *, struct resource_list *);
int pci_enable_resources(struct pci_dev *, int mask);
--
1.6.4.2

2010-01-16 02:46:25

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 09/11] pci: pci_bridge_release_res

prepare for pciehp_realloc

it will clear the resource size for bridge

-v2: patrick Keller pointed out need to export it...

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/setup-bus.c | 43 +++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 1 +
2 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 2ff34ca..89a531f 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -742,6 +742,49 @@ static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
break;
}
}
+
+void pci_bridge_release_res(struct pci_bus *bus)
+{
+ int idx;
+ bool changed = false;
+ struct pci_dev *dev;
+ struct resource *r;
+
+ /* The root bus? */
+ if (!bus->self)
+ return;
+
+ /* for pci bridges res only */
+ dev = bus->self;
+ if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
+ return;
+
+ for (idx = PCI_BRIDGE_RESOURCES; idx < PCI_BRIDGE_RESOURCES + 3;
+ idx++) {
+ r = &dev->resource[idx];
+ if (!r->parent)
+ continue;
+
+ /* if there are children under that, we should not release it */
+ if (r->child)
+ continue;
+
+ if (!release_resource(r)) {
+ dev_printk(KERN_DEBUG, &dev->dev,
+ "resource %d %pR released\n", idx, r);
+ /* old size is not kept */
+ r->start = 0;
+ r->end = 0;
+ r->flags = 0;
+ changed = true;
+ }
+ }
+
+ if (changed)
+ pci_setup_bridge(bus);
+}
+EXPORT_SYMBOL_GPL(pci_bridge_release_res);
+
static void pci_bridge_release_resources(struct pci_bus *bus,
unsigned long type)
{
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0e240ed..b8c7e45 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -805,6 +805,7 @@ int pci_vpd_truncate(struct pci_dev *dev, size_t size);
void pci_bus_assign_resources(const struct pci_bus *bus);
void pci_bus_size_bridges(struct pci_bus *bus);
int pci_claim_resource(struct pci_dev *, int);
+void pci_bridge_release_res(struct pci_bus *bus);
void pci_assign_unassigned_resources(void);
void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge);
void pdev_enable_device(struct pci_dev *);
--
1.6.4.2

2010-01-16 03:57:55

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 01/11] pci: add pci_bridge_release_unused_res and pci_bus_release_unused_bridge_res

Hi Yinghai,

First, thank you very much for writing the summary in patch 0/11.
I feel like I finally understand what you are trying to do with
this series.

> +/*
> + * try to release pci bridge resources that is from leaf bridge,
> + * so we can allocate big new one later
> + */
> +static void __ref pci_bus_release_bridge_resources(struct pci_bus *bus,
> + unsigned long type,
> + enum release_type rel_type)
> +{
> + struct pci_dev *dev;
> + bool is_leaf_bridge = true;
> +
> + list_for_each_entry(dev, &bus->devices, bus_list) {
> + struct pci_bus *b = dev->subordinate;
> + if (!b)
> + continue;
> +

How about settting

is_leaf_bridge = false;

here, and then you don't need the repeated assignments below?

> + switch (dev->class >> 8) {
> + case PCI_CLASS_BRIDGE_CARDBUS:
> + is_leaf_bridge = false;
> + break;
> +
> + case PCI_CLASS_BRIDGE_PCI:
> + default:
> + is_leaf_bridge = false;
> + if (rel_type == whole_subtree)
> + pci_bus_release_bridge_resources(b, type,
> + whole_subtree);
> + break;
> + }
> + }
> +
> + /* The root bus? */
> + if (!bus->self)
> + return;

Won't work correctly on a non-materialized root bridge.

Use pci_is_root_bus() instead.

Thanks,
/ac

2010-01-16 04:05:54

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 02/11] pci: add failed_list to record failed one for pci_bus_assign_resources

Please change the title of this patch to:

pci: pbus_assign_resources_sorted now records failed insertions

> so later we can do sth with those failed one

I don't understand what "sth" means here.

thanks,
/ac

2010-01-16 04:20:15

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 03/11] pci: reject mmio range start from 0 on pci_bridge read

Please change title of patch to:

pci: pci_read_bridge_bases rejects MMIO ranges starting at 0

* Yinghai Lu <[email protected]>:
> that is wrong.
>
> exposed by that patch that doesn's shrink pci bridge res.

Ok, I don't understand how this patch interacts with patch 4/11.
Am I correct in understanding that something in 4/11 exposes a
problem that 3/11 fixes?

I'm just looking for a better explanation of *why* we need this
patch here.

> -v2: change to "bar reading" to "reg reading"
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/probe.c | 18 +++++++++++++++---
> 1 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 11824d7..70b1f74 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -316,13 +316,17 @@ void __devinit pci_read_bridge_bases(struct pci_bus *child)
> limit |= (io_limit_hi << 16);
> }
>
> - if (base <= limit) {
> + if (base <= limit && base) {

This construct (and the similar one below) is a little
non-idiomatic.

if (base && base <= limit)

reads a lot more naturally to me.

thanks,
/ac

2010-01-16 04:38:52

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 05/11] pci: update bridge res to get more big range in pci assign unssign

* Yinghai Lu <[email protected]>:
> BIOS separate IO range between several IOHs, and on some slots,
> BIOS assign the resource to the bridge, but stop assigning
> resource to the device under that bridge, because the device
> need big resource.
>
> 1. pci assign unassign and record the failed device resource.
> 2. clear the BIOS assigned resource of the parent bridge of fail device
> 3. go back and call pci assign unsigned
> 4. if it still fail, will go up more bridges. and clear and try again.

I agree with Jesse and Bjorn. I really hate introducing a new
command line param.

The goal here should be to do the right thing, all the time,
without requiring command line parameters.

I understand that you are avoiding a change to existing behavior
by defaulting to try=1.

What I don't understand is how you expect your change to benefit
any actual users. The usage model you propose is:

- user encounters failure

- user emails lkml/linux-pci and says "why doesn't my
device get resources?"

- we tell user, "please reboot with try=N for increasing
values of N until it works"

Why shouldn't we be aggressive all the time?

If this patch series is going to be accepted, we should turn it
on all the time. Everyone will get the benefits.

If it breaks machines, depending on the bug reports, it will end
up either:

a) we missed an implementation detail (easily fixed)

or

b) the design/strategy is wrong, in which case we need to
take a step back and think about a different approach to
solve the problem you're trying to solve

For this patch, I don't mind the changes to
pci_assign_unassigned_resources(), but I really do not agree with
the configurability introduced in pci_setup().

thanks,
/ac

2010-01-16 04:47:38

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 06/11] pci: introduce pci_assign_unassigned_bridge_resources

> +static void pdev_assign_resources_sorted(struct pci_dev *dev,
> + struct resource_list_x *fail_head)
> +{
> + struct resource *res;
> + struct resource_list head, *list, *tmp;
> + int idx;
> + u16 class = dev->class >> 8;
> +
> + head.next = NULL;
> +
> + /* Don't touch classless devices or host bridges or ioapics. */
> + if (class == PCI_CLASS_NOT_DEFINED || class == PCI_CLASS_BRIDGE_HOST)
> + return;
> +
> + /* Don't touch ioapic devices already enabled by firmware */
> + if (class == PCI_CLASS_SYSTEM_PIC) {
> + u16 command;
> + pci_read_config_word(dev, PCI_COMMAND, &command);
> + if (command & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY))
> + return;
> + }
> +
> + pdev_sort_resources(dev, &head);
> +
> + for (list = head.next; list;) {
> + res = list->res;
> + idx = res - &list->dev->resource[0];
> + if (pci_assign_resource(list->dev, idx)) {
> + if (fail_head && !pci_is_root_bus(list->dev->bus)) {
> + /*
> + * device need to keep flags and size
> + * for second try
> + */
> + add_to_failed_list(fail_head, list->dev, res);
> + }
> + res->start = 0;
> + res->end = 0;
> + res->flags = 0;
> + }
> + tmp = list;
> + list = list->next;
> + kfree(tmp);
> + }
> +}

This is extremely similar to pbus_assign_resources_sorted(). Any
chance you could factor out the commonalities and reduce the
amount of duplicated code?

> +static void __ref __pci_bridge_assign_resources(const struct pci_dev *bridge,
> + struct resource_list_x *fail_head)
> +{
> + struct pci_bus *b;
> +
> + pdev_assign_resources_sorted((struct pci_dev *)bridge, fail_head);
> +
> + b = bridge->subordinate;
> + if (!b)
> + return;
> +
> + __pci_bus_assign_resources(b, fail_head);
> +
> + switch (bridge->class >> 8) {
> + case PCI_CLASS_BRIDGE_PCI:
> + pci_setup_bridge(b);
> + break;
> +
> + case PCI_CLASS_BRIDGE_CARDBUS:
> + pci_setup_cardbus(b);
> + break;
> +
> + default:
> + dev_info(&bridge->dev, "not setting up bridge for bus "
> + "%04x:%02x\n", pci_domain_nr(b), b->number);

I think this should probably be a dev_dbg.

thanks,
/ac

2010-01-16 04:52:23

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 08/11] pci: pciehp second try to get big range for pcie devices

Code looks ok, just need to fix up some of the explanations.

> handle the case the slot bridge that doesn't get pre-allocated big enough res
> from FW.

Please change "res" to "resource" for changelog posterity.

> @@ -1009,12 +1009,60 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
> {
> struct pci_bus *bus;
> struct pci_bus *parent = bridge->subordinate;
> + bool second_tried = false;

second_try ?

> + struct resource_list_x head, *list;
> int retval;
> + unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> + IORESOURCE_PREFETCH;
> +
> + head.next = NULL;
>
> +again:
> pci_bus_size_bridges(parent);
> - __pci_bridge_assign_resources(bridge, NULL);
> + __pci_bridge_assign_resources(bridge, &head);
> retval = pci_reenable_device(bridge);
> pci_set_master(bridge);
> pci_enable_bridges(parent);
> +
> + /* any device complain? */
> + if (!head.next)
> + return;
> +
> + if (second_tried) {
> + /* still fail, don't need to try more */
> + free_failed_list(&head);
> + return;
> + }
> +
> + second_tried = true;
> + printk(KERN_DEBUG "PCI: second try to assign unassigned res\n");

res -> resource

> + /*
> + * Try to release leaf bridge's resources that doesn't fit resource of
> + * child device under that bridge
> + */
> + for (list = head.next; list;) {
> + unsigned long flags = list->flags;
> +
> + bus = list->dev->bus;
> + pci_bus_release_bridge_resources(bus, flags & type_mask,
> + whole_subtree);
> + list = list->next;
> + }
> + /* retore size and flags */

retore -> restore

/ac

2010-01-16 04:58:33

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 09/11] pci: pci_bridge_release_res

> +void pci_bridge_release_res(struct pci_bus *bus)
> +{
> + int idx;
> + bool changed = false;
> + struct pci_dev *dev;
> + struct resource *r;
> +
> + /* The root bus? */
> + if (!bus->self)
> + return;
> +
> + /* for pci bridges res only */
> + dev = bus->self;
> + if ((dev->class >> 8) != PCI_CLASS_BRIDGE_PCI)
> + return;
> +
> + for (idx = PCI_BRIDGE_RESOURCES; idx < PCI_BRIDGE_RESOURCES + 3;
> + idx++) {
> + r = &dev->resource[idx];
> + if (!r->parent)
> + continue;
> +
> + /* if there are children under that, we should not release it */
> + if (r->child)
> + continue;
> +
> + if (!release_resource(r)) {
> + dev_printk(KERN_DEBUG, &dev->dev,
> + "resource %d %pR released\n", idx, r);
> + /* old size is not kept */
> + r->start = 0;
> + r->end = 0;
> + r->flags = 0;
> + changed = true;
> + }
> + }
> +
> + if (changed)
> + pci_setup_bridge(bus);
> +}
> +EXPORT_SYMBOL_GPL(pci_bridge_release_res);
> +
> static void pci_bridge_release_resources(struct pci_bus *bus,
> unsigned long type)

My brain is melting.

Why do we have pci_bridge_release_res and pci_bridge_release_resources?

The code is similar, but not the same. The names are similar but
not the same.

Any chance you could factor similarities, and then write a
comment that explains the differences to a developer?

thanks,
/ac

2010-01-16 04:59:15

by Alex Chiang

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

* Yinghai Lu <[email protected]>:
> From: Kenji Kaneshige <[email protected]>
>
> With this patch, pciehp driver try to clear PCI bridge resources to
> parent bridge (root port or switch downstream port) of the slot
>
> so we can shrink pci bridge resource for those port
>
> This feature is enabled when 'pciehp_realloc' option is specified.
>
> -v2: make it could be appiled after Yinghai patchset that touch pci bridge
> resource also remove poweron check, because pci_bridge_release_res will
> check child at first

Same comment as my earlier patch. Why not just make this the
default behavior, instead of introducing yet another command line
parameter for users to guess at?

/ac

2010-01-16 05:23:51

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

thanks for checking the patches.

will address them.

Yinghai

2010-01-16 09:36:18

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 10/11] pciehp: add support for bridge resource reallocation

On 01/15/2010 08:59 PM, Alex Chiang wrote:
> * Yinghai Lu <[email protected]>:
>> From: Kenji Kaneshige <[email protected]>
>>
>> With this patch, pciehp driver try to clear PCI bridge resources to
>> parent bridge (root port or switch downstream port) of the slot
>>
>> so we can shrink pci bridge resource for those port
>>
>> This feature is enabled when 'pciehp_realloc' option is specified.
>>
>> -v2: make it could be appiled after Yinghai patchset that touch pci bridge
>> resource also remove poweron check, because pci_bridge_release_res will
>> check child at first
>
> Same comment as my earlier patch. Why not just make this the
> default behavior, instead of introducing yet another command line
> parameter for users to guess at?

it will break Eric's setup/

YH

2010-01-16 10:12:17

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 01/11] pci: add pci_bridge_release_unused_res and pci_bus_release_unused_bridge_res

On 01/15/2010 07:57 PM, Alex Chiang wrote:
> Hi Yinghai,
>
> First, thank you very much for writing the summary in patch 0/11.
> I feel like I finally understand what you are trying to do with
> this series.
>
>> +/*
>> + * try to release pci bridge resources that is from leaf bridge,
>> + * so we can allocate big new one later
>> + */
>> +static void __ref pci_bus_release_bridge_resources(struct pci_bus *bus,
>> + unsigned long type,
>> + enum release_type rel_type)
>> +{
>> + struct pci_dev *dev;
>> + bool is_leaf_bridge = true;
>> +
>> + list_for_each_entry(dev, &bus->devices, bus_list) {
>> + struct pci_bus *b = dev->subordinate;
>> + if (!b)
>> + continue;
>> +
>
> How about settting
>
> is_leaf_bridge = false;
>
> here, and then you don't need the repeated assignments below?
>

no
that is AND

>> + switch (dev->class >> 8) {
>> + case PCI_CLASS_BRIDGE_CARDBUS:
>> + is_leaf_bridge = false;
>> + break;
>> +
>> + case PCI_CLASS_BRIDGE_PCI:
>> + default:
>> + is_leaf_bridge = false;
>> + if (rel_type == whole_subtree)
>> + pci_bus_release_bridge_resources(b, type,
>> + whole_subtree);
>> + break;
>> + }
>> + }
>> +
>> + /* The root bus? */
>> + if (!bus->self)
>> + return;
>
> Won't work correctly on a non-materialized root bridge.
>
> Use pci_is_root_bus() instead.


OK

YH