2013-05-23 17:14:22

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date: Thu Feb 23 19:23:29 2012 -0800
|
| PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional.
That will fail as mmio-non-pref and mmio-pref for bridge will
be next to each other. So we have no chance to extend mmio-non-pref.

We can check fail type and only fall back for io port only, that will
keep mmio type still have must+optional.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

-v2: need to remove assigned entries from optional list too.

Reported-by: Benjamin Herrenschmidt <[email protected]>
Tested-by: Gavin Shan <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
drivers/pci/setup-bus.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -317,6 +317,10 @@ static void __assign_resources_sorted(st
LIST_HEAD(local_fail_head);
struct pci_dev_resource *save_res;
struct pci_dev_resource *dev_res;
+ unsigned long fail_type = 0;
+ struct pci_dev_resource *fail_res;
+ unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
+ IORESOURCE_PREFETCH;

/* Check if optional add_size is there */
if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +352,25 @@ static void __assign_resources_sorted(st
return;
}

+ /* check failed type */
+ list_for_each_entry(fail_res, &local_fail_head, list)
+ fail_type |= fail_res->flags & type_mask;
+ /* only io port fails */
+ if ((fail_type & type_mask) == IORESOURCE_IO) {
+ struct pci_dev_resource *tmp_res;
+
+ /* remove assigned non ioport from head list etc */
+ list_for_each_entry_safe(dev_res, tmp_res, head, list)
+ if (dev_res->res->parent &&
+ !(dev_res->res->flags & IORESOURCE_IO)) {
+ /* remove it from realloc_head list */
+ remove_from_list(realloc_head, dev_res->res);
+ remove_from_list(&save_head, dev_res->res);
+ list_del(&dev_res->list);
+ kfree(dev_res);
+ }
+ }
+
free_list(&local_fail_head);
/* Release assigned resource */
list_for_each_entry(dev_res, head, list)


2013-05-24 17:26:10

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional

On Thu, May 23, 2013 at 11:11 AM, Yinghai Lu <[email protected]> wrote:
> BenH reported that there is some assign unassigned resource problem
> in powerpc.
>
> It turns out after
> | commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
> | Date: Thu Feb 23 19:23:29 2012 -0800
> |
> | PCI: Retry on IORESOURCE_IO type allocations
>
> even the root bus does not have io port range, it will keep retrying
> to realloc with mmio.
>
> Current retry logic is : try with must+optional at first, and if
> it fails with any ioport or mmio, it will try must then try to extend
> must with optional.
> That will fail as mmio-non-pref and mmio-pref for bridge will
> be next to each other. So we have no chance to extend mmio-non-pref.
>
> We can check fail type and only fall back for io port only, that will
> keep mmio type still have must+optional.
>
> This will be become more often when we have x86 8 sockets or 32 sockets
> system, and those system will have one root bus per socket.
> They will have some root buses do not have ioport range.
>
> -v2: need to remove assigned entries from optional list too.
>
> Reported-by: Benjamin Herrenschmidt <[email protected]>
> Tested-by: Gavin Shan <[email protected]>
> Signed-off-by: Yinghai Lu <[email protected]>
>
> ---
> drivers/pci/setup-bus.c | 23 +++++++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> Index: linux-2.6/drivers/pci/setup-bus.c
> ===================================================================
> --- linux-2.6.orig/drivers/pci/setup-bus.c
> +++ linux-2.6/drivers/pci/setup-bus.c
> @@ -317,6 +317,10 @@ static void __assign_resources_sorted(st
> LIST_HEAD(local_fail_head);
> struct pci_dev_resource *save_res;
> struct pci_dev_resource *dev_res;
> + unsigned long fail_type = 0;
> + struct pci_dev_resource *fail_res;
> + unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
> + IORESOURCE_PREFETCH;
>
> /* Check if optional add_size is there */
> if (!realloc_head || list_empty(realloc_head))
> @@ -348,6 +352,25 @@ static void __assign_resources_sorted(st
> return;
> }
>
> + /* check failed type */
> + list_for_each_entry(fail_res, &local_fail_head, list)
> + fail_type |= fail_res->flags & type_mask;
> + /* only io port fails */
> + if ((fail_type & type_mask) == IORESOURCE_IO) {
> + struct pci_dev_resource *tmp_res;
> +
> + /* remove assigned non ioport from head list etc */
> + list_for_each_entry_safe(dev_res, tmp_res, head, list)
> + if (dev_res->res->parent &&
> + !(dev_res->res->flags & IORESOURCE_IO)) {
> + /* remove it from realloc_head list */
> + remove_from_list(realloc_head, dev_res->res);
> + remove_from_list(&save_head, dev_res->res);
> + list_del(&dev_res->list);
> + kfree(dev_res);
> + }
> + }

The problem we're trying to solve is that when allocation for type X
fails, we retry allocation for type Y.

This patch handles IO specially. I think it basically says, "if we
only have IO allocation failures, don't retry MEM allocation." But a
clean strategy would also avoid retrying IO allocation if we only had
MEM allocation failures.

Bjorn

> free_list(&local_fail_head);
> /* Release assigned resource */
> list_for_each_entry(dev_res, head, list)

2013-05-24 23:31:59

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH v3] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional

BenH reported that there is some assign unassigned resource problem
in powerpc.

It turns out after
| commit 0c5be0cb0edfe3b5c4b62eac68aa2aa15ec681af
| Date: Thu Feb 23 19:23:29 2012 -0800
|
| PCI: Retry on IORESOURCE_IO type allocations

even the root bus does not have io port range, it will keep retrying
to realloc with mmio.

Current retry logic is : try with must+optional at first, and if
it fails with any ioport or mmio, it will try must then try to extend
must with optional. The reassign will put extended mmio or pref mmio
in the middle of parent resource range.
That will prevent other sibling resources to get must-have resources
or get extended properly.

We can check fail type to see if can we need fall back to must-have
only, that will keep not needed release resource to be must+optional.

Separate three resource type checking if we need to release
assigned resource after requested + add_size try.
1. if there is io port assign fail, will release assigned io port.
2. if there is pref mmio assign fail, release assigned pref mmio.
if assigned pref mmio's parent is non-pref mmio and there
is non-pref mmio assign fail, will release that assigned pref mmio.
3. if there is non-pref mmio assign fail or pref mmio assigned fail,
will release assigned non-pref mmio.

This will be become more often when we have x86 8 sockets or 32 sockets
system, and those system will have one root bus per socket.
They will have some root buses do not have ioport range.

-v2: need to remove assigned entries from optional list too.
-v3: not just checking ioport related, requested by Bjorn.


Reported-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Yinghai Lu <[email protected]>

---
drivers/pci/setup-bus.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 69 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/setup-bus.c
===================================================================
--- linux-2.6.orig/drivers/pci/setup-bus.c
+++ linux-2.6/drivers/pci/setup-bus.c
@@ -300,6 +300,48 @@ static void assign_requested_resources_s
}
}

+static unsigned long pci_fail_res_type_mask(struct list_head *fail_head)
+{
+ struct pci_dev_resource *fail_res;
+ unsigned long mask = 0;
+
+ /* check failed type */
+ list_for_each_entry(fail_res, fail_head, list)
+ mask |= fail_res->flags;
+
+ /*
+ * one pref failed resource will set IORESOURCE_MEM,
+ * as we can allocate pref in non-pref range.
+ * Will release all asssigned non-pref sibling resources
+ * according to that bit.
+ */
+ return mask & (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH);
+}
+
+static bool pci_need_to_release(unsigned long mask, struct resource *res)
+{
+ if (res->flags & IORESOURCE_IO)
+ return !!(mask & IORESOURCE_IO);
+
+ /* check pref at first */
+ if (res->flags & IORESOURCE_PREFETCH) {
+ if (mask & IORESOURCE_PREFETCH)
+ return true;
+ /* count pref if its parent is non-pref */
+ else if ((mask & IORESOURCE_MEM) &&
+ !(res->parent->flags & IORESOURCE_PREFETCH))
+ return true;
+ else
+ return false;
+ }
+
+ if (res->flags & IORESOURCE_MEM)
+ return !!(mask & IORESOURCE_MEM);
+
+ /* should not get here */
+ return false;
+}
+
static void __assign_resources_sorted(struct list_head *head,
struct list_head *realloc_head,
struct list_head *fail_head)
@@ -312,11 +354,24 @@ static void __assign_resources_sorted(st
* if could do that, could get out early.
* if could not do that, we still try to assign requested at first,
* then try to reassign add_size for some resources.
+ *
+ * Separate three resource type checking if we need to release
+ * assigned resource after requested + add_size try.
+ * 1. if there is io port assign fail, will release assigned
+ * io port.
+ * 2. if there is pref mmio assign fail, release assigned
+ * pref mmio.
+ * if assigned pref mmio's parent is non-pref mmio and there
+ * is non-pref mmio assign fail, will release that assigned
+ * pref mmio.
+ * 3. if there is non-pref mmio assign fail or pref mmio
+ * assigned fail, will release assigned non-pref mmio.
*/
LIST_HEAD(save_head);
LIST_HEAD(local_fail_head);
struct pci_dev_resource *save_res;
- struct pci_dev_resource *dev_res;
+ struct pci_dev_resource *dev_res, *tmp_res;
+ unsigned long fail_type;

/* Check if optional add_size is there */
if (!realloc_head || list_empty(realloc_head))
@@ -348,6 +403,19 @@ static void __assign_resources_sorted(st
return;
}

+ /* check failed type */
+ fail_type = pci_fail_res_type_mask(&local_fail_head);
+ /* remove not need to be released assigned res from head list etc */
+ list_for_each_entry_safe(dev_res, tmp_res, head, list)
+ if (dev_res->res->parent &&
+ !pci_need_to_release(fail_type, dev_res->res)) {
+ /* remove it from realloc_head list */
+ remove_from_list(realloc_head, dev_res->res);
+ remove_from_list(&save_head, dev_res->res);
+ list_del(&dev_res->list);
+ kfree(dev_res);
+ }
+
free_list(&local_fail_head);
/* Release assigned resource */
list_for_each_entry(dev_res, head, list)

2013-05-24 23:34:40

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH] PCI: Don't let mmio fallback to must-only, if ioport fails with must+optional

On Fri, May 24, 2013 at 10:25 AM, Bjorn Helgaas <[email protected]> wrote:
>
> The problem we're trying to solve is that when allocation for type X
> fails, we retry allocation for type Y.
>
> This patch handles IO specially. I think it basically says, "if we
> only have IO allocation failures, don't retry MEM allocation." But a
> clean strategy would also avoid retrying IO allocation if we only had
> MEM allocation failures.

Well, that will make it little bit complicated as v3 that is sent in
another mail.

Need to separate ioport, mmio, mmio pref three types.

Yinghai