2015-06-09 23:51:01

by Roland Dreier

[permalink] [raw]
Subject: Regression in 3.10.80 vs. 3.10.79

Hi, I recently updated from 3.10.79 to 3.10.80, and my system wouldn't
boot any more. I tracked this down to commit 92c934b10ec3 ("ACPI /
init: Fix the ordering of acpi_reserve_resources()"). With that
commit reverted, my system is OK again.

What happens is that ahci fails to initialize because
pcim_iomap_regions_request_all() fails with EBUSY, due to a resource
conflict on the first IO region of the ahci device. Since my root
device is on ahci, that's the end of that. I'm sure this is due to a
BIOS / ACPI table bug on my particular platform, but that's scant
comfort when the system won't boot :)

I patched 3.10.80 so that ahci continues to initialize after the
EBUSY, and relevant parts of the kernel log seem to be:

[ 3.836643,26] system 00:06: [io 0x0400-0x047f] could not be reserved
...
[ 3.844112,26] pci 0000:00:1f.2: BAR 0: assigned [io 0x0410-0x0417]
...
[ 6.020040,00] ahci 0000:00:1f.2: BAR 0: can't reserve [io 0x0410-0x0417]

and /proc/ioports shows

0410-0415 : ACPI CPU throttle

So if I'm understanding properly, for some reason we discover but fail
to reserve the region with the ACPI resources, then PCI decides to
assign ahci IO ports into that range, then ACPI loads and reserves
0x0410-0x0415, and then ahci fails to load.

If I fully revert the patch, then I see

[ 3.853857,08] system 00:06: [io 0x0400-0x047f] has been reserved
...
[ 3.861806,08] pci 0000:00:1f.2: BAR 0: assigned [io 0x0820-0x0827]

We're able to reserve the range, and then PCI assigns ahci into a
non-conflicting range.

I understand that the change here fixed another regression, but I'm
wondering if there's a way to make everyone happy here? I can provide
debugging info from my system as required...

Thanks,
Roland


2015-06-09 23:52:05

by Roland Dreier

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Tue, Jun 9, 2015 at 4:43 PM, Roland Dreier <[email protected]> wrote:
> I understand that the change here fixed another regression, but I'm
> wondering if there's a way to make everyone happy here? I can provide
> debugging info from my system as required...

Maybe sent my mail too quickly, as I have some thoughts after looking
at the code.

>From the link order, drivers/acpi init wll be called before
drivers/pnp init, right? In my case, the acpi resources ("ACPI
PM1a_EVT_BLK") etc are under a pnp bus. But if acpi requests the
resources first, then pnp can't request the enclosing range.

Is the right fix to make sure the pnp init happens before acpi
requests resources?

2015-06-10 22:57:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Tuesday, June 09, 2015 04:51:35 PM Roland Dreier wrote:
> On Tue, Jun 9, 2015 at 4:43 PM, Roland Dreier <[email protected]> wrote:
> > I understand that the change here fixed another regression, but I'm
> > wondering if there's a way to make everyone happy here? I can provide
> > debugging info from my system as required...
>
> Maybe sent my mail too quickly, as I have some thoughts after looking
> at the code.
>
> From the link order, drivers/acpi init wll be called before
> drivers/pnp init, right? In my case, the acpi resources ("ACPI
> PM1a_EVT_BLK") etc are under a pnp bus. But if acpi requests the
> resources first, then pnp can't request the enclosing range.
>
> Is the right fix to make sure the pnp init happens before acpi
> requests resources?

I need to have a deeper look at things.

Can you please file a bug at bugzilla.kernel.org to track this and attach
the output of acpidump from the affected system in there?

Rafael

2015-06-11 19:02:07

by Roland Dreier

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Wed, Jun 10, 2015 at 4:23 PM, Rafael J. Wysocki <[email protected]> wrote:
> Can you please file a bug at bugzilla.kernel.org to track this and attach
> the output of acpidump from the affected system in there?

Done: https://bugzilla.kernel.org/show_bug.cgi?id=99831

Thanks!

2015-06-11 20:24:58

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Thursday, June 11, 2015 12:01:40 PM Roland Dreier wrote:
> On Wed, Jun 10, 2015 at 4:23 PM, Rafael J. Wysocki <[email protected]> wrote:
> > Can you please file a bug at bugzilla.kernel.org to track this and attach
> > the output of acpidump from the affected system in there?
>
> Done: https://bugzilla.kernel.org/show_bug.cgi?id=99831

Thanks!

I've looked at things in the meantime and my current theory about what happens
is that the code in reserve_range() (drivers/pnp/system.c) fails to reserve
addres ranges that overlap with the ones previously reserverd by acpi_reserve_resources()
and so the PCI subsystem feels free to use them and then everything falls apart.

Changing the ordering between those two routines would work around that problem,
but in my view that wouldn't be a proper fix. In fact, the role of reserve_range()
is to reserve the resources so as to prevent them from being used going forward,
so they need not be reserved each in one piece. Instead, we can just check if they
overlap with the ones reserved by acpi_reserve_resources() and only request the
non-overlapping parts of them to avoid conflicts.

So I wonder if the patch below makes any difference?

---
drivers/acpi/osl.c | 6 --
drivers/acpi/resource.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pnp/system.c | 51 +++++++++++++++---
include/linux/acpi.h | 10 +++
4 files changed, 182 insertions(+), 14 deletions(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -175,11 +175,7 @@ static void __init acpi_request_region (
if (!addr || !length)
return;

- /* Resources are never freed */
- if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
- request_region(addr, length, desc);
- else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
- request_mem_region(addr, length, desc);
+ acpi_reserve_region(addr, length, gas->space_id, 0, desc);
}

static void __init acpi_reserve_resources(void)
Index: linux-pm/drivers/acpi/resource.c
===================================================================
--- linux-pm.orig/drivers/acpi/resource.c
+++ linux-pm/drivers/acpi/resource.c
@@ -26,6 +26,7 @@
#include <linux/device.h>
#include <linux/export.h>
#include <linux/ioport.h>
+#include <linux/list.h>
#include <linux/slab.h>

#ifdef CONFIG_X86
@@ -621,3 +622,131 @@ int acpi_dev_filter_resource_type(struct
return (type & types) ? 0 : 1;
}
EXPORT_SYMBOL_GPL(acpi_dev_filter_resource_type);
+
+struct acpi_region {
+ struct list_head node;
+ u64 start;
+ u64 end;
+};
+
+static LIST_HEAD(acpi_io_regions);
+static LIST_HEAD(acpi_mem_regions);
+
+static int acpi_add_region(u64 start, u64 end, struct list_head *head,
+ u8 space_id, unsigned long flags, char *desc)
+{
+ struct acpi_region *reg;
+ struct resource *res;
+ unsigned int length = end - start + 1;
+
+ reg = kmalloc(sizeof(*reg), GFP_KERNEL);
+ if (!reg)
+ return -ENOMEM;
+
+ reg->start = start;
+ reg->end = end;
+ list_add(&reg->node, head);
+ res = space_id == ACPI_ADR_SPACE_SYSTEM_IO ?
+ request_region(start, length, desc) :
+ request_mem_region(start, length, desc);
+ if (res) {
+ res->flags &= ~flags;
+ return 0;
+ }
+ return -EIO;
+}
+
+/**
+ * acpi_reserve_region - Reserve an I/O or memory region as a system resource.
+ * @start: Starting address of the region.
+ * @length: Length of the region.
+ * @space_id: Identifier of address space to reserve the region from.
+ * @flags: Resource flags to clear for the region after requesting it.
+ * @desc: Region description (for messages).
+ *
+ * Reserve an I/O or memory region as a system resource to prevent others from
+ * using it. If the new region overlaps with one of the regions (in the given
+ * address space) already reserved by this routine, only the non-overlapping
+ * parts of it will be reserved.
+ *
+ * Returned is either 0 (success) or a negative error code indicating a resource
+ * reservation problem. It is the code of the first encountered error, but the
+ * routine doesn't abort until it has attempted to request all of the parts of
+ * the new region that don't overlap with other regions reserved previously.
+ *
+ * The resources requested by this routine are never released.
+ */
+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc)
+{
+ struct list_head *regions, *head;
+ struct acpi_region *reg;
+ u64 end = start + length - 1;
+ int ret = 0;
+
+ if (space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+ regions = &acpi_io_regions;
+ else if (space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
+ regions = &acpi_mem_regions;
+ else
+ return -EINVAL;
+
+ if (list_empty(regions))
+ return acpi_add_region(start, end, regions, space_id, flags, desc);
+
+ reg = list_first_entry(regions, struct acpi_region, node);
+ if (reg->start > end) {
+ /* The new region goes in front of the first existing one. */
+ return acpi_add_region(start, end, regions, space_id, flags, desc);
+ } else if (reg->end >= start) {
+ head = reg->node.prev;
+ goto overlap;
+ }
+
+ loop:
+ head = NULL;
+ list_for_each_entry_continue(reg, regions, node)
+ if (reg->end < start) {
+ continue;
+ } else if (reg->start > end) {
+ /* No overlap, the new region can be inserted here. */
+ int auxret = acpi_add_region(start, end, reg->node.prev,
+ space_id, flags, desc);
+ return ret ? ret : auxret;
+ } else {
+ head = reg->node.prev;
+ break;
+ }
+
+ if (!head) {
+ /* The new region goes after the last existing one. */
+ int auxret = acpi_add_region(start, end, regions->prev,
+ space_id, flags, desc);
+ return ret ? ret : auxret;
+ }
+
+ overlap:
+ /*
+ * We know that reg->end >= start and reg->start <= end at this point.
+ * The part of the new region immediately preceding the existing
+ * overlapping one can be added right away.
+ */
+ if (reg->start > start) {
+ int auxret = acpi_add_region(start, reg->start - 1, head,
+ space_id, flags, desc);
+ if (!ret)
+ ret = auxret;
+ }
+
+ /*
+ * Skip the overlapping part of the new region and handle the rest as
+ * a new region to insert.
+ */
+ if (reg->end < end) {
+ start = reg->end + 1;
+ goto loop;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(acpi_reserve_region);
Index: linux-pm/include/linux/acpi.h
===================================================================
--- linux-pm.orig/include/linux/acpi.h
+++ linux-pm/include/linux/acpi.h
@@ -342,6 +342,9 @@ int acpi_check_region(resource_size_t st

int acpi_resources_are_enforced(void);

+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc);
+
#ifdef CONFIG_HIBERNATION
void __init acpi_no_s4_hw_signature(void);
#endif
@@ -537,6 +540,13 @@ static inline int acpi_check_region(reso
return 0;
}

+static inline int acpi_reserve_region(u64 start, unsigned int length,
+ u8 space_id, unsigned long flags,
+ char *desc)
+{
+ return -ENXIO;
+}
+
struct acpi_table_header;
static inline int acpi_table_parse(char *id,
int (*handler)(struct acpi_table_header *))
Index: linux-pm/drivers/pnp/system.c
===================================================================
--- linux-pm.orig/drivers/pnp/system.c
+++ linux-pm/drivers/pnp/system.c
@@ -7,6 +7,7 @@
* Bjorn Helgaas <[email protected]>
*/

+#include <linux/acpi.h>
#include <linux/pnp.h>
#include <linux/device.h>
#include <linux/init.h>
@@ -22,25 +23,57 @@ static const struct pnp_device_id pnp_de
{"", 0}
};

+#ifdef CONFIG_ACPI
+static inline bool reserve_region(u64 start, unsigned int length, char *desc)
+{
+ return !acpi_reserve_region(start, length, ACPI_ADR_SPACE_SYSTEM_IO,
+ IORESOURCE_BUSY, desc);
+}
+static inline bool reserve_mem_region(u64 start, unsigned int length, char *desc)
+{
+ return !acpi_reserve_region(start, length, ACPI_ADR_SPACE_SYSTEM_MEMORY,
+ IORESOURCE_BUSY, desc);
+}
+#else
+static inline bool reserve_region(u64 start, unsigned int length, char *desc)
+{
+ struct resource *res;
+
+ res = request_region(start, length, desc);
+ if (res) {
+ res->flags &= ~IORESOURCE_BUSY;
+ return true;
+ }
+ return false;
+}
+static inline bool reserve_mem_region(u64 start, unsigned int length, char *desc)
+{
+ struct resource *res;
+
+ res = request_mem_region(start, length, desc);
+ if (res) {
+ res->flags &= ~IORESOURCE_BUSY;
+ return true;
+ }
+ return false;
+}
+#endif
+
static void reserve_range(struct pnp_dev *dev, struct resource *r, int port)
{
char *regionid;
const char *pnpid = dev_name(&dev->dev);
resource_size_t start = r->start, end = r->end;
- struct resource *res;
+ bool reserved;

regionid = kmalloc(16, GFP_KERNEL);
if (!regionid)
return;

snprintf(regionid, 16, "pnp %s", pnpid);
- if (port)
- res = request_region(start, end - start + 1, regionid);
- else
- res = request_mem_region(start, end - start + 1, regionid);
- if (res)
- res->flags &= ~IORESOURCE_BUSY;
- else
+ reserved = port ? reserve_region(start, end - start + 1, regionid) :
+ reserve_mem_region(start, end - start + 1, regionid);
+ if (!reserved)
kfree(regionid);

/*
@@ -49,7 +82,7 @@ static void reserve_range(struct pnp_dev
* have double reservations.
*/
dev_info(&dev->dev, "%pR %s reserved\n", r,
- res ? "has been" : "could not be");
+ reserved ? "has been" : "could not be");
}

static void reserve_resources_of_dev(struct pnp_dev *dev)

2015-06-12 15:01:40

by Roland Dreier

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Thu, Jun 11, 2015 at 1:50 PM, Rafael J. Wysocki <[email protected]> wrote:
> Changing the ordering between those two routines would work around that problem,
> but in my view that wouldn't be a proper fix. In fact, the role of reserve_range()
> is to reserve the resources so as to prevent them from being used going forward,
> so they need not be reserved each in one piece. Instead, we can just check if they
> overlap with the ones reserved by acpi_reserve_resources() and only request the
> non-overlapping parts of them to avoid conflicts.
>
> So I wonder if the patch below makes any difference?

I will give this a try and make sure it fixes my system, although I'm
pretty sure it will.

However I'm not sure I agree that this is a better fix than just
having pnp reserve ranges before acpi. It already creates a special
relationship between pnp and acpi, and acpi_reserve_region is a bunch
of extra code. Could we really have a system where the hierarchy of
acpi being a subset of a pnp bus doesn't work? I looked at a few
other systems I have, and things like the following seem quite common:

supermicro:

03e0-0cf7 : PCI Bus 0000:00
03f8-03ff : serial
0400-0453 : pnp 00:0c
0400-0403 : ACPI PM1a_EVT_BLK
0404-0405 : ACPI PM1a_CNT_BLK
0408-040b : ACPI PM_TMR
0410-0415 : ACPI CPU throttle
0420-042f : ACPI GPE0_BLK
0430-0433 : iTCO_wdt
0450-0450 : ACPI PM2_CNT_BLK

dell:

03e0-0cf7 : PCI Bus 0000:00
03f8-03ff : serial
0800-087f : pnp 00:06
0800-0803 : ACPI PM1a_EVT_BLK
0804-0805 : ACPI PM1a_CNT_BLK
0808-080b : ACPI PM_TMR
0810-0815 : ACPI CPU throttle
0820-082f : ACPI GPE0_BLK
0830-0833 : iTCO_wdt
0830-0833 : iTCO_wdt
0850-0850 : ACPI PM2_CNT_BLK
0860-087f : iTCO_wdt
0860-087f : iTCO_wdt

but I wasn't able to find anything that required more generality...

2015-06-13 02:26:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Friday, June 12, 2015 08:01:15 AM Roland Dreier wrote:
> On Thu, Jun 11, 2015 at 1:50 PM, Rafael J. Wysocki <[email protected]> wrote:
> > Changing the ordering between those two routines would work around that problem,
> > but in my view that wouldn't be a proper fix. In fact, the role of reserve_range()
> > is to reserve the resources so as to prevent them from being used going forward,
> > so they need not be reserved each in one piece. Instead, we can just check if they
> > overlap with the ones reserved by acpi_reserve_resources() and only request the
> > non-overlapping parts of them to avoid conflicts.
> >
> > So I wonder if the patch below makes any difference?
>
> I will give this a try and make sure it fixes my system, although I'm
> pretty sure it will.

OK, thanks!

Below is a more sophisticated, so to speak, version of it with a changelog and
all. It works for me, but more testing would be much appreciated.

> However I'm not sure I agree that this is a better fix than just
> having pnp reserve ranges before acpi. It already creates a special
> relationship between pnp and acpi, and acpi_reserve_region is a bunch
> of extra code.

There is a special relationship between ACPI and PNP on ACPI-based systems
anyway, because PNP devices are discovered via ACPI on them (and there really
is no difference between "PNP devices" and "devices enumerated via ACPI" on
those systems).

Yes, acpi_reserve_region() is quite a bit of extra code, but what it does is
safe from the perspective of introducing more problems due to initialization
ordering changes.

> Could we really have a system where the hierarchy of acpi being a subset of
> a pnp bus doesn't work?

That is not a problem.

I've reordered acpi_reserve_region() before the initial ACPI namespace scan
to make sure that address ranges used by the fixed ACPI hardware features will
not be stepped on in that process (which includes things like the enumeration
of the PCI host bridge). It simply makes sense to have it in there.

Now, reordering the PNP "system" driver reservations before the acpi_reserve_region()
call site is not quite straightforward, because it is not suffcient to simply invoke
pnp_system_init() from there as it only registers the driver. A matching device
has to be discovered to trigger reserve_resources_of_dev() and that only happens
during the initial ACPI namespace scan mentioned above. While in principle
something like acpi_get_devices() could be used to force an extra namespace walk
to look for the specific devices matching the "system" driver earlier, that also
would be extra code and walking the entire ACPI namespace is not a lightweight
operation.

Moreover, it might lead to further regressions on some systems, because some
reservations made by the "system" driver fail on the systems I have access
to, so presumably something else already uses those address ranges when
reserve_resources_of_dev() is called for them and I'm not sure what would
happen if reserve_resources_of_dev() was called before that thing, in general.

Thanks,
Rafael


---
From: Rafael J. Wysocki <[email protected]>
Subject: ACPI / PNP: Avoid conflicting resource reservations

Commit b9a5e5e18fbf "ACPI / init: Fix the ordering of
acpi_reserve_resources()" overlooked the fact that the memory
and/or I/O regions reserved by acpi_reserve_resources() may
conflict with those reserved by the PNP "system" driver.

If that conflict actually takes place, it causes the reservations
made by the "system" driver to fail while before commit b9a5e5e18fbf
it would cause the reservations made by acpi_reserve_resources() to
fail. In turn, that causes the resources that haven't been reserved
by the "system" driver to be allocated by others (e.g. PCI) which
sometimes leads to functional problems (up to and including boot
failures).

To fix that issue, introduce a common resource reservation routine,
acpi_reserve_region(), to be used by both acpi_reserve_resources()
and the "system" driver, that will track all resources reserved by
it and avoid making conflicting requests.

Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()"
Reported-by: Roland Dreier <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/osl.c | 6 -
drivers/acpi/resource.c | 189 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pnp/system.c | 35 ++++++--
include/linux/acpi.h | 10 ++
4 files changed, 226 insertions(+), 14 deletions(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -175,11 +175,7 @@ static void __init acpi_request_region (
if (!addr || !length)
return;

- /* Resources are never freed */
- if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
- request_region(addr, length, desc);
- else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
- request_mem_region(addr, length, desc);
+ acpi_reserve_region(addr, length, gas->space_id, 0, desc);
}

static void __init acpi_reserve_resources(void)
Index: linux-pm/drivers/acpi/resource.c
===================================================================
--- linux-pm.orig/drivers/acpi/resource.c
+++ linux-pm/drivers/acpi/resource.c
@@ -26,6 +26,7 @@
#include <linux/device.h>
#include <linux/export.h>
#include <linux/ioport.h>
+#include <linux/list.h>
#include <linux/slab.h>

#ifdef CONFIG_X86
@@ -621,3 +622,191 @@ int acpi_dev_filter_resource_type(struct
return (type & types) ? 0 : 1;
}
EXPORT_SYMBOL_GPL(acpi_dev_filter_resource_type);
+
+struct reserved_region {
+ struct list_head node;
+ u64 start;
+ u64 end;
+};
+
+static LIST_HEAD(reserved_io_regions);
+static LIST_HEAD(reserved_mem_regions);
+
+static int request_range(u64 start, u64 end, u8 space_id, unsigned long flags,
+ char *desc)
+{
+ unsigned int length = end - start + 1;
+ struct resource *res;
+
+ res = space_id == ACPI_ADR_SPACE_SYSTEM_IO ?
+ request_region(start, length, desc) :
+ request_mem_region(start, length, desc);
+ if (!res)
+ return -EIO;
+
+ res->flags &= ~flags;
+ return 0;
+}
+
+static int acpi_add_region(u64 start, u64 end, u8 space_id, unsigned long flags,
+ char *desc, struct list_head *head)
+{
+ struct reserved_region *reg;
+ int error;
+
+ reg = kmalloc(sizeof(*reg), GFP_KERNEL);
+ if (!reg)
+ return -ENOMEM;
+
+ error = request_range(start, end, space_id, flags, desc);
+ if (error)
+ return error;
+
+ reg->start = start;
+ reg->end = end;
+ list_add(&reg->node, head);
+ return 0;
+}
+
+static int acpi_add_region_before(u64 start, u64 end, u8 space_id,
+ unsigned long flags, char *desc,
+ struct reserved_region *reg)
+{
+ int ret;
+
+ if (reg->start == end + 1) {
+ /* Try to combine. */
+ ret = request_range(start, end, space_id, flags, desc);
+ if (!ret)
+ reg->start = start;
+ } else {
+ ret = acpi_add_region(start, end, space_id, flags, desc,
+ reg->node.prev);
+ }
+ return ret;
+}
+
+static int acpi_add_region_after(u64 start, u64 end, u8 space_id,
+ unsigned long flags, char *desc,
+ struct reserved_region *reg)
+{
+ int ret;
+
+ if (reg->end == start - 1) {
+ /* Try to combine. */
+ ret = request_range(start, end, space_id, flags, desc);
+ if (!ret)
+ reg->end = end;
+ } else {
+ ret = acpi_add_region(start, end, space_id, flags, desc,
+ &reg->node);
+ }
+ return ret;
+}
+
+/**
+ * acpi_reserve_region - Reserve an I/O or memory region as a system resource.
+ * @start: Starting address of the region.
+ * @length: Length of the region.
+ * @space_id: Identifier of address space to reserve the region from.
+ * @flags: Resource flags to clear for the region after requesting it.
+ * @desc: Region description (for messages).
+ *
+ * Reserve an I/O or memory region as a system resource to prevent others from
+ * using it. If the new region overlaps with one of the regions (in the given
+ * address space) already reserved by this routine, only the non-overlapping
+ * parts of it will be reserved.
+ *
+ * Returned is either 0 (success) or a negative error code indicating a resource
+ * reservation problem. It is the code of the first encountered error, but the
+ * routine doesn't abort until it has attempted to request all of the parts of
+ * the new region that don't overlap with other regions reserved previously.
+ *
+ * The resources requested by this routine are never released.
+ */
+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc)
+{
+ struct list_head *regions;
+ struct reserved_region *reg;
+ u64 end = start + length - 1;
+ int ret = 0, error = 0;
+
+ if (space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+ regions = &reserved_io_regions;
+ else if (space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
+ regions = &reserved_mem_regions;
+ else
+ return -EINVAL;
+
+ if (list_empty(regions))
+ return acpi_add_region(start, end, space_id, flags, desc, regions);
+
+ list_for_each_entry(reg, regions, node)
+ if (reg->start > end) {
+ /* No overlap. Add the new region here and get out. */
+ return acpi_add_region_before(start, end, space_id,
+ flags, desc, reg);
+ } else if (reg->end == start - 1) {
+ goto combine;
+ } else if (reg->end >= start) {
+ goto overlap;
+ }
+
+ /* The new region goes after the last existing one. */
+ reg = list_last_entry(regions, struct reserved_region, node);
+ return acpi_add_region_after(start, end, space_id, flags, desc, reg);
+
+ overlap:
+ /*
+ * The new region overlaps an existing one.
+ *
+ * The head part of the new region immediately preceding the existing
+ * overlapping one can be combined with it right away.
+ */
+ if (reg->start > start) {
+ error = request_range(start, reg->start - 1, space_id, flags, desc);
+ if (error)
+ ret = error;
+ else
+ reg->start = start;
+ }
+
+ combine:
+ /*
+ * The new region is adjacent to an existing one. If it extends beyond
+ * that region all the way to the next one, it is possible to combine
+ * all three of them.
+ */
+ if (reg->end < end) {
+ struct reserved_region *next = NULL;
+ u64 a = reg->end + 1, b = end;
+
+ if (!list_is_last(&reg->node, regions)) {
+ next = list_next_entry(reg, node);
+ if (next->start <= end)
+ b = next->start - 1;
+ }
+ error = request_range(a, b, space_id, flags, desc);
+ if (!error) {
+ if (next && next->start == b + 1) {
+ reg->end = next->end;
+ list_del(&next->node);
+ kfree(next);
+ start = reg->end + 1;
+ goto combine;
+ } else {
+ reg->end = end;
+ }
+ } else if (next) {
+ if (!ret)
+ ret = error;
+
+ reg = next;
+ goto combine;
+ }
+ }
+
+ return ret ? ret : error;
+}
+EXPORT_SYMBOL_GPL(acpi_reserve_region);
Index: linux-pm/include/linux/acpi.h
===================================================================
--- linux-pm.orig/include/linux/acpi.h
+++ linux-pm/include/linux/acpi.h
@@ -342,6 +342,9 @@ int acpi_check_region(resource_size_t st

int acpi_resources_are_enforced(void);

+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc);
+
#ifdef CONFIG_HIBERNATION
void __init acpi_no_s4_hw_signature(void);
#endif
@@ -537,6 +540,13 @@ static inline int acpi_check_region(reso
return 0;
}

+static inline int acpi_reserve_region(u64 start, unsigned int length,
+ u8 space_id, unsigned long flags,
+ char *desc)
+{
+ return -ENXIO;
+}
+
struct acpi_table_header;
static inline int acpi_table_parse(char *id,
int (*handler)(struct acpi_table_header *))
Index: linux-pm/drivers/pnp/system.c
===================================================================
--- linux-pm.orig/drivers/pnp/system.c
+++ linux-pm/drivers/pnp/system.c
@@ -7,6 +7,7 @@
* Bjorn Helgaas <[email protected]>
*/

+#include <linux/acpi.h>
#include <linux/pnp.h>
#include <linux/device.h>
#include <linux/init.h>
@@ -22,25 +23,41 @@ static const struct pnp_device_id pnp_de
{"", 0}
};

+#ifdef CONFIG_ACPI
+static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc)
+{
+ u8 space_id = io ? ACPI_ADR_SPACE_SYSTEM_IO : ACPI_ADR_SPACE_SYSTEM_MEMORY;
+ return !acpi_reserve_region(start, length, space_id, IORESOURCE_BUSY, desc);
+}
+#else
+static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc)
+{
+ struct resource *res;
+
+ res = io ? request_region(start, length, desc) :
+ request_mem_region(start, length, desc);
+ if (res) {
+ res->flags &= ~IORESOURCE_BUSY;
+ return true;
+ }
+ return false;
+}
+#endif
+
static void reserve_range(struct pnp_dev *dev, struct resource *r, int port)
{
char *regionid;
const char *pnpid = dev_name(&dev->dev);
resource_size_t start = r->start, end = r->end;
- struct resource *res;
+ bool reserved;

regionid = kmalloc(16, GFP_KERNEL);
if (!regionid)
return;

snprintf(regionid, 16, "pnp %s", pnpid);
- if (port)
- res = request_region(start, end - start + 1, regionid);
- else
- res = request_mem_region(start, end - start + 1, regionid);
- if (res)
- res->flags &= ~IORESOURCE_BUSY;
- else
+ reserved = __reserve_range(start, end - start + 1, !!port, regionid);
+ if (!reserved)
kfree(regionid);

/*
@@ -49,7 +66,7 @@ static void reserve_range(struct pnp_dev
* have double reservations.
*/
dev_info(&dev->dev, "%pR %s reserved\n", r,
- res ? "has been" : "could not be");
+ reserved ? "has been" : "could not be");
}

static void reserve_resources_of_dev(struct pnp_dev *dev)

2015-06-13 16:56:35

by Roland Dreier

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Fri, Jun 12, 2015 at 7:52 PM, Rafael J. Wysocki <[email protected]> wrote:
> Below is a more sophisticated, so to speak, version of it with a changelog and
> all. It works for me, but more testing would be much appreciated.

Great, I'm convinced by your reasoning that this makes sense. I'm
building 3.10.80 patched with this (needed a tiny bit of context
adjustment because acpi_dev_filter_resource_type() hadn't been added
to 3.10 yet), and will confirm that it fixes the issue I saw.

Thanks!
Roland

2015-06-15 14:56:35

by Roland Dreier

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier <[email protected]> wrote:
> Below is a more sophisticated, so to speak, version of it with a changelog and
> all. It works for me, but more testing would be much appreciated.

Yes, the patch works as expected:

Tested-by: Roland Dreier <[email protected]>


It does change /proc/ioports heirarchy to

0400-0403 : ACPI PM1a_EVT_BLK
0404-0405 : ACPI PM1a_CNT_BLK
0406-0407 : pnp 00:06
0408-040b : ACPI PM_TMR
040c-041f : pnp 00:06
0410-0415 : ACPI CPU throttle
0420-042f : ACPI GPE0_BLK
0430-044f : pnp 00:06
0430-0433 : iTCO_wdt
0430-0433 : iTCO_wdt
0450-0450 : ACPI PM2_CNT_BLK
0451-047f : pnp 00:06
0460-047f : iTCO_wdt
0460-047f : iTCO_wdt

where the old kernel had

0400-047f : pnp 00:06
0400-0403 : ACPI PM1a_EVT_BLK
0404-0405 : ACPI PM1a_CNT_BLK
0408-040b : ACPI PM_TMR
0410-0415 : ACPI CPU throttle
0420-042f : ACPI GPE0_BLK
0430-0433 : iTCO_wdt
0430-0433 : iTCO_wdt
0450-0450 : ACPI PM2_CNT_BLK
0460-047f : iTCO_wdt
0460-047f : iTCO_wdt

but I don't think that matters.

Thanks,
- R.

2015-06-15 22:34:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Monday, June 15, 2015 07:56:05 AM Roland Dreier wrote:
> On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier <[email protected]> wrote:
> > Below is a more sophisticated, so to speak, version of it with a changelog and
> > all. It works for me, but more testing would be much appreciated.
>
> Yes, the patch works as expected:
>
> Tested-by: Roland Dreier <[email protected]>

Awesome, thanks a lot for the confirmation!

I'll send it Linuswards for final 4.1.

> It does change /proc/ioports heirarchy to
>
> 0400-0403 : ACPI PM1a_EVT_BLK
> 0404-0405 : ACPI PM1a_CNT_BLK
> 0406-0407 : pnp 00:06
> 0408-040b : ACPI PM_TMR
> 040c-041f : pnp 00:06
> 0410-0415 : ACPI CPU throttle
> 0420-042f : ACPI GPE0_BLK
> 0430-044f : pnp 00:06
> 0430-0433 : iTCO_wdt
> 0430-0433 : iTCO_wdt
> 0450-0450 : ACPI PM2_CNT_BLK
> 0451-047f : pnp 00:06
> 0460-047f : iTCO_wdt
> 0460-047f : iTCO_wdt
>
> where the old kernel had
>
> 0400-047f : pnp 00:06
> 0400-0403 : ACPI PM1a_EVT_BLK
> 0404-0405 : ACPI PM1a_CNT_BLK
> 0408-040b : ACPI PM_TMR
> 0410-0415 : ACPI CPU throttle
> 0420-042f : ACPI GPE0_BLK
> 0430-0433 : iTCO_wdt
> 0430-0433 : iTCO_wdt
> 0450-0450 : ACPI PM2_CNT_BLK
> 0460-047f : iTCO_wdt
> 0460-047f : iTCO_wdt
>
> but I don't think that matters.

No, it doesn't.

Thanks,
Rafael

2015-06-17 12:16:51

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Tuesday, June 16, 2015 01:00:13 AM Rafael J. Wysocki wrote:
> On Monday, June 15, 2015 07:56:05 AM Roland Dreier wrote:
> > On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier <[email protected]> wrote:
> > > Below is a more sophisticated, so to speak, version of it with a changelog and
> > > all. It works for me, but more testing would be much appreciated.
> >
> > Yes, the patch works as expected:
> >
> > Tested-by: Roland Dreier <[email protected]>
>
> Awesome, thanks a lot for the confirmation!

During the final review of the patch I noticed that it could be simplified
by dropping the acpi_add_region_after() routine and calling acpi_add_region()
directly instead of it.

I'll send an updated patch to you later today.

Thanks,
Rafael

2015-06-17 21:27:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: Regression in 3.10.80 vs. 3.10.79

On Wednesday, June 17, 2015 02:42:40 PM Rafael J. Wysocki wrote:
> On Tuesday, June 16, 2015 01:00:13 AM Rafael J. Wysocki wrote:
> > On Monday, June 15, 2015 07:56:05 AM Roland Dreier wrote:
> > > On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier <[email protected]> wrote:
> > > > Below is a more sophisticated, so to speak, version of it with a changelog and
> > > > all. It works for me, but more testing would be much appreciated.
> > >
> > > Yes, the patch works as expected:
> > >
> > > Tested-by: Roland Dreier <[email protected]>
> >
> > Awesome, thanks a lot for the confirmation!
>
> During the final review of the patch I noticed that it could be simplified
> by dropping the acpi_add_region_after() routine and calling acpi_add_region()
> directly instead of it.
>
> I'll send an updated patch to you later today.

OK, so the patch below achieves the same result in fewer lines of code.

I've tested it on the systems available to me, but please verify that it still
works for you too.

Thanks,
Rafael


---
From: Rafael J. Wysocki <[email protected]>
Subject: ACPI / PNP: Avoid conflicting resource reservations

Commit b9a5e5e18fbf "ACPI / init: Fix the ordering of
acpi_reserve_resources()" overlooked the fact that the memory
and/or I/O regions reserved by acpi_reserve_resources() may
conflict with those reserved by the PNP "system" driver.

If that conflict actually takes place, it causes the reservations
made by the "system" driver to fail while before commit b9a5e5e18fbf
all reservations made by it and by acpi_reserve_resources() would be
successful. In turn, that allows the resources that haven't been
reserved by the "system" driver to be used by others (e.g. PCI) which
sometimes leads to functional problems (up to and including boot
failures).

To fix that issue, introduce a common resource reservation routine,
acpi_reserve_region(), to be used by both acpi_reserve_resources()
and the "system" driver, that will track all resources reserved by
it and avoid making conflicting requests.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831
Link: http://marc.info/?t=143389402600001&r=1&w=2
Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()"
Reported-by: Roland Dreier <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/osl.c | 6 -
drivers/acpi/resource.c | 160 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pnp/system.c | 35 +++++++---
include/linux/acpi.h | 10 +++
4 files changed, 197 insertions(+), 14 deletions(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -175,11 +175,7 @@ static void __init acpi_request_region (
if (!addr || !length)
return;

- /* Resources are never freed */
- if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO)
- request_region(addr, length, desc);
- else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
- request_mem_region(addr, length, desc);
+ acpi_reserve_region(addr, length, gas->space_id, 0, desc);
}

static void __init acpi_reserve_resources(void)
Index: linux-pm/drivers/acpi/resource.c
===================================================================
--- linux-pm.orig/drivers/acpi/resource.c
+++ linux-pm/drivers/acpi/resource.c
@@ -26,6 +26,7 @@
#include <linux/device.h>
#include <linux/export.h>
#include <linux/ioport.h>
+#include <linux/list.h>
#include <linux/slab.h>

#ifdef CONFIG_X86
@@ -621,3 +622,162 @@ int acpi_dev_filter_resource_type(struct
return (type & types) ? 0 : 1;
}
EXPORT_SYMBOL_GPL(acpi_dev_filter_resource_type);
+
+struct reserved_region {
+ struct list_head node;
+ u64 start;
+ u64 end;
+};
+
+static LIST_HEAD(reserved_io_regions);
+static LIST_HEAD(reserved_mem_regions);
+
+static int request_range(u64 start, u64 end, u8 space_id, unsigned long flags,
+ char *desc)
+{
+ unsigned int length = end - start + 1;
+ struct resource *res;
+
+ res = space_id == ACPI_ADR_SPACE_SYSTEM_IO ?
+ request_region(start, length, desc) :
+ request_mem_region(start, length, desc);
+ if (!res)
+ return -EIO;
+
+ res->flags &= ~flags;
+ return 0;
+}
+
+static int add_region_before(u64 start, u64 end, u8 space_id,
+ unsigned long flags, char *desc,
+ struct list_head *head)
+{
+ struct reserved_region *reg;
+ int error;
+
+ reg = kmalloc(sizeof(*reg), GFP_KERNEL);
+ if (!reg)
+ return -ENOMEM;
+
+ error = request_range(start, end, space_id, flags, desc);
+ if (error)
+ return error;
+
+ reg->start = start;
+ reg->end = end;
+ list_add_tail(&reg->node, head);
+ return 0;
+}
+
+/**
+ * acpi_reserve_region - Reserve an I/O or memory region as a system resource.
+ * @start: Starting address of the region.
+ * @length: Length of the region.
+ * @space_id: Identifier of address space to reserve the region from.
+ * @flags: Resource flags to clear for the region after requesting it.
+ * @desc: Region description (for messages).
+ *
+ * Reserve an I/O or memory region as a system resource to prevent others from
+ * using it. If the new region overlaps with one of the regions (in the given
+ * address space) already reserved by this routine, only the non-overlapping
+ * parts of it will be reserved.
+ *
+ * Returned is either 0 (success) or a negative error code indicating a resource
+ * reservation problem. It is the code of the first encountered error, but the
+ * routine doesn't abort until it has attempted to request all of the parts of
+ * the new region that don't overlap with other regions reserved previously.
+ *
+ * The resources requested by this routine are never released.
+ */
+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc)
+{
+ struct list_head *regions;
+ struct reserved_region *reg;
+ u64 end = start + length - 1;
+ int ret = 0, error = 0;
+
+ if (space_id == ACPI_ADR_SPACE_SYSTEM_IO)
+ regions = &reserved_io_regions;
+ else if (space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY)
+ regions = &reserved_mem_regions;
+ else
+ return -EINVAL;
+
+ if (list_empty(regions))
+ return add_region_before(start, end, space_id, flags, desc, regions);
+
+ list_for_each_entry(reg, regions, node)
+ if (reg->start == end + 1) {
+ /* The new region can be prepended to this one. */
+ ret = request_range(start, end, space_id, flags, desc);
+ if (!ret)
+ reg->start = start;
+
+ return ret;
+ } else if (reg->start > end) {
+ /* No overlap. Add the new region here and get out. */
+ return add_region_before(start, end, space_id, flags,
+ desc, &reg->node);
+ } else if (reg->end == start - 1) {
+ goto combine;
+ } else if (reg->end >= start) {
+ goto overlap;
+ }
+
+ /* The new region goes after the last existing one. */
+ return add_region_before(start, end, space_id, flags, desc, regions);
+
+ overlap:
+ /*
+ * The new region overlaps an existing one.
+ *
+ * The head part of the new region immediately preceding the existing
+ * overlapping one can be combined with it right away.
+ */
+ if (reg->start > start) {
+ error = request_range(start, reg->start - 1, space_id, flags, desc);
+ if (error)
+ ret = error;
+ else
+ reg->start = start;
+ }
+
+ combine:
+ /*
+ * The new region is adjacent to an existing one. If it extends beyond
+ * that region all the way to the next one, it is possible to combine
+ * all three of them.
+ */
+ while (reg->end < end) {
+ struct reserved_region *next = NULL;
+ u64 a = reg->end + 1, b = end;
+
+ if (!list_is_last(&reg->node, regions)) {
+ next = list_next_entry(reg, node);
+ if (next->start <= end)
+ b = next->start - 1;
+ }
+ error = request_range(a, b, space_id, flags, desc);
+ if (!error) {
+ if (next && next->start == b + 1) {
+ reg->end = next->end;
+ list_del(&next->node);
+ kfree(next);
+ } else {
+ reg->end = end;
+ break;
+ }
+ } else if (next) {
+ if (!ret)
+ ret = error;
+
+ reg = next;
+ } else {
+ break;
+ }
+ }
+
+ return ret ? ret : error;
+}
+EXPORT_SYMBOL_GPL(acpi_reserve_region);
Index: linux-pm/include/linux/acpi.h
===================================================================
--- linux-pm.orig/include/linux/acpi.h
+++ linux-pm/include/linux/acpi.h
@@ -332,6 +332,9 @@ int acpi_check_region(resource_size_t st

int acpi_resources_are_enforced(void);

+int acpi_reserve_region(u64 start, unsigned int length, u8 space_id,
+ unsigned long flags, char *desc);
+
#ifdef CONFIG_HIBERNATION
void __init acpi_no_s4_hw_signature(void);
#endif
@@ -525,6 +528,13 @@ static inline int acpi_check_region(reso
return 0;
}

+static inline int acpi_reserve_region(u64 start, unsigned int length,
+ u8 space_id, unsigned long flags,
+ char *desc)
+{
+ return -ENXIO;
+}
+
struct acpi_table_header;
static inline int acpi_table_parse(char *id,
int (*handler)(struct acpi_table_header *))
Index: linux-pm/drivers/pnp/system.c
===================================================================
--- linux-pm.orig/drivers/pnp/system.c
+++ linux-pm/drivers/pnp/system.c
@@ -7,6 +7,7 @@
* Bjorn Helgaas <[email protected]>
*/

+#include <linux/acpi.h>
#include <linux/pnp.h>
#include <linux/device.h>
#include <linux/init.h>
@@ -22,25 +23,41 @@ static const struct pnp_device_id pnp_de
{"", 0}
};

+#ifdef CONFIG_ACPI
+static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc)
+{
+ u8 space_id = io ? ACPI_ADR_SPACE_SYSTEM_IO : ACPI_ADR_SPACE_SYSTEM_MEMORY;
+ return !acpi_reserve_region(start, length, space_id, IORESOURCE_BUSY, desc);
+}
+#else
+static bool __reserve_range(u64 start, unsigned int length, bool io, char *desc)
+{
+ struct resource *res;
+
+ res = io ? request_region(start, length, desc) :
+ request_mem_region(start, length, desc);
+ if (res) {
+ res->flags &= ~IORESOURCE_BUSY;
+ return true;
+ }
+ return false;
+}
+#endif
+
static void reserve_range(struct pnp_dev *dev, struct resource *r, int port)
{
char *regionid;
const char *pnpid = dev_name(&dev->dev);
resource_size_t start = r->start, end = r->end;
- struct resource *res;
+ bool reserved;

regionid = kmalloc(16, GFP_KERNEL);
if (!regionid)
return;

snprintf(regionid, 16, "pnp %s", pnpid);
- if (port)
- res = request_region(start, end - start + 1, regionid);
- else
- res = request_mem_region(start, end - start + 1, regionid);
- if (res)
- res->flags &= ~IORESOURCE_BUSY;
- else
+ reserved = __reserve_range(start, end - start + 1, !!port, regionid);
+ if (!reserved)
kfree(regionid);

/*
@@ -49,7 +66,7 @@ static void reserve_range(struct pnp_dev
* have double reservations.
*/
dev_info(&dev->dev, "%pR %s reserved\n", r,
- res ? "has been" : "could not be");
+ reserved ? "has been" : "could not be");
}

static void reserve_resources_of_dev(struct pnp_dev *dev)