2013-09-23 06:15:51

by Alexey Neyman

[permalink] [raw]
Subject: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

[Resending due to no response to the original message in a week]

Hi all,

I have a board with a BIOS bug that reports the following I/O port regions in
_CRS on one of the host bridges:

0x0000-0x03af // #0
0x03e0-0x0cf7 // #1
0x03b0-0x03bb // #2
0x03c0-0x03df // #3
0x0000-0xdfff // #4
0xf000-0xffff // #5

Obviously, region number #4 is erroneous as it overlaps with regions #0..3.
The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover from
such kind of BIOS bugs by merging the overlapping regions. Current code
expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As a result,
overlap of the expanded region #0 with regions #1..3 remains undetected (as
the inner loop already compared them with region #0). As a result, regions
#1..3 are inserted into the resource tree even though they overlap with
adjusted region #0 - which later results in resource conflicts for PCI devices
with IO ports in one of those regions (e.g., for an PCI IDE controller in
legacy mode - which has port 0x3f6). The kernel then refuses to initialize
these devices.

The fix: instead of expanding res1 and ignoring res2, do the opposite. The
res2 window is yet to be compared against all windows between res1 and res2
(regions #1..3 in the above example), so the resulting resource map will
include just the expanded region - and will ignore any overlapping ones.

Signed-off-by: Alexey Neyman <[email protected]>


Attachments:
acpi.c.diff (722.00 B)

2013-09-24 03:49:18

by Yijing Wang

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

On 2013/9/23 14:15, Alexey Neyman wrote:
> [Resending due to no response to the original message in a week]
>
> Hi all,
>
> I have a board with a BIOS bug that reports the following I/O port regions in
> _CRS on one of the host bridges:
>
> 0x0000-0x03af // #0
> 0x03e0-0x0cf7 // #1
> 0x03b0-0x03bb // #2
> 0x03c0-0x03df // #3
> 0x0000-0xdfff // #4
> 0xf000-0xffff // #5
>
> Obviously, region number #4 is erroneous as it overlaps with regions #0..3.
> The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover from
> such kind of BIOS bugs by merging the overlapping regions. Current code
> expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As a result,
> overlap of the expanded region #0 with regions #1..3 remains undetected (as
> the inner loop already compared them with region #0). As a result, regions
> #1..3 are inserted into the resource tree even though they overlap with
> adjusted region #0 - which later results in resource conflicts for PCI devices
> with IO ports in one of those regions (e.g., for an PCI IDE controller in
> legacy mode - which has port 0x3f6). The kernel then refuses to initialize
> these devices.
>
> The fix: instead of expanding res1 and ignoring res2, do the opposite. The
> res2 window is yet to be compared against all windows between res1 and res2
> (regions #1..3 in the above example), so the resulting resource map will
> include just the expanded region - and will ignore any overlapping ones.
>
> Signed-off-by: Alexey Neyman <[email protected]>
>

It looks fine to me, but I have no platform to test it. :)


--
Thanks!
Yijing

2013-09-28 07:14:06

by Alexey Neyman

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

On Tuesday, September 24, 2013 11:47:21 AM Yijing Wang wrote:
> On 2013/9/23 14:15, Alexey Neyman wrote:
> > [Resending due to no response to the original message in a week]
> >
> > Hi all,
> >
> > I have a board with a BIOS bug that reports the following I/O port regions
> > in _CRS on one of the host bridges:
> >
> > 0x0000-0x03af // #0
> > 0x03e0-0x0cf7 // #1
> > 0x03b0-0x03bb // #2
> > 0x03c0-0x03df // #3
> > 0x0000-0xdfff // #4
> > 0xf000-0xffff // #5
> >
> > Obviously, region number #4 is erroneous as it overlaps with regions
> > #0..3.
> > The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover
> > from such kind of BIOS bugs by merging the overlapping regions. Current
> > code expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As
> > a result, overlap of the expanded region #0 with regions #1..3 remains
> > undetected (as the inner loop already compared them with region #0). As a
> > result, regions #1..3 are inserted into the resource tree even though
> > they overlap with adjusted region #0 - which later results in resource
> > conflicts for PCI devices with IO ports in one of those regions (e.g.,
> > for an PCI IDE controller in legacy mode - which has port 0x3f6). The
> > kernel then refuses to initialize these devices.
> >
> > The fix: instead of expanding res1 and ignoring res2, do the opposite. The
> > res2 window is yet to be compared against all windows between res1 and
> > res2
> > (regions #1..3 in the above example), so the resulting resource map will
> > include just the expanded region - and will ignore any overlapping ones.
> >
> > Signed-off-by: Alexey Neyman <[email protected]>
>
> It looks fine to me, but I have no platform to test it. :)

Thanks for a review. Could anybody push it into the tree? :)
For convenience, patch attached again.

Regards,
Alexey.


Attachments:
acpi.c.diff (722.00 B)

2013-10-03 18:44:27

by Alexey Neyman

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

[Patch ping #4...]

On Tuesday, September 24, 2013 11:47:21 AM Yijing Wang wrote:
> On 2013/9/23 14:15, Alexey Neyman wrote:
> > [Resending due to no response to the original message in a week]
> >
> > Hi all,
> >
> > I have a board with a BIOS bug that reports the following I/O port regions
> > in _CRS on one of the host bridges:
> >
> > 0x0000-0x03af // #0
> > 0x03e0-0x0cf7 // #1
> > 0x03b0-0x03bb // #2
> > 0x03c0-0x03df // #3
> > 0x0000-0xdfff // #4
> > 0xf000-0xffff // #5
> >
> > Obviously, region number #4 is erroneous as it overlaps with regions
> > #0..3.
> > The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover
> > from such kind of BIOS bugs by merging the overlapping regions. Current
> > code expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As
> > a result, overlap of the expanded region #0 with regions #1..3 remains
> > undetected (as the inner loop already compared them with region #0). As a
> > result, regions #1..3 are inserted into the resource tree even though
> > they overlap with adjusted region #0 - which later results in resource
> > conflicts for PCI devices with IO ports in one of those regions (e.g.,
> > for an PCI IDE controller in legacy mode - which has port 0x3f6). The
> > kernel then refuses to initialize these devices.
> >
> > The fix: instead of expanding res1 and ignoring res2, do the opposite. The
> > res2 window is yet to be compared against all windows between res1 and
> > res2
> > (regions #1..3 in the above example), so the resulting resource map will
> > include just the expanded region - and will ignore any overlapping ones.
> >
> > Signed-off-by: Alexey Neyman <[email protected]>
>
> It looks fine to me, but I have no platform to test it. :)

Thanks for a review. Could anybody push it into the tree? :)
For convenience, patch attached again.

Regards,
Alexey.


Attachments:
acpi.c.diff (722.00 B)

2013-10-03 19:15:04

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

On Mon, Sep 23, 2013 at 12:15 AM, Alexey Neyman <[email protected]> wrote:
> [Resending due to no response to the original message in a week]
>
> Hi all,
>
> I have a board with a BIOS bug that reports the following I/O port regions in
> _CRS on one of the host bridges:
>
> 0x0000-0x03af // #0
> 0x03e0-0x0cf7 // #1
> 0x03b0-0x03bb // #2
> 0x03c0-0x03df // #3
> 0x0000-0xdfff // #4
> 0xf000-0xffff // #5
>
> Obviously, region number #4 is erroneous as it overlaps with regions #0..3.
> The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover from
> such kind of BIOS bugs by merging the overlapping regions. Current code
> expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As a result,
> overlap of the expanded region #0 with regions #1..3 remains undetected (as
> the inner loop already compared them with region #0). As a result, regions
> #1..3 are inserted into the resource tree even though they overlap with
> adjusted region #0 - which later results in resource conflicts for PCI devices
> with IO ports in one of those regions (e.g., for an PCI IDE controller in
> legacy mode - which has port 0x3f6). The kernel then refuses to initialize
> these devices.
>
> The fix: instead of expanding res1 and ignoring res2, do the opposite. The
> res2 window is yet to be compared against all windows between res1 and res2
> (regions #1..3 in the above example), so the resulting resource map will
> include just the expanded region - and will ignore any overlapping ones.
>
> Signed-off-by: Alexey Neyman <[email protected]>

Can you please open a report at http://bugzilla.kernel.org, assign it
to drivers/pci, and attach a complete dmesg log?

Thanks,
Bjorn

2013-10-03 23:18:28

by Alexey Neyman

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

On Thursday, October 03, 2013 12:14:38 pm Bjorn Helgaas wrote:
> On Mon, Sep 23, 2013 at 12:15 AM, Alexey Neyman <[email protected]> wrote:
> > [Resending due to no response to the original message in a week]
> >
> > Hi all,
> >
> > I have a board with a BIOS bug that reports the following I/O port
> > regions in _CRS on one of the host bridges:
> >
> > 0x0000-0x03af // #0
> > 0x03e0-0x0cf7 // #1
> > 0x03b0-0x03bb // #2
> > 0x03c0-0x03df // #3
> > 0x0000-0xdfff // #4
> > 0xf000-0xffff // #5
> >
> > Obviously, region number #4 is erroneous as it overlaps with regions
> > #0..3. The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to
> > recover from such kind of BIOS bugs by merging the overlapping regions.
> > Current code expands region #0 to 0x0000-0xdffff and makes region #4
> > ignored. As a result, overlap of the expanded region #0 with regions
> > #1..3 remains undetected (as the inner loop already compared them with
> > region #0). As a result, regions #1..3 are inserted into the resource
> > tree even though they overlap with adjusted region #0 - which later
> > results in resource conflicts for PCI devices with IO ports in one of
> > those regions (e.g., for an PCI IDE controller in legacy mode - which
> > has port 0x3f6). The kernel then refuses to initialize these devices.
> >
> > The fix: instead of expanding res1 and ignoring res2, do the opposite.
> > The res2 window is yet to be compared against all windows between res1
> > and res2 (regions #1..3 in the above example), so the resulting resource
> > map will include just the expanded region - and will ignore any
> > overlapping ones.
> >
> > Signed-off-by: Alexey Neyman <[email protected]>
>
> Can you please open a report at http://bugzilla.kernel.org, assign it
> to drivers/pci, and attach a complete dmesg log?

https://bugzilla.kernel.org/show_bug.cgi?id=62511

I am not at liberty to disclose full hardware configuration of the board, as
it is a proprietary prototype board not in production. I've provided the
relevant messages from dmesg though.

Regards,
Alexey.

2013-10-09 19:30:44

by Alexey Neyman

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

[Patch ping #5]

On Thursday, October 03, 2013 04:16:07 pm Alexey Neyman wrote:
> On Thursday, October 03, 2013 12:14:38 pm Bjorn Helgaas wrote:
> > On Mon, Sep 23, 2013 at 12:15 AM, Alexey Neyman <[email protected]> wrote:
> > > [Resending due to no response to the original message in a week]
> > >
> > > Hi all,
> > >
> > > I have a board with a BIOS bug that reports the following I/O port
> > > regions in _CRS on one of the host bridges:
> > >
> > > 0x0000-0x03af // #0
> > > 0x03e0-0x0cf7 // #1
> > > 0x03b0-0x03bb // #2
> > > 0x03c0-0x03df // #3
> > > 0x0000-0xdfff // #4
> > > 0xf000-0xffff // #5
> > >
> > > Obviously, region number #4 is erroneous as it overlaps with regions
> > > #0..3. The code in coalesce_windows() in arch/x86/pci/acpi.c attempts
> > > to recover from such kind of BIOS bugs by merging the overlapping
> > > regions. Current code expands region #0 to 0x0000-0xdffff and makes
> > > region #4 ignored. As a result, overlap of the expanded region #0 with
> > > regions #1..3 remains undetected (as the inner loop already compared
> > > them with region #0). As a result, regions #1..3 are inserted into the
> > > resource tree even though they overlap with adjusted region #0 - which
> > > later results in resource conflicts for PCI devices with IO ports in
> > > one of those regions (e.g., for an PCI IDE controller in legacy mode -
> > > which has port 0x3f6). The kernel then refuses to initialize these
> > > devices.
> > >
> > > The fix: instead of expanding res1 and ignoring res2, do the opposite.
> > > The res2 window is yet to be compared against all windows between res1
> > > and res2 (regions #1..3 in the above example), so the resulting
> > > resource map will include just the expanded region - and will ignore
> > > any overlapping ones.
> > >
> > > Signed-off-by: Alexey Neyman <[email protected]>
> >
> > Can you please open a report at http://bugzilla.kernel.org, assign it
> > to drivers/pci, and attach a complete dmesg log?
>
> https://bugzilla.kernel.org/show_bug.cgi?id=62511
>
> I am not at liberty to disclose full hardware configuration of the board,
> as it is a proprietary prototype board not in production. I've provided
> the relevant messages from dmesg though.

Could anybody review/commit?

Regards,
Alexey.

2013-10-09 23:25:10

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH] Fix coalescing host bridge windows in arch/x86/pci/acpi.c

[+cc linux-pci]

On Sun, Sep 22, 2013 at 11:15:46PM -0700, Alexey Neyman wrote:
> [Resending due to no response to the original message in a week]
>
> Hi all,
>
> I have a board with a BIOS bug that reports the following I/O port regions in
> _CRS on one of the host bridges:
>
> 0x0000-0x03af // #0
> 0x03e0-0x0cf7 // #1
> 0x03b0-0x03bb // #2
> 0x03c0-0x03df // #3
> 0x0000-0xdfff // #4
> 0xf000-0xffff // #5
>
> Obviously, region number #4 is erroneous as it overlaps with regions #0..3.
> The code in coalesce_windows() in arch/x86/pci/acpi.c attempts to recover from
> such kind of BIOS bugs by merging the overlapping regions. Current code
> expands region #0 to 0x0000-0xdffff and makes region #4 ignored. As a result,
> overlap of the expanded region #0 with regions #1..3 remains undetected (as
> the inner loop already compared them with region #0). As a result, regions
> #1..3 are inserted into the resource tree even though they overlap with
> adjusted region #0 - which later results in resource conflicts for PCI devices
> with IO ports in one of those regions (e.g., for an PCI IDE controller in
> legacy mode - which has port 0x3f6). The kernel then refuses to initialize
> these devices.
>
> The fix: instead of expanding res1 and ignoring res2, do the opposite. The
> res2 window is yet to be compared against all windows between res1 and res2
> (regions #1..3 in the above example), so the resulting resource map will
> include just the expanded region - and will ignore any overlapping ones.
>
> Signed-off-by: Alexey Neyman <[email protected]>

I added the bugzilla reference (thanks for that) and applied the
following patch to my pci/misc branch for v3.13.

Sorry this took so long; part of the reason was that it wasn't on
linux-pci, so it didn't show up in the PCI patchwork. I also added
a MAINTAINERS patch so in the future, get_maintainers.pl will include
linux-pci and me as co-maintainer of arch/x86/pci/*. Also, it helps
if you include the patch in-line rather than as an attachment because
I have to manually combine the in-line changelog with the attached
patch.

Thanks for the fix and your persistence!

Bjorn


x86/PCI: Coalesce multiple overlapping host bridge windows

From: Alexey Neyman <[email protected]>

Previously we coalesced windows by expanding the first overlapping one and
making the second invalid. But we never look at the expanded first window
again, so we fail to notice other windows that overlap it. For example, we
coalesced these:

[io 0x0000-0x03af] // #0
[io 0x03e0-0x0cf7] // #1
[io 0x0000-0xdfff] // #2

into these, which still overlap:

[io 0x0000-0xdfff] // #0
[io 0x03e0-0x0cf7] // #1

The fix is to expand the *second* overlapping resource and ignore the
first, so we get this instead with no overlaps:

[io 0x0000-0xdfff] // #2

[bhelgaas: changelog]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=62511
Signed-off-by: Alexey Neyman <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
---
arch/x86/pci/acpi.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index b30e937..7fb24e5 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -354,12 +354,12 @@ static void coalesce_windows(struct pci_root_info *info, unsigned long type)
* the kernel resource tree doesn't allow overlaps.
*/
if (resource_overlaps(res1, res2)) {
- res1->start = min(res1->start, res2->start);
- res1->end = max(res1->end, res2->end);
+ res2->start = min(res1->start, res2->start);
+ res2->end = max(res1->end, res2->end);
dev_info(&info->bridge->dev,
"host bridge window expanded to %pR; %pR ignored\n",
- res1, res2);
- res2->flags = 0;
+ res2, res1);
+ res1->flags = 0;
}
}
}