LinuxLists.cc - fixing "pci=use

2009-09-16 23:15:27

Subject: fixing "pci=use_crs"

The user currently has to boot with "pci=use_crs" to make
hot-add work on some machines. I think this is a poor
user experience, and I'd like to figure out a better solution.

We tried making "pci=use_crs" the default, which didn't work
because it broke machines like Larry's. I'd like to look
at that machine in more detail and figure out what it's doing.

Larry, would you mind collecting the output of:

# dmesg
# cat /proc/iomem
# lspci -vv
# cd /sys/devices/; grep . pnp*/*/{id,resources}

and attaching them here:

http://bugzilla.kernel.org/show_bug.cgi?id=14183

Thanks,
Bjorn

2009-09-17 16:16:50

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wednesday 16 September 2009 05:15:23 pm Bjorn Helgaas wrote:
> The user currently has to boot with "pci=use_crs" to make
> hot-add work on some machines. I think this is a poor
> user experience, and I'd like to figure out a better solution.
>
> We tried making "pci=use_crs" the default, which didn't work
> because it broke machines like Larry's. I'd like to look
> at that machine in more detail and figure out what it's doing.
>
> Larry, would you mind collecting the output of:
>
> # dmesg
> # cat /proc/iomem
> # lspci -vv
> # cd /sys/devices/; grep . pnp*/*/{id,resources}
>
> and attaching them here:
>
> http://bugzilla.kernel.org/show_bug.cgi?id=14183

Thanks a lot, Larry.

It looks like you have an HP box -- what exactly is it and
what BIOS version do you have? Maybe I can borrow one to play
with myself so I don't have to bug you as much.

You don't happen to have Windows on it also, do you? If you do,
I'd like to know what the device manager says about the PCI bridges.

Your dmesg from LKML
(http://thread.gmane.org/gmane.linux.kernel/856413/focus=856458)
suggests that ACPI told us about a whole bunch of resources:

pci_bus 0000:00: resource 0 io: [0x00-0xcf7]
pci_bus 0000:00: resource 1 io: [0xd00-0xffff]
pci_bus 0000:00: resource 2 mem: [0x0a0000-0x0bffff]
pci_bus 0000:00: resource 3 mem: [0x0c0000-0x0c3fff]
pci_bus 0000:00: resource 4 mem: [0x0c4000-0x0c7fff]
pci_bus 0000:00: resource 5 mem: [0x0c8000-0x0cbfff]
pci_bus 0000:00: resource 6 mem: [0x0cc000-0x0cffff]
pci_bus 0000:00: resource 7 mem: [0x0d4000-0x0d7fff]
pci_bus 0000:00: resource 8 mem: [0x0d8000-0x0dbfff]
pci_bus 0000:00: resource 9 mem: [0x0dc000-0x0dffff]
...

but the new dmesg (http://bugzilla.kernel.org/attachment.cgi?id=23106)
only has a few:

pci_bus 0000:00: resource 0 io: [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 1 mem: [0xfc100000-0xfc1fffff]
pci_bus 0000:01: resource 3 io: [0x00-0xffff]
pci_bus 0000:01: resource 4 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:06: resource 0 io: [0x4000-0x4fff]
pci_bus 0000:06: resource 1 mem: [0xf8000000-0xfbffffff]
pci_bus 0000:04: resource 1 mem: [0xfc000000-0xfc0fffff]

The extra resources in the old dmesg could be from "pci=use_crs"
being the default, but if that were the case, they should be in
the PNP resource dump. Did you update the BIOS between those
boots?

Bjorn

2009-09-17 16:45:34

by Jesse Barnes

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Thu, 17 Sep 2009 10:16:49 -0600
Bjorn Helgaas <[email protected]> wrote:

> On Wednesday 16 September 2009 05:15:23 pm Bjorn Helgaas wrote:
> > The user currently has to boot with "pci=use_crs" to make
> > hot-add work on some machines. I think this is a poor
> > user experience, and I'd like to figure out a better solution.
> >
> > We tried making "pci=use_crs" the default, which didn't work
> > because it broke machines like Larry's. I'd like to look
> > at that machine in more detail and figure out what it's doing.
> >
> > Larry, would you mind collecting the output of:
> >
> > # dmesg
> > # cat /proc/iomem
> > # lspci -vv
> > # cd /sys/devices/; grep . pnp*/*/{id,resources}
> >
> > and attaching them here:
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=14183
>
> Thanks a lot, Larry.
>
> It looks like you have an HP box -- what exactly is it and
> what BIOS version do you have? Maybe I can borrow one to play
> with myself so I don't have to bug you as much.
>
> You don't happen to have Windows on it also, do you? If you do,
> I'd like to know what the device manager says about the PCI bridges.
>
> Your dmesg from LKML
> (http://thread.gmane.org/gmane.linux.kernel/856413/focus=856458)
> suggests that ACPI told us about a whole bunch of resources:
>
> pci_bus 0000:00: resource 0 io: [0x00-0xcf7]
> pci_bus 0000:00: resource 1 io: [0xd00-0xffff]
> pci_bus 0000:00: resource 2 mem: [0x0a0000-0x0bffff]
> pci_bus 0000:00: resource 3 mem: [0x0c0000-0x0c3fff]
> pci_bus 0000:00: resource 4 mem: [0x0c4000-0x0c7fff]
> pci_bus 0000:00: resource 5 mem: [0x0c8000-0x0cbfff]
> pci_bus 0000:00: resource 6 mem: [0x0cc000-0x0cffff]
> pci_bus 0000:00: resource 7 mem: [0x0d4000-0x0d7fff]
> pci_bus 0000:00: resource 8 mem: [0x0d8000-0x0dbfff]
> pci_bus 0000:00: resource 9 mem: [0x0dc000-0x0dffff]
> ...
>
> but the new dmesg (http://bugzilla.kernel.org/attachment.cgi?id=23106)
> only has a few:
>
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0xfc100000-0xfc1fffff]
> pci_bus 0000:01: resource 3 io: [0x00-0xffff]
> pci_bus 0000:01: resource 4 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:06: resource 0 io: [0x4000-0x4fff]
> pci_bus 0000:06: resource 1 mem: [0xf8000000-0xfbffffff]
> pci_bus 0000:04: resource 1 mem: [0xfc000000-0xfc0fffff]
>
> The extra resources in the old dmesg could be from "pci=use_crs"
> being the default, but if that were the case, they should be in
> the PNP resource dump. Did you update the BIOS between those
> boots?

FWIW in the CRS problem thread there were other reports of large
numbers of PNP resources causing problems. One solution I considered
before we ended up reverting the patch was to increase the number of
bus resources we track. Rather than a small array of resources per
bus, we could have a linked list, which would allow us to track an
arbitrary number.

We probably want to distinguish between what we read from the hw regs
and what's reported in PNP though, so maybe a new list in addition to
the existing resource set would be the way to go. That would allow us
to selectively ignore PNP resources on machines where they report bogus
ranges (or selectively look at them, either way).

--
Jesse Barnes, Intel Open Source Technology Center

2009-09-17 18:57:48

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Thursday 17 September 2009 10:45:27 am Jesse Barnes wrote:
> > The extra resources in the old dmesg could be from "pci=use_crs"
> > being the default, but if that were the case, they should be in
> > the PNP resource dump. ?Did you update the BIOS between those
> > boots?
>
> FWIW in the CRS problem thread there were other reports of large
> numbers of PNP resources causing problems. ?One solution I considered
> before we ended up reverting the patch was to increase the number of
> bus resources we track. ?Rather than a small array of resources per
> bus, we could have a linked list, which would allow us to track an
> arbitrary number.

I think that might be a good starting point anyway, since I'm not
aware of any spec that limits the number of apertures a host bridge
can have.

> We probably want to distinguish between what we read from the hw regs
> and what's reported in PNP though, so maybe a new list in addition to
> the existing resource set would be the way to go. ?That would allow us
> to selectively ignore PNP resources on machines where they report bogus
> ranges (or selectively look at them, either way).

When we can read things from hardware registers, I think we have to
trust that more than anything from other sources. But I'm not proposing
any particular solution yet; I just want to get a better understanding
of what we're seeing.

Bjorn

2009-09-19 17:08:01

by Larry Finger

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

Bjorn Helgaas wrote:
> It looks like you have an HP box -- what exactly is it and
> what BIOS version do you have? Maybe I can borrow one to play
> with myself so I don't have to bug you as much.

It is an HP dv2815nr. The BIOS is F.21. I understand that debugging
is much easier when you have the machine in hand, but I don't mind
providing information.

> You don't happen to have Windows on it also, do you? If you do,
> I'd like to know what the device manager says about the PCI bridges.

The only Windows I have is on a VirtualBox VM. The machine came with
Vista, but it kept self destructing, thus I blew it away. I cannot do
it now as I am traveling, but I have a Bart's PE disk at home. I
should be able to look at the device manager picture with it.

> Your dmesg from LKML
> (http://thread.gmane.org/gmane.linux.kernel/856413/focus=856458)
> suggests that ACPI told us about a whole bunch of resources:
>
> pci_bus 0000:00: resource 0 io: [0x00-0xcf7]
> pci_bus 0000:00: resource 1 io: [0xd00-0xffff]
> pci_bus 0000:00: resource 2 mem: [0x0a0000-0x0bffff]
> pci_bus 0000:00: resource 3 mem: [0x0c0000-0x0c3fff]
> pci_bus 0000:00: resource 4 mem: [0x0c4000-0x0c7fff]
> pci_bus 0000:00: resource 5 mem: [0x0c8000-0x0cbfff]
> pci_bus 0000:00: resource 6 mem: [0x0cc000-0x0cffff]
> pci_bus 0000:00: resource 7 mem: [0x0d4000-0x0d7fff]
> pci_bus 0000:00: resource 8 mem: [0x0d8000-0x0dbfff]
> pci_bus 0000:00: resource 9 mem: [0x0dc000-0x0dffff]
> ...
>
> but the new dmesg (http://bugzilla.kernel.org/attachment.cgi?id=23106)
> only has a few:
>
> pci_bus 0000:00: resource 0 io: [0x00-0xffff]
> pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:01: resource 1 mem: [0xfc100000-0xfc1fffff]
> pci_bus 0000:01: resource 3 io: [0x00-0xffff]
> pci_bus 0000:01: resource 4 mem: [0x000000-0xffffffffffffffff]
> pci_bus 0000:06: resource 0 io: [0x4000-0x4fff]
> pci_bus 0000:06: resource 1 mem: [0xf8000000-0xfbffffff]
> pci_bus 0000:04: resource 1 mem: [0xfc000000-0xfc0fffff]
>
> The extra resources in the old dmesg could be from "pci=use_crs"
> being the default, but if that were the case, they should be in
> the PNP resource dump. Did you update the BIOS between those
> boots?

No new BIOS. I'm still running the one that was in the machine when I
bought it. Do you want me to boot with "pci=use_crs"?

Larry

2009-09-21 21:20:59

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Saturday 19 September 2009 11:08:02 am Larry Finger wrote:
> Bjorn Helgaas wrote:
> > It looks like you have an HP box -- what exactly is it and
> > what BIOS version do you have? Maybe I can borrow one to play
> > with myself so I don't have to bug you as much.
>
> It is an HP dv2815nr. The BIOS is F.21. I understand that debugging
> is much easier when you have the machine in hand, but I don't mind
> providing information.
> ...
> Do you want me to boot with "pci=use_crs"?

That'd be great. HP's bureaucracy makes it hard for me to borrow
machines. If you could also turn on CONFIG_PNP_DEBUG_MESSAGES
and boot with "pnp.debug", we should get some clues about what _CRS
returns for all the ACPI devices. Just attach the resulting dmesg
to the bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=14183).

Bjorn

2009-09-22 23:40:51

by Larry Finger

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

Bjorn Helgaas wrote:
> On Saturday 19 September 2009 11:08:02 am Larry Finger wrote:
>> Bjorn Helgaas wrote:
>>> It looks like you have an HP box -- what exactly is it and
>>> what BIOS version do you have? Maybe I can borrow one to play
>>> with myself so I don't have to bug you as much.
>> It is an HP dv2815nr. The BIOS is F.21. I understand that debugging
>> is much easier when you have the machine in hand, but I don't mind
>> providing information.
>> ...
>> Do you want me to boot with "pci=use_crs"?
>
> That'd be great. HP's bureaucracy makes it hard for me to borrow
> machines. If you could also turn on CONFIG_PNP_DEBUG_MESSAGES
> and boot with "pnp.debug", we should get some clues about what _CRS
> returns for all the ACPI devices. Just attach the resulting dmesg
> to the bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=14183).

I put the dmesg output with pci=use_crs and pnp.debug in the bugzilla.

Larry

2009-09-23 23:23:32

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Tuesday 22 September 2009 05:35:29 pm Larry Finger wrote:
> Bjorn Helgaas wrote:
> > On Saturday 19 September 2009 11:08:02 am Larry Finger wrote:
> >> Bjorn Helgaas wrote:
> >>> It looks like you have an HP box -- what exactly is it and
> >>> what BIOS version do you have? Maybe I can borrow one to play
> >>> with myself so I don't have to bug you as much.
> >> It is an HP dv2815nr. The BIOS is F.21. I understand that debugging
> >> is much easier when you have the machine in hand, but I don't mind
> >> providing information.
> >> ...
> >> Do you want me to boot with "pci=use_crs"?
> >
> > That'd be great. HP's bureaucracy makes it hard for me to borrow
> > machines. If you could also turn on CONFIG_PNP_DEBUG_MESSAGES
> > and boot with "pnp.debug", we should get some clues about what _CRS
> > returns for all the ACPI devices. Just attach the resulting dmesg
> > to the bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=14183).
>
> I put the dmesg output with pci=use_crs and pnp.debug in the bugzilla.

Thanks a lot. These resources:

pci_bus 0000:00: resource 0 io: [0x00-0xcf7]
pci_bus 0000:00: resource 1 io: [0xd00-0xffff]
pci_bus 0000:00: resource 2 mem: [0x0a0000-0x0bffff]

are obviously coming from _CRS, and we only exercise the
pci_acpi_scan_root() -> get_current_resources() -> setup_resource()
path when we have an acpi_device.

But we don't have a PNPACPI device with those resources. There are
a couple cases where we don't build PNPACPI devices for an ACPI
device: HID isn't valid PNP ID ("AAAXXXX"), device is excluded
(EC, PIC, interrupt links, timers), or status is "!present".

The most likely seems like a device that is "!present && functional".
We build acpi_devices for those, but not PNP devices.

Would you mind trying the attached debug patch to try to confirm this?

Bjorn

P.S. Yinghai, you posted some patches earlier dealing with "only one
HT chain." You apparently have some insight into what's going on here,
but unfortunately, the changelogs mean absolutely nothing to me. Can
you give me any clues?

diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 1014eb4..7c817f8 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -7,6 +7,7 @@
#include <asm/pci_x86.h>

struct pci_root_info {
+ struct acpi_device *device;
char *name;
unsigned int res_num;
struct resource *res;
@@ -77,6 +78,10 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
if (!ACPI_SUCCESS(status))
return AE_OK;

+ dev_info(&info->device->dev, "res %d addr %d min %#llx len %#llx tra %#llx\n",
+ acpi_res->type,
+ addr.resource_type, addr.minimum, addr.address_length,
+ addr.translation_offset);
if (addr.resource_type == ACPI_MEMORY_RANGE) {
root = &iomem_resource;
flags = IORESOURCE_MEM;
@@ -117,6 +122,8 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
return AE_OK;
}

+#define STRUCT_TO_INT(s) (*((int*)&s))
+
static void
get_current_resources(struct acpi_device *device, int busnum,
int domain, struct pci_bus *bus)
@@ -124,6 +131,10 @@ get_current_resources(struct acpi_device *device, int busnum,
struct pci_root_info info;
size_t size;

+ dev_info(&device->dev, "getting resources (status 0x%x)\n",
+ STRUCT_TO_INT(device->status));
+
+ info.device = device;
info.bus = bus;
info.res_num = 0;
acpi_walk_resources(device->handle, METHOD_NAME__CRS, count_resource,
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 31b961c..78a5fd6 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -461,6 +461,7 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
acpi_handle handle;
struct acpi_device *child;
u32 flags, base_flags;
+ struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };

segment = 0;
status = acpi_evaluate_integer(device->handle, METHOD_NAME__SEG, NULL,
@@ -508,9 +509,11 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
/* TBD: Locking */
list_add_tail(&root->node, &acpi_pci_roots);

+ acpi_get_name(device->handle, ACPI_FULL_PATHNAME, &buffer);
printk(KERN_INFO PREFIX "%s [%s] (%04x:%02x)\n",
- acpi_device_name(device), acpi_device_bid(device),
+ acpi_device_name(device), (char *) buffer.pointer,
root->segment, root->bus_nr);
+ kfree(buffer.pointer);

/*
* Scan the Root Bridge
diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c
index 83b8b5a..3a0a93b 100644
--- a/drivers/pnp/pnpacpi/core.c
+++ b/drivers/pnp/pnpacpi/core.c
@@ -154,6 +154,7 @@ static int __init pnpacpi_add_device(struct acpi_device *device)
acpi_status status;
struct pnp_dev *dev;
struct acpi_hardware_id *id;
+ struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };

/*
* If a PnPacpi device is not present , the device
@@ -205,7 +206,11 @@ static int __init pnpacpi_add_device(struct acpi_device *device)
/* clear out the damaged flags */
if (!dev->active)
pnp_init_resources(dev);
+
+ acpi_get_name(temp, ACPI_FULL_PATHNAME, &buffer);
+ pnp_dbg(&dev->dev, "pnp device for [%s]\n", (char *) buffer.pointer);
pnp_add_device(dev);
+ kfree(buffer.pointer);
num++;

return AE_OK;

2009-09-23 23:28:34

by Yinghai Lu

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wed, Sep 23, 2009 at 4:23 PM, Bjorn Helgaas <[email protected]> wrote:

>
> P.S. ?Yinghai, you posted some patches earlier dealing with "only one
> HT chain." ?You apparently have some insight into what's going on here,
> but unfortunately, the changelogs mean absolutely nothing to me. ?Can
> you give me any clues?

which commit?

normally we only need to have split root resource into several pieces
when we have two HT chains or other io chains...

YH

2009-09-24 03:24:13

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wed, 2009-09-23 at 16:28 -0700, Yinghai Lu wrote:
> On Wed, Sep 23, 2009 at 4:23 PM, Bjorn Helgaas <[email protected]> wrote:

> > P.S. Yinghai, you posted some patches earlier dealing with "only one
> > HT chain." You apparently have some insight into what's going on here,
> > but unfortunately, the changelogs mean absolutely nothing to me. Can
> > you give me any clues?
>
> which commit?
>
> normally we only need to have split root resource into several pieces
> when we have two HT chains or other io chains...

I meant the patches here:
http://lkml.org/lkml/2009/6/24/557

My opinion is that ACPI is there to give us an abstract description of
the machine, and we shouldn't have to introduce knowledge like "this
machine has two HT chains" or add checks in amd_bus.c about
"pci_root_num <= 1".

But maybe if I knew what an HT chain was and why you think it affects
the description returned by _CRS, it would give me a clue about how to
deal with this in a generic way.

Bjorn

2009-09-24 04:42:09

by Yinghai Lu

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wed, Sep 23, 2009 at 8:21 PM, Bjorn Helgaas <[email protected]> wrote:
> On Wed, 2009-09-23 at 16:28 -0700, Yinghai Lu wrote:
>> On Wed, Sep 23, 2009 at 4:23 PM, Bjorn Helgaas <[email protected]> wrote:
>
>> > P.S. ?Yinghai, you posted some patches earlier dealing with "only one
>> > HT chain." ?You apparently have some insight into what's going on here,
>> > but unfortunately, the changelogs mean absolutely nothing to me. ?Can
>> > you give me any clues?
>>
>> which commit?
>>
>> normally we only need to have split root resource into several pieces
>> when we have two HT chains or other io chains...
>
> I meant the patches here:
> ?http://lkml.org/lkml/2009/6/24/557
>
> My opinion is that ACPI is there to give us an abstract description of
> the machine, and we shouldn't have to introduce knowledge like "this
> machine has two HT chains" or add checks in amd_bus.c about
> "pci_root_num <= 1".
>
> But maybe if I knew what an HT chain was and why you think it affects
> the description returned by _CRS, it would give me a clue about how to
> deal with this in a generic way.

we could use _CRS, but lots of BIOS just provide messed up resources
in _CRS to OS.

for example, the HW conf register does have mmio high range there, but
_CRS doesn't report them.

thought we can use whilelist to use _CRS for them.

YH

2009-09-24 13:26:52

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wed, 2009-09-23 at 21:42 -0700, Yinghai Lu wrote:
> On Wed, Sep 23, 2009 at 8:21 PM, Bjorn Helgaas <[email protected]> wrote:
> > On Wed, 2009-09-23 at 16:28 -0700, Yinghai Lu wrote:
> >> On Wed, Sep 23, 2009 at 4:23 PM, Bjorn Helgaas <[email protected]> wrote:
> >
> >> > P.S. Yinghai, you posted some patches earlier dealing with "only one
> >> > HT chain." You apparently have some insight into what's going on here,
> >> > but unfortunately, the changelogs mean absolutely nothing to me. Can
> >> > you give me any clues?
> >>
> >> which commit?
> >>
> >> normally we only need to have split root resource into several pieces
> >> when we have two HT chains or other io chains...
> >
> > I meant the patches here:
> > http://lkml.org/lkml/2009/6/24/557
> >
> > My opinion is that ACPI is there to give us an abstract description of
> > the machine, and we shouldn't have to introduce knowledge like "this
> > machine has two HT chains" or add checks in amd_bus.c about
> > "pci_root_num <= 1".
> >
> > But maybe if I knew what an HT chain was and why you think it affects
> > the description returned by _CRS, it would give me a clue about how to
> > deal with this in a generic way.
>
> we could use _CRS, but lots of BIOS just provide messed up resources
> in _CRS to OS.

We do have to assume there are BIOS defects here, but in most cases, I
look for Linux deficiencies first. I'm assuming (without real evidence)
that Windows does look at the _CRS, so the worst BIOS defects should be
weeded out by Windows testing.

> for example, the HW conf register does have mmio high range there, but
> _CRS doesn't report them.

On Larry's box, _CRS reports *more* ranges than Linux was prepared for.
This would be a bug in the other direction, where _CRS reports *less*
than it should.

> thought we can use whilelist to use _CRS for them.

I'm opposed to a whitelist for this issue because it means we have to
continually update the whitelist for new, correctly working machines.
If we can't figure out anything better, we could use a date-based
blacklist (ignore _CRS for all machines older than today).

But first, we have to establish that there really is a requirement to
look at _CRS, then have a good try at making Linux smart enough to deal
with whatever it finds.

Bjorn

2009-09-24 15:27:55

by Bjorn Helgaas

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Wednesday 23 September 2009 05:23:30 pm Bjorn Helgaas wrote:
> Would you mind trying the attached debug patch to try to confirm this?

Let me refine this patch a little before you try it out. For
one thing, it applies on top of a bunch of patches in my queue,
so it won't apply directly for you.

> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index 1014eb4..7c817f8 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -7,6 +7,7 @@
> #include <asm/pci_x86.h>
>
> struct pci_root_info {
> + struct acpi_device *device;
> char *name;
> unsigned int res_num;
> struct resource *res;
> @@ -77,6 +78,10 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
> if (!ACPI_SUCCESS(status))
> return AE_OK;
>
> + dev_info(&info->device->dev, "res %d addr %d min %#llx len %#llx tra %#llx\n",
> + acpi_res->type,
> + addr.resource_type, addr.minimum, addr.address_length,
> + addr.translation_offset);
> if (addr.resource_type == ACPI_MEMORY_RANGE) {
> root = &iomem_resource;
> flags = IORESOURCE_MEM;
> @@ -117,6 +122,8 @@ setup_resource(struct acpi_resource *acpi_res, void *data)
> return AE_OK;
> }
>
> +#define STRUCT_TO_INT(s) (*((int*)&s))
> +
> static void
> get_current_resources(struct acpi_device *device, int busnum,
> int domain, struct pci_bus *bus)
> @@ -124,6 +131,10 @@ get_current_resources(struct acpi_device *device, int busnum,
> struct pci_root_info info;
> size_t size;
>
> + dev_info(&device->dev, "getting resources (status 0x%x)\n",
> + STRUCT_TO_INT(device->status));
> +
> + info.device = device;
> info.bus = bus;
> info.res_num = 0;
> acpi_walk_resources(device->handle, METHOD_NAME__CRS, count_resource,
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 31b961c..78a5fd6 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -461,6 +461,7 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
> acpi_handle handle;
> struct acpi_device *child;
> u32 flags, base_flags;
> + struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>
> segment = 0;
> status = acpi_evaluate_integer(device->handle, METHOD_NAME__SEG, NULL,
> @@ -508,9 +509,11 @@ static int __devinit acpi_pci_root_add(struct acpi_device *device)
> /* TBD: Locking */
> list_add_tail(&root->node, &acpi_pci_roots);
>
> + acpi_get_name(device->handle, ACPI_FULL_PATHNAME, &buffer);
> printk(KERN_INFO PREFIX "%s [%s] (%04x:%02x)\n",
> - acpi_device_name(device), acpi_device_bid(device),
> + acpi_device_name(device), (char *) buffer.pointer,
> root->segment, root->bus_nr);
> + kfree(buffer.pointer);
>
> /*
> * Scan the Root Bridge
> diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c
> index 83b8b5a..3a0a93b 100644
> --- a/drivers/pnp/pnpacpi/core.c
> +++ b/drivers/pnp/pnpacpi/core.c
> @@ -154,6 +154,7 @@ static int __init pnpacpi_add_device(struct acpi_device *device)
> acpi_status status;
> struct pnp_dev *dev;
> struct acpi_hardware_id *id;
> + struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>
> /*
> * If a PnPacpi device is not present , the device
> @@ -205,7 +206,11 @@ static int __init pnpacpi_add_device(struct acpi_device *device)
> /* clear out the damaged flags */
> if (!dev->active)
> pnp_init_resources(dev);
> +
> + acpi_get_name(temp, ACPI_FULL_PATHNAME, &buffer);
> + pnp_dbg(&dev->dev, "pnp device for [%s]\n", (char *) buffer.pointer);
> pnp_add_device(dev);
> + kfree(buffer.pointer);
> num++;
>
> return AE_OK;
>

2009-09-24 22:04:12

by Gary Hade

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

On Thu, Sep 24, 2009 at 07:24:18AM -0600, Bjorn Helgaas wrote:
> On Wed, 2009-09-23 at 21:42 -0700, Yinghai Lu wrote:
> > On Wed, Sep 23, 2009 at 8:21 PM, Bjorn Helgaas <[email protected]> wrote:
> > > On Wed, 2009-09-23 at 16:28 -0700, Yinghai Lu wrote:
> > >> On Wed, Sep 23, 2009 at 4:23 PM, Bjorn Helgaas <[email protected]> wrote:
> > >
> > >> > P.S. Yinghai, you posted some patches earlier dealing with "only one
> > >> > HT chain." You apparently have some insight into what's going on here,
> > >> > but unfortunately, the changelogs mean absolutely nothing to me. Can
> > >> > you give me any clues?
> > >>
> > >> which commit?
> > >>
> > >> normally we only need to have split root resource into several pieces
> > >> when we have two HT chains or other io chains...
> > >
> > > I meant the patches here:
> > > http://lkml.org/lkml/2009/6/24/557
> > >
> > > My opinion is that ACPI is there to give us an abstract description of
> > > the machine, and we shouldn't have to introduce knowledge like "this
> > > machine has two HT chains" or add checks in amd_bus.c about
> > > "pci_root_num <= 1".
> > >
> > > But maybe if I knew what an HT chain was and why you think it affects
> > > the description returned by _CRS, it would give me a clue about how to
> > > deal with this in a generic way.
> >
> > we could use _CRS, but lots of BIOS just provide messed up resources
> > in _CRS to OS.
>
> We do have to assume there are BIOS defects here, but in most cases, I
> look for Linux deficiencies first. I'm assuming (without real evidence)
> that Windows does look at the _CRS, so the worst BIOS defects should be
> weeded out by Windows testing.
>
> > for example, the HW conf register does have mmio high range there, but
> > _CRS doesn't report them.
>
> On Larry's box, _CRS reports *more* ranges than Linux was prepared for.
> This would be a bug in the other direction, where _CRS reports *less*
> than it should.
>
> > thought we can use whilelist to use _CRS for them.
>
> I'm opposed to a whitelist for this issue because it means we have to
> continually update the whitelist for new, correctly working machines.
> If we can't figure out anything better, we could use a date-based
> blacklist (ignore _CRS for all machines older than today).

Unless someone has a strong need other than PCI hot-add for
root bridge _CRS to be used by default, would it possibly make
sense (at least as a first cut) to limit the default use of
_CRS to only those root bridges that (1) are located above
PCI hotplug capable slots and (2) have _CRS that returns a
quantity of ranges that will not overrun the current fixed
size resource arrays?

Gary

--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
[email protected]
http://www.ibm.com/linux/ltc

2009-09-25 21:00:55

by Larry Finger

[permalink] [raw]

Subject: Re: fixing "pci=use_crs"

Bjorn Helgaas wrote:
> On Wednesday 23 September 2009 05:23:30 pm Bjorn Helgaas wrote:
>> Would you mind trying the attached debug patch to try to confirm this?
>
> Let me refine this patch a little before you try it out. For
> one thing, it applies on top of a bunch of patches in my queue,
> so it won't apply directly for you.

This mail was on my computer before I had a chance to try to patch.
I'll await a revised one.

Larry