2008-06-24 18:49:11

by Alok Kataria

[permalink] [raw]
Subject: [PATCH 2/2] acpi based pci gap caluculation v2

Evaluates the _CRS object under PCI0 looking for producer resources.
Then searches the e820 memory space for a gap within these producer resources.

Allows us to find a gap for the unclaimed pci resources or MMIO resources
for hotplug devices within the BIOS allowed pci regions.

v1->v2: Some changes in the print strings.
Also note the prototype for e820_search_gap has changes in this
iteration, but the return value is ignored while calling it over here.

Signed-off-by: Alok N Kataria <[email protected]>

Index: linux-x86-tree.git/arch/x86/pci/acpi.c
===================================================================
--- linux-x86-tree.git.orig/arch/x86/pci/acpi.c 2008-06-23 17:24:07.000000000 -0700
+++ linux-x86-tree.git/arch/x86/pci/acpi.c 2008-06-24 10:57:43.000000000 -0700
@@ -4,6 +4,7 @@
#include <linux/irq.h>
#include <linux/dmi.h>
#include <asm/numa.h>
+#include <asm/e820.h>
#include "pci.h"

struct pci_root_info {
@@ -14,6 +15,11 @@
int busnum;
};

+struct gap_info {
+ unsigned long gapstart;
+ unsigned long gapsize;
+};
+
static acpi_status
resource_to_addr(struct acpi_resource *resource,
struct acpi_resource_address64 *addr)
@@ -110,6 +116,70 @@
}
}

+static acpi_status search_gap(struct acpi_resource *resource, void *data)
+{
+ struct acpi_resource_address64 addr;
+ acpi_status status;
+ struct gap_info *gap = data;
+
+ status = resource_to_addr(resource, &addr);
+ if (ACPI_SUCCESS(status) &&
+ addr.resource_type == ACPI_MEMORY_RANGE &&
+ addr.address_length > gap->gapsize) {
+ e820_search_gap(&gap->gapstart, &gap->gapsize,
+ (addr.minimum + addr.translation_offset));
+ }
+
+ return AE_OK;
+}
+
+/*
+ * Search for a hole in the 32 bit address space for PCI to assign MMIO
+ * resources for hotplug or unconfigured resources.
+ * We query the CRS object of the PCI root device to look for possible producer
+ * resources in the tree and consider these while calulating the start address
+ * for this hole.
+ */
+static void pci_setup_gap(acpi_handle *handle)
+{
+ struct gap_info gap;
+ acpi_status status;
+
+ gap.gapstart = 0;
+ gap.gapsize = 0x400000;
+
+ status = acpi_walk_resources(handle, METHOD_NAME__CRS,
+ search_gap, &gap);
+
+ if (ACPI_SUCCESS(status)) {
+ unsigned long round;
+
+ if (!gap.gapstart) {
+ printk(KERN_ERR "ACPI: Warning: Cannot find a gap "
+ "in the 32bit address range for PCI\n"
+ "ACPI: PCI devices may collide with "
+ "hotpluggable memory address range\n");
+ }
+ /*
+ * Round the gapstart, uses the same logic as in
+ * e820_gap_setup
+ */
+ round = 0x100000;
+ while ((gap.gapsize >> 4) > round)
+ round += round;
+ /* Fun with two's complement */
+ pci_mem_start = (gap.gapstart + round) & -round;
+
+ printk(KERN_INFO "ACPI: PCI resources should "
+ "start at %lx (gap: %lx:%lx)\n",
+ pci_mem_start, gap.gapstart, gap.gapsize);
+ } else {
+ printk(KERN_ERR "ACPI: Error while searching for gap in "
+ "the 32bit address range for PCI\n");
+ }
+}
+
+
static void
get_current_resources(struct acpi_device *device, int busnum,
int domain, struct pci_bus *bus)
@@ -220,6 +290,8 @@

if (bus && (pci_probe & PCI_USE__CRS))
get_current_resources(device, busnum, domain, bus);
+
+ pci_setup_gap(device->handle);
return bus;
}



2008-06-25 02:38:38

by Zhao, Yakui

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
> Evaluates the _CRS object under PCI0 looking for producer resources.
> Then searches the e820 memory space for a gap within these producer resources.
>
> Allows us to find a gap for the unclaimed pci resources or MMIO resources
> for hotplug devices within the BIOS allowed pci regions.
>
> v1->v2: Some changes in the print strings.
> Also note the prototype for e820_search_gap has changes in this
> iteration, but the return value is ignored while calling it over here.
>
> Signed-off-by: Alok N Kataria <[email protected]>
>
> Index: linux-x86-tree.git/arch/x86/pci/acpi.c
> ===================================================================
> --- linux-x86-tree.git.orig/arch/x86/pci/acpi.c 2008-06-23 17:24:07.000000000 -0700
> +++ linux-x86-tree.git/arch/x86/pci/acpi.c 2008-06-24 10:57:43.000000000 -0700
> @@ -4,6 +4,7 @@
> #include <linux/irq.h>
> #include <linux/dmi.h>
> #include <asm/numa.h>
> +#include <asm/e820.h>
> #include "pci.h"
>
> struct pci_root_info {
> @@ -14,6 +15,11 @@
> int busnum;
> };
>
> +struct gap_info {
> + unsigned long gapstart;
> + unsigned long gapsize;
> +};
> +
> static acpi_status
> resource_to_addr(struct acpi_resource *resource,
> struct acpi_resource_address64 *addr)
> @@ -110,6 +116,70 @@
> }
> }
>
> +static acpi_status search_gap(struct acpi_resource *resource, void *data)
> +{
> + struct acpi_resource_address64 addr;
> + acpi_status status;
> + struct gap_info *gap = data;
> +
> + status = resource_to_addr(resource, &addr);
> + if (ACPI_SUCCESS(status) &&
> + addr.resource_type == ACPI_MEMORY_RANGE &&
> + addr.address_length > gap->gapsize) {
> + e820_search_gap(&gap->gapstart, &gap->gapsize,
> + (addr.minimum + addr.translation_offset));
> + }
> +
> + return AE_OK;
> +}
> +
> +/*
> + * Search for a hole in the 32 bit address space for PCI to assign MMIO
> + * resources for hotplug or unconfigured resources.
> + * We query the CRS object of the PCI root device to look for possible producer
> + * resources in the tree and consider these while calulating the start address
> + * for this hole.
> + */
> +static void pci_setup_gap(acpi_handle *handle)
> +{
> + struct gap_info gap;
> + acpi_status status;
> +
> + gap.gapstart = 0;
> + gap.gapsize = 0x400000;
> +
> + status = acpi_walk_resources(handle, METHOD_NAME__CRS,
> + search_gap, &gap);
> +
> + if (ACPI_SUCCESS(status)) {
> + unsigned long round;
> +
> + if (!gap.gapstart) {
> + printk(KERN_ERR "ACPI: Warning: Cannot find a gap "
> + "in the 32bit address range for PCI\n"
> + "ACPI: PCI devices may collide with "
> + "hotpluggable memory address range\n");
> + }
> + /*
> + * Round the gapstart, uses the same logic as in
> + * e820_gap_setup
> + */
> + round = 0x100000;
> + while ((gap.gapsize >> 4) > round)
> + round += round;
> + /* Fun with two's complement */
> + pci_mem_start = (gap.gapstart + round) & -round;
> +
> + printk(KERN_INFO "ACPI: PCI resources should "
> + "start at %lx (gap: %lx:%lx)\n",
> + pci_mem_start, gap.gapstart, gap.gapsize);
> + } else {
> + printk(KERN_ERR "ACPI: Error while searching for gap in "
> + "the 32bit address range for PCI\n");
> + }
> +}
> +
> +
> static void
> get_current_resources(struct acpi_device *device, int busnum,
> int domain, struct pci_bus *bus)
> @@ -220,6 +290,8 @@
>
> if (bus && (pci_probe & PCI_USE__CRS))
> get_current_resources(device, busnum, domain, bus);
> +
> + pci_setup_gap(device->handle);
> return bus;
> }
It seems reasonable.
But if the resource obtained from the PCI0 _CRS method is incorrect, we
will get the incorrect pci_mem_start.

At the same time after the patch is applied, pci_mem_start will be
parsed in two different ways. If the result is different, maybe the
incorrect pci_mem_start will be used.

Best regards.
Yakui
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2008-06-25 04:17:56

by Alok Kataria

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
> On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
>> Evaluates the _CRS object under PCI0 looking for producer resources.
>> Then searches the e820 memory space for a gap within these producer resources.
>>
>> Allows us to find a gap for the unclaimed pci resources or MMIO resources
>> for hotplug devices within the BIOS allowed pci regions.
>>
> It seems reasonable.
> But if the resource obtained from the PCI0 _CRS method is incorrect, we
> will get the incorrect pci_mem_start.

Hi Yakui,

What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
give a incorrect _CRS object ?
Also we don't just take the value given from the _CRS method, we still
read the e820_map to search for an unallocated resource. So even if
(by chance) the _CRS method returns incorrect value we would still
figure out if there is a collision with an already allocated resource.

>
> At the same time after the patch is applied, pci_mem_start will be
> parsed in two different ways.

Yes pci_mem_start would be initialized in 2 different ways but we
still have to parse the e820_map the old way because there could be
systems without ACPI.

> If the result is different, maybe the
> incorrect pci_mem_start will be used.

Yeah, The result is different in my case. Though my BIOS reserves
hotpluggable memory region, kernel doesn't respect that right now and
just parses the e820_map to calculate the gap and pci_mem_start value.
I have explained the problem in this mail.

http://marc.info/?l=linux-acpi&m=121391675711763&w=2

Maybe nobody has seen this problem yet, because there are no systems
out there with less than 4GB memory to start with and which allow
memory hotplug.

But still i don't understand what do you mean by, we can get incorrect
pci_mem_start, in which case ?

Thanks,
Alok
>
> Best regards.
> Yakui
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2008-06-25 05:28:42

by Zhao, Yakui

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote:
> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
> >> Evaluates the _CRS object under PCI0 looking for producer resources.
> >> Then searches the e820 memory space for a gap within these producer resources.
> >>
> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources
> >> for hotplug devices within the BIOS allowed pci regions.
> >>
> > It seems reasonable.
> > But if the resource obtained from the PCI0 _CRS method is incorrect, we
> > will get the incorrect pci_mem_start.
>
> Hi Yakui,
>
> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
> give a incorrect _CRS object ?
> Also we don't just take the value given from the _CRS method, we still
> read the e820_map to search for an unallocated resource. So even if
> (by chance) the _CRS method returns incorrect value we would still
> figure out if there is a collision with an already allocated resource.
In the patch the address obtained from the _CRS object will be passed
into the function of e820_search_gap. In such case maybe we will get the
pci_mem_start different with the e820_setup_gap.

> >
> > At the same time after the patch is applied, pci_mem_start will be
> > parsed in two different ways.
>
> Yes pci_mem_start would be initialized in 2 different ways but we
> still have to parse the e820_map the old way because there could be
> systems without ACPI.
>
> > If the result is different, maybe the
> > incorrect pci_mem_start will be used.
>
> Yeah, The result is different in my case. Though my BIOS reserves
> hotpluggable memory region, kernel doesn't respect that right now and
> just parses the e820_map to calculate the gap and pci_mem_start value.
> I have explained the problem in this mail.
>
> http://marc.info/?l=linux-acpi&m=121391675711763&w=2
>
> Maybe nobody has seen this problem yet, because there are no systems
> out there with less than 4GB memory to start with and which allow
> memory hotplug.
>
> But still i don't understand what do you mean by, we can get incorrect
> pci_mem_start, in which case ?

In the function of setup_arch the pci_mem_start will be parsed by
searching the e820 table. After the patch is applied, we will parse the
pci_mem_start again in the function of pci_acpi_scan_init and it will
override the value parsed in the function of setup_arch. If the
pci_mem_start is incorrect in the second case, maybe it will have side
effect.

Thanks.
Yakui.

> Thanks,
> Alok
> >
> > Best regards.
> > Yakui
> >>
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >

2008-06-25 06:04:28

by Alok Kataria

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

Hi Yakui,

On Tue, Jun 24, 2008 at 10:39 PM, Zhao Yakui <[email protected]> wrote:
> On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote:
>> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
>> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
>> >> Evaluates the _CRS object under PCI0 looking for producer resources.
>> >> Then searches the e820 memory space for a gap within these producer resources.
>> >>
>> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources
>> >> for hotplug devices within the BIOS allowed pci regions.
>> >>
>> > It seems reasonable.
>> > But if the resource obtained from the PCI0 _CRS method is incorrect, we
>> > will get the incorrect pci_mem_start.
>>
>> Hi Yakui,
>>
>> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
>> give a incorrect _CRS object ?
>> Also we don't just take the value given from the _CRS method, we still
>> read the e820_map to search for an unallocated resource. So even if
>> (by chance) the _CRS method returns incorrect value we would still
>> figure out if there is a collision with an already allocated resource.
> In the patch the address obtained from the _CRS object will be passed
> into the function of e820_search_gap. In such case maybe we will get the
> pci_mem_start different with the e820_setup_gap.

True..the whole idea behind doing this patch was to get a correct
(different) value for pci_mem_start.
We read the _CRS object over here to make sure that we assign the
pci_mem_start from the address range which is reserved by the BIOS for
PCI devices.

Also this reading of _CRS object would be done before we start
initializing the pci devices, i.e. before we start using the value of
pci_mem_start, so the original value assigned by pci_setup_gap is just
overwritten by this function. So that should be fine IMHO.
Also we would still want the call for e820_setup_gap because there can
be systems with no acpi support or acpi disabled


>
>> >
>> > At the same time after the patch is applied, pci_mem_start will be
>> > parsed in two different ways.
>>
>> Yes pci_mem_start would be initialized in 2 different ways but we
>> still have to parse the e820_map the old way because there could be
>> systems without ACPI.
>>
>> > If the result is different, maybe the
>> > incorrect pci_mem_start will be used.
>>
>> Yeah, The result is different in my case. Though my BIOS reserves
>> hotpluggable memory region, kernel doesn't respect that right now and
>> just parses the e820_map to calculate the gap and pci_mem_start value.
>> I have explained the problem in this mail.
>>
>> http://marc.info/?l=linux-acpi&m=121391675711763&w=2
>>
>> Maybe nobody has seen this problem yet, because there are no systems
>> out there with less than 4GB memory to start with and which allow
>> memory hotplug.
>>
>> But still i don't understand what do you mean by, we can get incorrect
>> pci_mem_start, in which case ?
>
> In the function of setup_arch the pci_mem_start will be parsed by
> searching the e820 table. After the patch is applied, we will parse the
> pci_mem_start again in the function of pci_acpi_scan_init and it will
> override the value parsed in the function of setup_arch. If the
> pci_mem_start is incorrect in the second case, maybe it will have side
> effect.

Yes it will override. But how can the value be incorrect in the second
case. As explained in my previous mail we still parse the e820_map to
check if we have unclaimed resources between start_address (that read
from _CRS) to 2^32. So even if this start_address is wrong we would
catch that during parsing e820_map. But again why would the _CRS
return incorrect values, are you talking about errors in BIOS ?

Thanks,
Alok

2008-06-25 08:27:33

by Zhao, Yakui

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Tue, 2008-06-24 at 23:04 -0700, Alok kataria wrote:
> Hi Yakui,
>
> On Tue, Jun 24, 2008 at 10:39 PM, Zhao Yakui <[email protected]> wrote:
> > On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote:
> >> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
> >> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
> >> >> Evaluates the _CRS object under PCI0 looking for producer resources.
> >> >> Then searches the e820 memory space for a gap within these producer resources.
> >> >>
> >> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources
> >> >> for hotplug devices within the BIOS allowed pci regions.
> >> >>
> >> > It seems reasonable.
> >> > But if the resource obtained from the PCI0 _CRS method is incorrect, we
> >> > will get the incorrect pci_mem_start.
> >>
> >> Hi Yakui,
> >>
> >> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
> >> give a incorrect _CRS object ?
> >> Also we don't just take the value given from the _CRS method, we still
> >> read the e820_map to search for an unallocated resource. So even if
> >> (by chance) the _CRS method returns incorrect value we would still
> >> figure out if there is a collision with an already allocated resource.
> > In the patch the address obtained from the _CRS object will be passed
> > into the function of e820_search_gap. In such case maybe we will get the
> > pci_mem_start different with the e820_setup_gap.
>
> True..the whole idea behind doing this patch was to get a correct
> (different) value for pci_mem_start.
> We read the _CRS object over here to make sure that we assign the
> pci_mem_start from the address range which is reserved by the BIOS for
> PCI devices.
>
> Also this reading of _CRS object would be done before we start
> initializing the pci devices, i.e. before we start using the value of
> pci_mem_start, so the original value assigned by pci_setup_gap is just
> overwritten by this function. So that should be fine IMHO.
> Also we would still want the call for e820_setup_gap because there can
> be systems with no acpi support or acpi disabled
>
>
> >
> >> >
> >> > At the same time after the patch is applied, pci_mem_start will be
> >> > parsed in two different ways.
> >>
> >> Yes pci_mem_start would be initialized in 2 different ways but we
> >> still have to parse the e820_map the old way because there could be
> >> systems without ACPI.
> >>
> >> > If the result is different, maybe the
> >> > incorrect pci_mem_start will be used.
> >>
> >> Yeah, The result is different in my case. Though my BIOS reserves
> >> hotpluggable memory region, kernel doesn't respect that right now and
> >> just parses the e820_map to calculate the gap and pci_mem_start value.
> >> I have explained the problem in this mail.
> >>
> >> http://marc.info/?l=linux-acpi&m=121391675711763&w=2
> >>
> >> Maybe nobody has seen this problem yet, because there are no systems
> >> out there with less than 4GB memory to start with and which allow
> >> memory hotplug.
> >>
> >> But still i don't understand what do you mean by, we can get incorrect
> >> pci_mem_start, in which case ?
> >
> > In the function of setup_arch the pci_mem_start will be parsed by
> > searching the e820 table. After the patch is applied, we will parse the
> > pci_mem_start again in the function of pci_acpi_scan_init and it will
> > override the value parsed in the function of setup_arch. If the
> > pci_mem_start is incorrect in the second case, maybe it will have side
> > effect.
>
> Yes it will override. But how can the value be incorrect in the second
> case. As explained in my previous mail we still parse the e820_map to
> check if we have unclaimed resources between start_address (that read
> from _CRS) to 2^32. So even if this start_address is wrong we would
> catch that during parsing e820_map. But again why would the _CRS
> return incorrect values, are you talking about errors in BIOS ?
The pci_mem_start is still gotten by parsing the E820 table.But the
input parameter start_addr will be used in the function of
e820_search_gap.
If the following is the resource start address returned by the PCI0
_CRS object , maybe the different pci_mem_start will be gotten.
0xE0000000
0xE4000000

At the same time if several start address is returned by the _CRS
object, the e820 table will be parsed several times.

Thanks.
Yakui

> Thanks,
> Alok

2008-06-25 16:58:42

by Alok Kataria

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Wed, 2008-06-25 at 01:38 -0700, Zhao Yakui wrote:
> On Tue, 2008-06-24 at 23:04 -0700, Alok kataria wrote:
> > Hi Yakui,
> >
> > On Tue, Jun 24, 2008 at 10:39 PM, Zhao Yakui <[email protected]> wrote:
> > > On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote:
> > >> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
> > >> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
> > >> >> Evaluates the _CRS object under PCI0 looking for producer resources.
> > >> >> Then searches the e820 memory space for a gap within these producer resources.
> > >> >>
> > >> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources
> > >> >> for hotplug devices within the BIOS allowed pci regions.
> > >> >>
> > >> > It seems reasonable.
> > >> > But if the resource obtained from the PCI0 _CRS method is incorrect, we
> > >> > will get the incorrect pci_mem_start.
> > >>
> > >> Hi Yakui,
> > >>
> > >> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
> > >> give a incorrect _CRS object ?
> > >> Also we don't just take the value given from the _CRS method, we still
> > >> read the e820_map to search for an unallocated resource. So even if
> > >> (by chance) the _CRS method returns incorrect value we would still
> > >> figure out if there is a collision with an already allocated resource.
> > > In the patch the address obtained from the _CRS object will be passed
> > > into the function of e820_search_gap. In such case maybe we will get the
> > > pci_mem_start different with the e820_setup_gap.
> >
> > True..the whole idea behind doing this patch was to get a correct
> > (different) value for pci_mem_start.
> > We read the _CRS object over here to make sure that we assign the
> > pci_mem_start from the address range which is reserved by the BIOS for
> > PCI devices.
> >
> > Also this reading of _CRS object would be done before we start
> > initializing the pci devices, i.e. before we start using the value of
> > pci_mem_start, so the original value assigned by pci_setup_gap is just
> > overwritten by this function. So that should be fine IMHO.
> > Also we would still want the call for e820_setup_gap because there can
> > be systems with no acpi support or acpi disabled
> >
> >
> > >
> > >> >
> > >> > At the same time after the patch is applied, pci_mem_start will be
> > >> > parsed in two different ways.
> > >>
> > >> Yes pci_mem_start would be initialized in 2 different ways but we
> > >> still have to parse the e820_map the old way because there could be
> > >> systems without ACPI.
> > >>
> > >> > If the result is different, maybe the
> > >> > incorrect pci_mem_start will be used.
> > >>
> > >> Yeah, The result is different in my case. Though my BIOS reserves
> > >> hotpluggable memory region, kernel doesn't respect that right now and
> > >> just parses the e820_map to calculate the gap and pci_mem_start value.
> > >> I have explained the problem in this mail.
> > >>
> > >> http://marc.info/?l=linux-acpi&m=121391675711763&w=2
> > >>
> > >> Maybe nobody has seen this problem yet, because there are no systems
> > >> out there with less than 4GB memory to start with and which allow
> > >> memory hotplug.
> > >>
> > >> But still i don't understand what do you mean by, we can get incorrect
> > >> pci_mem_start, in which case ?
> > >
> > > In the function of setup_arch the pci_mem_start will be parsed by
> > > searching the e820 table. After the patch is applied, we will parse the
> > > pci_mem_start again in the function of pci_acpi_scan_init and it will
> > > override the value parsed in the function of setup_arch. If the
> > > pci_mem_start is incorrect in the second case, maybe it will have side
> > > effect.
> >
> > Yes it will override. But how can the value be incorrect in the second
> > case. As explained in my previous mail we still parse the e820_map to
> > check if we have unclaimed resources between start_address (that read
> > from _CRS) to 2^32. So even if this start_address is wrong we would
> > catch that during parsing e820_map. But again why would the _CRS
> > return incorrect values, are you talking about errors in BIOS ?
> The pci_mem_start is still gotten by parsing the E820 table.But the
> input parameter start_addr will be used in the function of
> e820_search_gap.
> If the following is the resource start address returned by the PCI0
> _CRS object , maybe the different pci_mem_start will be gotten.
> 0xE0000000
> 0xE4000000
>
> At the same time if several start address is returned by the _CRS
> object, the e820 table will be parsed several times.

Yes we will parse e820 several times, but we don't initialize
pci_mem_start in every pass.
It will be initialized only twice once via the e820_setup_gap code path
and once via pci_setup_gap.
And i think you agree that both of these are required ?

During the gap calculation the previous code or the code now in
e820_setup_gap does this...
calculates a gap in e820_map from 0 to 2^32.
Initializes pci_mem_start.

And now with this patch, the code in pci_setup_gap
does this...
for each _CRS under PCI0
search gap from start_addr of _CRS to 2^32 *[1].
Initialize pci_mem_start with the biggest gap that we could
find.

Essentially, what we are doing is just limiting the gap calculation to a
smaller address space depending on the ACPI information we get.

Now then, what problem do you see with this approach ?

*[1]
While writing this mail i figured out that, instead of searching from
start_addr of _CRS to 2^32 we should just search till *end_addr* of _CRS resource.
I will send a patch to that effect.

Thanks,
Alok


>
> Thanks.
> Yakui
>
> > Thanks,
> > Alok
>

2008-06-25 18:22:53

by Alok Kataria

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Wed, 2008-06-25 at 09:57 -0700, Alok Kataria wrote:

> While writing this mail i figured out that, instead of searching from
> start_addr of _CRS to 2^32 we should just search till *end_addr* of _CRS resource.
> I will send a patch to that effect.
>
> Thanks,
> Alok
>

Ingo, Please also apply this patch to the tip/out-of-tree
--

Incremental patch . This patch passes the pci producer resources
end_addr (got from _CRS) to the e820_search_gap function.
Without this patch the code will always search from the producer regions
start_address to 2^32, even though the producer region may be smaller.

Now , we limit the search only within the producer region's address
space.
Also we put a condition to check if the resource lies in the 32bit
address range.

Signed-off-by: Alok N Kataria <[email protected]>

Index: linux-x86-tree.git/arch/x86/pci/acpi.c
===================================================================
--- linux-x86-tree.git.orig/arch/x86/pci/acpi.c 2008-06-25 10:11:35.000000000 -0700
+++ linux-x86-tree.git/arch/x86/pci/acpi.c 2008-06-25 11:17:31.000000000 -0700
@@ -121,13 +121,21 @@
struct acpi_resource_address64 addr;
acpi_status status;
struct gap_info *gap = data;
+ unsigned long long start_addr, end_addr;

status = resource_to_addr(resource, &addr);
if (ACPI_SUCCESS(status) &&
addr.resource_type == ACPI_MEMORY_RANGE &&
addr.address_length > gap->gapsize) {
- e820_search_gap(&gap->gapstart, &gap->gapsize,
- (addr.minimum + addr.translation_offset));
+ start_addr = addr.minimum + addr.translation_offset;
+ /*
+ * We want space only in the 32bit address range
+ */
+ if (start_addr < UINT_MAX) {
+ end_addr = start_addr + addr.address_length;
+ e820_search_gap(&gap->gapstart, &gap->gapsize,
+ start_addr, end_addr);
+ }
}

return AE_OK;

2008-06-26 00:50:02

by Zhao, Yakui

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Wed, 2008-06-25 at 09:57 -0700, Alok Kataria wrote:
> On Wed, 2008-06-25 at 01:38 -0700, Zhao Yakui wrote:
> > On Tue, 2008-06-24 at 23:04 -0700, Alok kataria wrote:
> > > Hi Yakui,
> > >
> > > On Tue, Jun 24, 2008 at 10:39 PM, Zhao Yakui <[email protected]> wrote:
> > > > On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote:
> > > >> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui <[email protected]> wrote:
> > > >> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote:
> > > >> >> Evaluates the _CRS object under PCI0 looking for producer resources.
> > > >> >> Then searches the e820 memory space for a gap within these producer resources.
> > > >> >>
> > > >> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources
> > > >> >> for hotplug devices within the BIOS allowed pci regions.
> > > >> >>
> > > >> > It seems reasonable.
> > > >> > But if the resource obtained from the PCI0 _CRS method is incorrect, we
> > > >> > will get the incorrect pci_mem_start.
> > > >>
> > > >> Hi Yakui,
> > > >>
> > > >> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS
> > > >> give a incorrect _CRS object ?
> > > >> Also we don't just take the value given from the _CRS method, we still
> > > >> read the e820_map to search for an unallocated resource. So even if
> > > >> (by chance) the _CRS method returns incorrect value we would still
> > > >> figure out if there is a collision with an already allocated resource.
> > > > In the patch the address obtained from the _CRS object will be passed
> > > > into the function of e820_search_gap. In such case maybe we will get the
> > > > pci_mem_start different with the e820_setup_gap.
> > >
> > > True..the whole idea behind doing this patch was to get a correct
> > > (different) value for pci_mem_start.
> > > We read the _CRS object over here to make sure that we assign the
> > > pci_mem_start from the address range which is reserved by the BIOS for
> > > PCI devices.
> > >
> > > Also this reading of _CRS object would be done before we start
> > > initializing the pci devices, i.e. before we start using the value of
> > > pci_mem_start, so the original value assigned by pci_setup_gap is just
> > > overwritten by this function. So that should be fine IMHO.
> > > Also we would still want the call for e820_setup_gap because there can
> > > be systems with no acpi support or acpi disabled
> > >
> > >
> > > >
> > > >> >
> > > >> > At the same time after the patch is applied, pci_mem_start will be
> > > >> > parsed in two different ways.
> > > >>
> > > >> Yes pci_mem_start would be initialized in 2 different ways but we
> > > >> still have to parse the e820_map the old way because there could be
> > > >> systems without ACPI.
> > > >>
> > > >> > If the result is different, maybe the
> > > >> > incorrect pci_mem_start will be used.
> > > >>
> > > >> Yeah, The result is different in my case. Though my BIOS reserves
> > > >> hotpluggable memory region, kernel doesn't respect that right now and
> > > >> just parses the e820_map to calculate the gap and pci_mem_start value.
> > > >> I have explained the problem in this mail.
> > > >>
> > > >> http://marc.info/?l=linux-acpi&m=121391675711763&w=2
> > > >>
> > > >> Maybe nobody has seen this problem yet, because there are no systems
> > > >> out there with less than 4GB memory to start with and which allow
> > > >> memory hotplug.
> > > >>
> > > >> But still i don't understand what do you mean by, we can get incorrect
> > > >> pci_mem_start, in which case ?
> > > >
> > > > In the function of setup_arch the pci_mem_start will be parsed by
> > > > searching the e820 table. After the patch is applied, we will parse the
> > > > pci_mem_start again in the function of pci_acpi_scan_init and it will
> > > > override the value parsed in the function of setup_arch. If the
> > > > pci_mem_start is incorrect in the second case, maybe it will have side
> > > > effect.
> > >
> > > Yes it will override. But how can the value be incorrect in the second
> > > case. As explained in my previous mail we still parse the e820_map to
> > > check if we have unclaimed resources between start_address (that read
> > > from _CRS) to 2^32. So even if this start_address is wrong we would
> > > catch that during parsing e820_map. But again why would the _CRS
> > > return incorrect values, are you talking about errors in BIOS ?
> > The pci_mem_start is still gotten by parsing the E820 table.But the
> > input parameter start_addr will be used in the function of
> > e820_search_gap.
> > If the following is the resource start address returned by the PCI0
> > _CRS object , maybe the different pci_mem_start will be gotten.
> > 0xE0000000
> > 0xE4000000
> >
> > At the same time if several start address is returned by the _CRS
> > object, the e820 table will be parsed several times.
>
> Yes we will parse e820 several times, but we don't initialize
> pci_mem_start in every pass.
> It will be initialized only twice once via the e820_setup_gap code path
> and once via pci_setup_gap.
> And i think you agree that both of these are required ?
Agree with what you said and IMO it is OK.
>
> During the gap calculation the previous code or the code now in
> e820_setup_gap does this...
> calculates a gap in e820_map from 0 to 2^32.
> Initializes pci_mem_start.
>
> And now with this patch, the code in pci_setup_gap
> does this...
> for each _CRS under PCI0
> search gap from start_addr of _CRS to 2^32 *[1].
> Initialize pci_mem_start with the biggest gap that we could
> find.
Yes. But maybe the pci_mem_start obtained in pci_setup_gap will be
different with that obtained in e820_setup_gap on some bogus BIOS.
If the pci_mem_start obtained in pci_setup_gap can meet the requirement,
it is also reasonable.
> Essentially, what we are doing is just limiting the gap calculation to a
> smaller address space depending on the ACPI information we get.
>
> Now then, what problem do you see with this approach ?
> *[1]
> While writing this mail i figured out that, instead of searching from
> start_addr of _CRS to 2^32 we should just search till *end_addr* of _CRS resource.
> I will send a patch to that effect.
If this, IMO it will be OK.
> Thanks,
> Alok
>
>
> >
> > Thanks.
> > Yakui
> >
> > > Thanks,
> > > Alok
> >
>

2008-06-26 01:18:43

by Alok Kataria

[permalink] [raw]
Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2

On Wed, 2008-06-25 at 18:00 -0700, Zhao Yakui wrote:
> On Wed, 2008-06-25 at 09:57 -0700, Alok Kataria wrote:
> > On Wed, 2008-06-25 at 01:38 -0700, Zhao Yakui wrote:
> > > On Tue, 2008-06-24 at 23:04 -0700, Alok kataria wrote:

> > > The pci_mem_start is still gotten by parsing the E820 table.But the
> > > input parameter start_addr will be used in the function of
> > > e820_search_gap.
> > > If the following is the resource start address returned by the PCI0
> > > _CRS object , maybe the different pci_mem_start will be gotten.
> > > 0xE0000000
> > > 0xE4000000
> > >
> > > At the same time if several start address is returned by the _CRS
> > > object, the e820 table will be parsed several times.
> >
> > Yes we will parse e820 several times, but we don't initialize
> > pci_mem_start in every pass.
> > It will be initialized only twice once via the e820_setup_gap code path
> > and once via pci_setup_gap.
> > And i think you agree that both of these are required ?
> Agree with what you said and IMO it is OK.
> >
> > During the gap calculation the previous code or the code now in
> > e820_setup_gap does this...
> > calculates a gap in e820_map from 0 to 2^32.
> > Initializes pci_mem_start.
> >
> > And now with this patch, the code in pci_setup_gap
> > does this...
> > for each _CRS under PCI0
> > search gap from start_addr of _CRS to 2^32 *[1].
> > Initialize pci_mem_start with the biggest gap that we could
> > find.
> Yes. But maybe the pci_mem_start obtained in pci_setup_gap will be
> different with that obtained in e820_setup_gap on some bogus BIOS.
> If the pci_mem_start obtained in pci_setup_gap can meet the requirement,
> it is also reasonable.

Ok, so the whole problem crops up from what if the BIOS itself is bogus.
Ok in that case too we take proper care that pci_mem_start is within the
expected limits, i.e. we make sure that
1. pci_mem_start is not initialized to an already allocated resource
address and..
2. pci_mem_start is within the lower 32bit of address space.

I assume you also agree that all the checks are in place.
Also it would be great if you could ACK both the patches as well.

Thanks for the review.
Alok

> > Essentially, what we are doing is just limiting the gap calculation to a
> > smaller address space depending on the ACPI information we get.
> >
> > Now then, what problem do you see with this approach ?
> > *[1]
> > While writing this mail i figured out that, instead of searching from
> > start_addr of _CRS to 2^32 we should just search till *end_addr* of _CRS resource.
> > I will send a patch to that effect.
> If this, IMO it will be OK.
> > Thanks,
> > Alok
> >
> >
> > >
> > > Thanks.
> > > Yakui
> > >
> > > > Thanks,
> > > > Alok
> > >
> >
>