2003-11-20 12:29:07

by Richard Curnow

[permalink] [raw]
Subject: Simplification in pbus_size_mem

The following patch is against 2.4, but the 2.6 code looks identical.

===== setup-bus.c 1.6 vs edited =====
--- 1.6/drivers/pci/setup-bus.c Thu Dec 12 22:14:01 2002
+++ edited/setup-bus.c Thu Nov 20 11:54:28 2003
@@ -311,18 +311,8 @@
}
}

- align = 0;
- min_align = 0;
- for (order = 0; order <= max_order; order++) {
- unsigned long align1 = 1UL << (order + 20);
+ min_align = 1UL << (max_order + 20);

- if (!align)
- min_align = align1;
- else if (ROUND_UP(align + min_align, min_align) < align1)
- min_align = align1 >> 1;
- align += aligns[order];
- }
- size = ROUND_UP(size, min_align);
if (!size) {
b_res->flags = 0;
return;


This is fixing the allocation on a system which looks like this

* 96Mb PCI memory aperture
* Kyro graphics card, requiring 64Mb + 768kb prefetchable
* USB card requiring 4x4k non-prefetchable

Without the change, 'min_align' is computed as 32Mb (the algorithm in the
loop basically seems to make 'min_align' end up as 1/2 the largest
alignment requirement that was found?), hence in the pass where the
prefetchable block is sized, 'size' ends up as 96Mb, which means there
is no space left in which to place the non-prefetchable blocks for the
USB card.

With the patch above, the alignment requirement for the prefetchable
memory actually ends up as the alignment required for the framebuffer,
and the size isn't rounded up unnecessarily. The USB card gets
allocated successfully as a result.

I couldn't be sure what the code in the loop is attempting to do, so I'm
sure I'm overlooking something subtle. Any comments?

--
Richard \\\ SuperH Core+Debug Architect /// .. At home ..
P. /// [email protected] /// [email protected]
Curnow \\\ http://www.superh.com/ /// http://www.rc0.org.uk


2003-11-20 14:16:45

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Simplification in pbus_size_mem

On Thu, Nov 20, 2003 at 12:28:38PM +0000, Richard Curnow wrote:
> * 96Mb PCI memory aperture

Also there is a PCI-PCI bridge, I guess? ;-)

> * Kyro graphics card, requiring 64Mb + 768kb prefetchable

768Kb sounds strange. It must be power of 2. Perhaps it's 512Kb MMIO
and 256Kb ROM? But MMIO registers must be non-prefetchable. Weird.

> * USB card requiring 4x4k non-prefetchable
>
> Without the change, 'min_align' is computed as 32Mb (the algorithm in the
> loop basically seems to make 'min_align' end up as 1/2 the largest

No. For example, if you have 64Mb + 2x16Mb then 'min_align' will be 16Mb.

> alignment requirement that was found?), hence in the pass where the
> prefetchable block is sized, 'size' ends up as 96Mb, which means there
> is no space left in which to place the non-prefetchable blocks for the
> USB card.

Yes, it's a trade-off - minimizing alignment vs. size requirements.
In most situations the former approach gives much better allocations.

> With the patch above, the alignment requirement for the prefetchable
> memory actually ends up as the alignment required for the framebuffer,
> and the size isn't rounded up unnecessarily. The USB card gets
> allocated successfully as a result.

Well, it works only because your 96Mb PCI aperture is aligned at 64Mb
(or more). If it was aligned at 32Mb, you wouldn't be able to allocate
prefetchable memory at all with your patch.

As a workaround, you can mark those additional 768Kb regions as
non-prefetchable and be done with it.

Ivan.

2003-11-20 15:26:16

by Richard Curnow

[permalink] [raw]
Subject: Re: Simplification in pbus_size_mem

* Ivan Kokshaysky <[email protected]> [2003-11-20]:
> On Thu, Nov 20, 2003 at 12:28:38PM +0000, Richard Curnow wrote:
> > * 96Mb PCI memory aperture
>
> Also there is a PCI-PCI bridge, I guess? ;-)

Yes. 'cat /proc/pci' (with my patch so the USB card gets allocations)
gives

Bus 0, device 2, function 0:
PCI bridge: Digital Equipment Corporation DECchip 21150 (rev 6).
Master Capable. Latency=32. Min Gnt=12.
Bus 1, device 9, function 0:
USB Controller: Lucent Microelectronics USS-344S USB Controller (rev 17).
IRQ 4.
Master Capable. Latency=32. Min Gnt=3.Max Lat=86.
Non-prefetchable 32 bit memory at 0x14100000 [0x14100fff].
Bus 1, device 9, function 1:
USB Controller: Lucent Microelectronics USS-344S USB Controller (#2) (rev 17).
IRQ 4.
Master Capable. Latency=32. Min Gnt=3.Max Lat=86.
Non-prefetchable 32 bit memory at 0x14101000 [0x14101fff].
Bus 1, device 9, function 2:
USB Controller: Lucent Microelectronics USS-344S USB Controller (#3) (rev 17).
IRQ 4.
Master Capable. Latency=32. Min Gnt=3.Max Lat=86.
Non-prefetchable 32 bit memory at 0x14102000 [0x14102fff].
Bus 1, device 9, function 3:
USB Controller: Lucent Microelectronics USS-344S USB Controller (#4) (rev 17).
IRQ 4.
Master Capable. Latency=32. Min Gnt=3.Max Lat=86.
Non-prefetchable 32 bit memory at 0x14103000 [0x14103fff].
Bus 1, device 11, function 0:
VGA compatible controller: SGS Thomson Microelectronics STG4000 [3D Prophet Kyro Series] (rev 1).
IRQ 2.
Master Capable. Latency=32.
Prefetchable 32 bit memory at 0x10000000 [0x13ffffff].
Prefetchable 32 bit memory at 0x14000000 [0x1407ffff].
I/O at 0x2000 [0x20ff].
Bus 1, device 12, function 0:
Ethernet controller: SGS Thomson Microelectronics DEC-Tulip compatible 10/100 Ethernet (rev 161).
IRQ 1.
Master Capable. Latency=32. Min Gnt=255.Max Lat=255.
I/O at 0x2400 [0x24ff].
Non-prefetchable 32 bit memory at 0x14104000 [0x141040ff].

(This is an SH-4 platform.)

> 768Kb sounds strange. It must be power of 2. Perhaps it's 512Kb MMIO
> and 256Kb ROM? But MMIO registers must be non-prefetchable. Weird.

Looks like it's actually 512kB. I'm not sure why I thought it was
768kB.

> > alignment requirement that was found?), hence in the pass where the
> > prefetchable block is sized, 'size' ends up as 96Mb, which means there
> > is no space left in which to place the non-prefetchable blocks for the
> > USB card.
>
> Yes, it's a trade-off - minimizing alignment vs. size requirements.
> In most situations the former approach gives much better allocations.

So is the idea that by rounding up 'size' to 96Mb in this case, it's
guaranteed that there will be a 64Mb aligned chunk inside where the
framebuffer can go, still leaving enough room around for the other
allocation, _regardless_ of the alignment of the base of the memory
aperture? (Or if there are multiple PCI-to-PCI bridges, the aperture
base for any one bridge is going to depend on the sizes of the apertures
forwarded by the others, I suppose).

If this is so, I can begin to see what the loop in the existing code is
doing.

>
> > With the patch above, the alignment requirement for the prefetchable
> > memory actually ends up as the alignment required for the framebuffer,
> > and the size isn't rounded up unnecessarily. The USB card gets
> > allocated successfully as a result.
>
> Well, it works only because your 96Mb PCI aperture is aligned at 64Mb
> (or more).

It's aligned at 256Mb in fact, as shown above.

> If it was aligned at 32Mb, you wouldn't be able to allocate
> prefetchable memory at all with your patch.

Good point.

I'll think about this some more.

> As a workaround, you can mark those additional 768Kb regions as
> non-prefetchable and be done with it.

How do I do that?

Many thanks for your help
Richard

--
Richard \\\ SuperH Core+Debug Architect /// .. At home ..
P. /// [email protected] /// [email protected]
Curnow \\\ http://www.superh.com/ /// http://www.rc0.org.uk

2003-11-20 16:36:29

by Ivan Kokshaysky

[permalink] [raw]
Subject: Re: Simplification in pbus_size_mem

On Thu, Nov 20, 2003 at 03:25:58PM +0000, Richard Curnow wrote:
> Bus 1, device 11, function 0:
> VGA compatible controller: SGS Thomson Microelectronics STG4000 [3D Prophet Kyro Series] (rev 1).
> IRQ 2.
> Master Capable. Latency=32.
> Prefetchable 32 bit memory at 0x10000000 [0x13ffffff].
> Prefetchable 32 bit memory at 0x14000000 [0x1407ffff].
> I/O at 0x2000 [0x20ff].

Probably the second memory region is ROM - 'lspci -vvv' would clarify that.

> So is the idea that by rounding up 'size' to 96Mb in this case, it's
> guaranteed that there will be a 64Mb aligned chunk inside where the
> framebuffer can go, still leaving enough room around for the other
> allocation, _regardless_ of the alignment of the base of the memory
> aperture? (Or if there are multiple PCI-to-PCI bridges, the aperture
> base for any one bridge is going to depend on the sizes of the apertures
> forwarded by the others, I suppose).

Exactly.

> > As a workaround, you can mark those additional 768Kb regions as
> > non-prefetchable and be done with it.
>
> How do I do that?

Add a 'pcibios_fixup' routine for this platform, which does

dev->resource[x].flags &= ~IORESOURCE_PREFETCH;

It can be either specific for that VGA controller (if it's built-in)
or for PCI devices with a class code of PCI_CLASS_DISPLAY_VGA or
(if that region is indeed a ROM) you can just mark ROM resources as
non-prefetchable for all PCI devices (i.e. x = PCI_ROM_RESOURCE).

Ivan.

2003-11-24 15:28:42

by Richard Curnow

[permalink] [raw]
Subject: Re: Simplification in pbus_size_mem

* Ivan Kokshaysky <[email protected]> [2003-11-20]:
> On Thu, Nov 20, 2003 at 03:25:58PM +0000, Richard Curnow wrote:
> > So is the idea that by rounding up 'size' to 96Mb in this case, it's
> > guaranteed that there will be a 64Mb aligned chunk inside where the
> > framebuffer can go, still leaving enough room around for the other
> > allocation, _regardless_ of the alignment of the base of the memory
> > aperture? (Or if there are multiple PCI-to-PCI bridges, the aperture
> > base for any one bridge is going to depend on the sizes of the apertures
> > forwarded by the others, I suppose).
>
> Exactly.

OK, so it's a tricky general-case problem then. After a day or two
pondering, I doubt there's much that can be improved without sinking a
_lot_ of time into this.

> > How do I do that?
>
> Add a 'pcibios_fixup' routine for this platform, which does

Thanks for this info! It's useful to have at least another solution to
consider instead of the bodge we're already using.

> It can be either specific for that VGA controller (if it's built-in)

no, it's not built in.

Thanks again
Richard

--
Richard \\\ SuperH Core+Debug Architect /// .. At home ..
P. /// [email protected] /// [email protected]
Curnow \\\ http://www.superh.com/ /// http://www.rc0.org.uk