2008-06-30 18:03:40

by Pierre Ossman

[permalink] [raw]
Subject: How to alloc highmem page below 4GB on i386?

Simple question. How do I allocate a page from highmem, that's still
within 32 bits? x86_64 has the DMA32 zone, but i386 has just HIGHMEM.
As most devices can't DMA above 32 bit, I have 3 GB of memory that's
not getting decent usage (or results in needless bouncing). What to do?

I tried just enabling CONFIG_DMA32 for i386, but there is some guard
against too many memory zones. I'm assuming this is there for a good
reason?

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.


Attachments:
signature.asc (197.00 B)

2008-07-04 17:58:19

by Pierre Ossman

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Mon, 30 Jun 2008 20:03:23 +0200
Pierre Ossman <[email protected]> wrote:

> Simple question. How do I allocate a page from highmem, that's still
> within 32 bits? x86_64 has the DMA32 zone, but i386 has just HIGHMEM.
> As most devices can't DMA above 32 bit, I have 3 GB of memory that's
> not getting decent usage (or results in needless bouncing). What to do?
>
> I tried just enabling CONFIG_DMA32 for i386, but there is some guard
> against too many memory zones. I'm assuming this is there for a good
> reason?
>

Anyone?

--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.


Attachments:
signature.asc (197.00 B)

2008-07-04 18:12:32

by Arjan van de Ven

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Fri, 4 Jul 2008 19:58:00 +0200
Pierre Ossman <[email protected]> wrote:

> On Mon, 30 Jun 2008 20:03:23 +0200
> Pierre Ossman <[email protected]> wrote:
>
> > Simple question. How do I allocate a page from highmem, that's still
> > within 32 bits? x86_64 has the DMA32 zone, but i386 has just
> > HIGHMEM. As most devices can't DMA above 32 bit, I have 3 GB of
> > memory that's not getting decent usage (or results in needless
> > bouncing). What to do?
> >
> > I tried just enabling CONFIG_DMA32 for i386, but there is some guard
> > against too many memory zones. I'm assuming this is there for a good
> > reason?
> >
>
> Anyone?
>

well... the assumption sort of is that all high-perf devices are 64 bit
capable. For the rest... well you get what you get. There's IOMMU's in
modern systems from Intel (and soon AMD) that help you avoid the bounce
if you really care.

The second assumption sort of is that you don't have 'too much' above
4Gb; once you're over 16Gb or so people assume you will run the 64 bit
kernel instead...
(you're hard pressed to find any system nowadays that can support > 4Gb
but cannot support 64 bit... a few years ago that was different but 64
bit has been with us for many years now)


--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-07-04 20:23:41

by Pierre Ossman

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Fri, 4 Jul 2008 11:12:24 -0700
Arjan van de Ven <[email protected]> wrote:

> On Fri, 4 Jul 2008 19:58:00 +0200
> Pierre Ossman <[email protected]> wrote:
>
> > On Mon, 30 Jun 2008 20:03:23 +0200
> > Pierre Ossman <[email protected]> wrote:
> >
> > > Simple question. How do I allocate a page from highmem, that's still
> > > within 32 bits? x86_64 has the DMA32 zone, but i386 has just
> > > HIGHMEM. As most devices can't DMA above 32 bit, I have 3 GB of
> > > memory that's not getting decent usage (or results in needless
> > > bouncing). What to do?
> > >
> > > I tried just enabling CONFIG_DMA32 for i386, but there is some guard
> > > against too many memory zones. I'm assuming this is there for a good
> > > reason?
> > >
> >
> > Anyone?
> >
>
> well... the assumption sort of is that all high-perf devices are 64 bit
> capable. For the rest... well you get what you get. There's IOMMU's in
> modern systems from Intel (and soon AMD) that help you avoid the bounce
> if you really care.

I was under the impression that the PCI bus was utterly incapable of
any larger address than 32 bits? But perhaps you only consider PCIE
stuff high-perf. :)

>
> The second assumption sort of is that you don't have 'too much' above
> 4Gb; once you're over 16Gb or so people assume you will run the 64 bit
> kernel instead...

Unfortunately some proprietary crud keeps migration somewhat annoying.
And in my case it's a 4 GB system, where 1 GB gets mapped up to make
room for devices, so it's not that uncommon.

The strange thing is that I keep getting pages from > 4GB all the time,
even on a loaded system. I would have expected mostly getting pages
below that limit as that's where most of the memory is. Do you have any
insight into which areas tend to fill up first?

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.


Attachments:
signature.asc (197.00 B)

2008-07-04 20:37:56

by Arjan van de Ven

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Fri, 4 Jul 2008 22:23:23 +0200
Pierre Ossman <[email protected]> wrote:

> On Fri, 4 Jul 2008 11:12:24 -0700
> Arjan van de Ven <[email protected]> wrote:
>
> > On Fri, 4 Jul 2008 19:58:00 +0200
> > Pierre Ossman <[email protected]> wrote:
> >
> > > On Mon, 30 Jun 2008 20:03:23 +0200
> > > Pierre Ossman <[email protected]> wrote:
> > >
> > > > Simple question. How do I allocate a page from highmem, that's
> > > > still within 32 bits? x86_64 has the DMA32 zone, but i386 has
> > > > just HIGHMEM. As most devices can't DMA above 32 bit, I have 3
> > > > GB of memory that's not getting decent usage (or results in
> > > > needless bouncing). What to do?
> > > >
> > > > I tried just enabling CONFIG_DMA32 for i386, but there is some
> > > > guard against too many memory zones. I'm assuming this is there
> > > > for a good reason?
> > > >
> > >
> > > Anyone?
> > >
> >
> > well... the assumption sort of is that all high-perf devices are 64
> > bit capable. For the rest... well you get what you get. There's
> > IOMMU's in modern systems from Intel (and soon AMD) that help you
> > avoid the bounce if you really care.
>
> I was under the impression that the PCI bus was utterly incapable of
> any larger address than 32 bits? But perhaps you only consider PCIE
> stuff high-perf. :)

actually your impression is not correct. There's a difference between
how many physical bits the bus has, and the logical data. Specifically,
PCI (and PCIE etc) have something that's called "Dual Address Cycle",
which is a pci bus transaction that sends the 64 bit address using 2
cycles on the bus even if the buswidth is 32 bit (logically).


> > The second assumption sort of is that you don't have 'too much'
> > above 4Gb; once you're over 16Gb or so people assume you will run
> > the 64 bit kernel instead...
>
> Unfortunately some proprietary crud keeps migration somewhat annoying.
> And in my case it's a 4 GB system, where 1 GB gets mapped up to make
> room for devices, so it's not that uncommon.

4Gb systems are entirely reasonably still with 32 bit kernels (I'm
typing on one right now ;-); it gets problematic in the 12-16Gb range.

>
> The strange thing is that I keep getting pages from > 4GB all the
> time, even on a loaded system. I would have expected mostly getting
> pages below that limit as that's where most of the memory is. Do you
> have any insight into which areas tend to fill up first?

ok this is tricky and goes way deep into buddy allocator internals.
On the highest level (2Mb chunks iirc, but it could be a bit or
two bigger now) we allocate top down. But once we split such a top level
chunk up, inside the chunk we allocate bottom up (so that the scatter
gather IOs tend to group nicer).
In addition, the kernel will prefer allocating userspace/pagecache
memory from highmem over lowmem, out of an effort to keep memory
pressure in the lowmem zones lower.



--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-07-04 22:03:21

by Pierre Ossman

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Fri, 4 Jul 2008 13:37:33 -0700
Arjan van de Ven <[email protected]> wrote:

> On Fri, 4 Jul 2008 22:23:23 +0200
> Pierre Ossman <[email protected]> wrote:
> >
> > I was under the impression that the PCI bus was utterly incapable of
> > any larger address than 32 bits? But perhaps you only consider PCIE
> > stuff high-perf. :)
>
> actually your impression is not correct. There's a difference between
> how many physical bits the bus has, and the logical data. Specifically,
> PCI (and PCIE etc) have something that's called "Dual Address Cycle",
> which is a pci bus transaction that sends the 64 bit address using 2
> cycles on the bus even if the buswidth is 32 bit (logically).
>

Ah, I see. I have to admit to only have read the PCI spec briefly. :)

Still, the devices I'm poking have 32-bit fields, so the limitation is
still there for my case.

> >
> > The strange thing is that I keep getting pages from > 4GB all the
> > time, even on a loaded system. I would have expected mostly getting
> > pages below that limit as that's where most of the memory is. Do you
> > have any insight into which areas tend to fill up first?
>
> ok this is tricky and goes way deep into buddy allocator internals.
> On the highest level (2Mb chunks iirc, but it could be a bit or
> two bigger now) we allocate top down. But once we split such a top level
> chunk up, inside the chunk we allocate bottom up (so that the scatter
> gather IOs tend to group nicer).
> In addition, the kernel will prefer allocating userspace/pagecache
> memory from highmem over lowmem, out of an effort to keep memory
> pressure in the lowmem zones lower.
>

For the test I'm playing with, in does a second order allocation, which
I suppose has good odds of finding a suitable hole somewhere in the
upper GB.

Ah well, I suppose this highmem business will eventually blow over. ;)

Thanks
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.


Attachments:
signature.asc (197.00 B)

2008-07-04 22:24:32

by Arjan van de Ven

[permalink] [raw]
Subject: Re: How to alloc highmem page below 4GB on i386?

On Sat, 5 Jul 2008 00:02:59 +0200
Pierre Ossman <[email protected]> wrote:

> On Fri, 4 Jul 2008 13:37:33 -0700
> Arjan van de Ven <[email protected]> wrote:
>
> > On Fri, 4 Jul 2008 22:23:23 +0200
> > Pierre Ossman <[email protected]> wrote:
> > >
> > > I was under the impression that the PCI bus was utterly incapable
> > > of any larger address than 32 bits? But perhaps you only consider
> > > PCIE stuff high-perf. :)
> >
> > actually your impression is not correct. There's a difference
> > between how many physical bits the bus has, and the logical data.
> > Specifically, PCI (and PCIE etc) have something that's called "Dual
> > Address Cycle", which is a pci bus transaction that sends the 64
> > bit address using 2 cycles on the bus even if the buswidth is 32
> > bit (logically).
> >
>
> Ah, I see. I have to admit to only have read the PCI spec briefly. :)
>
> Still, the devices I'm poking have 32-bit fields, so the limitation is
> still there for my case.

yeah only a portion of the devices out there support the higher
addresses unfortunately. (This comes back to: "the assumption is that
high performance devices support 64 bit". It's an assumption but it
doesn't seem to be too far off the mark)

>
> > >
> > > The strange thing is that I keep getting pages from > 4GB all the
> > > time, even on a loaded system. I would have expected mostly
> > > getting pages below that limit as that's where most of the memory
> > > is. Do you have any insight into which areas tend to fill up
> > > first?
> >
> > ok this is tricky and goes way deep into buddy allocator internals.
> > On the highest level (2Mb chunks iirc, but it could be a bit or
> > two bigger now) we allocate top down. But once we split such a top
> > level chunk up, inside the chunk we allocate bottom up (so that the
> > scatter gather IOs tend to group nicer).
> > In addition, the kernel will prefer allocating userspace/pagecache
> > memory from highmem over lowmem, out of an effort to keep memory
> > pressure in the lowmem zones lower.
> >
>
> For the test I'm playing with, in does a second order allocation,
> which I suppose has good odds of finding a suitable hole somewhere in
> the upper GB.
>
> Ah well, I suppose this highmem business will eventually blow over. ;)

hehe

well... a copy isn't free, but it's also not THAT expensive. In the
order of 3000 to 4000 cycles or so for a 4Kb copy (of course this varies
with hardware, but as a rough estimate it's in that ballpark)

Another thing is.. use the iommu ;)

--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org