2009-10-05 15:58:16

by Jeff Chua

[permalink] [raw]
Subject: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



I've 3 systems with 4GB, 16GB and 64GB all running 32bit with these set:

CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_ZONE_DMA=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HIGHMEM64G=y
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
CONFIG_KSM=y
CONFIG_HIGHPTE=y


# using "free -lm" ...

# with 4GB
total used free shared buffers cached
Mem: 3983 3862 120 0 112 3542
Low: 850 738 112
High: 3132 3123 8
-/+ buffers/cache: 207 3775
Swap: 8008 0 8008


# with 16GB
total used free shared buffers cached
Mem: 16244 6570 9673 0 320 5559
Low: 750 717 33
High: 15493 5853 9640
-/+ buffers/cache: 690 15554
Swap: 26709 0 26709


# with 64GB
total used free shared buffers cached
Mem: 63995 524 63470 0 4 365
Low: 378 32 345
High: 63616 492 63124
-/+ buffers/cache: 154 63840
Swap: 28003 0 28003


Question is ... is there anyway to increase "low mem" without resorting to
migrating to 64bit? (Look... it only has 378MB total low mem vs 850MB on
the 4GB system). I've oracle installed on the 64GB system and it keeps
getting OOMs!

I thought CONFIG_HIGHPTE (Allocate 3rd-level pagetables from highmem) is
supposed to help with low mem as stated here ...

CONFIG_HIGHPTE:
The VM uses one page table entry for each page of physical memory.
For systems with a lot of RAM, this can be wasteful of precious
low memory. Setting this option will put user-space page table
entries in high memory.


Anything I should do? Slow death using sysctl like this ...
vm.min_free_kbytes = 16384
vm.overcommit_memory = 2
vm.overcommit_ratio = 75
vm.vfs_cache_pressure = 10000


Thanks,
Jeff.

P.S. I was trying to "cross-compile" a gcc compiler, but didn't get far.
That's a different topic ... offline, please write to me directly if
you've done this ... would need binutil 64 (compile ok), gcc 64 (stucked
... seems to be looking for libc 64bit) ...


2009-10-05 18:12:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Mon, 5 Oct 2009, Jeff Chua wrote:
>
> I've 3 systems with 4GB, 16GB and 64GB all running 32bit with these set:

Don't.

4GB is useable. 8GB is borderline. 16GB+ and nobody will care.

> # with 64GB
> total used free shared buffers cached
> Mem: 63995 524 63470 0 4 365
> Low: 378 32 345

All your low memory is used for the 'struct page' arrays that describe
everything else.

> Anything I should do?

Use a 64-bit CPU and kernel or limit your memory to 8GB or so. No ifs,
buts and maybe's.

Linus

2009-10-05 18:43:38

by Byron Stanoszek

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Mon, 5 Oct 2009, Jeff Chua wrote:

> Question is ... is there anyway to increase "low mem" without resorting to
> migrating to 64bit? (Look... it only has 378MB total low mem vs 850MB on the
> 4GB system). I've oracle installed on the 64GB system and it keeps getting
> OOMs!

I have some installations using a 64-bit kernel and a 32-bit OS (including
Oracle). This setup works really well and lets you use all available memory.
You don't even have to upgrade your OS; just swap out the kernel.

-Byron

2009-10-05 19:16:24

by Dave Hansen

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Mon, 2009-10-05 at 23:57 +0800, Jeff Chua wrote:
> # with 64GB
> total used free shared buffers cached
> Mem: 63995 524 63470 0 4 365
> Low: 378 32 345
> High: 63616 492 63124
> -/+ buffers/cache: 154 63840
> Swap: 28003 0 28003
>
>
> Question is ... is there anyway to increase "low mem" without resorting to
> migrating to 64bit? (Look... it only has 378MB total low mem vs 850MB on
> the 4GB system). I've oracle installed on the 64GB system and it keeps
> getting OOMs!

Heh. You've really squeezed yourself into a bad situation. Go get a
64-bit kernel... please. You should be able to run 32-bit userspace
with a 64-bit kernel. Do you have some 32-bit kernel component that you
are relying on?

The kernel has a structure called 'struct page'. We allocate one of
those for each 4k page of physical memory on x86. But, each 'struct
page' is/was 32-bytes (is it still??). That means that on a 64GB
system, you've used at *least* 512MB of your 896MB of lowmem before
you're even out of early boot. That's just one structure.

The practical options are to use a different VMSPLIT or to use the RHEL
4/4 kernel. The VMSPLIT option is in mainline and it will let chop up
the user/kernel virtual address boundary in different ways. Looking at
arch/x86/Kconfig it doesn't look like mainline's code works with PAE.
It's theoretically possible, but not very practical. I think I hacked
up a custom kernel for a customer to do this once a long time ago, but
it was painful.

The RHEL 4/4 kernel is a big fat hack. I think they called it "hugemem"
or something. It gives the kernel (and userspace) ~4GB of of vaddr
space each, but costs you some extra context switch time. It lived in
-mm for a while and never made it to mainline.

I see it mentioned here:

http://www.redhat.com/rhel/previous_versions/rhel3/

and I don't know if it was continued in other RHEL releases.

You can get around the 896MB limit, but it's painful. You'll almost
certainly need a hacked kernel.

-- Dave

2009-10-05 19:42:56

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

> Heh. You've really squeezed yourself into a bad situation. Go get a
> 64-bit kernel... please. You should be able to run 32-bit userspace
> with a 64-bit kernel. Do you have some 32-bit kernel component that you
> are relying on?

Note that in such setups (64 bit kernels and 32 bit userspace) *some*
software will not work correctly (perhaps it's limited to tightly
kernel-userspace integrated software).

One example is open-iscsi - you won't run 32 bit open-iscsi userspace on
a 64 bit kernel (at least it was impossible a couple of months ago).


--
Tomasz Chmielewski
http://wpkg.org

2009-10-05 20:17:23

by Daniel J Blueman

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Oct 5, 5:10?pm, Jeff Chua <[email protected]> wrote:
> I've 3 systems with 4GB, 16GB and 64GB all running 32bit with these set:
[snip]
> Question is ... is there anyway to increase "low mem" without resorting to
> migrating to 64bit? (Look... it only has 378MB total low mem vs 850MB on
> the 4GB system). I've oracle installed on the 64GB system and it keeps
> getting OOMs!

Whilst linux may be able to address all 64GB of memory (running 32 or
64-bit), every 32-bit process will be limited to ~3GB virtual address
space, which may be the limiting factor for your out-of-memory issue
with Oracle. The only solution is to run 64-bit Oracle, alas.

Daniel
--
Daniel J Blueman

2009-10-05 20:32:58

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Mon, 05 Oct 2009 21:42:09 +0200, Tomasz Chmielewski said:
> > Heh. You've really squeezed yourself into a bad situation. Go get a
> > 64-bit kernel... please. You should be able to run 32-bit userspace
> > with a 64-bit kernel. Do you have some 32-bit kernel component that you
> > are relying on?
>
> Note that in such setups (64 bit kernels and 32 bit userspace) *some*
> software will not work correctly (perhaps it's limited to tightly
> kernel-userspace integrated software).
>
> One example is open-iscsi - you won't run 32 bit open-iscsi userspace on
> a 64 bit kernel (at least it was impossible a couple of months ago).

Sounds like somebody blew it on 32/64 bit combatability on a syscall,
most likely a parameter to an ioctl?


Attachments:
(No filename) (227.00 B)

2009-10-06 00:32:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Tue, 6 Oct 2009, Jeff Chua wrote:
>
> Loud and clear. And appreciated all the replies. I think it's time for
> upgrade!

As Byron said, it really should be sufficient to just upgrade the kernel
(I presume that you already have a CPU that is 64-bit capable: not very
many boards with old CPU's can even fit 64GB of ram).

In fact, I wish more people did that, just so that we'd get better
coverage of the 32-bit compat code. We occasionally find issues there,
although I think it's getting rarer.

Linus

2009-10-06 10:07:14

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Tuesday 06 October 2009 01:30:51 Linus Torvalds wrote:
> On Tue, 6 Oct 2009, Jeff Chua wrote:
> > Loud and clear. And appreciated all the replies. I think it's time for
> > upgrade!
>
> As Byron said, it really should be sufficient to just upgrade the kernel
> (I presume that you already have a CPU that is 64-bit capable: not very
> many boards with old CPU's can even fit 64GB of ram).
>
> In fact, I wish more people did that, just so that we'd get better
> coverage of the 32-bit compat code. We occasionally find issues there,
> although I think it's getting rarer.

Maybe convince some distro to offer this setup as an option at least? I always
wondered why no one has, or maybe I missed it.

Tvrtko

2009-10-06 11:02:18

by Frans Pop

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

Tvrtko Ursulin wrote:
>> In fact, I wish more people did that, just so that we'd get better
>> coverage of the 32-bit compat code. We occasionally find issues there,
>> although I think it's getting rarer.
>
> Maybe convince some distro to offer this setup as an option at least? I
> always wondered why no one has, or maybe I missed it.

Debian has a 64-bit kernel for its 32-bit "i386" architecture (and also for
its 64-bit "amd64" architecture obviously):
- for stable: http://packages.debian.org/lenny/linux-image-2.6.26-2-amd64
- for unstable: http://packages.debian.org/sid/linux-image-2.6.30-2-amd64

Cheers,
FJP

2009-10-06 13:00:41

by David Woodhouse

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Mon, 2009-10-05 at 17:30 -0700, Linus Torvalds wrote:
> In fact, I wish more people did that, just so that we'd get better
> coverage of the 32-bit compat code. We occasionally find issues there,
> although I think it's getting rarer.

We've been running PPC64 with 32-bit userspace for a long time, so most
of the issues with the 32-bit compat code ought to be dealt with.

There may be a few arch-specific issues for x86_64/i386, but not many.

--
dwmw2

2009-10-06 14:28:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Tue, 6 Oct 2009, David Woodhouse wrote:

> On Mon, 2009-10-05 at 17:30 -0700, Linus Torvalds wrote:
> > In fact, I wish more people did that, just so that we'd get better
> > coverage of the 32-bit compat code. We occasionally find issues there,
> > although I think it's getting rarer.
>
> We've been running PPC64 with 32-bit userspace for a long time, so most
> of the issues with the 32-bit compat code ought to be dealt with.
>
> There may be a few arch-specific issues for x86_64/i386, but not many.

The Intel Xorg guys used to do everything in 32-bit kernels because they
were also testing that they didn't break compatibility (which is a big
deal with that whole crazy DRM-in-kernel/direct-rendering-in-user-space
thing) and seemingly didn't realize that the compat layer was _supposed_
to mean that they could run a 64-bit kernel and still have a working
32-bit land.

It's driver interfaces like that that tend to break. ioctl's etc. But I
have heard less noise about it lately, so I do think it's mostly working.

Linus

2009-10-06 14:36:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Tue, 6 Oct 2009, Jeff Chua wrote:
>
> First, have to compile gcc so that it can compile 64 bit kernel. That's a
> tough one!

It's supposed to be easier these days, but I guess distros don't compile
in support for 64-bit mode by default. Just using another machine may be
the simplest approach, if you have any 64-bit distro around.

Linus

2009-10-06 16:51:09

by Yuhong Bao

[permalink] [raw]
Subject: RE: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?


<[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0



> On Mon=2C 5 Oct 2009=2C Jeff Chua wrote:
>>
>> I've 3 systems with 4GB=2C 16GB and 64GB all running 32bit with these se=
t:
>
> Don't.
>
> 4GB is useable. 8GB is borderline. 16GB+ and nobody will care.
>
>> # with 64GB
>> total used free shared buffers cached
>> Mem: 63995 524 63470 0 4 365
>> Low: 378 32 345
>
> All your low memory is used for the 'struct page' arrays that describe
> everything else.
>
>> Anything I should do?
>
> Use a 64-bit CPU and kernel or limit your memory to 8GB or so. No ifs=2C
> buts and maybe's.
>
> Linus
Even Windows limits the amount of RAM to 16 GB when both PAE and it's 3G/1G=
split mode is enabled for precisely the same reason.(It defaults to 2G/2G =
split mode)
See this for more info:http://blogs.msdn.com/oldnewthing/archive/2004/08/18=
/216492.aspx

Yuhong Bao =0A=
_________________________________________________________________=0A=
Hotmail: Free=2C trusted and rich email service.=0A=
http://clk.atdmt.com/GBL/go/171222984/direct/01/=

2009-10-08 16:36:40

by Dave Hansen

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Fri, 2009-10-09 at 00:26 +0800, Jeff Chua wrote:
> Newly compiled 64-bit kernel hangs on a previously working pure 32-bit
> 2.6.32-rc3 system ...
>
> VFS: Mounted root (reiserfs filesystem) readonly on device 8:2.
> Freeing unused kernel memory: 444k freed
> request_module: runaway loop modprove binfmt-464c
> request_module: runaway loop modprove binfmt-464c
> request_module: runaway loop modprove binfmt-464c
> request_module: runaway loop modprove binfmt-464c
> request_module: runaway loop modprove binfmt-464c

Take a quick look in your .config in the "Executable file formats /
Emulations" section. Make sure that you have ia32 emulation enabled:

CONFIG_IA32_EMULATION=y

That might not have been turned on if you used a .config from an old
32-bit kernel.

-- Dave

2009-10-10 18:11:29

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Sat, 10 Oct 2009 00:28:42 +0800, Jeff Chua said:

> From all the reading I've read about how slow 64-bit was, after doing all
> the lean and mean compiling, 64-bit is definitely the way to go! Fast and
> worth every bit switching to 64-bit! Now I can go for 128GB ram.

When the MIPS, PowerPC, and Sparc architectures went from 32 to 64 bits,
they *did* take a bit of a performance hit because it basically doubled
the memory bandwidth usage. However, they all had a reasonably large
number of registers in 32-bit mode. When the x86 went 64-bit, the register
pressure relief from the additional registers usually more then outweighs
the additional memory bandwidth (basically, if you're spending twice as
much time on each load/store, but only doing it 40% as often, you come out
ahead...)


Attachments:
(No filename) (227.00 B)

2009-10-10 18:38:46

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Sat, 10 Oct 2009, [email protected] wrote:
>
> When the x86 went 64-bit, the register pressure relief from the
> additional registers usually more then outweighs the additional memory
> bandwidth (basically, if you're spending twice as much time on each
> load/store, but only doing it 40% as often, you come out ahead...)

That's mainly stack traffic, and x86 has always been good at it. More
registers makes for simpler (and fewer) instructions due to less reloads,
but for kernel loads, it's not the biggest advantage.

If you have 8GB of RAM or more, the biggest advantage _by_far_ for the
kernel is that you don't spend 25% of your system time playing with
k[un]map() and the TLB flushing that goes along with it. You also have
much more freedom to allocate (and thus cache) inodes, dentries and
various other fundamental kernel data structures.

Also, the reason MIPS and Sparc had a slowdown for 64-bit code was only
partially the bigger cache footprint (and that depends a lot on the app
anyway: many applications aren't that pointer-intensive. The kernel is
_very_ pointer-intensive, but even for something like that, most data
structures tend to blow up by 50%, not 100%).

The other reason for slowdown is that generating those pointers (for
function calls in particular) is more complex, and x86-64 is better at
that than MIPS and Sparc. That complex instruction encoding with
variable-size instructions means that you don't have to try to fit all
constants in the instruction stream either in the fixed-sized instruction,
or by doing indirect data access to memory through a GP register.

So x86-64 not only had the register expansion advantage, it had less of a
code generation downside to 64-bit mode to begin with. Want to have large
constants in the code? No problem. Sure, it makes your code bigger, but
you can still have them predecoded in the instruction stream rather than
have to load them from memory. Much nicer for everybody.

And for the kernel, the bigger virtual address space really is a _huge_
deal. HIGHMEM accesses really are very slow. You don't see that in user
space, but I really have seen 25% performance differences between
non-highmem builds and CONFIG_HIGHMEM4G enabled for things that try to put
a lot of data in highmem (and the 64G one is even more expensive). And
that was just with 2GB of RAM.

And when it makes the difference between doing IO or not doign IO (ie
caching or not caching - when the dentry cache can't grow any more because
it _needs_ to be in lowmem), you can literally see an order-of-magnitude
difference.

With 8GB+ of ram, I guarantee you that the kernel spent tons of time on
just mappign high pages, _and_ it couldn't grow inodes and dentry caches
nearly as big as it would have wanted to. Going to x86-64 makes all those
issues just go away entirely.

So it's not "you can save a few instructions by not spilling to stack as
much". It's a much bigger deal than that. There's a reason I personally
refuse to even care about >2GB 32-bit machines. There's just no excuse
these days to do that.

Linus

2009-10-11 09:07:48

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Tue, 2009-10-06 at 07:35 -0700, Linus Torvalds wrote:
>
> On Tue, 6 Oct 2009, Jeff Chua wrote:
> >
> > First, have to compile gcc so that it can compile 64 bit kernel. That's a
> > tough one!
>
> It's supposed to be easier these days, but I guess distros don't compile
> in support for 64-bit mode by default. Just using another machine may be
> the simplest approach, if you have any 64-bit distro around.

My experience is that most distros have a compiler capable of generating
a 64-bits kernel, if not 64-bits userspace (the later depends on whether
the "other" bits such as libgcc, glibc, etc... are there for 64-bits,
which is also generally available, though optional).

I regulary compile 64-bit kernels with a 32-bit Ubuntu or Debian on i386

Cheers,
Ben.

2009-10-11 09:11:11

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Tue, 2009-10-06 at 07:26 -0700, Linus Torvalds wrote:

> The Intel Xorg guys used to do everything in 32-bit kernels because they
> were also testing that they didn't break compatibility (which is a big
> deal with that whole crazy DRM-in-kernel/direct-rendering-in-user-space
> thing) and seemingly didn't realize that the compat layer was _supposed_
> to mean that they could run a 64-bit kernel and still have a working
> 32-bit land.
>
> It's driver interfaces like that that tend to break. ioctl's etc. But I
> have heard less noise about it lately, so I do think it's mostly working.

I'm running a 32-bit Ubuntu karmic distro with a 64-bit kernel on my
thinkpad and so far, I yet have to encounter a single obvious problem
due to the compat layer. Everything seems to work fine including 3D,
compiz fancyness in X etc... :-)

Cheers,
Ben.


2009-10-11 17:35:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?



On Sun, 11 Oct 2009, Benjamin Herrenschmidt wrote:
> >
> > It's supposed to be easier these days, but I guess distros don't compile
> > in support for 64-bit mode by default. Just using another machine may be
> > the simplest approach, if you have any 64-bit distro around.
>
> My experience is that most distros have a compiler capable of generating
> a 64-bits kernel

At least not Fedora x86. Doing "gcc -m64" results in

sorry, unimplemented: 64-bit mode not compiled in

on my laptop.

Linus

2009-10-14 21:33:35

by Lennart Sorensen

[permalink] [raw]
Subject: Re: 2.6.32-rc3: low mem - only 378MB on x86_32 with 64GB. Why?

On Tue, Oct 06, 2009 at 01:01:35PM +0200, Frans Pop wrote:
> Tvrtko Ursulin wrote:
> >> In fact, I wish more people did that, just so that we'd get better
> >> coverage of the 32-bit compat code. We occasionally find issues there,
> >> although I think it's getting rarer.
> >
> > Maybe convince some distro to offer this setup as an option at least? I
> > always wondered why no one has, or maybe I missed it.
>
> Debian has a 64-bit kernel for its 32-bit "i386" architecture (and also for
> its 64-bit "amd64" architecture obviously):
> - for stable: http://packages.debian.org/lenny/linux-image-2.6.26-2-amd64
> - for unstable: http://packages.debian.org/sid/linux-image-2.6.30-2-amd64

I have been running that Debian setup for 5 years now. It's great.
Other than 32 bit iptables not liking the 64bit kernel, I can't currently
recall any issues at any point. Well and configure scripts often have to
be run with the 'linux32' wrapper to make them not try something stupid.

--
Len Sorensen