2008-12-16 14:56:41

by Martin Steigerwald

[permalink] [raw]
Subject: physical memory limit of 64-bit linux


Hi!

What is the physical memory limit for 64-bit Linux? I read about 40 bit
address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4 (256 TB).

Is 64-bit linux able to use that amount - provided that one would manage to
build it into a machine? Or does it have a lower limit?

Looking into the Google crystal ball gives unclear pictures... I tend to
assume that Linux would handle it, but I am not sure.

Ciao,
--
Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90


Attachments:
(No filename) (541.00 B)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-16 15:56:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux

On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
>
> Hi!
>
> What is the physical memory limit for 64-bit Linux? I read about 40 bit
> address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4 (256 TB).
>
> Is 64-bit linux able to use that amount - provided that one would manage to
> build it into a machine? Or does it have a lower limit?
>
> Looking into the Google crystal ball gives unclear pictures... I tend to
> assume that Linux would handle it, but I am not sure.

IIRC, the current maximal virtual memory space size of the kernel on x86_64 is 2^46.

Thanks,
Rafael

2008-12-16 16:32:43

by Andi Kleen

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux

On Tue, Dec 16, 2008 at 04:54:16PM +0100, Rafael J. Wysocki wrote:
> On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
> >
> > Hi!
> >
> > What is the physical memory limit for 64-bit Linux? I read about 40 bit
> > address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4 (256 TB).
> >
> > Is 64-bit linux able to use that amount - provided that one would manage to
> > build it into a machine? Or does it have a lower limit?

It depends on which 64bit Linux; e.g. IA64 has larger limits.

> > Looking into the Google crystal ball gives unclear pictures... I tend to
> > assume that Linux would handle it, but I am not sure.
>
> IIRC, the current maximal virtual memory space size of the kernel on x86_64 is 2^46.

Correct.

-Andi

--
[email protected]

2008-12-16 18:48:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux


* Rafael J. Wysocki <[email protected]> wrote:

> On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
> >
> > Hi!
> >
> > What is the physical memory limit for 64-bit Linux? I read about 40
> > bit address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4
> > (256 TB).
> >
> > Is 64-bit linux able to use that amount - provided that one would
> > manage to build it into a machine? Or does it have a lower limit?
> >
> > Looking into the Google crystal ball gives unclear pictures... I tend
> > to assume that Linux would handle it, but I am not sure.
>
> IIRC, the current maximal virtual memory space size of the kernel on
> x86_64 is 2^46.

Almost: the real current upstream kernel hard memory limit on x86-64 is 44
bits, i.e. 16 TB.

There's a couple of limits to consider here.

Firstly, there's the architectural limit imposed by the CPU - that is 48
bits, 256 TB. That is the full virtual memory range that x86-64 CPUs are
able to address: non-canonical addresses outside that range create an
exception.

I.e. valid addresses on x86-64 are in the range of:

[ 0xffff800000000000...0x00007fffffffffff ]

Which is minus 128 TB to plus 128 TB.

Traditionally (and because it's practical) that max range is split in two:
negative addresses to kernel-space-only addresses [the same on all tasks],
positive addresses to user-space addresses [unique to each process MM].

The kernel starts at minus 122 TB, far far down, to take maximum advantage
of the negative range:

arch/x86/include/asm/page_64.h:
#define __PAGE_OFFSET _AC(0xffff880000000000, UL)

(and we start with an 8 TB empty-mapped hole range. )

That is where all physical memory is mapped to, linearly. Then we have a
64 TB limit imposed on the maximum size of this linear kernel memory
range:

#define MAXMEM _AC(0x00003fffffffffff, UL)

that is sized a bit optimistically - it ends at ffffc7ffffffffff, which
overlaps by 2TB into the vmalloc area, which starts at:

#define VMALLOC_START _AC(0xffffc20000000000, UL)

We want to set MAXMEM to VMALLOC_START-hole instead, where the hole is say
0x20000000000. (2 TB)

This problem is academic because there are no such systems in existence,
and because we have another limit on the size of physical memory:

arch/x86/include/asm/sparsemem.h:
# define MAX_PHYSMEM_BITS 44

So in reality MAXMEM should be limited to the max sparsemem-covered
physical memory range, via the patch below.

In terms of future extensibility:

phase 1) We could go to 45 bits (32 TB) via a twoliner patch,
should the need arise

phase 2) We can then go to 46 bits (64 TB) with small changes too - by
moving the vmalloc area up a notch and moving the followon
dynamic kernel mappings areas too.

phase 3) We could also go close to 47 bits: with various more invasive
movings of VMALLOC and rest upwards, and other considerations
such as the elimination of the generous start of 8 TB hole at
__PAGE_OFFSET - i.e. moving __PAGE_OFFSET straight down to
minus 128 TB. 120 TB would be doable.

phase 4) If the 48 bits limit is ever lifted on the CPU side, we can move
__PAGE_OFFSET down. This is actually less invasive than phase
3), because moving __PAGE_OFFSET is relatively easy. The far
more invasive change would be the necessary changes to the
virtual memory code: the current 4-level paging has a 256 TB
limit which comes from the 512*512*512*512*4K split of
pgd/pud/pmd/pte entries. Either PGDIR_SHIFT would have to be
increased, moving the root pgtable's size from 4K to 8K or more,
or another pgdir level would have to be introduced (which is
even more intrusive and much less likely to be implemented by hw
makers).

Ingo

-------------------->
>From b6fd6f26733e864fba2ea3eb1d716e23d2e66f3a Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Tue, 16 Dec 2008 19:23:36 +0100
Subject: [PATCH] x86, mm: limit MAXMEM on 64-bit

on 64-bit x86 the physical memory limit is controlled by the sparsemem
bits - which are 44 bits right now. But MAXMEM (the max pfn number
e820 parsing will allow to enter our sizing routines) is set to
0x00003fffffffffff, i.e. 46 bits - that's too large because it overlaps
into the vmalloc range.

So couple MAXMEM to MAX_PHYSMEM_BITS, and add a comment that the
maximum of MAX_PHYSMEM_BITS is 45 bits.

Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/pgtable_64.h | 2 +-
arch/x86/include/asm/sparsemem.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 65b6be6..c54ba69 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -146,7 +146,7 @@ static inline void native_pgd_clear(pgd_t *pgd)
#define PGDIR_MASK (~(PGDIR_SIZE - 1))


-#define MAXMEM _AC(0x00003fffffffffff, UL)
+#define MAXMEM _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
#define VMALLOC_START _AC(0xffffc20000000000, UL)
#define VMALLOC_END _AC(0xffffe1ffffffffff, UL)
#define VMEMMAP_START _AC(0xffffe20000000000, UL)
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index be44f7d..e3cc3c0 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -27,7 +27,7 @@
#else /* CONFIG_X86_32 */
# define SECTION_SIZE_BITS 27 /* matt - 128 is convenient right now */
# define MAX_PHYSADDR_BITS 44
-# define MAX_PHYSMEM_BITS 44
+# define MAX_PHYSMEM_BITS 44 /* Can be max 45 bits */
#endif

#endif /* CONFIG_SPARSEMEM */

2008-12-16 19:07:33

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux

Ingo Molnar wrote:
> phase 3) We could also go close to 47 bits: with various more invasive
> movings of VMALLOC and rest upwards, and other considerations
> such as the elimination of the generous start of 8 TB hole at
> __PAGE_OFFSET - i.e. moving __PAGE_OFFSET straight down to
> minus 128 TB. 120 TB would be doable.
>

Originally it was there, but I moved it up because that's where Xen puts
itself when running a PV 64-bit guest. It is also properly
parameterised now, so we could make it move on the basis of a config
setting.

> phase 4) If the 48 bits limit is ever lifted on the CPU side, we can move
> __PAGE_OFFSET down. This is actually less invasive than phase
> 3), because moving __PAGE_OFFSET is relatively easy. The far
> more invasive change would be the necessary changes to the
> virtual memory code: the current 4-level paging has a 256 TB
> limit which comes from the 512*512*512*512*4K split of
> pgd/pud/pmd/pte entries. Either PGDIR_SHIFT would have to be
> increased, moving the root pgtable's size from 4K to 8K or more,
> or another pgdir level would have to be introduced (which is
> even more intrusive and much less likely to be implemented by hw
> makers).
>

...or we could just reintroduce highmem ;)

J

2008-12-16 19:16:32

by Andi Kleen

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux

On Tue, Dec 16, 2008 at 07:47:42PM +0100, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <[email protected]> wrote:
>
> > On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
> > >
> > > Hi!
> > >
> > > What is the physical memory limit for 64-bit Linux? I read about 40
> > > bit address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4
> > > (256 TB).
> > >
> > > Is 64-bit linux able to use that amount - provided that one would
> > > manage to build it into a machine? Or does it have a lower limit?
> > >
> > > Looking into the Google crystal ball gives unclear pictures... I tend
> > > to assume that Linux would handle it, but I am not sure.
> >
> > IIRC, the current maximal virtual memory space size of the kernel on
> > x86_64 is 2^46.
>
> Almost: the real current upstream kernel hard memory limit on x86-64 is 44
> bits, i.e. 16 TB.
>
> There's a couple of limits to consider here.

Good point.

>
> So in reality MAXMEM should be limited to the max sparsemem-covered
> physical memory range, via the patch below.

I think the SPARSMEM limit problem only happens in the !VMEMMAP case.
It might be better to just disable the config option for !VMEMMAP,
afaik on 64bit NUMA you always want to have VMEMMAP enabled anyways.
That would avoid that particular limit at least and be in general
more efficient.

Like in this patch.

-Andi

---

Always enable vmemmap on x86-64 NUMA

To my knowledge vmemap is the most efficient option on x86-64, so there's no
sense in supporting old style non virtual sparsemem too. On 32bit x86
it still makes some sense due to the limited address space.

Signed-off-by: Andi Kleen <[email protected]>

---
arch/x86/Kconfig | 1 +
mm/Kconfig | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6.28-rc4-test/arch/x86/Kconfig
===================================================================
--- linux-2.6.28-rc4-test.orig/arch/x86/Kconfig 2008-11-10 08:50:23.000000000 +0100
+++ linux-2.6.28-rc4-test/arch/x86/Kconfig 2008-12-16 20:12:48.000000000 +0100
@@ -1060,6 +1060,7 @@
depends on X86_64 || NUMA || (EXPERIMENTAL && X86_PC) || X86_GENERICARCH
select SPARSEMEM_STATIC if X86_32
select SPARSEMEM_VMEMMAP_ENABLE if X86_64
+ select SPARSEMEM_VMEMMAP_FORCE if X86_64

config ARCH_SELECT_MEMORY_MODEL
def_bool y
Index: linux-2.6.28-rc4-test/mm/Kconfig
===================================================================
--- linux-2.6.28-rc4-test.orig/mm/Kconfig 2008-10-24 13:35:09.000000000 +0200
+++ linux-2.6.28-rc4-test/mm/Kconfig 2008-12-16 20:13:11.000000000 +0100
@@ -115,8 +115,11 @@
config SPARSEMEM_VMEMMAP_ENABLE
bool

+config SPARSEMEM_VMEMMAP_FORCE
+ bool
+
config SPARSEMEM_VMEMMAP
- bool "Sparse Memory virtual memmap"
+ bool "Sparse Memory virtual memmap" if !SPARSEMEM_VMEMMAP_FORCE
depends on SPARSEMEM && SPARSEMEM_VMEMMAP_ENABLE
default y
help

2008-12-16 19:24:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux


* Jeremy Fitzhardinge <[email protected]> wrote:

> Ingo Molnar wrote:
>> phase 3) We could also go close to 47 bits: with various more invasive
>> movings of VMALLOC and rest upwards, and other considerations
>> such as the elimination of the generous start of 8 TB hole at
>> __PAGE_OFFSET - i.e. moving __PAGE_OFFSET straight down to
>> minus 128 TB. 120 TB would be doable.
>>
>
> Originally it was there, but I moved it up because that's where Xen puts
> itself when running a PV 64-bit guest. It is also properly
> parameterised now, so we could make it move on the basis of a config
> setting.

[ i knew why i Cc:-ed you ;-) ]

>> phase 4) If the 48 bits limit is ever lifted on the CPU side, we can move
>> __PAGE_OFFSET down. This is actually less invasive than phase
>> 3), because moving __PAGE_OFFSET is relatively easy. The far
>> more invasive change would be the necessary changes to the
>> virtual memory code: the current 4-level paging has a 256 TB
>> limit which comes from the 512*512*512*512*4K split of
>> pgd/pud/pmd/pte entries. Either PGDIR_SHIFT would have to be
>> increased, moving the root pgtable's size from 4K to 8K or more,
>> or another pgdir level would have to be introduced (which is
>> even more intrusive and much less likely to be implemented by hw
>> makers).
>>
>
> ...or we could just reintroduce highmem ;)

Only over my cold dead body ;-)

It would also be utterly impractical: it gives at most one or two more
bits in practice before it starts breaking down seriously.

The thing that made highmem on 32-bit a necessity was the slow (and still
ongoing) transition to the 64-bit world.

There's no such necessity at 48 bits - hw makers will just have to extend
the pagetable bits in some suitable fashion, if they want to extend the
number of the outgoing physical pins. There's no bitness migration hard
barrier here.

[ I suspect other OSs are a lot less flexible about their x86-64 limits
than us. ]

[ And i hope i wont be around by the time we get close to the 64-bit limit
;-) If it ever happens (it is not sure it will) it will be seriously
unfunny. ]

Ingo

2008-12-17 14:12:42

by Martin Steigerwald

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux


Thanks to all for the answers and the detailed discussions.

Am Dienstag, 16. Dezember 2008 schrieb Andi Kleen:
> On Tue, Dec 16, 2008 at 04:54:16PM +0100, Rafael J. Wysocki wrote:
> > On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
> > > Hi!
> > >
> > > What is the physical memory limit for 64-bit Linux? I read about 40 bit
> > > address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4 (256
> > > TB).
> > >
> > > Is 64-bit linux able to use that amount - provided that one would
> > > manage to build it into a machine? Or does it have a lower limit?
>
> It depends on which 64bit Linux; e.g. IA64 has larger limits.

Is it 50 bits (1024 TB)?

ms@mango:~/lokal/Kernel/linux-2.6.27/arch/ia64>
egrep -r "(define.*MAX_PHYS_MEMORY|define.*MAX_PHYS_BITS|
define.*IA64_MAX_PHYSBITS)" *
include/asm/pgtable.h:#define IA64_MAX_PHYS_BITS 50 /* max. number
of physical address bits (architected) */
include/asm/pgtable.h:#define _PAGE_PPN_MASK (((__IA64_UL(1) <<
IA64_MAX_PHYS_BITS) - 1) & ~0xfffUL)
sn/kernel/setup.c:#define MAX_PHYS_MEMORY (1UL <<
IA64_MAX_PHYS_BITS) /* Max physical address supported */

Ciao,
--
Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90


Attachments:
(No filename) (1.26 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments

2008-12-17 14:30:28

by Martin Steigerwald

[permalink] [raw]
Subject: Re: physical memory limit of 64-bit linux

Am Dienstag, 16. Dezember 2008 schrieb Ingo Molnar:
> * Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday, 16 of December 2008, Martin Steigerwald wrote:
> > > Hi!
> > >
> > > What is the physical memory limit for 64-bit Linux? I read about 40
> > > bit address bus for AMD Athlon X2 (1 TB) and 48 bit for Barcelona X4
> > > (256 TB).
> > >
> > > Is 64-bit linux able to use that amount - provided that one would
> > > manage to build it into a machine? Or does it have a lower limit?
> > >
> > > Looking into the Google crystal ball gives unclear pictures... I tend
> > > to assume that Linux would handle it, but I am not sure.
> >
> > IIRC, the current maximal virtual memory space size of the kernel on
> > x86_64 is 2^46.
>
> Almost: the real current upstream kernel hard memory limit on x86-64 is 44
> bits, i.e. 16 TB.
>
> There's a couple of limits to consider here.
>
> Firstly, there's the architectural limit imposed by the CPU - that is 48
> bits, 256 TB. That is the full virtual memory range that x86-64 CPUs are
> able to address: non-canonical addresses outside that range create an
> exception.
>
> I.e. valid addresses on x86-64 are in the range of:
>
> [ 0xffff800000000000...0x00007fffffffffff ]
>
> Which is minus 128 TB to plus 128 TB.

[...]

> This problem is academic because there are no such systems in existence,
> and because we have another limit on the size of physical memory:
>
> arch/x86/include/asm/sparsemem.h:
> # define MAX_PHYSMEM_BITS 44

So this gives a real 16 TB for userspace applications or is it splitted into
minus 8 TB for kernel and plus 8 TB for userspace again?

How much memory can a process consume? On 32-Bit with 1GB/3GB split its 3
GB... are there special process limits on x86_64 or IA64?

BTW should any of those limits by documented outside of source or this
mailinglist? Maybe doesn't make too much sense cause it could change anytime
anyway.

Ciao,
--
Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90


Attachments:
(No filename) (2.00 kB)
signature.asc (197.00 B)
This is a digitally signed message part.
Download all attachments