2018-10-05 14:35:40

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH] Documentation/arm64: HugeTLB page implementation

Arm v8 architecture supports multiple page sizes - 4k, 16k and
64k. Based on the active page size, the Linux port supports
corresponding hugepage sizes at PMD and PUD(4k only) levels.

In addition, the architecture also supports caching larger sized
ranges (composed of multiple entries) at the PTE and PMD level in the
TLBs using the contiguous bit. The Linux port makes use of this
architectural support to enable additional hugepage sizes.

Describe the two different types of hugepages supported by the arm64
kernel and the hugepage sizes enabled by each.

Signed-off-by: Punit Agrawal <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Jonathan Corbet <[email protected]>
---
Documentation/arm64/hugetlbpage.txt | 39 +++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
create mode 100644 Documentation/arm64/hugetlbpage.txt

diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
new file mode 100644
index 000000000000..64ee24b88d27
--- /dev/null
+++ b/Documentation/arm64/hugetlbpage.txt
@@ -0,0 +1,39 @@
+HugeTLBpage on ARM64
+====================
+
+Hugepage relies on making efficient use of TLBs to improve performance of
+address translations. The benefit depends on both -
+
+ - the size of hugepages
+ - size of entries supported by the TLBs
+
+The ARM64 port supports two flavours of hugepages.
+
+1) Block mappings at the pud/pmd level
+--------------------------------------
+
+These are regular hugepages where a pmd or a pud page table entry points to a
+block of memory. Regardless of the supported size of entries in TLB, block
+mappings reduces the depth of page table walk needed to translate hugepage
+addresses.
+
+2) Using the Contiguous bit
+---------------------------
+
+The architecture provides a contiguous bit in the translation table entries
+(D4.5.3, ARM DDI 0487C.a) that hints to the mmu to indicate that it is one of a
+contiguous set of entries that can be cached in a single TLB entry.
+
+The contiguous bit is used in Linux to increase the mapping size at the pmd and
+pte (last) level. The number of supported contiguous entries vary by page size
+and level of the page table.
+
+
+
+The following hugepage sizes are supported -
+
+ CONT PTE PMD CONT PMD PUD
+ -------- --- -------- ---
+ 4K: 64K 2M 32M 1G
+ 16K: 2M 32M 1G
+ 64K: 2M 512M 16G
--
2.18.0



2018-10-06 16:32:21

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] Documentation/arm64: HugeTLB page implementation

Hi,
Just some minor stuff (below).

On 10/5/18 7:34 AM, Punit Agrawal wrote:
> Arm v8 architecture supports multiple page sizes - 4k, 16k and
> 64k. Based on the active page size, the Linux port supports
> corresponding hugepage sizes at PMD and PUD(4k only) levels.
>
> In addition, the architecture also supports caching larger sized
> ranges (composed of multiple entries) at the PTE and PMD level in the
> TLBs using the contiguous bit. The Linux port makes use of this
> architectural support to enable additional hugepage sizes.
>
> Describe the two different types of hugepages supported by the arm64
> kernel and the hugepage sizes enabled by each.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Jonathan Corbet <[email protected]>
> ---
> Documentation/arm64/hugetlbpage.txt | 39 +++++++++++++++++++++++++++++
> 1 file changed, 39 insertions(+)
> create mode 100644 Documentation/arm64/hugetlbpage.txt
>
> diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
> new file mode 100644
> index 000000000000..64ee24b88d27
> --- /dev/null
> +++ b/Documentation/arm64/hugetlbpage.txt
> @@ -0,0 +1,39 @@
> +HugeTLBpage on ARM64
> +====================
> +
> +Hugepage relies on making efficient use of TLBs to improve performance of
> +address translations. The benefit depends on both -
> +
> + - the size of hugepages
> + - size of entries supported by the TLBs
> +
> +The ARM64 port supports two flavours of hugepages.
> +
> +1) Block mappings at the pud/pmd level
> +--------------------------------------
> +
> +These are regular hugepages where a pmd or a pud page table entry points to a
> +block of memory. Regardless of the supported size of entries in TLB, block
> +mappings reduces the depth of page table walk needed to translate hugepage

reduce

> +addresses.
> +
> +2) Using the Contiguous bit
> +---------------------------
> +
> +The architecture provides a contiguous bit in the translation table entries
> +(D4.5.3, ARM DDI 0487C.a) that hints to the mmu to indicate that it is one of a

preferably MMU

> +contiguous set of entries that can be cached in a single TLB entry.
> +
> +The contiguous bit is used in Linux to increase the mapping size at the pmd and
> +pte (last) level. The number of supported contiguous entries vary by page size

varies

> +and level of the page table.
> +
> +
> +
> +The following hugepage sizes are supported -
> +
> + CONT PTE PMD CONT PMD PUD
> + -------- --- -------- ---
> + 4K: 64K 2M 32M 1G
> + 16K: 2M 32M 1G
> + 64K: 2M 512M 16G
>


thanks,
--
~Randy

2018-10-08 09:39:41

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH] Documentation/arm64: HugeTLB page implementation

Hi Randy,

Randy Dunlap <[email protected]> writes:

> Hi,
> Just some minor stuff (below).
>
> On 10/5/18 7:34 AM, Punit Agrawal wrote:
>> Arm v8 architecture supports multiple page sizes - 4k, 16k and
>> 64k. Based on the active page size, the Linux port supports
>> corresponding hugepage sizes at PMD and PUD(4k only) levels.
>>
>> In addition, the architecture also supports caching larger sized
>> ranges (composed of multiple entries) at the PTE and PMD level in the
>> TLBs using the contiguous bit. The Linux port makes use of this
>> architectural support to enable additional hugepage sizes.
>>
>> Describe the two different types of hugepages supported by the arm64
>> kernel and the hugepage sizes enabled by each.
>>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Jonathan Corbet <[email protected]>
>> ---
>> Documentation/arm64/hugetlbpage.txt | 39 +++++++++++++++++++++++++++++
>> 1 file changed, 39 insertions(+)
>> create mode 100644 Documentation/arm64/hugetlbpage.txt
>>
>> diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
>> new file mode 100644
>> index 000000000000..64ee24b88d27
>> --- /dev/null
>> +++ b/Documentation/arm64/hugetlbpage.txt
>> @@ -0,0 +1,39 @@
>> +HugeTLBpage on ARM64
>> +====================
>> +
>> +Hugepage relies on making efficient use of TLBs to improve performance of
>> +address translations. The benefit depends on both -
>> +
>> + - the size of hugepages
>> + - size of entries supported by the TLBs
>> +
>> +The ARM64 port supports two flavours of hugepages.
>> +
>> +1) Block mappings at the pud/pmd level
>> +--------------------------------------
>> +
>> +These are regular hugepages where a pmd or a pud page table entry points to a
>> +block of memory. Regardless of the supported size of entries in TLB, block
>> +mappings reduces the depth of page table walk needed to translate hugepage
>
> reduce
>
>> +addresses.
>> +
>> +2) Using the Contiguous bit
>> +---------------------------
>> +
>> +The architecture provides a contiguous bit in the translation table entries
>> +(D4.5.3, ARM DDI 0487C.a) that hints to the mmu to indicate that it is one of a
>
> preferably MMU
>
>> +contiguous set of entries that can be cached in a single TLB entry.
>> +
>> +The contiguous bit is used in Linux to increase the mapping size at the pmd and
>> +pte (last) level. The number of supported contiguous entries vary by page size
>
> varies
>
>> +and level of the page table.
>> +
>> +
>> +
>> +The following hugepage sizes are supported -
>> +
>> + CONT PTE PMD CONT PMD PUD
>> + -------- --- -------- ---
>> + 4K: 64K 2M 32M 1G
>> + 16K: 2M 32M 1G
>> + 64K: 2M 512M 16G
>>

I've updated the patch with incorporating your feedback and will post an
update shortly.

Thanks a lot for taking a look.

Punit

2018-10-08 10:04:39

by Punit Agrawal

[permalink] [raw]
Subject: [PATCH v2] Documentation/arm64: HugeTLB page implementation

Arm v8 architecture supports multiple page sizes - 4k, 16k and
64k. Based on the active page size, the Linux port supports
corresponding hugepage sizes at PMD and PUD(4k only) levels.

In addition, the architecture also supports caching larger sized
ranges (composed of multiple entries) at the PTE and PMD level in the
TLBs using the contiguous bit. The Linux port makes use of this
architectural support to enable additional hugepage sizes.

Describe the two different types of hugepages supported by the arm64
kernel and the hugepage sizes enabled by each.

Signed-off-by: Punit Agrawal <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Jonathan Corbet <[email protected]>
---
Hi,

This version incorporates the feedback on v1.

Thanks,
Punit

Documentation/arm64/hugetlbpage.txt | 38 +++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
create mode 100644 Documentation/arm64/hugetlbpage.txt

diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
new file mode 100644
index 000000000000..cfae87dc653b
--- /dev/null
+++ b/Documentation/arm64/hugetlbpage.txt
@@ -0,0 +1,38 @@
+HugeTLBpage on ARM64
+====================
+
+Hugepage relies on making efficient use of TLBs to improve performance of
+address translations. The benefit depends on both -
+
+ - the size of hugepages
+ - size of entries supported by the TLBs
+
+The ARM64 port supports two flavours of hugepages.
+
+1) Block mappings at the pud/pmd level
+--------------------------------------
+
+These are regular hugepages where a pmd or a pud page table entry points to a
+block of memory. Regardless of the supported size of entries in TLB, block
+mappings reduce the depth of page table walk needed to translate hugepage
+addresses.
+
+2) Using the Contiguous bit
+---------------------------
+
+The architecture provides a contiguous bit in the translation table entries
+(D4.5.3, ARM DDI 0487C.a) that hints to the MMU to indicate that it is one of a
+contiguous set of entries that can be cached in a single TLB entry.
+
+The contiguous bit is used in Linux to increase the mapping size at the pmd and
+pte (last) level. The number of supported contiguous entries varies by page size
+and level of the page table.
+
+
+The following hugepage sizes are supported -
+
+ CONT PTE PMD CONT PMD PUD
+ -------- --- -------- ---
+ 4K: 64K 2M 32M 1G
+ 16K: 2M 32M 1G
+ 64K: 2M 512M 16G
--
2.18.0


2018-10-08 19:49:53

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2] Documentation/arm64: HugeTLB page implementation

On 10/8/18 3:03 AM, Punit Agrawal wrote:
> Arm v8 architecture supports multiple page sizes - 4k, 16k and
> 64k. Based on the active page size, the Linux port supports
> corresponding hugepage sizes at PMD and PUD(4k only) levels.
>
> In addition, the architecture also supports caching larger sized
> ranges (composed of multiple entries) at the PTE and PMD level in the
> TLBs using the contiguous bit. The Linux port makes use of this
> architectural support to enable additional hugepage sizes.
>
> Describe the two different types of hugepages supported by the arm64
> kernel and the hugepage sizes enabled by each.
>
> Signed-off-by: Punit Agrawal <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Jonathan Corbet <[email protected]>

Acked-by: Randy Dunlap <[email protected]>

Thanks.

> ---
> Hi,
>
> This version incorporates the feedback on v1.
>
> Thanks,
> Punit
>
> Documentation/arm64/hugetlbpage.txt | 38 +++++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)
> create mode 100644 Documentation/arm64/hugetlbpage.txt
>
> diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
> new file mode 100644
> index 000000000000..cfae87dc653b
> --- /dev/null
> +++ b/Documentation/arm64/hugetlbpage.txt
> @@ -0,0 +1,38 @@
> +HugeTLBpage on ARM64
> +====================
> +
> +Hugepage relies on making efficient use of TLBs to improve performance of
> +address translations. The benefit depends on both -
> +
> + - the size of hugepages
> + - size of entries supported by the TLBs
> +
> +The ARM64 port supports two flavours of hugepages.
> +
> +1) Block mappings at the pud/pmd level
> +--------------------------------------
> +
> +These are regular hugepages where a pmd or a pud page table entry points to a
> +block of memory. Regardless of the supported size of entries in TLB, block
> +mappings reduce the depth of page table walk needed to translate hugepage
> +addresses.
> +
> +2) Using the Contiguous bit
> +---------------------------
> +
> +The architecture provides a contiguous bit in the translation table entries
> +(D4.5.3, ARM DDI 0487C.a) that hints to the MMU to indicate that it is one of a
> +contiguous set of entries that can be cached in a single TLB entry.
> +
> +The contiguous bit is used in Linux to increase the mapping size at the pmd and
> +pte (last) level. The number of supported contiguous entries varies by page size
> +and level of the page table.
> +
> +
> +The following hugepage sizes are supported -
> +
> + CONT PTE PMD CONT PMD PUD
> + -------- --- -------- ---
> + 4K: 64K 2M 32M 1G
> + 16K: 2M 32M 1G
> + 64K: 2M 512M 16G
>


--
~Randy

2018-10-09 10:02:47

by Punit Agrawal

[permalink] [raw]
Subject: Re: [PATCH v2] Documentation/arm64: HugeTLB page implementation

Randy Dunlap <[email protected]> writes:

> On 10/8/18 3:03 AM, Punit Agrawal wrote:
>> Arm v8 architecture supports multiple page sizes - 4k, 16k and
>> 64k. Based on the active page size, the Linux port supports
>> corresponding hugepage sizes at PMD and PUD(4k only) levels.
>>
>> In addition, the architecture also supports caching larger sized
>> ranges (composed of multiple entries) at the PTE and PMD level in the
>> TLBs using the contiguous bit. The Linux port makes use of this
>> architectural support to enable additional hugepage sizes.
>>
>> Describe the two different types of hugepages supported by the arm64
>> kernel and the hugepage sizes enabled by each.
>>
>> Signed-off-by: Punit Agrawal <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Jonathan Corbet <[email protected]>
>
> Acked-by: Randy Dunlap <[email protected]>

Thanks!

Catalin, Will - I assume you'll pick this up at some point? Or do arm64
documentation patches get routed by another tree?

>
> Thanks.
>
>> ---
>> Hi,
>>
>> This version incorporates the feedback on v1.
>>
>> Thanks,
>> Punit
>>
>> Documentation/arm64/hugetlbpage.txt | 38 +++++++++++++++++++++++++++++
>> 1 file changed, 38 insertions(+)
>> create mode 100644 Documentation/arm64/hugetlbpage.txt
>>
>> diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.txt
>> new file mode 100644
>> index 000000000000..cfae87dc653b
>> --- /dev/null
>> +++ b/Documentation/arm64/hugetlbpage.txt
>> @@ -0,0 +1,38 @@
>> +HugeTLBpage on ARM64
>> +====================
>> +
>> +Hugepage relies on making efficient use of TLBs to improve performance of
>> +address translations. The benefit depends on both -
>> +
>> + - the size of hugepages
>> + - size of entries supported by the TLBs
>> +
>> +The ARM64 port supports two flavours of hugepages.
>> +
>> +1) Block mappings at the pud/pmd level
>> +--------------------------------------
>> +
>> +These are regular hugepages where a pmd or a pud page table entry points to a
>> +block of memory. Regardless of the supported size of entries in TLB, block
>> +mappings reduce the depth of page table walk needed to translate hugepage
>> +addresses.
>> +
>> +2) Using the Contiguous bit
>> +---------------------------
>> +
>> +The architecture provides a contiguous bit in the translation table entries
>> +(D4.5.3, ARM DDI 0487C.a) that hints to the MMU to indicate that it is one of a
>> +contiguous set of entries that can be cached in a single TLB entry.
>> +
>> +The contiguous bit is used in Linux to increase the mapping size at the pmd and
>> +pte (last) level. The number of supported contiguous entries varies by page size
>> +and level of the page table.
>> +
>> +
>> +The following hugepage sizes are supported -
>> +
>> + CONT PTE PMD CONT PMD PUD
>> + -------- --- -------- ---
>> + 4K: 64K 2M 32M 1G
>> + 16K: 2M 32M 1G
>> + 64K: 2M 512M 16G
>>

2018-10-09 11:51:52

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2] Documentation/arm64: HugeTLB page implementation

On Tue, Oct 09, 2018 at 11:02:01AM +0100, Punit Agrawal wrote:
> Randy Dunlap <[email protected]> writes:
>
> > On 10/8/18 3:03 AM, Punit Agrawal wrote:
> >> Arm v8 architecture supports multiple page sizes - 4k, 16k and
> >> 64k. Based on the active page size, the Linux port supports
> >> corresponding hugepage sizes at PMD and PUD(4k only) levels.
> >>
> >> In addition, the architecture also supports caching larger sized
> >> ranges (composed of multiple entries) at the PTE and PMD level in the
> >> TLBs using the contiguous bit. The Linux port makes use of this
> >> architectural support to enable additional hugepage sizes.
> >>
> >> Describe the two different types of hugepages supported by the arm64
> >> kernel and the hugepage sizes enabled by each.
> >>
> >> Signed-off-by: Punit Agrawal <[email protected]>
> >> Cc: Catalin Marinas <[email protected]>
> >> Cc: Will Deacon <[email protected]>
> >> Cc: Jonathan Corbet <[email protected]>
> >
> > Acked-by: Randy Dunlap <[email protected]>
>
> Thanks!
>
> Catalin, Will - I assume you'll pick this up at some point? Or do arm64
> documentation patches get routed by another tree?

Acked-by: Will Deacon <[email protected]>

Catalin can pick this up for 4.20.

Will

2018-10-10 17:10:11

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2] Documentation/arm64: HugeTLB page implementation

On Tue, Oct 09, 2018 at 12:50:49PM +0100, Will Deacon wrote:
> On Tue, Oct 09, 2018 at 11:02:01AM +0100, Punit Agrawal wrote:
> > Randy Dunlap <[email protected]> writes:
> >
> > > On 10/8/18 3:03 AM, Punit Agrawal wrote:
> > >> Arm v8 architecture supports multiple page sizes - 4k, 16k and
> > >> 64k. Based on the active page size, the Linux port supports
> > >> corresponding hugepage sizes at PMD and PUD(4k only) levels.
> > >>
> > >> In addition, the architecture also supports caching larger sized
> > >> ranges (composed of multiple entries) at the PTE and PMD level in the
> > >> TLBs using the contiguous bit. The Linux port makes use of this
> > >> architectural support to enable additional hugepage sizes.
> > >>
> > >> Describe the two different types of hugepages supported by the arm64
> > >> kernel and the hugepage sizes enabled by each.
> > >>
> > >> Signed-off-by: Punit Agrawal <[email protected]>
> > >> Cc: Catalin Marinas <[email protected]>
> > >> Cc: Will Deacon <[email protected]>
> > >> Cc: Jonathan Corbet <[email protected]>
> > >
> > > Acked-by: Randy Dunlap <[email protected]>
> >
> > Thanks!
> >
> > Catalin, Will - I assume you'll pick this up at some point? Or do arm64
> > documentation patches get routed by another tree?
>
> Acked-by: Will Deacon <[email protected]>
>
> Catalin can pick this up for 4.20.

Done.

--
Catalin