Hi,
The below patch describes some of the concepts in Linux mm. It does not aim
to provide in-depth description of the mm internals, but rather help
unprepared reader to understand cryptic texts, e.g.
Documentation/sysctl/vm.txt.
I covered what seemed to me the essential minimum that is required for
user/administrator to read the existing docs without searching the web for
the explanations for every other term.
--
Sincerely yours,
Mike.
From 2d3ec7ea101a66b1535d5bec4acfc1e0f737fd53 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Tue, 29 May 2018 14:12:39 +0300
Subject: [PATCH] docs/admin-guide/mm: add high level concepts overview
The are terms that seem obvious to the mm developers, but may be somewhat
obscure for, say, less involved readers.
The concepts overview can be seen as an "extended glossary" that introduces
such terms to the readers of the kernel documentation.
Signed-off-by: Mike Rapoport <[email protected]>
---
Documentation/admin-guide/mm/concepts.rst | 222 ++++++++++++++++++++++++++++++
Documentation/admin-guide/mm/index.rst | 5 +
2 files changed, 227 insertions(+)
create mode 100644 Documentation/admin-guide/mm/concepts.rst
diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
new file mode 100644
index 0000000..291699c
--- /dev/null
+++ b/Documentation/admin-guide/mm/concepts.rst
@@ -0,0 +1,222 @@
+.. _mm_concepts:
+
+=================
+Concepts overview
+=================
+
+The memory management in Linux is complex system that evolved over the
+years and included more and more functionality to support variety of
+systems from MMU-less microcontrollers to supercomputers. The memory
+management for systems without MMU is called ``nommu`` and it
+definitely deserves a dedicated document, which hopefully will be
+eventually written. Yet, although some of the concepts are the same,
+here we assume that MMU is available and CPU can translate a virtual
+address to a physical address.
+
+.. contents:: :local:
+
+Virtual Memory Primer
+=====================
+
+The physical memory in a computer system is a limited resource and
+even for systems that support memory hotplug there is a hard limit on
+the amount of memory that can be installed. The physical memory is not
+necessary contiguous, it might be accessible as a set of distinct
+address ranges. Besides, different CPU architectures, and even
+different implementations of the same architecture have different view
+how these address ranges defined.
+
+All this makes dealing directly with physical memory quite complex and
+to avoid this complexity a concept of virtual memory was developed.
+
+The virtual memory abstracts the details of physical memory from the
+application software, allows to keep only needed information in the
+physical memory (demand paging) and provides a mechanism for the
+protection and controlled sharing of data between processes.
+
+With virtual memory, each and every memory access uses a virtual
+address. When the CPU decodes the an instruction that reads (or
+writes) from (or to) the system memory, it translates the `virtual`
+address encoded in that instruction to a `physical` address that the
+memory controller can understand.
+
+The physical system memory is divided into page frames, or pages. The
+size of each page is architecture specific. Some architectures allow
+selection of the page size from several supported values; this
+selection is performed at the kernel build time by setting an
+appropriate kernel configuration option.
+
+Each physical memory page can be mapped as one or more virtual
+pages. These mappings are described by page tables that allow
+translation from virtual address used by programs to real address in
+the physical memory. The page tables organized hierarchically.
+
+The tables at the lowest level of the hierarchy contain physical
+addresses of actual pages used by the software. The tables at higher
+levels contain physical addresses of the pages belonging to the lower
+levels. The pointer to the top level page table resides in a
+register. When the CPU performs the address translation, it uses this
+register to access the top level page table. The high bits of the
+virtual address are used to index an entry in the top level page
+table. That entry is then used to access the next level in the
+hierarchy with the next bits of the virtual address as the index to
+that level page table. The lowest bits in the virtual address define
+the offset inside the actual page.
+
+Huge Pages
+==========
+
+The address translation requires several memory accesses and memory
+accesses are slow relatively to CPU speed. To avoid spending precious
+processor cycles on the address translation, CPUs maintain a cache of
+such translations called Translation Lookaside Buffer (or
+TLB). Usually TLB is pretty scarce resource and applications with
+large memory working set will experience performance hit because of
+TLB misses.
+
+Many modern CPU architectures allow mapping of the memory pages
+directly by the higher levels in the page table. For instance, on x86,
+it is possible to map 2M and even 1G pages using entries in the second
+and the third level page tables. In Linux such pages are called
+`huge`. Usage of huge pages significantly reduces pressure on TLB,
+improves TLB hit-rate and thus improves overall system performance.
+
+There are two mechanisms in Linux that enable mapping of the physical
+memory with the huge pages. The first one is `HugeTLB filesystem`, or
+hugetlbfs. It is a pseudo filesystem that uses RAM as its backing
+store. For the files created in this filesystem the data resides in
+the memory and mapped using huge pages. The hugetlbfs is described at
+:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`.
+
+Another, more recent, mechanism that enables use of the huge pages is
+called `Transparent HugePages`, or THP. Unlike the hugetlbfs that
+requires users and/or system administrators to configure what parts of
+the system memory should and can be mapped by the huge pages, THP
+manages such mappings transparently to the user and hence the
+name. See
+:ref:`Documentation/admin-guide/mm/transhuge.rst <admin_guide_transhuge>`
+for more details about THP.
+
+Zones
+=====
+
+Often hardware poses restrictions on how different physical memory
+ranges can be accessed. In some cases, devices cannot perform DMA to
+all the addressable memory. In other cases, the size of the physical
+memory exceeds the maximal addressable size of virtual memory and
+special actions are required to access portions of the memory. Linux
+groups memory pages into `zones` according to their possible
+usage. For example, ZONE_DMA will contain memory that can be used by
+devices for DMA, ZONE_HIGHMEM will contain memory that is not
+permanently mapped into kernel's address space and ZONE_NORMAL will
+contain normally addressed pages.
+
+The actual layout of the memory zones is hardware dependent as not all
+architectures define all zones, and requirements for DMA are different
+for different platforms.
+
+Nodes
+=====
+
+Many multi-processor machines are NUMA - Non-Uniform Memory Access -
+systems. In such systems the memory is arranged into banks that have
+different access latency depending on the "distance" from the
+processor. Each bank is referred as `node` and for each node Linux
+constructs an independent memory management subsystem. A node has it's
+own set of zones, lists of free and used pages and various statistics
+counters. You can find more details about NUMA in
+:ref:`Documentation/vm/numa.rst <numa>` and in
+:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`.
+
+Page cache
+==========
+
+The physical memory is volatile and the common case for getting data
+into the memory is to read it from files. Whenever a file is read, the
+data is put into the `page cache` to avoid expensive disk access on
+the subsequent reads. Similarly, when one writes to a file, the data
+is placed in the page cache and eventually gets into the backing
+storage device. The written pages are marked as `dirty` and when Linux
+decides to reuse them for other purposes, it makes sure to synchronize
+the file contents on the device with the updated data.
+
+Anonymous Memory
+================
+
+The `anonymous memory` or `anonymous mappings` represent memory that
+is not backed by a filesystem. Such mappings are implicitly created
+for program's stack and heap or by explicit calls to mmap(2) system
+call. Usually, the anonymous mappings only define virtual memory areas
+that the program is allowed to access. The read accesses will result
+in creation of a page table entry that references a special physical
+page filled with zeroes. When the program performs a write, regular
+physical page will be allocated to hold the written data. The page
+will be marked dirty and if the kernel will decide to repurpose it,
+the dirty page will be swapped out.
+
+Reclaim
+=======
+
+Throughout the system lifetime, a physical page can be used for storing
+different types of data. It can be kernel internal data structures,
+DMA'able buffers for device drivers use, data read from a filesystem,
+memory allocated by user space processes etc.
+
+Depending on the page usage it is treated differently by the Linux
+memory management. The pages that can be freed at any time, either
+because they cache the data available elsewhere, for instance, on a
+hard disk, or because they can be swapped out, again, to the hard
+disk, are called `reclaimable`. The most notable categories of the
+reclaimable pages are page cache and anonymous memory.
+
+In most cases, the pages holding internal kernel data and used as DMA
+buffers cannot be repurposed, and they remain pinned until freed by
+their user. Such pages are called `unreclaimable`. However, in certain
+circumstances, even pages occupied with kernel data structures can be
+reclaimed. For instance, in-memory caches of filesystem metadata can
+be re-read from the storage device and therefore it is possible to
+discard them from the main memory when system is under memory
+pressure.
+
+The process of freeing the reclaimable physical memory pages and
+repurposing them is called (surprise!) `reclaim`. Linux can reclaim
+pages either asynchronously or synchronously, depending on the state
+of the system. When system is not loaded, most of the memory is free
+and allocation request will be satisfied immediately from the free
+pages supply. As the load increases, the amount of the free pages goes
+down and when it reaches a certain threshold (high watermark), an
+allocation request will awaken the ``kswapd`` daemon. It will
+asynchronously scan memory pages and either just free them if the data
+they contain is available elsewhere, or evict to the backing storage
+device (remember those dirty pages?). As memory usage increases even
+more and reaches another threshold - min watermark - an allocation
+will trigger the `direct reclaim`. In this case allocation is stalled
+until enough memory pages are reclaimed to satisfy the request.
+
+Compaction
+==========
+
+As the system runs, tasks allocate and free the memory and it becomes
+fragmented. Although with virtual memory it is possible to present
+scattered physical pages as virtually contiguous range, sometimes it is
+necessary to allocate large physically contiguous memory areas. Such
+need may arise, for instance, when a device driver requires large
+buffer for DMA, or when THP allocates a huge page. Memory `compaction`
+addresses the fragmentation issue. This mechanism moves occupied pages
+from the lower part of a memory zone to free pages in the upper part
+of the zone. When a compaction scan is finished free pages are grouped
+together at the beginning of the zone and allocations of large
+physically contiguous areas become possible.
+
+Like reclaim, the compaction may happen asynchronously in ``kcompactd``
+daemon or synchronously as a result of memory allocation request.
+
+OOM killer
+==========
+
+It may happen, that on a loaded machine memory will be exhausted. When
+the kernel detects that the system runs out of memory (OOM) it invokes
+`OOM killer`. Its mission is simple: all it has to do is to select a
+task to sacrifice for the sake of the overall system health. The
+selected task is killed in a hope that after it exits enough memory
+will be freed to continue normal operation.
diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
index 8454be6..ceead68 100644
--- a/Documentation/admin-guide/mm/index.rst
+++ b/Documentation/admin-guide/mm/index.rst
@@ -15,12 +15,17 @@ are described in Documentation/sysctl/vm.txt and in `man 5 proc`_.
.. _man 5 proc: http://man7.org/linux/man-pages/man5/proc.5.html
+Linux memory management has its own jargon and if you are not yet
+familiar with it, consider reading
+:ref:`Documentation/admin-guide/mm/concepts.rst <mm_concepts>`.
+
Here we document in detail how to interact with various mechanisms in
the Linux memory management.
.. toctree::
:maxdepth: 1
+ concepts
hugetlbpage
idle_page_tracking
ksm
--
2.7.4
> On 29 May 2018 at 12:37 Mike Rapoport <[email protected]> wrote:
>
> +=================
> +Concepts overview
> +=================
> +
> +The memory management in Linux is complex system that evolved over the
> +years and included more and more functionality to support variety of
> +systems from MMU-less microcontrollers to supercomputers. The memory
> +management for systems without MMU is called ``nommu`` and it
> +definitely deserves a dedicated document, which hopefully will be
> +eventually written. Yet, although some of the concepts are the same,
> +here we assume that MMU is available and CPU can translate a virtual
> +address to a physical address.
> +
> +.. contents:: :local:
> +
> +Virtual Memory Primer
> +=====================
> +
> +The physical memory in a computer system is a limited resource and
> +even for systems that support memory hotplug there is a hard limit on
> +the amount of memory that can be installed. The physical memory is not
> +necessary contiguous, it might be accessible as a set of distinct
> +address ranges. Besides, different CPU architectures, and even
> +different implementations of the same architecture have different view
> +how these address ranges defined.
> +
> +All this makes dealing directly with physical memory quite complex and
> +to avoid this complexity a concept of virtual memory was developed.
> +
> +The virtual memory abstracts the details of physical memory from the
> +application software, allows to keep only needed information in the
> +physical memory (demand paging) and provides a mechanism for the
> +protection and controlled sharing of data between processes.
> +
> +With virtual memory, each and every memory access uses a virtual
> +address. When the CPU decodes the an instruction that reads (or
> +writes) from (or to) the system memory, it translates the `virtual`
> +address encoded in that instruction to a `physical` address that the
> +memory controller can understand.
I spotted an errant "the an" in that paragraph.
I would rewrite that sentence as "When the CPU decodes an instruction that
reads from (or writes to) the system memory," ...
The rest of the document looks good to me, and a nice overview.
Best regards,
Justin.
On Tue, 29 May 2018 14:37:25 +0300
Mike Rapoport <[email protected]> wrote:
> The are terms that seem obvious to the mm developers, but may be somewhat
> obscure for, say, less involved readers.
>
> The concepts overview can be seen as an "extended glossary" that introduces
> such terms to the readers of the kernel documentation.
So as I read through this I thought of all kinds of ways it could be
improved, but I suspect that will always be the case. It's a good intro
as-is, so I've applied it. Thanks!
jon
On 05/29/2018 04:37 AM, Mike Rapoport wrote:
> Hi,
>
> From 2d3ec7ea101a66b1535d5bec4acfc1e0f737fd53 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Tue, 29 May 2018 14:12:39 +0300
> Subject: [PATCH] docs/admin-guide/mm: add high level concepts overview
>
> The are terms that seem obvious to the mm developers, but may be somewhat
There are [or: These are]
> obscure for, say, less involved readers.
>
> The concepts overview can be seen as an "extended glossary" that introduces
> such terms to the readers of the kernel documentation.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> Documentation/admin-guide/mm/concepts.rst | 222 ++++++++++++++++++++++++++++++
> Documentation/admin-guide/mm/index.rst | 5 +
> 2 files changed, 227 insertions(+)
> create mode 100644 Documentation/admin-guide/mm/concepts.rst
>
> diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
> new file mode 100644
> index 0000000..291699c
> --- /dev/null
> +++ b/Documentation/admin-guide/mm/concepts.rst
> @@ -0,0 +1,222 @@
> +.. _mm_concepts:
> +
> +=================
> +Concepts overview
> +=================
> +
> +The memory management in Linux is complex system that evolved over the
is a complex
> +years and included more and more functionality to support variety of
support a variety of
> +systems from MMU-less microcontrollers to supercomputers. The memory
> +management for systems without MMU is called ``nommu`` and it
without an MMU
> +definitely deserves a dedicated document, which hopefully will be
> +eventually written. Yet, although some of the concepts are the same,
> +here we assume that MMU is available and CPU can translate a virtual
that an MMU and a CPU
> +address to a physical address.
> +
> +.. contents:: :local:
> +
> +Virtual Memory Primer
> +=====================
> +
> +The physical memory in a computer system is a limited resource and
> +even for systems that support memory hotplug there is a hard limit on
> +the amount of memory that can be installed. The physical memory is not
> +necessary contiguous, it might be accessible as a set of distinct
Change comma to semi-colon or period (and if latter, s/it/It/).
> +address ranges. Besides, different CPU architectures, and even
> +different implementations of the same architecture have different view
views of
> +how these address ranges defined.
> +
> +All this makes dealing directly with physical memory quite complex and
> +to avoid this complexity a concept of virtual memory was developed.
> +
> +The virtual memory abstracts the details of physical memory from the
virtual memory {system, implementation} abstracts
> +application software, allows to keep only needed information in the
software, allowing the VM to keep only needed information in the
> +physical memory (demand paging) and provides a mechanism for the
> +protection and controlled sharing of data between processes.
> +
> +With virtual memory, each and every memory access uses a virtual
> +address. When the CPU decodes the an instruction that reads (or
> +writes) from (or to) the system memory, it translates the `virtual`
> +address encoded in that instruction to a `physical` address that the
> +memory controller can understand.
> +
> +The physical system memory is divided into page frames, or pages. The
> +size of each page is architecture specific. Some architectures allow
> +selection of the page size from several supported values; this
> +selection is performed at the kernel build time by setting an
> +appropriate kernel configuration option.
> +
> +Each physical memory page can be mapped as one or more virtual
> +pages. These mappings are described by page tables that allow
> +translation from virtual address used by programs to real address in
from a virtual address to {a, the} real address in
> +the physical memory. The page tables organized hierarchically.
tables are organized
> +
> +The tables at the lowest level of the hierarchy contain physical
> +addresses of actual pages used by the software. The tables at higher
> +levels contain physical addresses of the pages belonging to the lower
> +levels. The pointer to the top level page table resides in a
> +register. When the CPU performs the address translation, it uses this
> +register to access the top level page table. The high bits of the
> +virtual address are used to index an entry in the top level page
> +table. That entry is then used to access the next level in the
> +hierarchy with the next bits of the virtual address as the index to
> +that level page table. The lowest bits in the virtual address define
> +the offset inside the actual page.
> +
> +Huge Pages
> +==========
> +
> +The address translation requires several memory accesses and memory
> +accesses are slow relatively to CPU speed. To avoid spending precious
> +processor cycles on the address translation, CPUs maintain a cache of
> +such translations called Translation Lookaside Buffer (or
> +TLB). Usually TLB is pretty scarce resource and applications with
> +large memory working set will experience performance hit because of
> +TLB misses.
> +
> +Many modern CPU architectures allow mapping of the memory pages
> +directly by the higher levels in the page table. For instance, on x86,
> +it is possible to map 2M and even 1G pages using entries in the second
> +and the third level page tables. In Linux such pages are called
> +`huge`. Usage of huge pages significantly reduces pressure on TLB,
> +improves TLB hit-rate and thus improves overall system performance.
> +
> +There are two mechanisms in Linux that enable mapping of the physical
> +memory with the huge pages. The first one is `HugeTLB filesystem`, or
> +hugetlbfs. It is a pseudo filesystem that uses RAM as its backing
> +store. For the files created in this filesystem the data resides in
> +the memory and mapped using huge pages. The hugetlbfs is described at
> +:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`.
> +
> +Another, more recent, mechanism that enables use of the huge pages is
> +called `Transparent HugePages`, or THP. Unlike the hugetlbfs that
> +requires users and/or system administrators to configure what parts of
> +the system memory should and can be mapped by the huge pages, THP
> +manages such mappings transparently to the user and hence the
> +name. See
> +:ref:`Documentation/admin-guide/mm/transhuge.rst <admin_guide_transhuge>`
> +for more details about THP.
> +
> +Zones
> +=====
> +
> +Often hardware poses restrictions on how different physical memory
> +ranges can be accessed. In some cases, devices cannot perform DMA to
> +all the addressable memory. In other cases, the size of the physical
> +memory exceeds the maximal addressable size of virtual memory and
> +special actions are required to access portions of the memory. Linux
> +groups memory pages into `zones` according to their possible
> +usage. For example, ZONE_DMA will contain memory that can be used by
> +devices for DMA, ZONE_HIGHMEM will contain memory that is not
> +permanently mapped into kernel's address space and ZONE_NORMAL will
> +contain normally addressed pages.
> +
> +The actual layout of the memory zones is hardware dependent as not all
> +architectures define all zones, and requirements for DMA are different
> +for different platforms.
> +
> +Nodes
> +=====
> +
> +Many multi-processor machines are NUMA - Non-Uniform Memory Access -
> +systems. In such systems the memory is arranged into banks that have
> +different access latency depending on the "distance" from the
> +processor. Each bank is referred as `node` and for each node Linux
is referred to as a `node`
> +constructs an independent memory management subsystem. A node has it's
its
> +own set of zones, lists of free and used pages and various statistics
> +counters. You can find more details about NUMA in
> +:ref:`Documentation/vm/numa.rst <numa>` and in
> +:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`.
> +
> +Page cache
> +==========
> +
> +The physical memory is volatile and the common case for getting data
> +into the memory is to read it from files. Whenever a file is read, the
> +data is put into the `page cache` to avoid expensive disk access on
> +the subsequent reads. Similarly, when one writes to a file, the data
> +is placed in the page cache and eventually gets into the backing
> +storage device. The written pages are marked as `dirty` and when Linux
> +decides to reuse them for other purposes, it makes sure to synchronize
> +the file contents on the device with the updated data.
> +
> +Anonymous Memory
> +================
> +
> +The `anonymous memory` or `anonymous mappings` represent memory that
> +is not backed by a filesystem. Such mappings are implicitly created
> +for program's stack and heap or by explicit calls to mmap(2) system
> +call. Usually, the anonymous mappings only define virtual memory areas
> +that the program is allowed to access. The read accesses will result
> +in creation of a page table entry that references a special physical
> +page filled with zeroes. When the program performs a write, regular
write, a regular
> +physical page will be allocated to hold the written data. The page
> +will be marked dirty and if the kernel will decide to repurpose it,
> +the dirty page will be swapped out.
> +
> +Reclaim
> +=======
> +
> +Throughout the system lifetime, a physical page can be used for storing
> +different types of data. It can be kernel internal data structures,
> +DMA'able buffers for device drivers use, data read from a filesystem,
> +memory allocated by user space processes etc.
> +
> +Depending on the page usage it is treated differently by the Linux
> +memory management. The pages that can be freed at any time, either
> +because they cache the data available elsewhere, for instance, on a
> +hard disk, or because they can be swapped out, again, to the hard
> +disk, are called `reclaimable`. The most notable categories of the
> +reclaimable pages are page cache and anonymous memory.
> +
> +In most cases, the pages holding internal kernel data and used as DMA
> +buffers cannot be repurposed, and they remain pinned until freed by
> +their user. Such pages are called `unreclaimable`. However, in certain
> +circumstances, even pages occupied with kernel data structures can be
> +reclaimed. For instance, in-memory caches of filesystem metadata can
> +be re-read from the storage device and therefore it is possible to
> +discard them from the main memory when system is under memory
> +pressure.
> +
> +The process of freeing the reclaimable physical memory pages and
> +repurposing them is called (surprise!) `reclaim`. Linux can reclaim
> +pages either asynchronously or synchronously, depending on the state
> +of the system. When system is not loaded, most of the memory is free
When {the, a} system
> +and allocation request will be satisfied immediately from the free
requests
or
and an allocation request
> +pages supply. As the load increases, the amount of the free pages goes
> +down and when it reaches a certain threshold (high watermark), an
> +allocation request will awaken the ``kswapd`` daemon. It will
> +asynchronously scan memory pages and either just free them if the data
> +they contain is available elsewhere, or evict to the backing storage
> +device (remember those dirty pages?). As memory usage increases even
> +more and reaches another threshold - min watermark - an allocation
> +will trigger the `direct reclaim`. In this case allocation is stalled
s/the//
> +until enough memory pages are reclaimed to satisfy the request.
> +
> +Compaction
> +==========
> +
> +As the system runs, tasks allocate and free the memory and it becomes
> +fragmented. Although with virtual memory it is possible to present
> +scattered physical pages as virtually contiguous range, sometimes it is
> +necessary to allocate large physically contiguous memory areas. Such
> +need may arise, for instance, when a device driver requires large
requires a large
> +buffer for DMA, or when THP allocates a huge page. Memory `compaction`
> +addresses the fragmentation issue. This mechanism moves occupied pages
> +from the lower part of a memory zone to free pages in the upper part
> +of the zone. When a compaction scan is finished free pages are grouped
> +together at the beginning of the zone and allocations of large
> +physically contiguous areas become possible.
> +
> +Like reclaim, the compaction may happen asynchronously in ``kcompactd``
in the
> +daemon or synchronously as a result of memory allocation request.
of a memory allocation request.
> +
> +OOM killer
> +==========
> +
> +It may happen, that on a loaded machine memory will be exhausted. When
no comma.
> +the kernel detects that the system runs out of memory (OOM) it invokes
> +`OOM killer`. Its mission is simple: all it has to do is to select a
> +task to sacrifice for the sake of the overall system health. The
> +selected task is killed in a hope that after it exits enough memory
> +will be freed to continue normal operation.
thanks for doing this overview.
--
~Randy
Hi Randy,
Thanks for the review! I always have trouble with articles :)
The patch below addresses most of your comments.
On Fri, Jun 01, 2018 at 05:09:38PM -0700, Randy Dunlap wrote:
> On 05/29/2018 04:37 AM, Mike Rapoport wrote:
> > Hi,
> >
> > From 2d3ec7ea101a66b1535d5bec4acfc1e0f737fd53 Mon Sep 17 00:00:00 2001
> > From: Mike Rapoport <[email protected]>
> > Date: Tue, 29 May 2018 14:12:39 +0300
> > Subject: [PATCH] docs/admin-guide/mm: add high level concepts overview
> >
> > The are terms that seem obvious to the mm developers, but may be somewhat
Huh, I afraid it's to late to change the commit message :(
> There are [or: These are]
>
> > obscure for, say, less involved readers.
> >
> > The concepts overview can be seen as an "extended glossary" that introduces
> > such terms to the readers of the kernel documentation.
> >
> > Signed-off-by: Mike Rapoport <[email protected]>
> > ---
> > Documentation/admin-guide/mm/concepts.rst | 222 ++++++++++++++++++++++++++++++
> > Documentation/admin-guide/mm/index.rst | 5 +
> > 2 files changed, 227 insertions(+)
> > create mode 100644 Documentation/admin-guide/mm/concepts.rst
> >
> > diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
> > new file mode 100644
> > index 0000000..291699c
> > --- /dev/null
> > +++ b/Documentation/admin-guide/mm/concepts.rst
[...]
> > +All this makes dealing directly with physical memory quite complex and
> > +to avoid this complexity a concept of virtual memory was developed.
> > +
> > +The virtual memory abstracts the details of physical memory from the
>
> virtual memory {system, implementation} abstracts
>
> > +application software, allows to keep only needed information in the
>
> software, allowing the VM to keep only needed information in the
>
> > +physical memory (demand paging) and provides a mechanism for the
> > +protection and controlled sharing of data between processes.
> > +
My intention was "virtual memory concept allows ... and provides ..."
I didn't want to repeat "concept", to I've just omitted it.
Somehow, I don't feel that "system" or "implementation" fit here...
>
> --
> ~Randy
>
--
Sincerely yours,
Mike.
From 60e74f6ef29789f22555c4fdbbb85215e506f6d0 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Mon, 4 Jun 2018 15:09:54 +0300
Subject: [PATCH] docs/admin-guide/mm/concepts.rst: grammar fixups
The patch is mostly about adding 'a' and 'the' and updating indentation.
Suggested-by: Randy Dunlap <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
---
Documentation/admin-guide/mm/concepts.rst | 39 ++++++++++++++++---------------
1 file changed, 20 insertions(+), 19 deletions(-)
diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
index 291699c..ab7a0f9 100644
--- a/Documentation/admin-guide/mm/concepts.rst
+++ b/Documentation/admin-guide/mm/concepts.rst
@@ -4,13 +4,13 @@
Concepts overview
=================
-The memory management in Linux is complex system that evolved over the
-years and included more and more functionality to support variety of
+The memory management in Linux is a complex system that evolved over the
+years and included more and more functionality to support a variety of
systems from MMU-less microcontrollers to supercomputers. The memory
-management for systems without MMU is called ``nommu`` and it
+management for systems without an MMU is called ``nommu`` and it
definitely deserves a dedicated document, which hopefully will be
eventually written. Yet, although some of the concepts are the same,
-here we assume that MMU is available and CPU can translate a virtual
+here we assume that an MMU is available and a CPU can translate a virtual
address to a physical address.
.. contents:: :local:
@@ -21,10 +21,10 @@ Virtual Memory Primer
The physical memory in a computer system is a limited resource and
even for systems that support memory hotplug there is a hard limit on
the amount of memory that can be installed. The physical memory is not
-necessary contiguous, it might be accessible as a set of distinct
+necessary contiguous; it might be accessible as a set of distinct
address ranges. Besides, different CPU architectures, and even
-different implementations of the same architecture have different view
-how these address ranges defined.
+different implementations of the same architecture have different views
+of how these address ranges defined.
All this makes dealing directly with physical memory quite complex and
to avoid this complexity a concept of virtual memory was developed.
@@ -48,8 +48,9 @@ appropriate kernel configuration option.
Each physical memory page can be mapped as one or more virtual
pages. These mappings are described by page tables that allow
-translation from virtual address used by programs to real address in
-the physical memory. The page tables organized hierarchically.
+translation from a virtual address used by programs to the real
+address in the physical memory. The page tables are organized
+hierarchically.
The tables at the lowest level of the hierarchy contain physical
addresses of actual pages used by the software. The tables at higher
@@ -121,8 +122,8 @@ Nodes
Many multi-processor machines are NUMA - Non-Uniform Memory Access -
systems. In such systems the memory is arranged into banks that have
different access latency depending on the "distance" from the
-processor. Each bank is referred as `node` and for each node Linux
-constructs an independent memory management subsystem. A node has it's
+processor. Each bank is referred as a `node` and for each node Linux
+constructs an independent memory management subsystem. A node has its
own set of zones, lists of free and used pages and various statistics
counters. You can find more details about NUMA in
:ref:`Documentation/vm/numa.rst <numa>` and in
@@ -149,7 +150,7 @@ for program's stack and heap or by explicit calls to mmap(2) system
call. Usually, the anonymous mappings only define virtual memory areas
that the program is allowed to access. The read accesses will result
in creation of a page table entry that references a special physical
-page filled with zeroes. When the program performs a write, regular
+page filled with zeroes. When the program performs a write, a regular
physical page will be allocated to hold the written data. The page
will be marked dirty and if the kernel will decide to repurpose it,
the dirty page will be swapped out.
@@ -181,8 +182,8 @@ pressure.
The process of freeing the reclaimable physical memory pages and
repurposing them is called (surprise!) `reclaim`. Linux can reclaim
pages either asynchronously or synchronously, depending on the state
-of the system. When system is not loaded, most of the memory is free
-and allocation request will be satisfied immediately from the free
+of the system. When the system is not loaded, most of the memory is free
+and allocation requests will be satisfied immediately from the free
pages supply. As the load increases, the amount of the free pages goes
down and when it reaches a certain threshold (high watermark), an
allocation request will awaken the ``kswapd`` daemon. It will
@@ -190,7 +191,7 @@ asynchronously scan memory pages and either just free them if the data
they contain is available elsewhere, or evict to the backing storage
device (remember those dirty pages?). As memory usage increases even
more and reaches another threshold - min watermark - an allocation
-will trigger the `direct reclaim`. In this case allocation is stalled
+will trigger `direct reclaim`. In this case allocation is stalled
until enough memory pages are reclaimed to satisfy the request.
Compaction
@@ -200,7 +201,7 @@ As the system runs, tasks allocate and free the memory and it becomes
fragmented. Although with virtual memory it is possible to present
scattered physical pages as virtually contiguous range, sometimes it is
necessary to allocate large physically contiguous memory areas. Such
-need may arise, for instance, when a device driver requires large
+need may arise, for instance, when a device driver requires a large
buffer for DMA, or when THP allocates a huge page. Memory `compaction`
addresses the fragmentation issue. This mechanism moves occupied pages
from the lower part of a memory zone to free pages in the upper part
@@ -208,13 +209,13 @@ of the zone. When a compaction scan is finished free pages are grouped
together at the beginning of the zone and allocations of large
physically contiguous areas become possible.
-Like reclaim, the compaction may happen asynchronously in ``kcompactd``
-daemon or synchronously as a result of memory allocation request.
+Like reclaim, the compaction may happen asynchronously in the ``kcompactd``
+daemon or synchronously as a result of a memory allocation request.
OOM killer
==========
-It may happen, that on a loaded machine memory will be exhausted. When
+It may happen that on a loaded machine memory will be exhausted. When
the kernel detects that the system runs out of memory (OOM) it invokes
`OOM killer`. Its mission is simple: all it has to do is to select a
task to sacrifice for the sake of the overall system health. The
--
2.7.4
On 06/04/2018 05:22 AM, Mike Rapoport wrote:
> Hi Randy,
>
> Thanks for the review! I always have trouble with articles :)
> The patch below addresses most of your comments.
>
> On Fri, Jun 01, 2018 at 05:09:38PM -0700, Randy Dunlap wrote:
>> On 05/29/2018 04:37 AM, Mike Rapoport wrote:
>>> Hi,
>>>
>>> From 2d3ec7ea101a66b1535d5bec4acfc1e0f737fd53 Mon Sep 17 00:00:00 2001
>>> From: Mike Rapoport <[email protected]>
>>> Date: Tue, 29 May 2018 14:12:39 +0300
>>> Subject: [PATCH] docs/admin-guide/mm: add high level concepts overview
>>>
>>> The are terms that seem obvious to the mm developers, but may be somewhat
>
> Huh, I afraid it's to late to change the commit message :(
Sure.
>> There are [or: These are]
>>
>>> obscure for, say, less involved readers.
>>>
>>> The concepts overview can be seen as an "extended glossary" that introduces
>>> such terms to the readers of the kernel documentation.
>>>
>>> Signed-off-by: Mike Rapoport <[email protected]>
>>> ---
>>> Documentation/admin-guide/mm/concepts.rst | 222 ++++++++++++++++++++++++++++++
>>> Documentation/admin-guide/mm/index.rst | 5 +
>>> 2 files changed, 227 insertions(+)
>>> create mode 100644 Documentation/admin-guide/mm/concepts.rst
>>>
>>> diff --git a/Documentation/admin-guide/mm/concepts.rst b/Documentation/admin-guide/mm/concepts.rst
>>> new file mode 100644
>>> index 0000000..291699c
>>> --- /dev/null
>>> +++ b/Documentation/admin-guide/mm/concepts.rst
>
> [...]
>
>>> +All this makes dealing directly with physical memory quite complex and
>>> +to avoid this complexity a concept of virtual memory was developed.
>>> +
>>> +The virtual memory abstracts the details of physical memory from the
>>
>> virtual memory {system, implementation} abstracts
>>
>>> +application software, allows to keep only needed information in the
>>
>> software, allowing the VM to keep only needed information in the
>>
>>> +physical memory (demand paging) and provides a mechanism for the
>>> +protection and controlled sharing of data between processes.
>>> +
>
> My intention was "virtual memory concept allows ... and provides ..."
> I didn't want to repeat "concept", to I've just omitted it.
>
> Somehow, I don't feel that "system" or "implementation" fit here...
OK. Thanks for the update.
> Subject: [PATCH] docs/admin-guide/mm/concepts.rst: grammar fixups
> The patch is mostly about adding 'a' and 'the' and updating indentation.
I would say that it's mostly about improving readability.
Acked-by: Randy Dunlap <[email protected]>
--
~Randy