2019-08-29 16:36:30

by Michal Hocko

[permalink] [raw]
Subject: [PATCH] mm, oom: consider present pages for the node size

From: Michal Hocko <[email protected]>

constrained_alloc calculates the size of the oom domain by using
node_spanned_pages which is incorrect because this is the full range of
the physical memory range that the numa node occupies rather than the
memory that backs that range which is represented by node_present_pages.

Sparsely populated nodes (e.g. after memory hot remove or simply sparse
due to memory layout) can have really a large difference between the
two. This shouldn't really cause any real user observable problems
because the oom calculates a ratio against totalpages and used memory
cannot exceed present pages but it is confusing and wrong from code
point of view.

Noticed-by: David Hildenbrand <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
mm/oom_kill.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index eda2e2a0bdc6..16af3da97d08 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -287,7 +287,7 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc)
!nodes_subset(node_states[N_MEMORY], *oc->nodemask)) {
oc->totalpages = total_swap_pages;
for_each_node_mask(nid, *oc->nodemask)
- oc->totalpages += node_spanned_pages(nid);
+ oc->totalpages += node_present_pages(nid);
return CONSTRAINT_MEMORY_POLICY;
}

@@ -300,7 +300,7 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc)
if (cpuset_limited) {
oc->totalpages = total_swap_pages;
for_each_node_mask(nid, cpuset_current_mems_allowed)
- oc->totalpages += node_spanned_pages(nid);
+ oc->totalpages += node_present_pages(nid);
return CONSTRAINT_CPUSET;
}
return CONSTRAINT_NONE;
--
2.20.1


2019-08-29 17:02:23

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH] mm, oom: consider present pages for the node size

On 29.08.19 18:34, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> constrained_alloc calculates the size of the oom domain by using
> node_spanned_pages which is incorrect because this is the full range of
> the physical memory range that the numa node occupies rather than the
> memory that backs that range which is represented by node_present_pages.
>
> Sparsely populated nodes (e.g. after memory hot remove or simply sparse
> due to memory layout) can have really a large difference between the
> two. This shouldn't really cause any real user observable problems
> because the oom calculates a ratio against totalpages and used memory
> cannot exceed present pages but it is confusing and wrong from code
> point of view.
>
> Noticed-by: David Hildenbrand <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/oom_kill.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index eda2e2a0bdc6..16af3da97d08 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -287,7 +287,7 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc)
> !nodes_subset(node_states[N_MEMORY], *oc->nodemask)) {
> oc->totalpages = total_swap_pages;
> for_each_node_mask(nid, *oc->nodemask)
> - oc->totalpages += node_spanned_pages(nid);
> + oc->totalpages += node_present_pages(nid);
> return CONSTRAINT_MEMORY_POLICY;
> }
>
> @@ -300,7 +300,7 @@ static enum oom_constraint constrained_alloc(struct oom_control *oc)
> if (cpuset_limited) {
> oc->totalpages = total_swap_pages;
> for_each_node_mask(nid, cpuset_current_mems_allowed)
> - oc->totalpages += node_spanned_pages(nid);
> + oc->totalpages += node_present_pages(nid);
> return CONSTRAINT_CPUSET;
> }
> return CONSTRAINT_NONE;>

Thanks!

Reviewed-by: David Hildenbrand <[email protected]>

--

Thanks,

David / dhildenb

2019-08-29 22:21:21

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] mm, oom: consider present pages for the node size

On Thu, 29 Aug 2019, Michal Hocko wrote:

> From: Michal Hocko <[email protected]>
>
> constrained_alloc calculates the size of the oom domain by using
> node_spanned_pages which is incorrect because this is the full range of
> the physical memory range that the numa node occupies rather than the
> memory that backs that range which is represented by node_present_pages.
>
> Sparsely populated nodes (e.g. after memory hot remove or simply sparse
> due to memory layout) can have really a large difference between the
> two. This shouldn't really cause any real user observable problems
> because the oom calculates a ratio against totalpages and used memory
> cannot exceed present pages but it is confusing and wrong from code
> point of view.
>
> Noticed-by: David Hildenbrand <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>

Acked-by: David Rientjes <[email protected]>