Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932976AbaJGXc7 (ORCPT ); Tue, 7 Oct 2014 19:32:59 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:39668 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932122AbaJGXUA (ORCPT ); Tue, 7 Oct 2014 19:20:00 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Michal Hocko , David Rientjes , Nishanth Aravamudan , Mel Gorman , Andrew Morton , Linus Torvalds Subject: [PATCH 3.14 16/37] mm: exclude memoryless nodes from zone_reclaim Date: Tue, 7 Oct 2014 16:19:33 -0700 Message-Id: <20141007231827.565961673@linuxfoundation.org> X-Mailer: git-send-email 2.1.2 In-Reply-To: <20141007231827.043235686@linuxfoundation.org> References: <20141007231827.043235686@linuxfoundation.org> User-Agent: quilt/0.63-1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Michal Hocko commit 70ef57e6c22c3323dce179b7d0d433c479266612 upstream. We had a report about strange OOM killer strikes on a PPC machine although there was a lot of swap free and a tons of anonymous memory which could be swapped out. In the end it turned out that the OOM was a side effect of zone reclaim which wasn't unmapping and swapping out and so the system was pushed to the OOM. Although this sounds like a bug somewhere in the kswapd vs. zone reclaim vs. direct reclaim interaction numactl on the said hardware suggests that the zone reclaim should not have been set in the first place: node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 node 0 size: 0 MB node 0 free: 0 MB node 2 cpus: node 2 size: 7168 MB node 2 free: 6019 MB node distances: node 0 2 0: 10 40 2: 40 10 So all the CPUs are associated with Node0 which doesn't have any memory while Node2 contains all the available memory. Node distances cause an automatic zone_reclaim_mode enabling. Zone reclaim is intended to keep the allocations local but this doesn't make any sense on the memoryless nodes. So let's exclude such nodes for init_zone_allows_reclaim which evaluates zone reclaim behavior and suitable reclaim_nodes. Signed-off-by: Michal Hocko Acked-by: David Rientjes Acked-by: Nishanth Aravamudan Tested-by: Nishanth Aravamudan Acked-by: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/page_alloc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1869,7 +1869,7 @@ static void __paginginit init_zone_allow { int i; - for_each_online_node(i) + for_each_node_state(i, N_MEMORY) if (node_distance(nid, i) <= RECLAIM_DISTANCE) node_set(i, NODE_DATA(nid)->reclaim_nodes); else @@ -4933,7 +4933,8 @@ void __paginginit free_area_init_node(in pgdat->node_id = nid; pgdat->node_start_pfn = node_start_pfn; - init_zone_allows_reclaim(nid); + if (node_state(nid, N_MEMORY)) + init_zone_allows_reclaim(nid); #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP get_pfn_range_for_nid(nid, &start_pfn, &end_pfn); #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/