Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752293AbaBMWlK (ORCPT ); Thu, 13 Feb 2014 17:41:10 -0500 Received: from mail-pb0-f48.google.com ([209.85.160.48]:48779 "EHLO mail-pb0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751854AbaBMWlH (ORCPT ); Thu, 13 Feb 2014 17:41:07 -0500 Date: Thu, 13 Feb 2014 14:41:04 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Raghavendra K T cc: Andrew Morton , Fengguang Wu , David Cohen , Al Viro , Damien Ramonda , Jan Kara , Linus Torvalds , Nishanth Aravamudan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages In-Reply-To: <52FC98A6.1000701@linux.vnet.ibm.com> Message-ID: References: <1390388025-1418-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com> <20140206145105.27dec37b16f24e4ac5fd90ce@linux-foundation.org> <20140206152219.45c2039e5092c8ea1c31fd38@linux-foundation.org> <52F4B8A4.70405@linux.vnet.ibm.com> <52F88C16.70204@linux.vnet.ibm.com> <52F8C556.6090006@linux.vnet.ibm.com> <52FC6F2A.30905@linux.vnet.ibm.com> <52FC98A6.1000701@linux.vnet.ibm.com> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 13 Feb 2014, Raghavendra K T wrote: > Thanks David, unfortunately even after applying that patch, I do not see > the improvement. > > Interestingly numa_mem_id() seem to still return the value of a > memoryless node. > May be per cpu _numa_mem_ values are not set properly. Need to dig out .... > I believe ppc will be relying on __build_all_zonelists() to set numa_mem_id() to be the proper node, and that relies on the ordering of the zonelist built for the memoryless node. It would be very strange if local_memory_node() is returning a memoryless node because it is the first zone for node_zonelist(GFP_KERNEL) (why would a memoryless node be on the zonelist at all?). I think the real problem is that build_all_zonelists() is only called at init when the boot cpu is online so it's only setting numa_mem_id() properly for the boot cpu. Does it return a node with memory if you toggle /proc/sys/vm/numa_zonelist_order? Do echo node > /proc/sys/vm/numa_zonelist_order echo zone > /proc/sys/vm/numa_zonelist_order echo default > /proc/sys/vm/numa_zonelist_order and check if it returns the proper value at either point. This will force build_all_zonelists() and numa_mem_id() to point to the proper node since all cpus are now online. So the prerequisite for CONFIG_HAVE_MEMORYLESS_NODES is that there is an arch-specific set_numa_mem() that makes this mapping correct like ia64 does. If that's the case, then it's (1) completely undocumented and (2) Nishanth's patch is incomplete because anything that adds CONFIG_HAVE_MEMORYLESS_NODES needs to do the proper set_numa_mem() for it to be any different than numa_node_id(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/