Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754494AbaA1DbV (ORCPT ); Mon, 27 Jan 2014 22:31:21 -0500 Received: from cn.fujitsu.com ([222.73.24.84]:12085 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1754351AbaA1DbU (ORCPT ); Mon, 27 Jan 2014 22:31:20 -0500 X-IronPort-AV: E=Sophos;i="4.95,733,1384272000"; d="scan'208";a="9460148" Message-ID: <52E72083.4090703@cn.fujitsu.com> Date: Tue, 28 Jan 2014 11:14:11 +0800 From: Tang Chen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: Dave Jones , David Rientjes , tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, akpm@linux-foundation.org, zhangyanfei@cn.fujitsu.com, guz.fnst@cn.fujitsu.com, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] numa, mem-hotplug: Fix stack overflow in numa when seting kernel nodes to unhotpluggable. References: <1390456168-28259-1-git-send-email-tangchen@cn.fujitsu.com> <52E70165.8070709@cn.fujitsu.com> <20140128025537.GA21730@redhat.com> In-Reply-To: <20140128025537.GA21730@redhat.com> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2014/01/28 11:10:12, Serialize by Router on mailserver/fnst(Release 8.5.3|September 15, 2011) at 2014/01/28 11:10:54, Serialize complete at 2014/01/28 11:10:54 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1; format=flowed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/28/2014 10:55 AM, Dave Jones wrote: > On Tue, Jan 28, 2014 at 09:01:25AM +0800, Tang Chen wrote: > > On 01/28/2014 08:32 AM, David Rientjes wrote: > > > On Wed, 22 Jan 2014, David Rientjes wrote: > > > > > >>> arch/x86/mm/numa.c | 2 +- > > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > > >>> > > >>> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c > > >>> index 81b2750..ebefeb7 100644 > > >>> --- a/arch/x86/mm/numa.c > > >>> +++ b/arch/x86/mm/numa.c > > >>> @@ -562,10 +562,10 @@ static void __init numa_init_array(void) > > >>> } > > >>> } > > >>> > > >>> +static nodemask_t numa_kernel_nodes __initdata; > > >>> static void __init numa_clear_kernel_node_hotplug(void) > > >>> { > > >>> int i, nid; > > >>> - nodemask_t numa_kernel_nodes; > > >>> unsigned long start, end; > > >>> struct memblock_type *type =&memblock.reserved; > > >>> > > >> > > >> Isn't this also a bugfix since you never initialize numa_kernel_nodes when > > >> it's allocated on the stack with NODE_MASK_NONE? > > >> > > > > > > This hasn't been answered and the patch still isn't in linux-kernel yet > > > Dave tested it as good. I'm suspicious of the changelog that indicates > > > this nodemask is the result of a stack overflow itself which only manages > > > to reproduce itself in the init patch slightly more than 50% of the time. > > > How is that possible? > > > > > > I think the changelog should indicate this also fixes an uninitialized > > > nodemask issue. > > > > Hi David, > > > > I'm still working on this problem, but unfortunately nothing new for now. > > And the test till now shows no more problem here. > > > > I'm digging into it, but need more time. > > > > I'll resend a new patch and modify the changelog soon. Before we find the > > root cause, I think we can use this patch as a temporary solution. > > Ok, I hit the 2nd bug again (oops in next_zones_zonelist...) > > I did a bisect with the patch above applied each step of the way. > This time I got a plausible looking result.... > > > a0acda917284183f9b71e2d08b0aa0aea722b321 is the first bad commit > commit a0acda917284183f9b71e2d08b0aa0aea722b321 > Author: Tang Chen > Date: Tue Jan 21 15:49:32 2014 -0800 > > acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable > > > Reverting this commit of course removes the whole function from above, > so we haven't really learned anything new, other than that commit is broken, > even after the above fix-up. If we revert this commit, memory hot-remove won't be able to work. Let's try to fix it before the merge window is close. > > Dave > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/