Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp309437imu; Fri, 14 Dec 2018 20:25:21 -0800 (PST) X-Google-Smtp-Source: AFSGD/VCMmhvD2oUexFKXXnnwIVLov07jJ40qLICN29kfld40wyjIHijHvSxsRziRauOrvUOgWFH X-Received: by 2002:a62:442:: with SMTP id 63mr5205401pfe.156.1544847921397; Fri, 14 Dec 2018 20:25:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544847921; cv=none; d=google.com; s=arc-20160816; b=mKDCkDdsobjpNb5THldabKPc44sGU6LnCQfO66JTyUk9BAghN/haiBlvLIz9MBPPEE K+oEDXwas6VOd1JtYD6M7Ou0Iq9wMHfP7D6jKDPe5jqYatZ2vjCA81luhNdGiZH6IzAo Mw5LLeccPYSlJWHFTGEJAy65TKf0ys6RPbICZCh/2E6E5yck7kqk4to/d0syLSZDoGuR Cih2FACEh0tC0yJCJe2ZWLKDyRwXI0OiJj55A5VpNAQ4zYK8+SitCqiGjxkk3kllsc4y LGaZ01GP8Uk1LnNiga6b0DfgPeS1xiSZlGTBfEJIkTEAqH7GTrXYnBLbSrAloHqiMFyH yUVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:subject:cc:to:from:date; bh=tbhsyuvLwfSmDGIn4f9dQy6n/LFO7oj5ThYFRDhF4sQ=; b=B2RXqKh8S6Ilpijj0rpcI+lFzZjhwJZc/KWeuYAw42J3NWAwpsCr0+ViaDfZtrDbK/ m4jBsx3yoaZOj9/qDFwi/CTuBer2zAU6L4vRTHElwUFwx6kLL4PCIS9ettVxQ2W4IuEB ZnELISgDRyPYYwBUYtW6EC2wkWsZWp1BpOBBl7UD5MHm4hyKO71AvIHhmsUWSH6an0r0 nkjLOMGr/p77nwaG0k8u9Ky0KG81JZqQOKAiw/r6RnjEw9XTlfLcPKXICBFyA8/nlT6e otjX5eJNo2SYCbOujeUfSbMwaRXM3XYIpKyFxRpFvJtHg/23c1Af7Y3BNRGYB34Rqpr1 EhGg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p16si5481891plr.31.2018.12.14.20.24.31; Fri, 14 Dec 2018 20:25:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729895AbeLOERi (ORCPT + 99 others); Fri, 14 Dec 2018 23:17:38 -0500 Received: from shelob.surriel.com ([96.67.55.147]:38154 "EHLO shelob.surriel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727965AbeLOERi (ORCPT ); Fri, 14 Dec 2018 23:17:38 -0500 Received: from [2001:470:1f07:12aa:6e0b:84ff:fee2:98bb] (helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from ) id 1gY1OO-00016m-W2; Fri, 14 Dec 2018 23:17:33 -0500 Date: Fri, 14 Dec 2018 23:17:26 -0500 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com, linux-mm@kvack.org, Andrew Morton , Shakeel Butt , Michal Hocko , Johannes Weiner , Tejun Heo , Roman Gushchin Subject: [PATCH] fork,memcg: fix crash in free_thread_stack on memcg charge fail Message-ID: <20181214231726.7ee4843c@imladris.surriel.com> X-Mailer: Claws Mail 3.16.0 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changeset 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 #10 [ffffc900244efe98] _do_fork at ffffffff810884e0 #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Fixes: 9b6f7e163cd0 ("mm: rework memcg kernel stack accounting") Cc: Andrew Morton Cc: Shakeel Butt Cc: Michal Hocko Cc: Johannes Weiner Cc: Tejun Heo Cc: Roman Gushchin Signed-off-by: Rik van Riel --- kernel/fork.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 07cddff89c7b..e2a5156bc9c3 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -240,8 +240,10 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) * free_thread_stack() can be called in interrupt context, * so cache the vm_struct. */ - if (stack) + if (stack) { tsk->stack_vm_area = find_vm_area(stack); + tsk->stack = stack; + } return stack; #else struct page *page = alloc_pages_node(node, THREADINFO_GFP, @@ -288,7 +290,10 @@ static struct kmem_cache *thread_stack_cache; static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) { - return kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node); + unsigned long *stack; + stack = kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node); + tsk->stack = stack; + return stack; } static void free_thread_stack(struct task_struct *tsk) -- All rights reversed.