Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758001Ab2K0EzY (ORCPT ); Mon, 26 Nov 2012 23:55:24 -0500 Received: from mail-vc0-f174.google.com ([209.85.220.174]:45023 "EHLO mail-vc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757957Ab2K0EzW (ORCPT ); Mon, 26 Nov 2012 23:55:22 -0500 MIME-Version: 1.0 In-Reply-To: <50B4145C.3010406@gmail.com> References: <1352721091-27022-1-git-send-email-walken@google.com> <50A16212.8090507@gmail.com> <50B4145C.3010406@gmail.com> Date: Mon, 26 Nov 2012 20:55:21 -0800 Message-ID: Subject: Re: [PATCH 0/3] fix missing rb_subtree_gap updates on vma insert/erase From: Michel Lespinasse To: Sasha Levin Cc: Andrew Morton , Rik van Riel , Hugh Dickins , linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5491 Lines: 106 On Mon, Nov 26, 2012 at 5:16 PM, Sasha Levin wrote: > I've built today's -next, and got the following BUG pretty quickly (2-3 hours): > > [ 1556.479284] BUG: unable to handle kernel paging request at 0000000000412000 > [ 1556.480036] IP: [] validate_mm+0x34/0x130 > [ 1556.480036] PGD 31739067 PUD 4fbc4067 PMD 1c936067 PTE 0 > [ 1556.480036] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 1556.480036] Dumping ftrace buffer: > [ 1556.480036] (ftrace buffer empty) > [ 1556.480036] CPU 0 > [ 1556.480036] Pid: 10274, comm: trinity-child29 Tainted: G W 3.7.0-rc6-next-20121126-sasha-00015-gb04382b-dirty #201 > [ 1556.480036] RIP: 0010:[] [] validate_mm+0x34/0x130 > [ 1556.480036] RSP: 0018:ffff88004fbc7d08 EFLAGS: 00010206 > [ 1556.480036] RAX: 0000000000412000 RBX: 0000000000000000 RCX: 0000000000000000 > [ 1556.512120] RDX: 0000000000000000 RSI: ffff88001c1a6008 RDI: ffff88001c1a6000 > [ 1556.512120] RBP: ffff88004fbc7d38 R08: ffff8800371e7808 R09: ffff88004fb56cf0 > [ 1556.512120] R10: 0000000000000001 R11: 0000000000001000 R12: ffff88001c1a6000 > [ 1556.512120] R13: ffff8800371e7b00 R14: 0000000000000000 R15: ffff88001c1a6000 > [ 1556.512120] FS: 00007f4e0f8e3700(0000) GS:ffff8800bfc00000(0000) knlGS:0000000000000000 > [ 1556.512120] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1556.512120] CR2: 0000000000412000 CR3: 000000002faec000 CR4: 00000000000406f0 > [ 1556.512120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1556.512120] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 1556.512120] Process trinity-child29 (pid: 10274, threadinfo ffff88004fbc6000, task ffff88004fbb0000) > [ 1556.512120] Stack: > [ 1556.512120] ffff8800bf80aa80 ffff88001c1a6000 ffff88004fb56cf0 ffff8800371e7818 > [ 1556.512120] ffff8800371e7808 ffff88001c1a6000 ffff88004fbc7d88 ffffffff8123843c > [ 1556.512120] 0000000000000001 ffff88004fb56da8 ffff880000000000 ffff8800371e7818 > [ 1556.512120] Call Trace: > [ 1556.512120] [] vma_link+0xcc/0xf0 > [ 1556.512120] [] mmap_region+0x40c/0x5a0 > [ 1556.512120] [] do_mmap_pgoff+0x2ab/0x310 > [ 1556.512120] [] ? vm_mmap_pgoff+0x6c/0xb0 > [ 1556.512120] [] vm_mmap_pgoff+0x84/0xb0 > [ 1556.512120] [] sys_mmap_pgoff+0x193/0x1a0 > [ 1556.512120] [] ? trace_hardirqs_on_caller+0x118/0x140 > [ 1556.512120] [] sys_mmap+0x1d/0x20 > [ 1556.512120] [] tracesys+0xe1/0xe6 > [ 1556.512120] Code: 31 f6 41 55 41 54 49 89 fc 53 31 db 48 83 ec 08 4c 8b 2f 4d 85 ed 74 75 0f 1f 80 00 00 00 00 49 8b 85 88 00 00 > 00 48 85 c0 74 0e <48> 8b 38 31 f6 48 83 c7 08 e8 0e bc a4 02 49 8b 45 78 4d 8d 7d > [ 1556.512120] RIP [] validate_mm+0x34/0x130 > [ 1556.512120] RSP > [ 1556.512120] CR2: 0000000000412000 > [ 1557.729958] ---[ end trace d2a29e98cc9e2568 ]--- > > The bit that's failing is: > > struct vm_area_struct *vma = mm->mmap; // mm->mmap = 0x412000 > while (vma) { > struct anon_vma_chain *avc; > vma_lock_anon_vma(vma); // BOOM! > > > Thanks, > Sasha Thanks for the report. I believe we actually have mm->mmap = ffff8800371e7b00 (r13); That first vma has its anon_vma field pointing to 0000000000412000 (rax) and we fault while trying to read anon_vma->root. I don't see what could cause vma->anon_vma to be an invalid non-null value. This looks very much like there might be some kind of memory corruption occuring, but I can't tell where it would come from. Going back into your previous reports, we also never really identified the root cause of your two reports at the start of the thread with subject: "mm: NULL ptr deref in anon_vma_interval_tree_verify" (on Oct 18th and Oct 25th). At some point we thought that taking the anon_vma lock in validate_mm would prevent a race and fix the issue, but further inspection convinced us that this shouldn't be necessary - so, in the end, we still don't know what caused us to crash in these two cases either (and, I was tempted to suggest memory corruption at the time too). So, I'm not sure what to do. One thing to keep in mind is that CONFIG_DEBUG_VM_RB was only recently introduced (between v3.6 and v3.7-rc1); before that the code existed only as an #ifdef in mm/mmap.c. So, it might help if you could run your trinity test on these few kernel versions: - first, on v3.6, after editing mm/mmap.c to replace #undef DEBUG_MM_RB with #define DEBUG_MM_RB; (if this fails here, then what we have is a latent bug that CONFIG_DEBUG_VM_RB just happened to reveal) - then, on v3.7-rc1 with no further changes (if this fails here, then the issue is likely with my rbtree intervals stuff that made it into v3.7-rc1) - finally, on the akpm-base branch in linux-next tree (if this fails here, then the issue may be some corruption caused by one of the trees other than -mm) Sorry to request this; I'm really not sure what else to try at this point :/ -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/